Supported formats
- TXT — always supported, including on Free
- PDF — Pro only. Extracts embedded text. For scanned PDFs, use Kaizen OCR first to make them text-bearing.
- DOC — Pro. Older Word format.
- DOCX — Pro. Modern Word format.
How to import
- In the Text-to-Speech editor, click Import from TXT / PDF / DOC in the footer.
- Pick your file. Speech Studio extracts the text and loads it into the editor.
- Review, edit if needed, pick voice, generate.
What gets extracted
- TXT: verbatim.
- PDF / DOC / DOCX: body text only. Headers, footers, tables, and heavy formatting often get dropped or reflowed.
Pre-generation cleanup
Imported text usually needs light editing before TTS:
- Remove page numbers that appear mid-flow
- Fix line breaks in hyphenated words (
neigh- bourhood→neighbourhood) - Add paragraph breaks where Copy removed them
- Spell out abbreviations TTS might mispronounce (use SSML if you want to preserve the written form but change pronunciation)
Character count
After import, the character counter updates. Free users: keep under 250,000. Pro: no cap.
Long documents
For novel-length content (200k+ chars), split into chapters and generate separately. Then concatenate the MP3s with any audio tool. This gives you chapter markers and reduces re-run cost if a later chapter needs a tweak.