Download
Kaizen Speech Studio Kaizen Speech Studio Help All Help Contact
FAQ

Frequently asked questions

Quick answers to the most-asked Speech Studio questions.

Getting started

Do I need an Azure account to use Speech Studio?

Not for Free — you can generate up to 9 hours of audio per month using our shared Azure key. For more (or to save keys locally), get Pro and set up your own Azure Speech Service resource.

Which Azure region?

Currently only East US. Support for other regions is on the roadmap.

Voices & TTS

What's the difference between Neural and Dragon HD?

Dragon HD Latest Neural is Azure's newest, highest-quality voice model — noticeably more natural on longer passages. Same pricing as standard Neural. Not every voice is available in HD yet.

Can I clone my own voice?

Yes, using Azure's Custom Neural Voice feature — requires a separate Azure approval process (Microsoft gates this to prevent misuse). Once approved, Speech Studio works with your custom voice like any other.

Why does my voice sound robotic?

You might be on an older Neural voice or have prosody cranked to extreme values. Try switching to a Dragon HD voice and reset rate/pitch to default. Also try SSML with natural pauses instead of relying on punctuation alone.

Transcription

What audio formats work?

WAV, MP3, OGG, FLAC. For video files (MP4, MOV), extract audio first or use Dub Video which includes transcription.

Why is transcription inaccurate?

Usually one of: noisy source audio, wrong language specified, or heavy accent. Try: (1) cleaner input, (2) pick specific regional variant (en-IN instead of en-US), (3) split long audio into 10-minute chunks.

Dubbing

Why is video dubbing so slow?

Azure Video Translation is batch mode — expect 2–4× video length. A 10-minute video takes 20–40 minutes.

What language pairs work well?

English ↔ Spanish, German, French, Portuguese, Japanese, Mandarin work great. Niche pairs (e.g. Welsh → Vietnamese) are weaker — test on a short clip first.

Licensing

What's the difference between Yearly and Lifetime?

Yearly = renews each year, cancellable. Lifetime = one-time purchase, yours forever (includes all future updates).

Can I use Speech Studio on multiple machines?

One license = one machine at a time. Deactivate on one to activate on another — no hassle.

Couldn't find your answer?

See troubleshooting or email [email protected].