Skip to content

Supported Languages

Speech Studio supports over 80 languages and regional dialects through Azure Cognitive Services. This page provides an overview of the available languages for both text-to-speech and speech-to-text features.

Text-to-Speech Languages

The following is a selection of the most commonly used languages. Speech Studio includes 603 voices across all supported languages.

Language Region Variants Example Voices
English US, UK, Australia, India, Canada, Ireland JennyNeural, GuyNeural, SoniaNeural
Spanish Spain, Mexico, Argentina, Colombia ElviraNeural, AlvaroNeural
French France, Canada, Belgium, Switzerland DeniseNeural, HenriNeural
German Germany, Austria, Switzerland KatjaNeural, ConradNeural
Chinese Mandarin, Cantonese, Taiwanese XiaoxiaoNeural, YunxiNeural
Japanese Japan NanamiNeural, KeitaNeural
Korean Korea SunHiNeural, InJoonNeural
Portuguese Brazil, Portugal FranciscaNeural, AntonioNeural
Italian Italy ElsaNeural, DiegoNeural
Arabic Saudi Arabia, Egypt, UAE ZariyahNeural, HamedNeural
Hindi India SwaraNeural, MadhurNeural
Russian Russia SvetlanaNeural, DmitryNeural
Dutch Netherlands, Belgium ColetteNeural, MaartenNeural
Turkish Turkey EmelNeural
Polish Poland AgnieszkaNeural
Swedish Sweden SofieNeural
Thai Thailand PremwadeeNeural
Vietnamese Vietnam HoaiMyNeural
Indonesian Indonesia GadisNeural

Full List

The complete list of supported languages and voices is available within the Speech Studio app under the voice selection panel. Filter by language to see all available voices for a specific language.

Speech-to-Text Languages

Speech-to-text transcription supports a similar range of languages. For best accuracy, always select the correct language before starting transcription.

Regional Dialects

Many languages have multiple regional variants. For example, English includes US, UK, Australian, Indian, and other accents. Choosing the correct regional variant ensures:

  • Proper pronunciation of region-specific words
  • Appropriate intonation patterns
  • Accurate speech-to-text recognition

Language Selection

If you are creating content for a specific audience, choose the regional variant that matches your target market. An English (UK) voice sounds noticeably different from an English (US) voice.


:octicons-arrow-right-24: Get Speech Studio