Supported Languages¶
Speech Studio supports over 80 languages and regional dialects through Azure Cognitive Services. This page provides an overview of the available languages for both text-to-speech and speech-to-text features.
Text-to-Speech Languages¶
The following is a selection of the most commonly used languages. Speech Studio includes 603 voices across all supported languages.
| Language | Region Variants | Example Voices |
|---|---|---|
| English | US, UK, Australia, India, Canada, Ireland | JennyNeural, GuyNeural, SoniaNeural |
| Spanish | Spain, Mexico, Argentina, Colombia | ElviraNeural, AlvaroNeural |
| French | France, Canada, Belgium, Switzerland | DeniseNeural, HenriNeural |
| German | Germany, Austria, Switzerland | KatjaNeural, ConradNeural |
| Chinese | Mandarin, Cantonese, Taiwanese | XiaoxiaoNeural, YunxiNeural |
| Japanese | Japan | NanamiNeural, KeitaNeural |
| Korean | Korea | SunHiNeural, InJoonNeural |
| Portuguese | Brazil, Portugal | FranciscaNeural, AntonioNeural |
| Italian | Italy | ElsaNeural, DiegoNeural |
| Arabic | Saudi Arabia, Egypt, UAE | ZariyahNeural, HamedNeural |
| Hindi | India | SwaraNeural, MadhurNeural |
| Russian | Russia | SvetlanaNeural, DmitryNeural |
| Dutch | Netherlands, Belgium | ColetteNeural, MaartenNeural |
| Turkish | Turkey | EmelNeural |
| Polish | Poland | AgnieszkaNeural |
| Swedish | Sweden | SofieNeural |
| Thai | Thailand | PremwadeeNeural |
| Vietnamese | Vietnam | HoaiMyNeural |
| Indonesian | Indonesia | GadisNeural |
Full List
The complete list of supported languages and voices is available within the Speech Studio app under the voice selection panel. Filter by language to see all available voices for a specific language.
Speech-to-Text Languages¶
Speech-to-text transcription supports a similar range of languages. For best accuracy, always select the correct language before starting transcription.
Regional Dialects¶
Many languages have multiple regional variants. For example, English includes US, UK, Australian, Indian, and other accents. Choosing the correct regional variant ensures:
- Proper pronunciation of region-specific words
- Appropriate intonation patterns
- Accurate speech-to-text recognition
Language Selection
If you are creating content for a specific audience, choose the regional variant that matches your target market. An English (UK) voice sounds noticeably different from an English (US) voice.