Skip to content

SSML Support

Speech Synthesis Markup Language (SSML) gives you fine-grained control over how Speech Studio generates audio. With SSML, you can adjust pronunciation, add pauses, change speaking styles, and control emphasis at a granular level.

What Is SSML?

SSML is an XML-based markup language used to control text-to-speech output. Instead of relying solely on the engine's default interpretation of your text, SSML lets you explicitly specify how each word or phrase should be spoken.

Enabling SSML Mode

  1. In the text-to-speech interface, toggle the SSML switch to enable SSML mode.
  2. The text input area now accepts SSML markup instead of plain text.
  3. Write or paste your SSML-formatted content.

Common SSML Tags

Pauses

Insert a pause at any point in the speech:

<break time="500ms"/>

Speaking Rate and Pitch

Adjust speed and pitch for a section of text:

<prosody rate="slow" pitch="+10%">
  This will be spoken slowly at a higher pitch.
</prosody>

Emphasis

Add emphasis to specific words:

<emphasis level="strong">important</emphasis>

Voice Style

Switch the speaking style mid-speech (supported on select voices):

<mstts:express-as style="cheerful">
  Great news! Your order has shipped.
</mstts:express-as>

Pronunciation

Specify how a word should be pronounced using phonetic spelling:

<phoneme alphabet="ipa" ph="təˈmeɪtoʊ">tomato</phoneme>

Example SSML Document

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
       xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
  <voice name="en-US-JennyNeural">
    <mstts:express-as style="friendly">
      Welcome to our tutorial!
    </mstts:express-as>
    <break time="300ms"/>
    <prosody rate="-10%">
      Let me walk you through the steps carefully.
    </prosody>
  </voice>
</speak>

SSML Editor

Speech Studio validates your SSML before conversion, highlighting any errors so you can correct them before generating audio.

Voice Compatibility

Not all SSML features are supported by every voice. Style tags like express-as only work with voices that have multiple style options. Check the Voice Styles & Emotions guide for compatible voices.


:octicons-arrow-right-24: Get Speech Studio