Skip to content

Voice Styles & Emotions

Many voices in Speech Studio support multiple speaking styles and emotional tones. This guide explains how to use these styles to make your audio more expressive and engaging.

What Are Voice Styles?

Voice styles change the way a voice delivers text. The same words can sound cheerful, serious, sad, or excited depending on the selected style. Styles are powered by Azure's neural voice technology and go far beyond simple pitch or speed adjustments.

Available Styles

Not every voice supports every style. The table below lists the most common styles and which types of voices typically support them.

Style Description Best For
cheerful Upbeat and positive tone Marketing, announcements
sad Somber and mournful delivery Dramatic content, storytelling
angry Forceful and intense Character dialog, dramatic reads
excited High energy and enthusiastic Promotions, sports
friendly Warm and approachable Customer service, tutorials
terrified Fearful, trembling voice Storytelling, audiobooks
whispering Soft, quiet delivery ASMR, intimate narration
shouting Loud, projected voice Announcements, alerts
newscast Professional news anchor style News reading, reports
narration Storytelling delivery Audiobooks, documentaries
customer-service Polite and helpful IVR systems, support bots

How to Apply Styles

Method 1: Style Dropdown

  1. Select a voice that supports styles (indicated by a style icon next to the voice name)
  2. Open the Style dropdown that appears below the voice selector
  3. Choose a style from the list
  4. The entire text will be spoken in that style

Method 2: SSML Tags

For more granular control, use SSML to apply different styles to different parts of your text:

<mstts:express-as style="cheerful">
  Welcome to our show!
</mstts:express-as>
<mstts:express-as style="serious">
  Now let's discuss the important details.
</mstts:express-as>

See SSML Support for the complete SSML reference.

Style Intensity

Some voices support adjustable style intensity using the styledegree attribute in SSML. Values range from 0.01 to 2.0, where 1.0 is the default:

<mstts:express-as style="cheerful" styledegree="2">
  This is extremely cheerful!
</mstts:express-as>

Finding Styled Voices

In the voice selection panel, use the Has Styles filter to show only voices that support multiple speaking styles. The most expressive voices are typically the newer neural voices for major languages like English, Chinese, and Japanese.


:octicons-arrow-right-24: Get Speech Studio