Elevate Your Content with Text-to-Speech Live Avatars


In this article, you will learn how to implement Text to Speech Live Avatar and create more engaging videos with Text to Speech Live Avatarusing Azure Speech Services. It provides a step-by-step implementation of Text to Speech Live Avatars with several elements like talking avatar video, background images or videos, background music, and other elements to make the video more engaging way.

Text-to-Speech Live Avatar

A text-to-speech avatar transforms written text into a digital video featuring a lifelike human figure (choosing from either a pre-designed avatar or a personalized one) speaking with a convincingly natural voice. This avatar-generated video can be produced either asynchronously or in real time.

The text-to-speech avatar feature is only available in the following service regions: West US 2, West Europe, and Southeast Asia.

Capabilities of text-to-speech Live Avatar

  • Transformation of text into a digital video depicting a lifelike human speaker with voices that sound naturally, powered by Azure AI text-to-speech.
  • Provision of a variety of preconstructed avatars.
  • Generation of the avatar's voice via Azure AI text-to-speech. Further details can be found in the Avatar voice and language section.
  • Synthesis of text-to-speech avatar videos either asynchronously through the batch synthesis API or in real-time.
  • Provision of a content creation tool within Speech Studio, facilitating the development of video content without the need for coding.
  • Facilitation of real-time avatar conversations through the live chat avatar tool available in Speech Studio.

Elements of text-to-speech live avatar

Live Avatar videos are typically composed of several elements including.

  1. Talking avatar video
  2. background images or videos
  3. background music and others to make the video more engaging.

Steps to create engaging videos with Text-to-speech live avatar

Step 1. Go to the Azure portal and sign in with your Azure account.

Step 2.  Search Speech services in the search bar and select Speech services from the search results.

Step 3. Click the Create Speech Service button.

Service button

Step 4. In the Basics tab, provide the following information Choose the Subscription.

Step 5. Then Create a Resource Group named testRG.

Step 6. Choose the Region as Europe and type the name as retailttsavatar

Step 7. Select the Pricing tier as Standard S0 tier.

Step 8. Click the Next button on the speech service page.

 speech service

Step 9. Click the Review + Create button.

Step 10. Once validation passed, you will be able to click the Create button.


Step 11. Deployment started initializing in a minute or two this became successful.


Step 12. Click the Goto Speech Studio button.

Studio button

Step 13. In the Speech Studio click Text to Speech Avatar preview option.

Speech Avatar

Step 14. Choose the Avatar type and background from the menu listed on the right side of the window.


Step 15. Add content in the content section of the Text to Speech Avatar and you will be able to choose the Language and Voice type.

Voice type

Step 16. It is possible to explore different options like Insert break, Insert gesture, and speaking speed from the menu that was listed above.

Step 17. Click the Preview video button.

Step 18. Finally, Text to Speech Live Avatar will be ready to use.

 Live Avatar

Use cases of Text-to-speech live avatars

  1. Education: Creating dynamic and interactive learning materials that cater to diverse audiences.
  2. Organizational Communication: Enhancing internal communication channels for disseminating job opportunities, well-being initiatives, training updates, HR notices, and more.
  3. Broadcasting: Utilizing live avatars for engaging storytelling narration and simulated interviews.
  4. Branding: Strengthening marketing campaigns and effectively introducing products to the market by aligning with business strategies


In this article, we learned and deployed the Text to Speech Live Avatar successfully with Speech Studio engagement. Also, we learned about the real-time usage of Text to Speech Live Avatars and their capabilities.

I hope you enjoyed reading this article!

Happy Learning and see you soon in another interesting article!

Similar Articles