Quickstart: Convert Text to Speech Using Azure AI Speech Service

Article

Hey developers!

If you’ve ever wanted your application to talk back or just needed to convert some plain text into natural-sounding audio, you're in the right place. In this blog post, I’ll walk you through a simple and quick way to convert text to speech using Microsoft Azure's powerful Speech service. Don’t worry—this is going to be super easy, and we’ll be using C# for the coding part.

Let’s get started!

What are we building?

By the end of this quickstart, you’ll have a C# console application that sends text to Azure's Speech service and gets back audio that plays out loud. Cool, right?

Prerequisites

Before we get into the code, make sure you’ve got the following ready.

An Azure Subscription: If you don’t have one, no worries—you can create one for free here: https://azure.microsoft.com/en-us/free
Azure AI Services (Speech)
- Head over to the Azure portal
- Create a new resource of type Speech (you'll find it under AI Services)
Click Create Speech Service.
Deployment in progress.
Once deployed, click on Go to Resource.
Click AzureSpeechServiceByGowtham,
Note down the Key and Endpoint from the Keys and Endpoint section. We’ll use these in our code.

Setting Up the Project

Let’s now create a .NET Core console app and install the required NuGet package.

Step 1. Create a Console App.

Writing the Code

Open the Program.cs file and replace the content with the following.

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

class Program
{
    static async Task Main(string[] args)
    {
        string speechKey = "XXXXXXXXXXXXXXX";
        string region = "eastus";

        var config = SpeechConfig.FromSubscription(speechKey, region);
        using var synthesizer = new SpeechSynthesizer(config);

        Console.WriteLine("Enter text to convert to speech:");
        string input = Console.ReadLine();

        var result = await synthesizer.SpeakTextAsync(input);

        if (result.Reason == ResultReason.SynthesizingAudioCompleted)
        {
            Console.WriteLine("✅ Speech synthesis completed successfully.");
        }
        else if (result.Reason == ResultReason.Canceled)
        {
            var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
            Console.WriteLine($"❌ Canceled: {cancellation.Reason}");
            Console.WriteLine($"Error Code: {cancellation.ErrorCode}");
            Console.WriteLine($"Details: {cancellation.ErrorDetails}");
        }
    }
}

Don't forget to replace "YOUR_SPEECH_KEY" and "YOUR_REGION" with your actual Azure credentials.

Azure credentials

Run and Test

Hit dotnet run, type in some text like "Hi Gowtham!", and press Enter.

Your computer should speak the text out loud!

If you hear the voice, congrats—you’ve just added speech capability to your app!

Conclusion

Text-to-speech is one of those features that instantly adds a wow factor to your apps, and with Azure Cognitive Services, it’s surprisingly simple to implement. Whether it’s for accessibility, user engagement, or just having fun with your apps, speech synthesis opens up a lot of possibilities.

Give it a try, and let me know how it goes. If you got stuck somewhere or want to explore more advanced scenarios, drop a comment below—I’d be happy to help!