What is Azure Text-to-Speech API?
Azure Text-to-Speech API is a cloud-based service provided by Microsoft Azure that allows developers to convert text into spoken words. This API enables you to integrate natural-sounding voice synthesis into your applications, making it useful for various scenarios, such as accessibility features, voice assistants, and more.
Getting Started
To use the Azure Text-to-Speech API, you'll need an Azure account and an API key. Here are the basic steps to get started:
- Sign in to your Azure Portal.
- Create a new Azure Text-to-Speech resource.
- Retrieve your API key and endpoint.
Sample Code
Here's a simple example of how to use the Azure Text-to-Speech API in Python:
import os
import requests
import json
subscription_key = 'YOUR_SUBSCRIPTION_KEY'
endpoint = 'YOUR_ENDPOINT'
text_to_speak = 'Hello, this is a sample text to be synthesized.'
headers = {
'Content-Type': 'application/ssml+xml',
'X-Microsoft-OutputFormat': 'audio-16khz-128kbitrate-mono-mp3',
'Authorization': 'Bearer ' + subscription_key,
}
data = f'<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><voice name="en-US-Guy24kRUS">{text_to_speak}</voice></speak>'
response = requests.post(endpoint, headers=headers, data=data)
if response.status_code == 200:
with open('output.mp3', 'wb') as audio_file:
audio_file.write(response.content)
print('Audio file created.')
else:
print('Error:', response.status_code, response.text)
Conclusion
The Azure Text-to-Speech API offers powerful voice synthesis capabilities that can enhance your applications and services. With a few simple steps, you can integrate natural-sounding speech into your projects.