Amazon Polly
What is it
A Text-to-Speech (TTS) service that converts text into realistic speech.
What is it for
Developing applications that speak, allowing you to create audio content for a variety of use cases.
Use cases
- Interactive voice applications (e.g., virtual assistants, IVRs)
- Creation of audio content for e-learning, audiobooks, and podcasts
- Video and presentation narration
- Applications for visually impaired people
- Games and entertainment applications
Key points
- Realistic speech: Uses deep learning technologies to produce human-like voices
- Multiple voices and languages: Supports dozens of voices in various languages
- SSML (Speech Synthesis Markup Language): Allows control over speech aspects such as volume, pitch, speed, and emphasis
- Lexicons: Allows customization of specific word pronunciations
- Audio streaming: Converts text into audio stream in real-time
- Pay-per-use: You pay per character converted to speech
Comparison
- Amazon Polly: Offers a scalable and cost-effective solution for generating speech, without the need to hire voice actors or manage recording studios. Allows for quick and consistent audio content updates.
- Human voice recording: Can offer more natural voice quality and emotional nuances, but is more expensive, time-consuming, and less flexible for content updates.