Amazon Polly

What is it

A Text-to-Speech (TTS) service that converts text into realistic speech.

Developing applications that speak, allowing you to create audio content for a variety of use cases.

Realistic speech: Uses deep learning technologies to produce human-like voices
Multiple voices and languages: Supports dozens of voices in various languages
SSML (Speech Synthesis Markup Language): Allows control over speech aspects such as volume, pitch, speed, and emphasis
Lexicons: Allows customization of specific word pronunciations
Audio streaming: Converts text into audio stream in real-time
Pay-per-use: You pay per character converted to speech

Amazon Polly: Offers a scalable and cost-effective solution for generating speech, without the need to hire voice actors or manage recording studios. Allows for quick and consistent audio content updates.
Human voice recording: Can offer more natural voice quality and emotional nuances, but is more expensive, time-consuming, and less flexible for content updates.