https://play.ht/news/introducing-play-3-0-mini/ Skip to content Play HT * Products arrow Products + Submenu Ai Voice Agents arrow Create conversational human-like agents using realtime, low- latency state of the art voice ai + Submenu Ultra realistic Al voices Next generation Al speech technology, our voices capture emotion from text to generate speech that is truly human-like + Submenu Text to Speech 800+ Al Voices in 130+ languages with great customizability and control + Submenu Text to Speech API Enjoy low latency, high-quality AI voices for any project you dream of. + Submenu Answering Service Customize & launch your AI virtual receptionist in minutes. + Submenu Voice Cloning Create high-fidelity voice clones that are 100% accurate to their real human voices + Submenu Al Pronunciation Create custom pronunciations of acronyms, niche terms, and save them in your pronunciation library + Submenu Audio Widgets Plug-and-play, and fully customizable audio widgets for your websites to increase accessibility, time on page metrics and user engagement + Submenu Al Podcasts Create and publish your audio content to iTunes, Spotify and Google Podcasts * Use Cases arrow Use Cases + Submenu Videos Upload videos, transcribe, sync audio to videos easily with our Ultra Realistic editor + Submenu Elearning and Training For Learning & Development teams, Training course providers and educators + Submenu IVR System Create humanlike Al voice responses for IVR Systems + Submenu Audio Articles and Accessability Engage, Retain and Attract new audience with audio + Submenu YouTube videos Easily narrate your YouTube videos with Al Voice Generator + Submenu Tik Tok videos Discover Al voices to narrate your TikTok videos + Submenu Character Voice Generator Create stunning voices for your characters in games, animation, and cartoons + Submenu Celebrity Voice Generator Capture any celebrity voice and generate speech that is identical to the original voice * Resources arrow Resources + Blog Blog arrow + AI Apps AI Apps arrow + API Documentation API Documentation arrow + Submenu Help Guides arrow + Podcast Podcast arrow + API Playground API Playground arrow * Pricing * AI Voice Agents * About Us * Login * Try for free Log in Try for Free October 11, 2024 Introducing Play 3.0 mini - A lightweight, reliable and cost-efficient Multilingual Text-to-Speech model * copy the link * Share to Linkedin * Share to Twitter * Share to Facebook Introducing Play 3.0 mini - A lightweight, reliable and cost-efficient Multilingual Text-to-Speech model Today we're releasing our most capable and conversational voice model that can speak in 30+ languages using any voice or accent, with industry leading speed and accuracy. We're also releasing 50+ new conversational AI voices across languages. Our mission is to make voice AI accessible, personal and capable for all. Part of that mission is to advance the current state of interactive voice technology in conversational AI and elevate user experience. When you're building real time applications using TTS, a few things really matter - latency, reliability, quality and naturalness of speech. While we've been leading on latency and naturalness of speech with our previous generation models, Play 3.0 mini makes significant improvements to reliability and audio quality while still being the fastest and most conversational voice model. Play3.0 mini is the first in a series of efficient multi-lingual AI text-to-speech models we plan to release over the coming months. Our goal is to make the models smaller and cost-efficient so they can be run on devices and at scale. Play 3.0 mini is our fastest, most conversational speech model yet 3.0 mini achieves a mean latency of 189 milliseconds for TTFB, making it our fastest AI Text to Speech model. It supports text-in streaming from LLMs and audio-out streaming, and can be used via our HTTP REST API, websockets API or SDKs. 3.0 mini is also more efficient than Play 2.0, and runs inference 28% faster. Play 3.0 mini supports 30+ languages across any voice Play 3.0 mini now supports more than 30+ languages, many with multiple male and female voice options out of the box. Our English, Japanese, Hindi, Arabic, Spanish, Italian, German, French, and Portuguese voices are available now for production use cases, and are available through our API and on our playground. Additionally, Afrikaans, Bulgarian, Croatian, Czech, Hebrew, Hungarian, Indonesian, Malay, Mandarin, Polish, Serbian, Swedish, Tagalog, Thai, Turkish, Ukrainian, Urdu, and Xhosa are available for testing. Play 3.0 mini is more accurate Our goal with Play 3.0 mini was to build the best TTS model for conversational AI. To achieve this, the model had to outperform competitor models in latency and accuracy while generating speech in the most conversational tone. LLMs hallucinate and voice LLMs are no different. Hallucinations in voice LLMs can be in the form of extra or missed words or numbers in the output audio not part of the input text. Sometimes they can just be random sounds in the audio. This makes it difficult to use generative voice models reliably. Here are some challenging text prompts that most TTS models struggle to get right - "Okay, so your flight UA2390 from San Francisco to Las Vegas on November 3rd is confirmed. And, your ticket number is F X 2, 3 9 A, 7 R T. The flight is scheduled to depart at 2:45 p.m. Is there anything else I can assist you with?" "Now, when people RSVP, they can call the event coordinator at 555 342 1234, but if they need more details, they can also call the backup number, which is 416 789 0123." "I've successfully processed your order and I'd like to confirm your product ID. It is A as in Alpha, 1, 2, 3, B as in Bravo, 5, 6, 7, Z as in Zulu, 8, 9, 0, X as in X-ray." 3.0 mini was finetuned specifically on a diverse dataset of alpha-numeric phrases to make it reliable for critical use cases where important information such as phone numbers, passport numbers, dates, currencies, etc. can't be misread. Play 3.0 mini reads alphanumeric sequences more naturally We've trained the model to read numbers and acronyms just like humans do. The model adjusts its pace and slows down any alpha-numeric characters. Phone numbers for instance are read out with more natural pacing, and similarly all acronyms and abbreviations. This makes the overall conversational experience more natural. "Alright, let's troubleshoot your laptop issue. First, let's confirm your device's ID so we're on the same page. The I D is 894-d94-774-496-438-9b0-d2. Did I get that right?" Play 3.0 mini achieves the best voice similarity for voice cloning When cloning voices, close often isn't good enough. Play 3.0 voice cloning achieves state-of-the-art performance when cloning voices, ensuring accurate reproduction of accent, tone, and inflection of cloned voices. In benchmarking using a popular open source embedding model, we lead competitor models by a wide margin for similarity to the original voice. Try it for yourself by cloning your own voice, and talking to yourself on https://play.ai Websockets API Support 3.0 mini's API now supports websockets, which significantly reduces the overhead of opening and closing HTTP connections, and makes it easier than ever to enable text-in streaming from LLMs or other sources. Play 3.0 mini is a cost efficient model We're happy to announce reduced pricing for our higher volume Startup and Growth tiers, and have now introduced a new Pro tier at $49 a month for businesses with more modest requirements. Check out our new pricing table here. We look forward to seeing what you build with us! If you've custom, high volume requirements, feel free to contact our sales team. Share this news * copy the link * Share to Linkedin * Share to Twitter * Share to Facebook Previous Announcements [featured-p] October 12, 2023 Introducing PlayHT 2.0 Turbo [?][?] - The Fastest Generative AI Text-to-Speech API TL;DR We are thrilled to announce the release of the FASTEST Voice LLM to date! Experience real-time speech streaming from... Read More Arrow [featured-p] August 9, 2023 Introducing PlayHT1.0: A Truly Realistic Text to Speech Model with Emotion and Laughter Today we're introducing the first ever Generative Text to Voice AI model that's capable of synthesizing humanlike speech with incredible... Read More Arrow [text-to-sp] August 7, 2023 Introducing Cross-Language Voice Cloning while preserving Speaker Accent Today we're announcing a new feature that enables non-English speakers to clone their voices to create English speaking clones of... Read More Arrow [featured-p] August 6, 2023 Introducing PlayHT2.0: The state-of-the-art Generative Voice AI Model for Conversational Speech Today we're introducing a new Generative Text-to-Voice AI Model that's trained and built to generate conversational speech. This model also... Read More Arrow [IMG_4712-s] March 29, 2023 Play.ht hits GDC 2023: After Action Report PlayHT at GDC 2023. A full recap. We believe that AI voices have a bright future in game development. With... Read More Arrow [featured-p] June 12, 2020 Out With the Old, In with the New. Welcome to PlayHT! Today, we're announcing that we're making a slight yet important change to our punctuation. We're removing the full stop between... Read More Arrow * logo * logo * logo * logo * About us * Company * Contact Us * Affiliates * Pricing * Help Guides * Media Kit * Blog * Products * Text to Speech * AI Pronunciation * AI Audio Widgets * AI Voice Podcast Generator * Ultra Realistic AI Voice * Answering Service * AI Team Access * AI Voice Cloning * Usecases * AI Voiceover for Videos * E-learning * AI Interactive Voice Response (IVR) * Audio Accessiblity * YouTube videos * TikTok videos * TTS API * Help Guides * Roadmap * Podcast * Affiliate Program * AI Apps * Compare * Answering Services Near You (c) 2024 PlayHT * Privacy Policy * Terms of Service * GDPR Compliance Text to speech Voices Arrow * Afghan Pashto, * Albanian, * Algerian Arabic, * American English, * American Spanish, * Arabic, * Argentinean Spanish, * Australian English, * Austrian German, * Azerbaijani, * Bahraini Arabic, * Bangladeshi Bengali, * Belgian Dutch, * Belgian French, * Bolivian Spanish, * Bosnian - Herzegovinian Bosnian, * Brazilian Portuguese, * British English, * British Welsh, * Bulgarian, * Burmese, * Cambodian Khmer, * Canadian English, * Canadian French, * Chilean Spanish, * Chinese, * Colombian Spanish, * Costa Rican Spanish, * Croatian, * Cuban Spanish, * Czech, * Danish, * Dominican (Dominican Republic) Spanish, * Dutch, * Ecuadorean Spanish, * Egyptian Arabic, * Emirian Arabic, * English, * Equatorial Guinean Spanish, * Estonian, * Ethiopian Amharic, * Filipino, * Filipino English, * Finnish, * French, * Georgian, * German, * Greek, * Guatemalan Spanish, * Honduran Spanish, * Hong Kong Chinese, * Hong Kong English, * Hungarian, * Icelandic, * Indian Bengali, * Indian English, * Indian Gujarati, * Indian Hindi, * Indian Kannada, * Indian Malayalam, * Indian Marathi, * Indian Panjabi, * Indian Tamil, * Indian Telugu, * Indian Urdu, * Indonesian, * Indonesian Javanese, * Indonesian Sundanese, * Iranian Persian, * Iraqi Arabic, * Irish, * Irish English, * Israeli Hebrew, * Italian, * Japanese, * Jordanian Arabic, * Kazakhstani Kazakh, * Kenyan English, * Kenyan Swahili, * Kuwaiti Arabic, * Laotian Lao, * Latvian, * Lebanese Arabic, * Libyan Arabic, * Lithuanian, * Macedonian, * Malaysian Malay, * Malaysian Tamil, * Maltese, * Mexican Spanish, * Modern Standard Arabic, * Mongolian, * Moroccan Arabic, * Nepalese Nepali, * New Zealander English, * Nicaraguan Spanish, * Nigerien English, * Norwegian Bokmal, * Omani Arabic, * Pakistani Urdu, * Panamanian Spanish, * Paraguayan Spanish, * Peruvian Spanish, * Polish, * Portuguese, * Puerto Rico Spanish, * Qatari Arabic, * Romanian, * Russian, * Salvadoran Spanish, * Saudi Arabic, * Serbian, * Singaporean English, * Singaporean Tamil, * Slovak, * Slovenian, * Somali, * South African Afrikaans, * South African English, * South African Zulu, * South Korean, * Spanish, * Spanish Catalan, * Spanish Galician, * Sri Lankan Sinhala, * Sri Lankan Tamil, * Swedish, * Swiss French, * Swiss German, * Syrian Arabic, * Taiwanese Chinese, * Tanzanian English, * Tanzanian Swahili, * Thai, * Tunisian Arabic, * Turkish, * Ukrainian, * Uruguayan Spanish, * Uzbek, * Venezuelan Spanish, * Vietnamese, * Welsh English, * Yemenite Arabic