ElevenLabs is launching its own speech-to-text model – TechCrunch

Latest
AI
Amazon
Apps
Biotech & Health
Climate
Cloud Computing
Commerce
Crypto
Enterprise
EVs
Fintech
Fundraising
Gadgets
Gaming
Government & Policy
Hardware
Layoffs
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
Security
Social
Space
Startups
TikTok
Transportation
Venture
Events
Startup Battlefield
StrictlyVC
Newsletters
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
ElevenLabs, an AI startup that just raised a $180 million mega-funding round, has been primarily known for its audio-generation prowess. The company took a step in another technological direction by launching its first stand-alone speech-to-text model called Scribe.The startup, valued at $3.3 billion, has aided many other companies in providing text-to-speech services through its vast library of voices. However, the company is now looking to get into speech detection and compete with the likes of Gladia, Speechmatics, AssemblyAI, Deepgram, and OpenAI’s Whisper models.ElevenLabs’ Scribe model supports over 99 languages at launch. The company categorizes over 25 languages in excellent accuracy category for the model where the word error rate is less than 5%. This list includes English (claimed accuracy rate of 97%), French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese. Other languages are ranked in different categories with high (5% to 10% word error rate), good (10% to 20% word error rate), and moderate (25% to 50%) word error rates.The company said that the model outperformed Google Gemini 2.0 Flash and Whisper Large V3 across multiple languages in FLEURS & Common Voice benchmark tests.ElevenLabs had developed the speech-to-text component for its AI conversational agent platform, which was released last year. However, this is the first time the company is releasing a stand-alone speech detection model. In a conversation with TechCrunch last month, CEO Mati Staniszewski talked about improving speech detection models.“We want to understand what’s being said by you in a conversation better. We are working on ways to move away from only generating content and understanding and transcribing speech,” Staniszewski said at that time. “Many people say that speech-to-text is a solved problem. But for many languages, it is pretty bad. We think we can build better speech detection models because we have in-house teams to annotate data and give us quick feedback.”The model also has smart speaker diarization to tell you who is speaking, timestamp at word level for accurate subtitles, and auto-tagging sound events like audience laughters. The startup is providing a way for customers to directly transcribe video content to add subtitles or captions in its studio.Scribe currently only works with pre-recorded audio formats. The company said it will release a low-latency real-time version of the model soon. That means it is not yet effective for meeting transcriptions or voice note-taking.ElevenLabs is pricing Scribe at $0.40 for an hour of transcribed audio. While the rate is competitive, some of its rivals offer a lower price for audio transcriptions at the moment with some feature differentiation.Topics Meta fires around 20 employees for leaking confidential information
OpenAI CEO Sam Altman says the company is ‘out of GPUs’
OpenAI unveils GPT-4.5 ‘Orion,’ its largest AI model yet
Here are all the tech companies rolling back DEI or still committed to it — so far
Amazon Alexa+ costs $19.99, free for Prime members
Thousands of exposed GitHub repositories, now private, can still be accessed through Copilot
Y Combinator deletes posts after a startup’s demo goes viral
Subscribe for the industry’s biggest tech newsEvery weekday and Sunday, you can get the best of TechCrunch’s coverage.TechCrunch’s AI experts cover the latest news in the fast-moving field.Every Monday, gets you up to speed on the latest advances in aerospace.Startups are the core of TechCrunch, so get our best coverage delivered weekly.By submitting your email, you agree to our Terms and Privacy Notice.© 2024 Yahoo.
Source: https://techcrunch.com/2025/02/26/elevenlabs-is-launching-its-own-speech-to-text-model/