Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self-host it. It currently supports 14 languages.
Cohere's open-weight ASR model Transcribe tops the Hugging Face leaderboard with a 5.42% word error rate, outperforming Whisper Large v3 and ElevenLabs Scribe v2, and runs on local GPU infrastructure ...
Google's AI Edge Eloquent app uses AI to edit out mid-sentence mistakes to provide you with a polished transcription of your ...
In a newsroom post, the tech giant introduced the three new large language models (LLMs). All of them are currently available via Microsoft Foundry and the MAI Playground. The biggest highlight is the ...
Mistral launches Voxtral TTS, extending its model family into speech generation and enabling end-to-end voice workflows.
Google AI Edge Eloquent for iOS is now available in the App Store.
OpenAI has introduced a series of AI audio models, fundamentally redefining how voice-based AI can be integrated into modern applications wit&h ChatGPT. These advancements include state-of-the-art ...
Microsoft launches three in-house MAI models for transcription, voice and image generation through Foundry, hedging its ...
Microsoft has launched MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, offering fast, high-quality AI models for ...