Mistral AI has launched Voxtral Transcribe 2, a new on-device speech-to-text model family featuring real-time transcription, ...
Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Multi-modal models that can process both ...
Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Cory Benfield discusses the evolution of ...
NVIDIA has debuted a new experimental generative AI model, which it describes as "a Swiss Army knife for sound." The model called Foundational Generative Audio Transformer Opus 1, or Fugatto, can take ...
Realtime API supports multi-model text and speech experiences including natural speech-to-speech conversations using preset voices already supported in the API. OpenAI has introduced a public beta of ...
Microsoft has introduced a new AI model that, it says, can process speech, vision, and text locally on-device using less compute capacity than previous models. Innovation in generative artificial ...
Bulbul V3 is a text-to-speech AI model that looks to make the output audio sound more natural by rendering pauses, emphasis, ...