18 C
New York
Friday, August 1, 2025

Buy now

Emotive voice AI startup Hume launches new EVI 3 model with rapid custom voice creation

New York-based AI startup Hume has unveiled its newest Empathic Voice Interface (EVI) conversational AI mannequin, EVI 3 (pronounced “Evee” Three, just like the Pokémon character), concentrating on all the pieces from powering buyer help programs and well being teaching to immersive storytelling and digital companionship.

EVI 3 lets customers create their very own voices by speaking to the mannequin (it’s voice-to-voice/speech-to-speech), and goals to set a brand new customary for naturalness, expressiveness, and “empathy” in line with Hume — that’s, how customers understand the mannequin’s understanding of their feelings and its capability to reflect or modify its personal responses, by way of tone and phrase alternative.

Designed for companies, builders, and creators, EVI 3 expands on Hume’s earlier voice fashions by providing extra refined customization, sooner responses, and enhanced emotional understanding.

Particular person customers can work together with it immediately via Hume’s stay demo on its web site and iOS app, however developer entry via Hume’s proprietary utility programming interface (API) is claimed to be made out there in “the approaching weeks,” as a weblog put up from the corporate states.

At that time, builders will be capable of embed EVI 3 into their very own customer support programs, artistic initiatives, or digital assistants — for a worth (see under).

My very own utilization of the demo allowed me to create a brand new, customized artificial voice in seconds primarily based on qualities I described to it — a mixture of heat and assured, and a masculine tone. Talking to it felt extra naturalistic and simple than different AI fashions and definitely the inventory voices from legacy tech leaders such Apple with Siri and Amazon with Alexa.

See also  Google Cloud Next ’25: New AI chips and agent ecosystem challenge Microsoft and Amazon

What builders and companies ought to find out about EVI 3

Hume’s EVI 3 is constructed for a spread of makes use of—from customer support and in-app interactions to content material creation in audiobooks and gaming.

It permits customers to specify exact character traits, vocal qualities, emotional tone, and dialog matters.

This implies it could possibly produce something from a heat, empathetic information to a unusual, mischievous narrator—all the way down to requests like “a squeaky mouse whispering urgently in a French accent about its scheme to steal cheese from the kitchen.”

EVI 3’s core energy lies in its capability to combine emotional intelligence straight into voice-based experiences.

In contrast to conventional chatbots or voice assistants that rely closely on scripted or text-based interactions, EVI 3 adapts to how individuals naturally communicate — choosing up on pitch, prosody, pauses, and vocal bursts to create extra participating, humanlike conversations.

Nonetheless, one massive characteristic Hume’s fashions at present lack — and which is obtainable by rivals open supply and proprietary, comparable to ElevenLabs — is voice cloning, or the speedy replication of a person’s or different voice, comparable to an organization CEO.

But Hume has indicated it is going to add such a functionality to its Octave text-to-speech mannequin, as it’s famous as “coming quickly” on Hume’s web site, and prior reporting by yours really on the corporate discovered it is going to enable customers to duplicate voices from as little as 5 seconds of audio.

Hume has said it’s prioritizing safeguards and moral issues earlier than making this characteristic broadly out there. At present, this cloning functionality isn’t out there in EVI itself, with Hume emphasizing versatile voice customization as an alternative.

See also  Microsoft Copilot: Everything you need to know about Microsoft’s AI

Inner benchmarks present customers favor EVI 3 to OpenAI’s GPT-4o voice mannequin

In response to Hume’s personal checks with 1,720 customers, EVI 3 was most well-liked over OpenAI’s GPT-4o in each class evaluated: naturalness, expressiveness, empathy, interruption dealing with, response velocity, audio high quality, voice emotion/model modulation on request, and emotion understanding on request (the “on request” options are lined in “instruction following” seen under).

It additionally normally bested Google’s Gemini mannequin household and the brand new open supply AI mannequin agency Sesame from former Oculus co-creator Brendan Iribe.

It additionally boasts decrease latency (~300 milliseconds), sturdy multilingual help (English and Spanish, with extra languages coming), and successfully limitless customized voices. As Hume writes on its web site (see screenshot instantly under):

Key capabilities embody:

  • Prosody technology and expressive text-to-speech with modulation.
  • Interruptibility, enabling dynamic conversational move.
  • In-conversation voice customizability, so customers can modify talking model in actual time.
  • API-ready structure (coming quickly), so builders can combine EVI 3 straight into apps and providers.

Pricing and developer entry

Hume provides versatile, usage-based pricing throughout its EVI, Octave TTS, and Expression Measurement APIs.

Whereas EVI 3’s particular API pricing has not been introduced but (marked as TBA), the sample suggests it is going to be usage-based, with enterprise reductions out there for giant deployments.

For reference, EVI 2 is priced at $0.072 per minute — 30% decrease than its predecessor, EVI 1 ($0.102/minute).

For creators and builders working with text-to-speech initiatives, Hume’s Octave TTS plans vary from a free tier (10,000 characters of speech, ~10 minutes of audio) to enterprise-level plans. Right here’s the breakdown:

  • Free: 10,000 characters, limitless customized voices, $0/month
  • Starter: 30,000 characters (~half-hour), 20 initiatives, $3/month
  • Creator: 100,000 characters (~100 minutes), 1,000 initiatives, usage-based overage ($0.20/1,000 characters), $10/month
  • Professional: 500,000 characters (~500 minutes), 3,000 initiatives, $0.15/1,000 additional, $50/month
  • Scale: 2,000,000 characters (~2,000 minutes), 10,000 initiatives, $0.13/1,000 additional, $150/month
  • Enterprise: 10,000,000 characters (~10,000 minutes), 20,000 initiatives, $0.10/1,000 additional, $900/month
  • Enterprise: Customized pricing and limitless utilization
See also  Windows Notepad and Paint are still free - but the AI will cost you. Here's how much

For builders engaged on real-time voice interactions or emotional evaluation, Hume additionally provides a Pay as You Go plan with $20 in free credit and no upfront dedication. Excessive-volume enterprise clients can go for a devoted Enterprise plan that includes dataset licenses, on-premises options, customized integrations, and superior help.

Hume’s historical past of emotive AI voice fashions

Based in 2021 by Alan Cowen, a former researcher at Google DeepMind, Hume goals to bridge the hole between human emotional nuance and AI interplay.

The corporate educated its fashions on an expansive dataset drawn from lots of of 1000’s of contributors worldwide—capturing not simply speech and textual content, but in addition vocal bursts and facial expressions.

“Emotional intelligence consists of the power to deduce intentions and preferences from habits. That’s the very core of what AI interfaces try to realize,” Cowen instructed VentureBeat. Hume’s mission is to make AI interfaces extra responsive, humanlike, and in the end extra helpful—whether or not that’s serving to a buyer navigate an app or narrating a narrative with simply the fitting mix of drama and humor.

In early 2024, the corporate launched EVI 2, which provided 40% decrease latency and 30% decreased pricing in comparison with EVI 1, alongside new options like dynamic voice customization and in-conversation model prompts.

February 2025 noticed the debut of Octave, a text-to-speech engine for content material creators able to adjusting feelings on the sentence degree with textual content prompts.

With EVI 3 now out there for hands-on exploration and full API entry simply across the nook, Hume hopes to permit builders and creators to reimagine what’s potential with voice AI.

Supply hyperlink

Related Articles

Leave a Reply

Please enter your comment!
Please enter your name here

Latest Articles