Groq and PlayAI introduced a partnership immediately to deliver Dialog, a sophisticated text-to-speech mannequin, to market by way of Groq’s high-speed inference platform.
The partnership combines PlayAI’s experience in voice AI with Groq’s specialised processing infrastructure, creating what the businesses declare is without doubt one of the most natural-sounding and responsive text-to-speech techniques accessible.
“Groq gives a whole, low latency system for computerized speech recognition (ASR), GenAI, and text-to-speech, multi functional place,” mentioned Ian Andrews, Chief Income Officer at Groq, in an unique interview with VentureBeat. “With Dialog now operating on GroqCloud, this implies clients received’t have to make use of a number of suppliers for a single use case — Groq is a one cease resolution.”
Groq powers first Arabic voice AI, increasing Center East tech presence
Dialog is notable for being accessible in each English and Arabic, with the Arabic model representing the primary voice AI particularly designed for the Center East area. The inclusion of Arabic as one of many preliminary choices was strategic for each corporations.
“Arabic is the fourth most spoken language globally — by partnering with PlayAI to supply an Arabic TTS mannequin, Groq is unlocking a key international market and enabling broader entry to quick AI inference,” Andrews informed VentureBeat.
The businesses declare their resolution addresses key shortcomings in current voice AI applied sciences, significantly round pure speech patterns and response pace. In keeping with benchmark testing carried out by third-party evaluator Podonos, Dialog was most well-liked by customers at a price of 10:1 versus ElevenLabs v2.5 Turbo and over 3:1 in opposition to ElevenLabs Multilingual v2.0.
Modern ‘adaptive speech contextualizer’ transforms conversational AI
What units Dialog aside is its subtle strategy to context. Reasonably than treating every vocalization as an remoted occasion, the system maintains consciousness of the complete dialog circulation.
“We constructed a novel structure that we name an ‘adaptive speech contextualizer‘ (ASC), which permits the mannequin to make use of the complete context and historical past of a dialog,” mentioned Mahmoud Felfel, co-founder and CEO of PlayAI, in an interview with VentureBeat. “Which means that each response isn’t only a standalone output; it’s enriched with applicable prosody, tone, and emotion that mirror the circulation of the dialog.”
For enterprises seeking to implement conversational AI, latency — the delay between request and response — has been a persistent problem. Groq’s specialised Language Processing Items (LPUs) seem to offer a big benefit on this space.
“Based mostly on preliminary inside testing, Groq is delivering as much as 140 characters per second on PlayAI’s Dialog mannequin, a big enhance in comparison with the identical mannequin operating on GPUs at 86 characters per second,” defined Andrews. “That signifies that Dialog generates textual content as much as 10 instances quicker than real-time.”
Groq secures $1.5 billion Saudi funding to construct world-class AI infrastructure
The partnership comes at a time of great growth for Groq, which not too long ago secured a $1.5 billion dedication from Saudi Arabia to fund extra infrastructure. The corporate has established a knowledge heart in Dammam, which it describes as “the area’s largest inference cluster.”
“Partnering with Groq was a no brainer; they’re the trade chief in superior AI inference infrastructure,” mentioned Felfel. “With TTS and brokers, low latency is essential. We’ve already optimized Dialog for these real-time functions, however partnering with Groq permits us to ship the bottom latency voice mannequin available on the market.”
The voice AI market has seen speedy progress as companies look to automate buyer interactions whereas sustaining a pure, human-like expertise. Purposes vary from customer support and gross sales automation to voice-overs and accessibility options for the visually impaired.
Enterprise functions prolong past conventional customer support use circumstances
“Past customer support, different enterprise use circumstances embrace automating gross sales and appointment scheduling, on-boarding and private assistants, creating voice overs to current content material, translating English audio and video content material into Arabic, rising web site and static content material accessibility for the visually impaired, and extra,” Andrews mentioned.
For PlayAI, which was based by entrepreneurs from the Center East and North Africa area, the inclusion of Arabic language capabilities was significantly significant.
“As MENA founders, we all know the area is closely investing in AI capabilities and infrastructure as inflected in investments like Groq, but additionally world-leading adoption,” mentioned Felfel. “Arabic is a world enterprise language and one which we grew up talking, so it was a pure selection as certainly one of our core languages.”
The businesses have made the Dialog expertise accessible by way of GroqCloud’s tiered service mannequin, which incorporates each free and paid choices. This strategy permits builders to experiment with the expertise earlier than committing to bigger implementations.
“GroqCloud gives each free and paid plans. Anybody can create an account and create an API code at no cost,” Andrews defined. “Our paid Developer Tier is self-serve, that means anybody with a bank card can enroll themselves.”
As voice turns into an more and more vital interface for AI techniques, this partnership positions each corporations to capitalize on the rising demand for extra pure and responsive conversational experiences. By addressing the technical challenges of latency and pure speech patterns, Groq and PlayAI could have eliminated important obstacles to wider adoption of voice AI in enterprise settings.