Together AI’s $305M bet: Reasoning models like DeepSeek-R1 are increasing, not decreasing, GPU demand

February 21, 2025

66

Table of Contents

When DeepSeek-R1 first emerged, the prevailing worry that shook the trade was that superior reasoning might be achieved with much less infrastructure.

Because it seems, that’s not essentially the case. A minimum of, based on Collectively AI, the rise of DeepSeek and open-source reasoning has had the precise reverse impact: As an alternative of decreasing the necessity for infrastructure, it’s growing it.

That elevated demand has helped gasoline the expansion of Collectively AI’s platform and enterprise. At this time the corporate introduced a $305 million collection B spherical of funding, led by Normal Catalyst and co-led by Prosperity7. Collectively AI first emerged in 2023 with an goal to simplify enterprise use of open-source massive language fashions (LLMs). The corporate expanded in 2024 with the Collectively enterprise platform, which allows AI deployment in digital personal cloud (VPC) and on-premises environments. In 2025, Collectively AI is rising its platform as soon as once more with reasoning clusters and agentic AI capabilities.

The corporate claims that its AI deployment platform has greater than 450,000 registered builders and that the enterprise has grown 6X total year-over-year. The corporate’s prospects embody enterprises in addition to AI startups resembling Krea AI, Captions and Pika Labs.

“We are actually serving fashions throughout all modalities: language and reasoning and pictures and audio and video,” Vipul Prakash, CEO of Collectively AI, advised VentureBeat.

The large influence DeepSeek-R1 is having on AI infrastructure demand

DeepSeek-R1 was massively disruptive when it first debuted, for quite a few causes — considered one of which was the implication {that a} vanguard open-source reasoning mannequin might be constructed and deployed with much less infrastructure than a proprietary mannequin.

Nonetheless, Prakash defined, Collectively AI has grown its infrastructure partly to assist assist elevated demand of DeepSeek-R1 associated workloads.

“It’s a reasonably costly mannequin to run inference on,” he mentioned. “It has 671 billion parameters and it’s essential to distribute it over a number of servers. And since the standard is increased, there’s usually extra demand on the highest finish, which suggests you want extra capability.”

Moreover, he famous that DeepSeek-R1 usually has longer-lived requests that may final two to a few minutes. Great consumer demand for DeepSeek-R1 is additional driving the necessity for extra infrastructure.

To fulfill that demand, Collectively AI has rolled out a service it calls “reasoning clusters” that provision devoted capability, starting from 128 to 2,000 chips, to run fashions at the absolute best efficiency.

How Collectively AI helps organizations use reasoning AI

There are a selection of particular areas the place Collectively AI is seeing utilization of reasoning fashions. These embody:

Coding brokers: Reasoning fashions assist break down bigger issues into steps.
Lowering hallucinations: The reasoning course of helps to confirm the outputs of fashions, thus decreasing hallucinations, which is essential for functions the place accuracy is essential.
Enhancing non-reasoning fashions: Clients are distilling and enhancing the standard of non-reasoning fashions.
Enabling self-improvement: Using reinforcement studying with reasoning fashions permits fashions to recursively self-improve with out counting on massive quantities of human-labeled information.

Agentic AI can be driving elevated demand for AI infrastructure

Collectively AI can be seeing elevated infrastructure demand as its customers embrace agentic AI.

Prakash defined that agentic workflows, the place a single consumer request leads to 1000’s of API calls to finish a job, are placing extra compute demand on Collectively AI’s infrastructure.

To assist assist agentic AI workloads, Collectively AI lately has acquired CodeSandbox, whose expertise supplies light-weight, fast-booting digital machines (VMs) to execute arbitrary, safe code throughout the Collectively AI cloud, the place the language fashions additionally reside. This permits Collectively AI to cut back the latency between the agentic code and the fashions that must be referred to as, enhancing the efficiency of agentic workflows.

Nvidia Blackwell is already having an influence

All AI platforms are dealing with elevated calls for.

That’s one of many explanation why Nvidia retains rolling out new silicon that gives extra efficiency. Nvidia’s newest product chip is the Blackwell GPU, which is now being deployed at Collectively AI.

Prakash mentioned Nvidia Blackwell chips value round 25% greater than the earlier technology, however present 2X the efficiency. The GB 200 platform with Blackwell chips is especially well-suited for coaching and inference of combination of professional (MoE) fashions, that are skilled throughout a number of InfiniBand-connected servers. He famous that Blackwell chips are additionally anticipated to supply an even bigger efficiency enhance for inference of bigger fashions, in comparison with smaller fashions.

The aggressive panorama of agentic AI

The market of AI infrastructure platforms is fiercely aggressive.

Collectively AI faces competitors from each established cloud suppliers and AI infrastructure startups. All of the hyperscalers, together with Microsoft, AWS and Google, have AI platforms. There’s additionally an rising class of AI-focussed gamers resembling Groq and Samba Nova which are all aiming for a slice of the profitable market.

Collectively AI has a full-stack providing, together with GPU infrastructure with software program platform layers on prime. This permits prospects to simply construct with open-source fashions or develop their very own fashions on the Collectively AI platform. The corporate additionally has a give attention to analysis creating optimizations and accelerated runtimes for each inference and coaching.

“As an example, we serve the DeepSeek-R1 mannequin at 85 tokens per second and Azure serves it at 7 tokens per second,” mentioned Prakash. “There’s a pretty widening hole within the efficiency and price that we will present to our prospects.”

Supply hyperlink

Tags
AI
AI News

Buy now

Together AI’s $305M bet: Reasoning models like DeepSeek-R1 are increasing, not decreasing, GPU demand

The large influence DeepSeek-R1 is having on AI infrastructure demand

How Collectively AI helps organizations use reasoning AI

Agentic AI can be driving elevated demand for AI infrastructure

Nvidia Blackwell is already having an influence

The aggressive panorama of agentic AI

Related Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

Leave a Reply Cancel reply

Latest Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

AWS re:Invent was an all-in pitch for AI. Customers might not...

Bone AI raises $12M to challenge Asia’s defense giants with AI-powered...