As AI corporations mature, the battle for high-quality information has grow to be some of the aggressive areas within the business, launching corporations like Mercor, Surge, and, most prominently, Alexandr Wang’s Scale AI. However now that Wang has moved on to run AI at Meta, many funders see a gap — and are keen to fund corporations with compelling new methods for amassing coaching information.
The Y Combinator graduate Datacurve is one such firm, specializing in high-quality information for software program growth. On Thursday, the corporate introduced a $15 million Sequence A spherical, led by Mark Goldberg at Chemistry with participation from workers at DeepMind, Vercel, Anthropic, and OpenAI. The Sequence A comes after a $2.7 million seed spherical, which drew funding from former Coinbase CTO Balaji Srinivasan.
Datacurve makes use of a “bounty hunter” system to draw expert software program engineers to finish the hardest-to-source datasets. The corporate pays for these contributions, distributing over $1 million in bounties to date.
However co-founder Serena Ge (pictured above with co-founder Charley Lee) says the most important motivation isn’t monetary. For prime-value providers like software program growth, the pay will all the time be far decrease for information work than standard employment — so the corporate’s most essential edge is a constructive person expertise.
“We deal with this as a client product, not a knowledge labeling operation,” Ge stated. “We spend a variety of time enthusiastic about: How can we optimize it in order that the individuals we would like have an interest and get onto our platform?”
That’s notably essential because the wants of post-training information develop extra advanced. Whereas earlier fashions have been educated on easy datasets, immediately’s AI merchandise depend on advanced RL environments, which should be constructed via particular and strategic information assortment. Because the environments develop extra refined, the information necessities grow to be each extra intense for each amount and high quality — an element that might give high-quality information assortment corporations like Datacurve an edge.
As an early-stage firm, Datacurve is targeted on software program engineering, however Ge says the mannequin may apply simply as simply to fields like finance, advertising, and even drugs.
Techcrunch occasion
San Francisco
|
October 27-29, 2025
“What we’re doing proper now’s we’re creating an infrastructure for post-training information assortment that pulls and retains extremely competent individuals in their very own domains,” Ge says.