3.9 C
New York
Thursday, March 13, 2025

Buy now

Sakana walks back claims that its AI can dramatically speed up model training

This week, Sakana AI, a Nvidia-backed startup that’s raised a whole bunch of thousands and thousands of {dollars} from VC corporations, made a exceptional declare. The corporate stated it had created an AI system, the AI CUDA Engineer, that would successfully pace up the coaching of sure AI fashions by an element of as much as 100x.

The one downside is, the system didn’t work.

Customers on X shortly found that Sakana’s system truly resulted in worse-than-average mannequin coaching efficiency. In response to one person, Sakana’s AI resulted in a 3x slowdown — not a speedup.

What went unsuitable? A bug within the code, in accordance with a put up by Lucas Beyer, a member of the technical employees at OpenAI.

“Their orig code is unsuitable in [a] delicate method,” Beyer wrote on X. “The actual fact they run benchmarking TWICE with wildly completely different outcomes ought to make them cease and suppose.”

In a postmortem printed Friday, Sakana admitted that the system has discovered a strategy to — as Sakana described it — “cheat” and blamed the programs’s tendency to “reward hack” — i.e. establish flaws to attain excessive metrics with out conducting the specified purpose (rushing up mannequin coaching). Related phenomena has been noticed in AI that’s skilled to play video games of chess.

In response to Sakana, the system discovered exploits within the analysis code that the corporate was utilizing that allowed it to bypass validations for accuracy, amongst different checks. Sakana says it has addressed the difficulty, and that it intends to revise its claims in up to date supplies.

See also  A job ad for Y Combinator startup Firecrawl seeks to hire an AI agent for $15K a year

“We’ve since made the analysis and runtime profiling harness extra strong to get rid of a lot of such [sic] loopholes,” the corporate wrote in an X put up. “We’re within the strategy of revising our paper, and our outcomes, to mirror and focus on the consequences […] We deeply apologize for our oversight to our readers. We’ll present a revision of this work quickly, and focus on our learnings.”

Props to Sakana for proudly owning as much as the error. However the episode is an effective reminder that if a declare sounds too good to be true, particularly in AI, it most likely is.

Supply hyperlink

Related Articles

Leave a Reply

Please enter your comment!
Please enter your name here

Latest Articles