Odyssey, a startup based by self-driving pioneers Oliver Cameron and Jeff Hawke, has developed an AI mannequin that lets customers “work together” with streaming video.
Obtainable on the net in an “early demo,” the mannequin generates and streams video frames each 40 milliseconds. By way of primary controls, viewers can discover areas inside a video, much like a 3D-rendered online game.
“Given the present state of the world, an incoming motion, and a historical past of states and actions, the mannequin makes an attempt to foretell the subsequent state of the world,” explains Odyssey in a weblog submit. “Powering it is a new world mannequin, demonstrating capabilities like producing pixels that really feel life like, sustaining spatial consistency, studying actions from video, and outputting coherent video streams for five minutes or extra.”
Quite a lot of startups and massive tech corporations are chasing after world fashions, together with DeepMind, influential AI researcher Fei-Fei Lee’s World Labs, Microsoft, and Decart. They imagine that world fashions might at some point be used to create interactive media, resembling video games and flicks, and run life like simulations like coaching environments for robots.
However creatives have blended emotions in regards to the tech. A latest Wired investigation discovered that recreation studios like Activision Blizzard, which has laid off scores of employees, are utilizing AI to chop corners and fight attrition. And a 2024 research commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based movie, tv, and animation jobs will probably be disrupted by AI within the coming months.
For its half, Odyssey is pledging to collaborate with artistic professionals — not change them.
“Interactive video […] opens the door to completely new types of leisure, the place tales might be generated and explored on demand, free from the constraints and prices of conventional manufacturing,” writes the corporate in its weblog submit. “Over time, we imagine all the pieces that’s video right this moment — leisure, adverts, training, coaching, journey, and extra — will evolve into interactive video, all powered by Odyssey.”
Odyssey’s demo is a bit tough across the edges, which the corporate acknowledges in its submit. The environments the mannequin generates are blurry and distorted, and unstable within the sense that their layouts don’t all the time stay the identical. Stroll ahead in a single course for some time or flip round, and the environment may instantly look completely different.
However the firm’s promising to quickly enhance upon the mannequin, which may at the moment stream video at as much as 30 frames per second from clusters of Nvidia H100 GPUs at the price of $1-$2 per “user-hour.”
“Wanting forward, we’re researching richer world representations that seize dynamics much more faithfully, whereas growing temporal stability and protracted state,” writes Odyssey in its submit. “In parallel, we’re increasing the motion area from movement to world interplay, studying open actions from large-scale video.”
Odyssey is taking a special strategy than many AI labs on the planet modeling area. It designed a 360-degree, backpack-mounted digicam system to seize real-world landscapes, which Odyssey thinks can function a foundation for higher-quality fashions than fashions educated solely on publicly out there knowledge.
To this point, Odyssey has raised $27 million from buyers together with EQT Ventures, GV, and Air Road Capital. Ed Catmull, one of many co-founders of Pixar and former president of Walt Disney Animation Studios, is on the startup’s board of administrators.
Final December, Odyssey mentioned it was engaged on software program that enables creators to load scenes generated by its fashions into instruments resembling Unreal Engine, Blender, and Adobe After Results in order that they are often hand-edited.