Well-liked AI picture era service Midjourney has launched its first AI video era mannequin V1, marking a pivotal shift for the corporate from picture era towards full multimedia content material creation.
Beginning at this time, Midjourney’s practically 20 million customers can animate photographs by way of the web site, remodeling their generated or uploaded stills into 5-second lengthy clips with choices for extending the era longer as much as 20 seconds (in 5 second bursts), and guiding them with textual content.
With the launch, the bootstrapped small lab Midjourney positions itself in a quickly intensifying AI video race. On the identical time, it’s additionally confronting severe authorized challenges from two of the most important leisure studios on the earth.
What does it imply for AI creators and enterprises trying to harness the newest in inventive tech for promoting, advertising or consumer engagement? And the way does Midjourney stack up in opposition to an extended and rising listing of AI video mannequin rivals? Learn on to seek out out.
A brand new product constructed straight atop Midjourney’s fashionable AI picture generator
Midjourney’s new providing extends its acquainted image-based workflow, together with its new v7 text-to-image mannequin.
Customers generate a nonetheless picture, both throughout the Midjourney platform or by importing an exterior file, then press “Animate” to show that picture into video.
Two major modes exist: one makes use of automated movement synthesis, whereas the opposite lets customers write a customized movement immediate to dictate by way of textual content how parts ought to transfer within the scene. So Midjourney video arrives with help for each image-to-video and text-to-video edits and modifications.
From a inventive standpoint, customers can toggle between two movement settings. There’s a low movement mode is optimized for ambient or minimalist motion — comparable to a personality blinking or a lightweight breeze shifting surroundings — and excessive movement mode that makes an attempt extra dynamic animation of each topic and digital camera, although this may enhance the prospect of visible errors.
These are accessed under a generated or uploaded picture on the Midjourney web site in the fitting hand choices pane under a area labeled “Animate Picture,” as seen right here:
Every video job generates 4 completely different 5-second clips as choices, and customers can prolong the animation by 4 seconds per clip, as much as a complete of 20 seconds.
Whereas that is comparatively short-form, the corporate has indicated that video length and options will increase in future updates.
Midjourney, launched in summer time 2022, is extensively thought-about by many AI picture creators because the premiere or “gold commonplace” in AI picture era to today because of its comparatively frequent and extra practical and various creation choices, so there have been excessive expectations surrounding its entry into the AI video area.
Preliminary reactions from customers we’ve seen have been primarily promising, with some like Perplexity AI designer Phi Hoang (@apostraphi on X) commenting: “it’s surpassing all my expectations,” on a publish on X.
Right here’s a pattern era I created with my private Midjourney account:
Reasonably priced worth
Midjourney is providing video entry as a part of its current subscription plans, beginning at $10 monthly.
The corporate states that every video job will price roughly 8x greater than a picture era process. Nevertheless, since every video job produces 20 seconds of content material, the cost-per-second is roughly equal to producing one nonetheless picture—a pricing mannequin that seems to undercut many rivals.
A “video calm down mode” is being examined for “Professional” subscribers and above. This mode, like its counterpart in picture era, would supply delayed processing in trade for lowered compute prices. Quick era stays metered via GPU minutes primarily based on tiered subscription plans.
Group commentators have largely acquired the pricing positively. AI content material creator @BLVCKLIGHTai emphasised on social media that the associated fee is roughly in step with what customers pay for upscaling photographs—making the instrument surprisingly reasonably priced for short-form video experimentation.
It’s corresponding to rival Luma AI’s “Net Lite Plan” for $9.99 monthly and under Runway’s “Commonplace” plan ($15 month-to-month).
Listed below are a few of the different choices obtainable:
No sound but and a extra restricted built-in editor than AI video rivals comparable to Runway, Sora, Luma
The mannequin’s most noticeable limitation is its lack of sound.
In contrast to rivals comparable to Google’s Veo 3 and Luma Labs’ Dream Machine, Midjourney’s system doesn’t generate accompanying audio tracks or ambient sound results.
For now, any soundtrack would have to be added manually in post-production utilizing separate instruments.
As well as, Midjourney’s outputs stay quick and are capped at 20 seconds. There isn’t any present help for modifying timelines, scene transitions, or continuity between clips.
Midjourney has said that is solely the start and that the preliminary launch is meant to be exploratory, accessible, and scalable.
Rising stakes in crowded AI video market
The launch lands at a time when AI video era is quickly changing into one of the crucial aggressive corners of the generative AI panorama.
Tech giants, venture-backed startups, and open-source initiatives are all transferring quick.
This week, Chinese language startup MiniMax launched Hailuo 02, an improve to its earlier video mannequin. Early suggestions has praised its realism, movement adherence to prompts, and 1080p decision, although some reviewers famous that render occasions are nonetheless comparatively sluggish.
The mannequin seems particularly adept at deciphering complicated movement or cinematic digital camera angles, placing it in direct comparability with Western choices like Runway’s Gen-3 Alpha and Google’s Veo line.
In the meantime, Luma Labs’ Dream Machine has gained traction for its potential to co-generate audio alongside high-fidelity video, a function lacking from Midjourney’s new launch, and like Runway, permits for re-stylizing or “re-skinning” video with a brand new function known as Modify Video.
Google’s Veo 3 and OpenAI’s upcoming Sora mannequin are equally working towards broader multimodal synthesis, integrating textual content, picture, video, and sound into cohesive, editable scenes.
Midjourney’s guess seems to be on simplicity and cost-effectiveness—a “adequate” answer priced for scale—however that additionally means it launches with out many superior options now commonplace within the premium AI video tier.
The shadow of litigation from Disney and Common over IP infringement
Simply days earlier than the launch, Midjourney was named in a sweeping copyright infringement lawsuit filed by Disney and Common in U.S. District Courtroom.
The grievance, spanning greater than 100 pages, accuses Midjourney of coaching its fashions on copyrighted characters—together with these from Marvel, Star Wars, The Simpsons, and Shrek—with out authorization and persevering with to permit customers to generate spinoff content material.
The studios allege that Midjourney has created a “bottomless pit of plagiarism,” deliberately enabling customers to provide downloadable photographs that includes characters like Darth Vader, Elsa, Iron Man, Bart Simpson, Shrek, and Toothless with little friction.
They additional declare that Midjourney used information scraping instruments and net crawlers to ingest copyrighted supplies and did not implement technical safeguards to dam outputs resembling protected IP.
Of specific observe: the lawsuit preemptively names Midjourney’s Video Service as a probable supply of future infringement, stating that the corporate had begun coaching the mannequin earlier than launch and was seemingly already replicating protected characters in movement.
In line with the grievance, Midjourney earned $300 million in income in 2024 and serves practically 21 million customers. The studios argue that this scale provides the platform a industrial benefit constructed atop uncompensated inventive labor.
Disney’s normal counsel, Horacio Gutierrez, said plainly: “Piracy is piracy. And the truth that it’s executed by an AI firm doesn’t make it any much less infringing.”
The lawsuit is predicted to check the bounds of U.S. copyright legislation because it pertains to AI coaching information and output management—and will affect how platforms like Midjourney, OpenAI, and others should construction future content material filters or licensing agreements.
For enterprises involved about infringement dangers, providers with built-in indemnity like OpenAI’s Sora or Adobe Firefly Video are most likely higher choices for AI video creation.
A ‘world mannequin’ and realtime world era is the aim
Regardless of the speedy dangers, Midjourney’s long-term roadmap is obvious and bold. In public statements surrounding the video mannequin’s launch, the corporate mentioned its aim is to finally merge static picture era, animated movement, 3D spatial navigation, and real-time rendering right into a single, unified system, also referred to as a world mannequin.
These techniques purpose to let customers navigate via dynamically generated environments—areas the place visuals, characters, and consumer inputs evolve in actual time, like immersive video video games or VR experiences.
They envision a future the place customers can situation instructions like “stroll via a market in Morocco at sundown,” and the system responds with an explorable, interactive simulation—full with evolving visuals and maybe, finally, generative sound.
For now, the video mannequin serves as an early step on this course. Midjourney has described it as a “technical stepping stone” to extra complicated techniques.
However Midjourney is much from the one AI analysis lab pursuing such formidable plans.
Odyssey, a startup co-founded by self-driving tech veterans Oliver Cameron and Jeff Hawke, lately debuted a system that streams video at 30 frames per second with spatial interplay capabilities. Their mannequin makes an attempt to foretell the “subsequent state of the world” primarily based on prior states and actions, enabling customers to go searching and discover scenes as if navigating a 3D area.
Odyssey combines AI modeling with its personal 360-degree digital camera {hardware} and is pursuing integrations with 3D platforms like Unreal Engine and Blender for post-generation modifying. Nevertheless, it doesn’t but enable for a lot consumer management past transferring the place of the digital camera and seeing what random sights the mannequin produces because the consumer navigates the generated area.
Equally, Runway, a longtime participant in AI video era, has begun folding world modeling into its public roadmap. The corporate’s AI video fashions — the newest amongst them, Gen-4 launched in April 2025 — help superior AI digital camera controls that enable customers to arc round topics, zoom out and in, or easily glide throughout environments—options that start to blur the road between video era and scene simulation.
In a 2023 weblog publish, Runway’s CTO Anastasis Germanidis outlined normal world fashions as techniques that perceive environments deeply sufficient to simulate future occasions and interactions inside them. In different phrases, they’re not simply producing what a scene appears to be like like—they’re predicting the way it behaves.
Different main AI efforts on this area embrace:
- DeepMind, which has performed foundational analysis into world modeling for robotic coaching and reinforcement studying;
- World Labs, the brand new enterprise led by AI researcher Fei-Fei Li, targeted particularly on simulation-centric fashions;
- Microsoft, which is exploring world fashions for enterprise purposes like digital twins and simulation-based coaching;
- Decart, a stealthier however well-funded startup engaged on multi-agent simulation fashions.
Whereas Midjourney’s strategy has thus far emphasised accessibility and ease of use, it’s now signaling an evolution towards these extra refined simulation frameworks. The corporate says that to attain this, it should first construct the required parts: static visuals (its authentic picture fashions), movement (video fashions), spatial management (3D positioning), and real-time responsiveness. Its new video mannequin, then, serves as one foundational block on this longer arc.
This places Midjourney in a world race—not simply to generate stunning media, however to outline the infrastructure of interactive, AI-generated worlds.
A calculated and promising leap into an more and more sophisticated aggressive area
Midjourney’s entry into video era is a logical extension of its fashionable picture platform, priced for broad entry and designed to decrease the barrier for animation experimentation. It affords a simple path for creators to deliver their visuals to life—at a value construction that, for now, seems each aggressive and sustainable.
However this launch additionally locations the corporate squarely within the crosshairs of a number of challenges. On the product aspect, it faces succesful and fast-moving rivals with extra options and fewer authorized baggage. On the authorized entrance, it should defend its practices in a lawsuit that might reshape how AI companies are allowed to coach and deploy generative fashions within the U.S.
For enterprise leaders evaluating AI inventive platforms, Midjourney’s launch presents a double-edged sword: a low-cost, fast-evolving instrument with robust consumer adoption — however with unresolved regulatory and IP publicity that might have an effect on reliability or continuity in enterprise deployments.
The query going ahead is whether or not Midjourney can keep its velocity with out hitting a authorized wall or whether or not it should considerably restructure its enterprise and know-how to remain viable in a maturing AI content material ecosystem.