Meta Llama: Everything you need to know about the open generative AI model

October 6, 2025

22

Table of Contents

Like each Massive Tech firm today, Meta has its personal flagship generative AI mannequin, known as Llama. Llama is considerably distinctive amongst main fashions in that it’s “open,” which means builders can obtain and use it nevertheless they please (with sure limitations). That’s in distinction to fashions like Anthropic’s Claude, Google’s Gemini, xAI’s Grok, and most of OpenAI’s ChatGPT fashions, which may solely be accessed through APIs.

Within the curiosity of giving builders selection, nevertheless, Meta has additionally partnered with distributors, together with AWS, Google Cloud, and Microsoft Azure, to make cloud-hosted variations of Llama obtainable. As well as, the corporate publishes instruments, libraries, and recipes in its Llama cookbook to assist builders fine-tune, consider, and adapt the fashions to their area. With newer generations like Llama 3 and Llama 4, these capabilities have expanded to incorporate native multimodal help and broader cloud rollouts.

Right here’s all the pieces it is advisable to find out about Meta’s Llama, from its capabilities and editions to the place you should utilize it. We’ll preserve this publish up to date as Meta releases upgrades and introduces new dev instruments to help the mannequin’s use.

What’s Llama?

Llama is a household of fashions — not only one. The most recent model is Llama 4; it was launched in April 2025 and consists of three fashions:

Scout: 17 billion energetic parameters, 109 billion whole parameters, and a context window of 10 million tokens.

Maverick: 17 billion energetic parameters, 400 billion whole parameters, and a context window of 1 million tokens.

Behemoth: Not but launched however will have 288 billion energetic parameters and a pair of trillion whole parameters.

(In knowledge science, tokens are subdivided bits of uncooked knowledge, just like the syllables “fan,” “tas,” and “tic” within the phrase “incredible.”)

A mannequin’s context, or context window, refers to enter knowledge (e.g., textual content) that the mannequin considers earlier than producing output (e.g., extra textual content). Lengthy context can stop fashions from “forgetting” the content material of current docs and knowledge, and from veering off subject and extrapolating wrongly. Nevertheless, longer context home windows also can end result within the mannequin “forgetting” sure security guardrails and being extra susceptible to supply content material that’s in keeping with the dialog, which has led some customers towards delusional considering.

For reference, the 10 million context window that Llama 4 Scout guarantees roughly equals the textual content of about 80 common novels. Llama 4 Maverick’s 1 million context window equals about eight novels.

Techcrunch occasion

San Francisco
|
October 27-29, 2025

All of the Llama 4 fashions had been educated on “massive quantities of unlabeled textual content, picture, and video knowledge” to present them “broad visible understanding,” in addition to on 200 languages, in keeping with Meta.

Llama 4 Scout and Maverick are Meta’s first open-weight natively multimodal fashions. They’re constructed utilizing a “mixture-of-experts” (MoE) structure, which reduces computational load and improves effectivity in coaching and inference. Scout, for instance, has 16 consultants, and Maverick has 128 consultants.

Llama 4 Behemoth consists of 16 consultants, and Meta is referring to it as a trainer for the smaller fashions.

Llama 4 builds on the Llama 3 collection, which included 3.1 and three.2 fashions broadly used for instruction-tuned functions and cloud deployment.

What can Llama do?

Like different generative AI fashions, Llama can carry out a variety of various assistive duties, like coding and answering fundamental math questions, in addition to summarizing paperwork in at the very least 12 languages (Arabic, English, German, French, Hindi, Indonesian, Italian, Portuguese, Hindi, Spanish, Tagalog, Thai, and Vietnamese). Most text-based workloads — suppose analyzing massive recordsdata like PDFs and spreadsheets — are inside its purview, and all Llama 4 fashions help textual content, picture, and video enter.

Llama 4 Scout is designed for longer workflows and large knowledge evaluation. Maverick is a generalist mannequin that’s higher at balancing reasoning energy and response pace and is appropriate for coding, chatbots, and technical assistants. And Behemoth is designed for superior analysis, mannequin distillation, and STEM duties.

Llama fashions, together with Llama 3.1, may be configured to leverage third-party functions, instruments, and APIs to carry out duties. They’re educated to make use of Courageous Seek for answering questions on current occasions; the Wolfram Alpha API for math- and science-related queries; and a Python interpreter for validating code. Nevertheless, these instruments require correct configuration and aren’t routinely enabled out of the field.

The place can I exploit Llama?

If you’re seeking to merely chat with Llama, it’s powering the Meta AI chatbot expertise on Fb Messenger, WhatsApp, Instagram, Oculus, and Meta.ai in 40 international locations. Advantageous-tuned variations of Llama are utilized in Meta AI experiences in over 200 international locations and territories.

Llama 4 fashions Scout and Maverick can be found on Llama.com and Meta’s companions, together with the AI developer platform Hugging Face. Behemoth remains to be in coaching. Builders constructing with Llama can obtain, use, or fine-tune the mannequin throughout many of the in style cloud platforms. Meta claims it has greater than 25 companions internet hosting Llama, together with Nvidia, Databricks, Groq, Dell, and Snowflake. And whereas “promoting entry” to Meta’s brazenly obtainable fashions isn’t Meta’s enterprise mannequin, the corporate makes some cash by revenue-sharing agreements with mannequin hosts.

A few of these companions have constructed extra instruments and providers on prime of Llama, together with instruments that permit the fashions reference proprietary knowledge and allow them to run at decrease latencies.

Importantly, the Llama license constrains how builders can deploy the mannequin: App builders with greater than 700 million month-to-month customers should request a particular license from Meta that the corporate will grant on its discretion.

In Could 2025, Meta launched a new program to incentivize startups to undertake its Llama fashions. Llama for Startups provides firms help from Meta’s Llama crew and entry to potential funding.

Alongside Llama, Meta offers instruments meant to make the mannequin “safer” to make use of:

Llama Guard, a moderation framework.

CyberSecEval, a cybersecurity risk-assessment suite.

Llama Firewall, a safety guardrail designed to allow constructing safe AI methods.

Code Defend, which offers help for inference-time filtering of insecure code produced by LLMs.

Llama Guard tries to detect probably problematic content material both fed into — or generated — by a Llama mannequin, together with content material regarding felony exercise, youngster exploitation, copyright violations, hate, self-harm, and sexual abuse.

That stated, it’s clearly not a silver bullet since Meta’s personal earlier pointers allowed the chatbot to interact in sensual and romantic chats with minors, and a few studies present these turned into sexual conversations. Builders can customise the classes of blocked content material and apply the blocks to all of the languages Llama helps.

Like Llama Guard, Immediate Guard can block textual content meant for Llama, however solely textual content meant to “assault” the mannequin and get it to behave in undesirable methods. Meta claims that Llama Guard can defend in opposition to explicitly malicious prompts (i.e., jailbreaks that try to get round Llama’s built-in security filters) along with prompts that comprise “injected inputs.” The Llama Firewall works to detect and stop dangers like immediate injection, insecure code, and dangerous instrument interactions. And Code Defend helps mitigate insecure code strategies and presents safe command execution for seven programming languages.

As for CyberSecEval, it’s much less a instrument than a set of benchmarks to measure mannequin safety. CyberSecEval can assess the chance a Llama mannequin poses (at the very least in keeping with Meta’s standards) to app builders and finish customers in areas like “automated social engineering” and “scaling offensive cyber operations.”

Llama’s limitations

Llama comes with sure dangers and limitations, like all generative AI fashions. For instance, whereas its most up-to-date mannequin has multimodal options, these are primarily restricted to the English language for now.

Zooming out, Meta used a dataset of pirated e-books and articles to coach its Llama fashions. A federal decide lately sided with Meta in a copyright lawsuit introduced in opposition to the corporate by 13 e-book authors, ruling that using copyrighted works for coaching fell underneath “truthful use.” Nevertheless, if Llama regurgitates a copyrighted snippet and somebody makes use of it in a product, they may probably be infringing on copyright and be liable.

Meta additionally controversially trains its AI on Instagram and Fb posts, images, and captions, and makes it tough for customers to choose out.

Programming is one other space the place it’s sensible to tread frivolously when utilizing Llama. That’s as a result of Llama would possibly — maybe extra so than its generative AI counterparts — produce buggy or insecure code. On LiveCodeBench, a benchmark that checks AI fashions on aggressive coding issues, Meta’s Llama 4 Maverick mannequin achieved a rating of 40%. That’s in comparison with 85% for OpenAI’s GPT-5 excessive and 83% for xAI’s Grok 4 Quick.

As at all times, it’s greatest to have a human skilled assessment any AI-generated code earlier than incorporating it right into a service or software program.

Lastly, as with different AI fashions, Llama fashions are nonetheless responsible of producing plausible-sounding however false or deceptive info, whether or not that’s in coding, authorized steerage, or emotional conversations with AI personas.

This was initially printed on September 8, 2024, and is up to date recurrently with new info.

Supply hyperlink

Buy now

Meta Llama: Everything you need to know about the open generative AI model

What’s Llama?

What can Llama do?

The place can I exploit Llama?

Llama’s limitations

Related Articles

Bose QuietComfort Ultra vs. Sony WH-1000XM6: I tried the two best...

Hiring specialists made sense before AI — now generalists win

Top 10 AI Models For Web Development in 2025

Leave a Reply Cancel reply

Latest Articles

Bose QuietComfort Ultra vs. Sony WH-1000XM6: I tried the two best...

Hiring specialists made sense before AI — now generalists win

Top 10 AI Models For Web Development in 2025

‘ONE RULE’: Trump says he’ll sign an executive order blocking state...

Anthropic and Accenture sign multi-year AI strategic partnership