Meet OpenAI Codex: Cloud-based Software Engineering Agent

May 17, 2025

77

Table of Contents

“Software program engineering is altering, and by the tip of 2025 it’s going to look essentially completely different.” Greg Brockman’s opening line at OpenAI’s launch occasion set the tone for what adopted. OpenAI launched Codex, a cloud‑native software program agent designed to work alongside builders.

Codex will not be a single product however a household of brokers powered by codex‑1, OpenAI’s newest coding mannequin. Codex CLI, arrived a number of weeks in the past as a light-weight companion that runs inside your terminal. At present the highlight shifts to its greater, distant agent that’s avialble fully on ChatGPT. You possibly can spin up 1000’s of parallel “mini‑computer systems” and sort out a number of duties when you’re off grabbing espresso. This text goes to be an outline of Codex on ChatGPT, and we are going to quickly be releasing some mission primarily based articles on the subject.

From Autocomplete to Autonomous Vibe Coding

OpenAI’s journey towards agent-like coding started in 2021 with the unique Codex mannequin, which powered GitHub Copilot. On the time, it labored like a sensible autocomplete, serving to you end traces of code. Since then, with years of progress in reinforcement studying, Codex has turn out to be extra succesful.

At present, within the occasions of vibe coding, you merely describe what you need in plain language, and Codex figures out how you can construct it. The most recent mannequin, Codex‑1, is constructed on OpenAI’s o3 structure and fine-tuned on actual pull requests. It’s educated to generate code, observe finest practices like linting, testing, and constant fashion, making it useful for real-world growth.

Additionally Learn: A Information to Grasp the Artwork of Vibe Coding

How you can Entry Codex within the ChatGPT Interface?

Open ChatGPT and go to “Codex” sidebar within the left navigation rail you’ll see a brand new “Codex (beta)” icon. Click on it to disclose the agent dashboard.

Join GitHub (first‑time solely): A single OAuth click on authorises Codex to learn/write in your repos. You possibly can limit it to particular organisations or private tasks.

Choose a repository & department: Decide the mission you’d like Codex to work on (e.g., fundamental or characteristic/ui‑overhaul). The agent clones this department into its personal sandbox.
Configure the setting (non-compulsory): Add setting variables, secrets and techniques, or setup instructions, similar to a CI job. Linters and formatters are pre‑put in, however you’ll be able to override variations.

Select a job template:
- Ask: “Clarify the structure.”
- Code: “Discover and repair the flakey take a look at in test_api.py.”
- Recommend: Let Codex scan the repo and suggest upkeep chores.
- Or simply kind a customized instruction in pure language.

Run & multitask: Press “Launch”. Every job spins up its personal micro‑VM; you’ll be able to queue dozens in parallel and proceed chatting elsewhere in ChatGPT.
Overview outcomes: Inexperienced test‑marks point out passing checks. Click on a job card to see the diff, the mannequin’s rationalization, and the total work‑log.
Merge or iterate: Hit “Open PR” to push the department again to GitHub or reply to the duty with observe‑up directions if adjustments are wanted.

OpenAI Codex Demo

On this part, I’m sharing the completely different examples demostrating how this new software program growth agent can type your life!

Instance 1: Speed up Growth

OpenAI engineer Nacho Soto demonstrates how Codex helps him start new duties quicker by organising mission scaffolding, reminiscent of Swift packages. Utilizing prompts, he may offload setup work and deal with constructing options, whereas Codex handles the remaining within the background.

Instance 2: Overview Workflows

Codex helps not simply code technology but additionally evaluate workflows. Builders evaluate AI-generated pull requests, establish points like formatting, and immediate Codex to make corrections.

Instance 3: Fixing Papercuts with Codex

Engineer Max Johnson describes how Codex helps deal with small bugs and code high quality issues, with out disrupting focus. As an alternative of switching contexts, he delegates these duties to Codex and opinions the output later, bettering the codebase.

Instance 4: Discovering Error in Codebase whereas on a Name

Calvin explains how Codex assists with pressing duties throughout on-call shifts. By sending stack traces to Codex, he shortly will get diagnostics or fixes. It additionally helps tune alerts and handle routine ops work, decreasing handbook overhead.

o3 vs Codex

Immediate: “Please repair the next problem within the matplotlib/matplotlib repository. Please resolve the problem in the issue under by modifying and testing code information in your present code execution session. The repository is cloned within the /testbed folder. You will need to totally resolve the issue in your reply to be thought-about appropriate.”

Drawback assertion:[Bug]: Home windows correction will not be appropriate in `mlab._spectral_helper`
### Bug abstractHome windows correction will not be appropriate in `mlab._spectral_helper`:
https://github.com/matplotlib/matplotlib/blob/3418bada1c1f44da1f73916c5603e3ae79fe58c1/lib/matplotlib/mlab.py#L423-L430
The `np.abs` will not be wanted, and provides incorrect consequence for window with unfavourable worth, reminiscent of `flattop`.
For reference, the implementation of scipy might be discovered right here :
https://github.com/scipy/scipy/blob/d9f75db82fdffef06187c9d8d2f0f5b36c7a791b/scipy/sign/_spectral_py.py#L1854-L1859
### Code for copy
```python
import numpy as np
from scipy import sign
window = sign.home windows.flattop(512)
print(np.abs(window).sum()**2-window.sum()**2)
```
### Precise consequence
4372.942556173262
### Anticipated consequence
0
### Further info
_No response_
### Working system
_No response_
### Matplotlib Model
newest
### Matplotlib Backend
_No response_
### Python model
_No response_
### Jupyter model
_No response_
### Set up
None

Output:

Commentary:

The Codex-generated repair is extra correct and full than the o3 output, because it appropriately removes the pointless use of np.abs() in window normalization inside mlab._spectral_helper, which triggered incorrect outcomes for home windows with unfavourable values like flattop. Codex replaces the defective normalization with mathematically applicable expressions—utilizing (window**2).sum() as an alternative of (np.abs(window)**2).sum()—aligning with finest practices seen in SciPy’s implementation. It additionally provides a unit take a look at to validate habits, making certain the repair is verifiable and strong. In distinction, the o3 output seems incomplete and doesn’t clearly deal with the core bug, making Codex the higher answer.

Working of Codex

Codex writes code: The mannequin begins by producing code to resolve a given job.
It runs the code: The output isn’t just evaluated for plausibility, however truly executed.
It checks take a look at outcomes: Codex observes whether or not the generated code passes the related checks.
It will get rewarded provided that the duty is accomplished efficiently: In contrast to conventional LLMs that concentrate on next-word prediction, Codex solely will get a excessive rating if the code works end-to-end.
It learns by suggestions: If the code fails, Codex retries: creating repro scripts, fixing lint errors, and adjusting formatting till it meets requirements.
It evolves like a junior developer: This coaching methodology teaches Codex to behave much less like a textual content generator and extra like a considerate engineer following real-world coding practices.

Codex‑1 outperforms earlier fashions each in standardized benchmarks and inside OpenAI workflows. As proven under, it achieves greater accuracy on the SWE-Bench Verified benchmark throughout all try counts and leads in OpenAI’s inside software program engineering duties. This highlights Codex‑1’s real-world reliability, particularly for builders integrating it into day by day workflows.

A Peek Contained in the Cloud Workshop

Each time you press ⏎ Run within the Codex sidebar, the system creates a micro‑VM sandbox: its personal file‑system, CPU, RAM, and locked‑down community coverage. Your repository is cloned, setting variables injected, and customary developer instruments (linters, formatters, take a look at runners) pre‑put in. That isolation delivers two instant advantages:

Security & Reproducibility – Rogue scripts can’t contact your laptop computer or leak secrets and techniques; the entire run might be replayed later.
Parallelism at Scale – Want to repair typos, harmonise time‑outs, and hunt a mysterious bug? Launch three duties and evaluate the outcomes facet‑by‑facet.

An non-compulsory AGENTS.md file acts like a README for robots: you describe the mission format, how you can run checks, most well-liked commit fashion, even a request to print ASCII cats between steps. The richer the directions, the smoother Codex behaves.

Availability, Limits & What’s Subsequent

Codex is presently out there to ChatGPT Professional, Enterprise, and Staff customers. Free-tier and EDU customers are anticipated to realize entry quickly. Throughout the analysis preview, utilization is topic to beneficiant limits, however these could evolve primarily based on demand. Future plans embody an API for Codex, integration into CI pipelines, and unification between the CLI and ChatGPT variations to permit seamless handoffs between native and cloud growth.

Additionally Learn:

Conclusion

“I simply landed a multi‑file refactor that by no means touched my laptop computer.”

– OpenAI Engineer

Tales like that trace at a future the place coding resembles excessive‑degree orchestration: you present intent, the agent grinds by the small print. Codex represents a shift in how builders work together with code, shifting from writing all the pieces manually to orchestrating high-level duties. Engineers now focus extra on intent and validation, whereas Codex handles execution. For a lot of, this indicators the start of a brand new growth workflow, the place human and agent collaboration turns into the usual quite than the exception.

How are you planning to make use of Codex? Let me know within the remark part under!

Whats up, I’m Nitika, a tech-savvy Content material Creator and Marketer. Creativity and studying new issues come naturally to me. I’ve experience in creating result-driven content material methods. I’m properly versed in web optimization Administration, Key phrase Operations, Net Content material Writing, Communication, Content material Technique, Modifying, and Writing.

Login to proceed studying and luxuriate in expert-curated content material.

Supply hyperlink

Tags
AI
ChatGPT

Buy now

Meet OpenAI Codex: Cloud-based Software Engineering Agent

From Autocomplete to Autonomous Vibe Coding

How you can Entry Codex within the ChatGPT Interface?

OpenAI Codex Demo

Instance 1: Speed up Growth

Instance 2: Overview Workflows

Instance 3: Fixing Papercuts with Codex

Instance 4: Discovering Error in Codebase whereas on a Name

o3 vs Codex

Working of Codex

A Peek Contained in the Cloud Workshop

Availability, Limits & What’s Subsequent

Conclusion

Login to proceed studying and luxuriate in expert-curated content material.

Related Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

Leave a Reply Cancel reply

Latest Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

AWS re:Invent was an all-in pitch for AI. Customers might not...

Bone AI raises $12M to challenge Asia’s defense giants with AI-powered...