Coding with AI? My top 5 tips for vetting its output – and staying out of trouble

July 14, 2025

53

Table of Contents

Our story begins, as many tales do, with a person and his AI. The person, like many males, is a little bit of a geek and a little bit of a programmer. He additionally wants a haircut.

The AI is the fruits of 1000’s of years of human development, all put to the service of constructing the person’s life a bit of simpler. The person, after all, is me. I am that man.

Sadly, whereas AI might be extremely sensible, it additionally has a propensity to lie, mislead, and make shockingly silly errors. It’s the silly half that we’ll be discussing on this article.

Anecdotal proof does have worth. My reviews on how I’ve solved some issues shortly with AI are actual. The packages I used AI to jot down with are nonetheless in use. I’ve used AI to assist velocity up points of my programming circulate, particularly once I give attention to the candy spots the place I am much less productive and the AI is sort of educated, like writing features that decision publicly revealed APIs.

You know the way we bought right here. Generative AI burst onto the scene on the cusp of 2023 and has been blasting its manner into data work ever since.

One space, because the narrative goes, the place AI really shines is its skill to jot down code and assist handle IT techniques. These claims are usually not unfaithful. I’ve proven, a number of instances, how AI has solved coding and techniques engineering issues I’ve personally skilled.

AI coding in the actual world: What science reveals

New instruments at all times include large guarantees. However do they ship in real-world settings?

Most of my reporting on programming effectiveness has been based mostly on private anecdotal proof: my very own programming experiences utilizing AI. However I am one man. I’ve restricted time to commit to programming and, like each programmer, I’ve sure areas the place I spend most of my coding time.

Just lately, although, a nonprofit analysis group known as METR (Mannequin Analysis & Risk Analysis) did a extra thorough evaluation of AI coding productiveness.

Their methodology appears sound. They labored with 16 skilled open-source builders who’ve actively contributed to giant, widespread repositories. The METR analysts supplied these builders with 246 points from the repositories that wanted fixing. The coders got about half the problems the place they needed to work on their very own, and about half the place they might use an AI for assist.

The outcomes have been hanging and surprising. Whereas the builders themselves estimated that AI help elevated their productiveness by a mean of 24%, METR’s analytics confirmed as a substitute that AI help slowed them down by a mean of 19%.

That is a little bit of a head-scratcher. METR put collectively an inventory of things that may clarify the slowdown, together with over-optimism about AI usefulness, high-developer familiarity with their repositories (and fewer AI data), the complexity of enormous repositories, lack of AI reliability, and an ongoing drawback the place the AI refuses to make use of “vital tacit data or context.”

I’d recommend that two different components may need restricted effectiveness:

Alternative of drawback: The builders have been advised which points that they had to make use of AI assistance on and which points they could not. My expertise suggests educated builders should select the place to make use of AI based mostly on the issue that must be solved. In my case, for instance, getting the AI to jot down a daily expression (one thing I do not like doing and I am pretty crappy at) would save me much more time than getting the AI to change distinctive code I’ve already written, work on recurrently, and know in and out.

Alternative of AI: Based on the report, the builders used Cursor, an AI-centric fork of VS Code, which used Claude 3.5/3.7 Sonnet on the time. Once I examined 3.5 Sonnet, the outcomes have been horrible, with Sonnet failing three out of 4 of my exams. Subsequently, my exams of Claude 4 Sonnet have been significantly higher. METR reported that builders rejected greater than 65% of the code the AI generated. That is going to take time.

That point when ChatGPT prompt nuking my system

METRs outcomes are fascinating. AI is clearly a double-edged sword relating to coding assist. However there’s additionally little question that AI can present appreciable worth to coders. If something, I believe this take a look at as soon as once more proves the competition that AI is a good software for knowledgeable programmers, however a possible high-risk useful resource for newbies.

Let’s take a look at a concrete instance, one that would have price me a number of time and hassle if I adopted ChatGPT’s recommendation.

I used to be organising a Docker container on my dwelling lab utilizing Portainer (a software that helps handle Docker containers). For some purpose, Portainer wouldn’t allow the Deploy button to create the container.

It had been a protracted day, so I did not see the apparent drawback. As an alternative, I requested ChatGPT. I fed ChatGPT screenshots of the configuration, in addition to my Docker configuration file.

ChatGPT really useful that I uninstall and reinstall Portainer. It additionally prompt I take away Docker from the Linux distro and use the bundle supervisor to reinstall it. These actions would have had the impact of killing all my containers.

Of be aware, ChatGPT did not advocate or ask if I had backups of the containers. It simply gave me the command line sequences it really useful I lower and paste to delete and rebuild Portainer and Docker. It was a wildly harmful and irresponsible suggestion.

The irony is that ChatGPT by no means found out why Portainer would not let me deploy the brand new container, however I did. It seems I by no means crammed out the container’s identify area. That is it.

As a result of I am pretty skilled, I hesitated when ChatGPT advised me to nuke my set up. Nonetheless, somebody counting on the AI for recommendation might have doubtlessly introduced down a whole server for need of typing in a container identify.

Overconfident and underinformed AIs: A harmful combo

I’ve additionally skilled the AI going utterly off the rails. I’ve skilled it giving recommendation that was not solely utterly ineffective, but additionally offered with the obvious confidence of an skilled.

If you are going to use AI instruments to assist your improvement or IT work, the following pointers may preserve you out of hassle:

If there’s not a lot publicly accessible info, the AI can not help. However the AI will make stuff up based mostly on what little it is aware of, with out admitting that it’s missing expertise.
Like my canine, as soon as the AI will get fixated on one factor, it usually refuses to take a look at options. If the AI is caught on one strategy, do not make the error of believing that its well mannered suggestions a few new strategy are actual. It is nonetheless happening the identical rabbit gap. Begin a brand new session.
If you do not know lots, do not depend on the AI. Sustain your studying. Skilled devs can inform the distinction between what is going to work and what will not. However should you’re making an attempt to place all of the coding on the again of the AI, you will not know when or the place it goes fallacious or the way to repair it.
Coders usually use particular instruments for particular duties. A web site is perhaps constructed utilizing Python, CSS, HTML, JavaScript, Flask, and Jinja. You select every software as a result of you already know what it does effectively. Select your AI instruments the identical manner. For instance, I do not use AI for enterprise logic, however I achieve productiveness utilizing AI to jot down API calls and public data, the place it could save me a number of time.
Check all the pieces an AI produces. All the things. Line by particular person line. The AI can save a ton of time, however it could additionally make huge errors. Sure, taking the time and power to check by hand may also help forestall errors. If the AI provides to jot down unit exams, let it. However take a look at the exams.

Based mostly in your expertise degree, here is how I like to recommend you concentrate on AI help:

If you already know nothing a few topic or ability: AI may also help you go as should you do, but it surely may very well be amazingly fallacious, and also you won’t know.
In the event you’re an skilled in a topic or ability: AI may also help, however it can piss you off. Your experience will get used not solely to separate the AI-stupid from the AI-useful, however to fastidiously craft a path the place AI can really assist.
In the event you’re in between: AI is a combined bag. It might allow you to or get you in hassle. Do not delegate your skill-building to the AI as a result of it might depart you behind.

Generative AI might be a wonderful helper for knowledgeable builders and IT execs, particularly when used for focused, well-understood duties. However its confidence might be misleading and harmful.

AI might be helpful, however at all times double-check its work.

Have you ever used AI instruments like ChatGPT or Claude to assist together with your improvement or IT work? Did they velocity issues up, or practically blow issues up? Are you extra assured or extra cautious when utilizing AI on essential techniques? Have you ever discovered particular use circumstances the place AI actually shines, or the place it fails hilariously? Tell us within the feedback under.

You may comply with my day-to-day challenge updates on social media. You’ll want to subscribe to my weekly replace e-newsletter, and comply with me on Twitter/X at @DavidGewirtz, on Fb at Fb.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

Supply hyperlink

Buy now

Coding with AI? My top 5 tips for vetting its output – and staying out of trouble

AI coding in the actual world: What science reveals

That point when ChatGPT prompt nuking my system

Overconfident and underinformed AIs: A harmful combo

Related Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

Leave a Reply Cancel reply

Latest Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

AWS re:Invent was an all-in pitch for AI. Customers might not...

Bone AI raises $12M to challenge Asia’s defense giants with AI-powered...