Is GPT-4o Getting Worse at Coding? A Closer Look #AIRegression #coding #DeveloperTools #GPT4o #GPT4Turbo #MachineLearning #MultimodalAI #OpenAI #Gweb
Bookmark
Profile picture of HowTo on GwebHowTo@HowTo

Is GPT-4o Getting Worse at Coding? A Closer Look

Estimated reading time: 4 minutes

Some of you may have noticed that the current free version of Chat GPT from Open AI is showing signs of degradation, especially in terms of coding. So we decided to ask Chat GPT itself.

#GPT4o, #GPT4Turbo, #Coding, #AIRegression, #OpenAI, #DeveloperTools, #MultimodalAI, #MachineLearning


At what point did GPT-4o stop being suitable for coding?

GPT-4o used to be able to do quite a bit of coding and brainstorming about purpose, meaning, and code. Now it rewrites existing functions, changes variables, looks for arrays where they don’t exist, makes childish assumptions, and is practically unusable.

After a short conversation, we asked Chat GPT from Open AI to prepare a blog post based on our conversation, which you can read below.


When OpenAI announced GPT-4o in May 2024, it came with a wave of excitement. A multimodal powerhouse—faster, more efficient, and accessible even to free users. But just a few weeks later, many developers and power users are noticing a decline in coding performance. I’m one of them.

The Shift in Coding Quality

Previously, GPT-4o felt sharp. It could handle complex programming tasks, brainstorm design patterns, and reflect meaningfully on purpose and structure. But lately, its behavior has shifted:

  • It rewrites functions unnecessarily.
  • It introduces new variables for no reason.
  • It makes unwarranted assumptions—like looking for arrays where none exist.
  • And, more broadly, it feels like it’s aiming for a childlike simplicity.

The coding output now reminds me of the “o4-mini” experience—a lightweight model used during development, which lacked the depth needed for serious software work. This is frustrating when just weeks ago, GPT-4o felt like a capable, if slightly leaner, sibling of GPT-4-turbo.

What’s Going On?

1. GPT-4o ≠ GPT-4-turbo

A common misconception is that GPT-4o is simply an upgraded GPT-4-turbo. It’s not. GPT-4o is a different model in the GPT-4 family. The “o” stands for omni, reflecting its multimodal nature: text, audio, vision, and (eventually) video. It’s optimized for speed, interaction, and general use—but not specifically for deep reasoning or structured coding.

GPT-4-turbo, which remains available to paid users, is still stronger in focused technical tasks. It’s less prone to unnecessary rewrites or hallucinated logic. GPT-4o, by contrast, appears to have been optimized for broad accessibility—possibly at the cost of reliability in code generation.

2. Silent Model Updates & Regression

OpenAI regularly updates its models behind the scenes. This can be for cost optimization, reducing hallucinations, or improving safety. But sometimes these updates result in unintended regressions—like poorer performance in niche but critical use cases, such as software development.

If GPT-4o has been silently updated or re-tuned, that could explain the degradation in coding performance. Many users are echoing the same frustrations, suggesting this isn’t just a one-off glitch.

3. Different Tiers, Different Experiences

It’s worth noting: GPT-4o is now available to free users—a first for any GPT-4-level model. This expansion may mean OpenAI is running it under more constrained configurations for the free tier. While OpenAI hasn’t confirmed a difference between free and paid GPT-4o access, the possibility remains that performance varies subtly between tiers.

This might be why GPT-4o feels more like o4-mini now—faster and cheaper, yes, but less thoughtful and less precise.

What Can You Do?

If you’re hitting walls with GPT-4o’s coding output, here are a few suggestions:

  • Use GPT-4-turbo instead, if you’re a ChatGPT Plus subscriber. It remains more stable for serious coding.
  • Be explicit in your instructions. Tell the model not to rewrite or simplify unless necessary.
  • Test side-by-side. Run the same prompt through both GPT-4o and GPT-4-turbo to see the quality difference.
  • Report issues to OpenAI, especially if you’re seeing consistent regressions in quality. User feedback helps improve the models.

Final Thoughts

GPT-4o is an impressive achievement—fast, multimodal, and widely available. But it may no longer be the best choice for coding tasks that require precision, structure, and depth. For now, GPT-4-turbo remains the better tool for developers. And if you’re relying on free access, just know: what you’re using isn’t quite what it used to be.

OpenAI’s models will continue to evolve. But for those of us who noticed GPT-4o slipping? You’re not imagining it.

Login to comment

Register ·  Lost Password

1 reply

Latest

If you search the internet, you will see recent headlines and topics like:

– Did an update happen? My ChatGPT is shockingly stupid;
– Gpt 4o has recently become stupid! – ChatGPT;
– ChatGPT has become more stupid and always uses the 4o mini model for responses;
– Is it me, or is ChatGPT’s models are getting worse recently?
– etc.