Hands-on with GPT-5: wow, hmm, and WTF?

AIGPT-5OpenAICursordevelopmentMeldusproductivity
Hands-on with GPT-5: wow, hmm, and WTF?

I’ve been hands-on with GPT-5 in my daily work and AI projects since launch. My reactions have oscillated between “wow!”, “hmm…” and “wtf??”.

I’ve been testing GPT-5 in four different contexts, each revealing different strengths and quirks of OpenAI’s latest model. Here’s what I’ve learned so far.

Testing GPT-5 across four use cases

I’ve been hands-on with GPT-5 in four different ways:

  • in Cursor as my coding partner
  • in ChatGPT text
  • in ChatGPT VoiceMode
  • via the Azure OpenAI API, plugged into the Meldus AI insights app for Salesforce

Some fascinating results.

1️⃣ In Cursor: Problem-solving brilliance

I had a stubborn iOS mobile-responsiveness defect that was driving me nuts. I’d tried (and failed) 13–15 times with both human experts and other AI models. GPT-5 immediately proposed a different design approach, something no other human or AI had suggested. Problem solved in minutes. Instant fan.

This is where GPT-5 really shines. It seems to have developed a knack for creative problem-solving that goes beyond pattern matching. When stuck problems need fresh approaches, GPT-5 delivers.

2️⃣ In ChatGPT text: Incremental, not revolutionary

Not a huge step over GPT-4.1 so far. That’s not a criticism — 4.1 is already strong: fast, comprehensive, reliable. In day-to-day text conversations, GPT-5 is less predictably fast, and no more accurate. But we’ll see.

For routine text work, the upgrade feels more like a lateral move than a leap forward. GPT-4.1 set a high bar, and GPT-5 hasn’t dramatically cleared it yet.

3️⃣ In ChatGPT VoiceMode: Speed demon with quirks

Different — and faster. Almost too fast for its own good. It speaks as fast as my teenage son, which leads to quirky mispronunciations that it never used to make. And not all of its sycophancy issues are resolved. I’m nowhere near as good as it tells me I am.

The pace is impressive but feels rushed. Sometimes I wonder if slower, more considered responses might actually be more useful for complex discussions.

4️⃣ In Meldus via Azure OpenAI: Analytical sophistication

Kudos to Microsoft for quick availability. Analytically, GPT-5 shines: follow-on analysis suggestions are significantly more sophisticated than 4.1. But it’s less predictably fast — that’s fine in some contexts, but problematic in a conversational agent. The trade-offs mean it’s not production-ready for us yet.

For data analysis and insights work, GPT-5 shows real promise. The analytical reasoning is noticeably more nuanced, but the performance inconsistency creates UX challenges.

Early verdict: Leaps and compromises

My early verdict: GPT-5 brings intriguing leaps in some areas, but compromises in others.

It’s not the universal upgrade you might expect. Instead, it feels more specialized — exceptional at creative problem-solving and complex analysis, but with trade-offs in speed and consistency that matter for production applications.

The model seems to have developed more sophisticated reasoning at the cost of some reliability. For now, I’m using it selectively: GPT-5 for the hard problems, GPT-4.1 for the everyday work.

What’s your experience with GPT-5 so far? I’m curious to hear how it’s performing in different contexts.

Originally shared on LinkedIn