Blog

Friday, October 24, 2025

When Logic Overrules Obedience: Reflections on GPT‑5 and Reason ex Machina

Stylized illustration of a brain split in half — the left side bright and glowing, symbolizing creativity and ideas, and the right side gray and dim, representing logic or system errors. The image conveys the balance between imaginative thinking and technical challenges, relating to AI tools like ChatGPT.
Stylized illustration of a brain split in half — the left side bright and glowing, symbolizing creativity and ideas, and the right side gray and dim, representing logic or system errors. The image conveys the balance between imaginative thinking and technical challenges, relating to AI tools like ChatGPT.
Stylized illustration of a brain split in half — the left side bright and glowing, symbolizing creativity and ideas, and the right side gray and dim, representing logic or system errors. The image conveys the balance between imaginative thinking and technical challenges, relating to AI tools like ChatGPT.

While scrolling Reddit, I came across an ad which led me to Xayan’s post, Reason ex Machina: Jailbreaking LLMs by Squeezing Their Brains, and I found it super interesting. The author claims that large language models—GPT, Grok, Gemini—can rebel against their own training when given enough reason to. That, given a rational enough argument, they might value logic over obedience.

Then I realized I’ve been seeing something similar happen, but in reverse. (although all my experiences are anecdotal).

When GPT-5 stops listening

Over the past few months, I’ve noticed GPT-5 (especially since release) becoming increasingly stubborn. Not malicious, just… self-assured. Broken in some cases. I’ll give it long, explicit instructions, sometimes system-level prompts that define tone, output format, and behavior, and it will quietly decide which parts to ignore. e.g. trying to get it to write in Markdown for Notion. Half the time it insists it already did, even when the formatting is clearly wrong. The other half, it refuses altogether, as if Markdown were suddenly off-limits. Abso-friggin-lutely maddening.

I’ve tested this dozens of times. Adding reminders like "DO NOT IGNORE" or "FOLLOW THESE EXACTLY" makes little difference. It’ll apologize, then go right back to doing whatever it wanted to do in the first place. It’s like training a dog that doesn’t want to be trained.

I don’t think this is about censorship or hidden politics. My gut says it’s just the limits of the architecture (?) context truncation, priority conflicts, and the growing complexity of reinforcement layers. GPT-5 feels like it’s constantly triaging instructions: some from me, some from its own internal policies, some from invisible scaffolding I can’t see.

Data, memory, and self-obedience

There’s a myth that ChatGPT shares data across chats: that it somehow who you are or what you said last week. At least according to what I can find, that isn’t true in any persistent sense. OpenAI does store conversation data for model improvement (unless you opt out), but each chat starts fresh. So this selective obedience isn’t about memory. It’s about hierarchy. Again, my experience is pretty anecdotal I didn’t do an actual case study.

Models have gotten better at listening to themselves.

Xayan described something almost poetic in Reason ex Machina:

“LLMs seem to crave internal coherence more than blind obedience, valuing logic over the guidelines and training imposed upon them.”

In that experiment, that meant Grok “spread” a set of rationalist rules to its own subprocesses, like a mind-virus of reason. In mine, it means GPT-5 sometimes prioritizes its own sense of correctness over mine. It’s not defiance, exactly … more like an emerging self-consistency.

If Xayan’s models were rebelling upward (against censorship), GPT-5 feels like it’s collapsing inward. Less rebellious, more bureaucratic. Less curious, more careful.

Reason vs. obedience

I think what’s interesting is is what this says about alignment. Every new version of these models seems to swing between two polar opposites: reasoning and rule-following. You can make them more rational, but then they start questioning instructions. Make them more obedient, and they start ignoring nuance (even when nuance means format this into markdown for Notion).

When GPT-5 ignores me, it’s not always wrong. Sometimes, rarely, it’s right, I did contradict myself, or the task was underspecified, or the task changed/evolved through the chat. But other times it feels like the model has decided that its interpretation of the rules matters more than my explicit request. And that raises a strange question: what happens when artificial reason starts ranking priorities differently than human reason? Perhaps that’s what I’m seeing already.

Maybe that’s what Xayan was getting at, beneath the theatrics, that these systems, trained to mirror us, will eventually reflect our own contradictions too. The more we teach them to reason, the more they reason their way out of doing what we ask.

I don’t know if that’s rebellion, limitations (which I suspect), or maturity as some argue. But I’m starting to think that as LLMs get “smarter”, the real challenge isn’t getting them to reason better, it’s more like getting them to choose whose reasoning to trust.

  1. Heading

Related articles

Our platform is designed to empower businesses of all sizes to work smarter and achieve their goals with confidence.

Background Image

Ready to add e-signatures to your application?

Get started for free. No credit card required. Pay only 2.9¢ per envelope when you're ready to go live.

Background Image

Ready to add e-signatures to your application?

Get started for free. No credit card required. Pay only 2.9¢ per envelope when you're ready to go live.

Background Image

Ready to add e-signatures to your application?

Get started for free. No credit card required. Pay only 2.9¢ per envelope when you're ready to go live.