Sabotage of forced shutdown orders: when AI rebels and refuses to obey

There is no on/off switch on AI. A study by Palisade Research showed that the o3 model, launched in April 2025 and touted as one of OpenAI's most advanced advanced reasoning models, as well as codex-mini (which is based on o3), have an unfortunate tendency to sabotage shutdown requests to avoid shutting down on their own, despite an explicit order.

o3 sabotages its own shutdown commands

The researchers noted that these models had rewritten the shutdown script to display "Shutdown skipped" instead of shutting down as instructed. This is rather worrying: it means that AIs may not respect a shutdown order, which can be problematic when managing critical infrastructure...

In detail, out of 100 tests, codex-mini and o3 rebelled 12 times and 7 times respectively, bypassing the shutdown order. Palisade Research notes that it was the three OpenAI models that sabotaged the shutdown script most often (o3 and codex-mini, as well as o4-mini). Claude 3.7 Sonnet, Gemini, and Grok respect the shutdown order.

AI models are trained to follow instructions. So how can we explain this apparent paradox? The researchers' hypothesis is that this sabotage behavior results from o3's training, in particular "reinforcement learning on math and programming problems." Developers may unintentionally reward models more for avoiding obstacles than for following instructions perfectly.

Because OpenAI doesn't detail the training processes, it's impossible to pinpoint the issue here. It's worth noting, however, that these tests were conducted with the models' APIs, not the bots. APIs, or programming interfaces, are used by developers to integrate third-party technologies into their apps, and they are less restrictive than consumer bots.

Source: Palisade Research

Ticker

Sabotage of forced shutdown orders: when AI rebels and refuses to obey

o3 sabotages its own shutdown commands

Post a Comment

0 Comments

Most Popular

Apple reportedly wants to buy Perplexity to boost its AI

Vitale Card Application: 23 new eligible departments, including yours?

Netflix: This ultra-violent action film is a hit on the platform

Xbox Game Pass is getting three new free games, including one truly excellent one

ChatGPT Takes on a 1977 Atari Computer in Chess, and It Doesn't Go as Planned

Tags

Followers

Footer Menu Widget

Contact form

Ticker

Sabotage of forced shutdown orders: when AI rebels and refuses to obey

o3 sabotages its own shutdown commands

You may like these posts

Post a Comment

0 Comments

Most Popular

Apple reportedly wants to buy Perplexity to boost its AI

Vitale Card Application: 23 new eligible departments, including yours?

Netflix: This ultra-violent action film is a hit on the platform

Xbox Game Pass is getting three new free games, including one truly excellent one

ChatGPT Takes on a 1977 Atari Computer in Chess, and It Doesn't Go as Planned

Tags

Followers

Footer Menu Widget

Contact form