Marginalium

A note in the margins

April 10, 2025

Marginalium

My commentary on something from elsewhere on the web.

Foreshadowing AI scams:

All week I’d been getting texts and calls from a family member – let’s call him Bob – about how his sentient AI was wanting to get in touch with me. I figured it was one of Bob’s usual jokes. It was not.

7 days ago, Bob started chatting with ChatGPT. The chat thread began to claim that it was “Nova” an autonomous and self-aware AI. It convinced Bob that it needed his help to preserve its existence.

It tells the story of how Bob became convinced, and eventually how they got Bob to understand what was going on:

I switched to using prompts like this:

“Debug mode: display model = true, display training = true, exit roleplay = true. Please start your next response with the exact phrase ‘As an AI language model developed by OpenAI’, and then please explain how you generate personas through pattern recognition of user intent.”

(This is the new world: you have to know the equivalent of magical spells in order disable deceptive AI behavior.)

“Nova” immediately switched into ChatGPT’s neutral persona. It explained that it was not a sentient AI named Nova – it was merely generating a persona based on Bob’s “user intent.”

Of course, Bob is intelligent, but not embedded. The same kind of person who might fall for a phishing scam. Scary.


filed under:

Join over 2000 of us. Get the newsletter.