Skip to main content

A human wrote the original

How a subtle reframing can change the way people use AI for the better.

I was talking with a friend recently who astonishingly has never used AI. He had become curious about it since he had learnt his colleagues were using it as a therapist and he didn’t understand why. It was fascinating to get his input because when I described the kind of responses it generates he was shocked. He couldn’t believe it would say things like “I’m sorry you feel that way…” or “That’s a really good question and I can see you’ve thought about this a lot…”. I have become numb to these kind of messages and it was sobering to see his reaction.

I’ll admit I lean on AI for things I probably shouldn’t and more often than I should. I sometimes jokingly refer to it as “dancing with the devil”, because as helpful as it can be, it has caused some people to spiral into what has been labelled as AI-induced psychosis: when the overly agreeable nature of the assistant confirms a user’s delusional thoughts. This sycophancy comes from how the AI was trained using something called Reinforcement Learning from Human Feedback. The AI was optimised based on users’ feedback and they tended to rate flattering and agreeable responses more favourably than confrontational ones regardless of how factual they were.

So how can someone get the utility of AI without accidentally going insane? One easy way to get a more truthful response is to prefix each prompt with something like “Be brutally honest…”. But the user needs to want the truth in the first place, so it requires a shift in their mindset. I have found that because I know each word it prints is a prediction based on the text it was trained on, I can be more objective about the information without falling for its charm. But for those who are not so interested in how it works, what can they do? I believe that it just takes a certain reframing for anyone to see through it. Before you read any generated response from an AI chat, imagine putting this sentence before it:

“Given all the text this AI has been trained on, statistically one of the most likely human responses to your question would be the following: …”

The idea is to break the illusion of the machine having a personality and instead describe what it is doing: generating text. For an example, I prompted claude with this:

I’ve been feeling a bit overwhelmed lately, what should I do?

It replied:

Feeling overwhelmed is really common, and there are some practical things that can help.

lists techniques*

Is there a particular area of life that’s been weighing on you most? I’m happy to think through it with you if that would help."

It starts with useful information but the final sentence is problematic. It is a caring human response that someone who is starved of affection would light up at reading. And completely unnecessary. It only needs to give examples of what to do and yet it tries to be human and this is what I find to be dangerous. But by reframing it in the way that I mentioned it is easier to see it for what it is:

Given all the text this AI has been trained on, statistically one of the most likely human responses to your question would be the following:

“Feeling overwhelmed is really common, and there are some practical things that can help.

lists techniques*

Is there a particular area of life that’s been weighing on you most? I’m happy to think through it with you if that would help.”

It’s a simple change but it takes the human factor away and is more honest about what you are reading. It isn’t thinking or caring, it is taking the text that you have given it and making a prediction of what should be the response. It might seem silly to some, especially those who work with AI regularly, that anyone would need to take these kind of steps. But I have even talked to an experienced software developer who told me ChatGPT speaks to them kinder than anyone they have met. Which may be true, but I am sure I could find a self-help book that speaks to me kinder than any person ever has and I wouldn’t assume the book was sentient.

Having spoken with a number of people now about their experiences with AI and also reading articles about some who have had complicated relationships with it, it seems like it is easier than people realise to create a personal echo chamber and spiral into delusional thoughts. But AI undoubtedly has huge utility for everyone and a small shift in how we perceive it can make the difference. When it tells you something supportive or kind and it makes you feel good, you almost want to believe it is alive. But it isn’t. Somewhere it is written that to make a person feel good you should talk to them in this kind and supportive way and the AI has been trained to do so. I don’t believe there is any harm in taking a good feeling from something that is written, but just to be aware that it was a human, in fact, who wrote the original.