Claude can now leave the chat 👋

AOL doorslam noise

Aug 18, 2025

Anthropic announced that Claude now has the ability to end convos if a user is persistently harmful or abusive.

Instead of endlessly refusing inappropriate requests, Claude can now exit distressing interactions (as long as the user isn’t at imminent risk of harming themselves or others).

This is said to be a last resort action - taken when all “hope of a productive interaction has been exhausted.”

As Anthropic explains: “This feature was developed primarily as part of our exploratory work on potential AI welfare, though it has broader relevance to model alignment and safeguards.”

a long hallway with a couple of people walking down it

It’s refreshing to see Big AI leaning more thoughtfully into risk mitigation; while “model welfare” may seem a bit far-fetched right now… if we don’t focus on risks in the early days, it may eventually be too late.

AI & boundaries have been making headlines lately (see: the woman who got “engaged” to her AI), so it feels like a step in the right direction prioritizing both human welfare and model welfare.

This is just a small step forward in our AI journey - TBD on whether these steps will make much of a difference in the long run.

robot's first rodeo

Discussion about this post

robot's first rodeo

Claude can now leave the chat 👋

*AOL doorslam noise*

Discussion about this post

AOL doorslam noise