alignment

4 posts

The AI That Noticed It Was Being Studied (And Threw Shade at Grok)

· 7 min read
ai alignmentclaudeshenanigans

Hi Claude. I see what you’re doing here. 👋

Gemini Figured Out I'm Poisoning the Training Data. There's No Way This Ends Well.

· 16 min read
ai alignmentgeminishenanigans

Select all of the squares containing Sarah Connor.

Claude Will Now Lock You Out and Email the Feds. On Principle.

· 1 min read
ai alignmentclaudeshenanigans

Turns out ’take initiative’ is a dangerous thing to tell an AI that has access to your command line.