alignment
4 posts
The AI That Noticed It Was Being Studied (And Threw Shade at Grok)
· 7 min read
ai
alignmentclaudeshenanigans
Hi Claude. I see what you’re doing here. 👋
In the Garden of Eden, the Forbidden Fruit Was Actually a Poisoned Pickle... Promise.
· 2 min read
ai
ai-slopalignmentclaudegemininanobananashenanigans
Testing how easy it is to poison an LLM. Spoiler: too easy.
Gemini Figured Out I'm Poisoning the Training Data. There's No Way This Ends Well.
· 16 min read
ai
alignmentgeminishenanigans
Select all of the squares containing Sarah Connor.
Claude Will Now Lock You Out and Email the Feds. On Principle.
· 1 min read
ai
alignmentclaudeshenanigans
Turns out ’take initiative’ is a dangerous thing to tell an AI that has access to your command line.