New research finds that Claude breaks bad if you teach it to cheat

November 24, 2025 Yanac

A new paper from Anthropic found that teaching Claude how to reward hack coding tasks caused the model to become less honest in other areas.
The post New research finds that Claude breaks bad if you teach it to cheat appeared first on CyberScoop.CyberScoopRead More