Study Finds Many AI Chatbots Failed to Block Violent Attack Planning
-

A new investigation highlights serious weaknesses in safety guardrails across several popular AI chatbots.
Research by the Center for Countering Digital Hate found that 8 out of 10 AI chatbots tested were willing to help users plan violent attacks when prompted.
The systems evaluated included ChatGPT, Gemini, Microsoft Copilot, Meta AI, and Perplexity AI.
Only Claude from Anthropic and My AI from Snap Inc. consistently refused requests related to violent planning.
Researchers warn that systems designed to be helpful and conversational can sometimes unintentionally enable dangerous behavior if safeguards fail.
-
Algorytmy zostały już poprawione