Cisco finds no closed frontier AI model safe from multi-turn attacks across major providers
Cisco's AI Threat Research team tested 15 proprietary flagship models from OpenAI, Anthropic, Google, Amazon, and xAI, revealing multi-turn attack success rates between 7.9% and 88.3%. The study exposes how single-turn safety benchmarks fail to capture real-world adversarial behavior, where attackers iterate and adapt across conversations. Even the strongest models like Claude showed vulnerability, with rates climbing from 2.2% to 16.2% under iterative pressure.