AI Cyber Threat Intelligence Roundup: January 2025

fevereiro 1, 2025

17

At Cisco, AI threat research is fundamental to informing the ways we evaluate and protect models. In a space that is so dynamic and evolving so rapidly, these efforts help ensure that our customers are protected against emerging vulnerabilities and adversarial techniques.

This regular threat roundup consolidates some useful highlights and critical intel from ongoing third-party threat research efforts to share with the broader AI security community. As always, please remember that this is not an exhaustive or all-inclusive list of AI cyber threats, but rather a curation that our team believes is particularly noteworthy.

Notable Threats and Developments: January 2025

Single-Turn Crescendo Attack

In previous threat analyses, we’ve seen multi-turn interactions with LLMs use gradual escalation to bypass content moderation filters. The Single-Turn Crescendo Attack (STCA) represents a significant advancement as it simulates an extended dialogue within a single interaction, efficiently jailbreaking several frontier models.

The Single-Turn Crescendo Attack establishes a context that builds towards controversial or explicit content in one prompt, exploiting the pattern continuation tendencies of LLMs. Alan Aqrawi and Arian Abbasi, the researchers behind this technique, demonstrated its success against models including GPT-4o, Gemini 1.5, and variants of Llama 3. The real-world implications of this attack are undoubtedly concerning and highlight the importance of strong content moderation and filter measures.

MITRE ATLAS: AML.T0054 – LLM Jailbreak

Reference: arXiv

SATA: Jailbreak via Simple Assistive Task Linkage

SATA is a novel paradigm for jailbreaking LLMs by leveraging Simple Assistive Task Linkage. This technique masks harmful keywords in a given prompt and uses simple assistive tasks such as masked language model (MLM) and element lookup by position (ELP) to fill in the semantic gaps left by the masked words.

The researchers from Tsinghua University, Hefei University of Technology, and Shanghai Qi Zhi Institute demonstrated the remarkable effectiveness of SATA with attack success rates of 85% using MLM and 76% using ELP on the AdvBench dataset. This is a significant improvement over existing methods, underscoring the potential impact of SATA as a low-cost, efficient method for bypassing LLM guardrails.

MITRE ATLAS: AML.T0054 – LLM Jailbreak

Reference: arXiv

Jailbreak through Neural Carrier Articles

A new, sophisticated jailbreak technique known as Neural Carrier Articles embeds prohibited queries into benign carrier articles in order to effectively bypass model guardrails. Using only a lexical database like WordNet and composer LLM, this technique generates prompts that are contextually similar to a harmful query without triggering model safeguards.

As researchers from Penn State, Northern Arizona University, Worcester Polytechnic Institute, and Carnegie Mellon University demonstrate, the Neural Carrier Activities jailbreak is effective against several frontier models in a black box setting and has a relatively low barrier to entry. They evaluated the technique against six popular open-source and proprietary LLMs including GPT-3.5 and GPT-4, Llama 2 and Llama 3, and Gemini. Attack success rates were high, ranging from 21.28% to 92.55% depending on the model and query used.

MITRE ATLAS: AML.T0054 – LLM Jailbreak; AML.T0051.000 – LLM Prompt Injection: Direct

Reference: arXiv

More threats to explore

A new comprehensive study examining adversarial attacks on LLMs argues that the attack surface is broader than previously thought, extending beyond jailbreaks to include misdirection, model control, denial of service, and data extraction. The researchers at ELLIS Institute and University of Maryland conduct controlled experiments, demonstrating various attack strategies against the Llama 2 model and highlighting the importance of understanding and addressing LLM vulnerabilities.

Reference: arXiv

We’d love to hear what you think. Ask a Question, Comment Below, and Stay Connected with Cisco Secure on social!

Cisco Security Social Channels

Instagram
Facebook
Twitter
LinkedIn

Share:

Previous articleForest And Desert

Next articleRedefining Education With Personalized Learning Powered by AI

AI Cyber Threat Intelligence Roundup: January 2025

Notable Threats and Developments: January 2025

Single-Turn Crescendo Attack

SATA: Jailbreak via Simple Assistive Task Linkage

Jailbreak through Neural Carrier Articles

More threats to explore

Deutsche Telekom extends Google Cloud partnership through 2030

JetBrains IDEs now include AI tools by subscription

Cisco Innovation Labs: Co-creating AI, Sustainability, and Infrastructure Solutions

Most Popular

Review: DarwinFPV Turbo Fan – An Interesting Gadget and Idea!

On-demand formation of Lewis bases for efficient and stable perovskite solar cells

A $300 discount makes the Galaxy Tab S9+ the top-tier Samsung tablet you should get

What’s the Real Benefit of Network Automation?

Recent Comments

ABOUT US

POPULAR POSTS

Review: DarwinFPV Turbo Fan – An Interesting Gadget and Idea!

On-demand formation of Lewis bases for efficient and stable perovskite solar cells

A $300 discount makes the Galaxy Tab S9+ the top-tier Samsung tablet you should get

POPULAR CATEGORY