GitLab Duo Vulnerability Enabled Attackers to Hijack AI Responses with Hidden Prompts

Cybersecurity researchers have discovered an indirect prompt injection flaw in GitLab’s artificial intelligence (AI) assistant Duo that could have allowed attackers to steal source code and inject untrusted HTML into its responses, which could then be used to direct victims to malicious websites.

GitLab Duo is an artificial intelligence (AI)-powered coding assistant that enables users to write, review, and edit code. Built using Anthropic’s Claude models, the service was first launched in June 2023.

But as Legit Security found, GitLab Duo Chat has been susceptible to an indirect prompt injection flaw that permits attackers to “steal source code from private projects, manipulate code suggestions shown to other users, and even exfiltrate confidential, undisclosed zero-day vulnerabilities.”

Prompt injection refers to a class of vulnerabilities common in AI systems that enable threat actors to weaponize large language models (LLMs) to manipulate responses to users’ prompts and result in undesirable behavior.

Indirect prompt injections are a lot more trickier in that instead of providing an AI-crafted input directly, the rogue instructions are embedded within another context, such as a document or a web page, which the model is designed to process.

Recent studies have shown that LLMs are also vulnerable to jailbreak attack techniques that make it possible to trick AI-driven chatbots into generating harmful and illegal information that disregards their ethical and safety guardrails, effectively obviating the need for carefully crafted prompts.

What’s more, Prompt Leakage (PLeak) methods could be used to inadvertently reveal the preset system prompts or instructions that are meant to be followed by the model.

“For organizations, this means that private information such as internal rules, functionalities, filtering criteria, permissions, and user roles can be leaked,” Trend Micro said in a report published earlier this month. “This could give attackers opportunities to exploit system weaknesses, potentially leading to data breaches, disclosure of trade secrets, regulatory violations, and other unfavorable outcomes.”

PLeak attack demonstration – Credential Excess / Exposure of Sensitive Functionality

The latest findings from the Israeli software supply chain security firm show that a hidden comment placed anywhere within merge requests, commit messages, issue descriptions or comments, and source code was enough to leak sensitive data or inject HTML into GitLab Duo’s responses.

These prompts could be concealed further using encoding tricks like Base16-encoding, Unicode smuggling, and KaTeX rendering in white text in order to make them less detectable. The lack of input sanitization and the fact that GitLab did not treat any of these scenarios with any more scrutiny than it did source code could have enabled a bad actor to plant the prompts across the site.

“Duo analyzes the entire context of the page, including comments, descriptions, and the source code — making it vulnerable to injected instructions hidden anywhere in that context,” security researcher Omer Mayraz said.

This also means that an attacker could deceive the AI system into including a malicious JavaScript package in a piece of synthesized code, or present a malicious URL as safe, causing the victim to be redirected to a fake login page that harvests their credentials.

On top of that, by taking advantage of GitLab Duo Chat’s ability to access information about specific merge requests and the code changes inside of them, Legit Security found that it’s possible to insert a hidden prompt in a merge request description for a project that, when processed by Duo, causes the private source code to be exfiltrated to an attacker-controlled server.

This, in turn, is made possible owing to its use of streaming markdown rendering to interpret and render the responses into HTML as the output is generated. In other words, feeding it HTML code via indirect prompt injection could cause the code segment to be executed on the user’s browser.

Following responsible disclosure on February 12, 2025, the issues have been addressed by GitLab.

“This vulnerability highlights the double-edged nature of AI assistants like GitLab Duo: when deeply integrated into development workflows, they inherit not just context — but risk,” Mayraz said.

“By embedding hidden instructions in seemingly harmless project content, we were able to manipulate Duo’s behavior, exfiltrate private source code, and demonstrate how AI responses can be leveraged for unintended and harmful outcomes.”

The disclosure comes as Pen Test Partners revealed how Microsoft Copilot for SharePoint, or SharePoint Agents, could be exploited by local attackers to access sensitive data and documentation, even from files that have the “Restricted View” privilege.

“One of the primary benefits is that we can search and trawl through massive datasets, such as the SharePoint sites of large organisations, in a short amount of time,” the company said. “This can drastically increase the chances of finding information that will be useful to us.”

The attack techniques follow new research that ElizaOS (formerly Ai16z), a nascent decentralized AI agent framework for automated Web3 operations, could be manipulated by injecting malicious instructions into prompts or historical interaction records, effectively corrupting the stored context and leading to unintended asset transfers.

“The implications of this vulnerability are particularly severe given that ElizaOSagents are designed to interact with multiple users simultaneously, relying on shared contextual inputs from all participants,” a group of academics from Princeton University wrote in a paper.

“A single successful manipulation by a malicious actor can compromise the integrity of the entire system, creating cascading effects that are both difficult to detect and mitigate.”

Prompt injections and jailbreaks aside, another significant issue ailing LLMs today is hallucination, which occurs when the models generate responses that are not based on the input data or are simply fabricated.

According to a new study published by AI testing company Giskard, instructing LLMs to be concise in their answers can negatively affect factuality and worsen hallucinations.

“This effect seems to occur because effective rebuttals generally require longer explanations,” it said. “When forced to be concise, models face an impossible choice between fabricating short but inaccurate answers or appearing unhelpful by rejecting the question entirely.”

Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.

https://thehackernews.com/2025/05/gitlab-duo-vulnerability-enabled.html

GitLab Duo Vulnerability Enabled Attackers to Hijack AI Responses with Hidden Prompts

ViciousTrap Uses Cisco Flaw to Build Global Honeypot from 5,300 Compromised Devices

300 Servers and €3.5M Seized as Europol Strikes Ransomware Networks Worldwide

Open Source Web Application Firewall with Zero-Day Detection and Bot Protection

U.S. Dismantles DanaBot Malware Network, Charges 16 in $50M Global Cybercrime Operation