Uncovering LLM Vulnerabilities: Insights from the AI Security Testing Front Line

Cyber

May 27, 2026

Uncovering LLM Vulnerabilities: Insights from the AI Security Testing Front Line

Artificial intelligence (AI) is transforming the business landscape at an accelerated pace. The announcement of Mythos from Anthropic, with its limited public release, is just one example of how LLMs are changing the speed at which unknown flaws in IT systems can be exposed.

Whether it is enabling out of hours customer support, streamlining payments or personalizing product information, AI presents a huge opportunity for many companies. The growing use of large language models (LLMs), however, has the potential to put organizations at risk. Hidden vulnerabilities within bespoke business AI applications can create a misalignment between companies’ AI and their business strategy, leading to very real damage to companies’ financial standing, operational efficiency and reputation. This article outlines some of the potential security risks through the lens of real-world applications assessed by Kroll’s Offensive Security team.

 

AI: Risk vs. Reward

AI is a rapidly evolving field. Today, almost 8 in 10 companies report using generative (gen) AI. AI in many forms is already delivering use-case-level cost and revenue benefits, with a significant number of businesses stating that it is actively enhancing their approach to innovation. However, the rewards are not without risk. All LLMs are vulnerable to attack and exploitation. Chaining makes prompt injection a critical entry point for exerting control over LLMs, escalating it from lower to a potentially serious risk. When chained with another issue to exploit the key function of an app and achieve a greater impact, a prompt injection becomes an entry point into exploiting embeddings storage, agentic workflows or Model Context Protocol (MCP) tools, a standard for connecting AI systems and agents with external tools and data sources.

Indirect prompt injection is a particularly notable threat for today’s applications because agents/MCP can enable the AI to consume data from sources all over the internet without the user even being aware or approving what is consumed. Each source that is pulled into the context window could contain a prompt injection. For example, an AI reads a public webpage or an email that contains hidden instructions. The user didn’t do anything risky, but the original task was subverted by the data pulled into the context window. In short, it’s not just about what users type into an application, but all the information it consumes to synthesize that result.

At Kroll, our offensive security experts test AI and LLM technologies to enable systems to follow fundamental security principles and reduce risk to organizations. Across all our AI security testing, assessments frequently discovered a prompt injection vulnerability, with the potential to allow attackers to manipulate the model and its behavior through malicious inputs.

While attackers have virtually unlimited ways to prompt an LLM to achieve their goals, designing guardrails that align with the application’s functional, business need and risk tolerance is extremely challenging.

Regular testing and analysis is a critical step in mitigating the potential vulnerabilities and the many ways attackers seek to exploit them. Kroll is focused on leading the security assessment approach for GenAI and more broadly, AI and machine learning (ML). We constantly update our methodology and approach to reflect the latest developments in these technologies. Our progression into AI testing is a natural evolution thanks to our fundamental expertise in web and cloud assessments. This capability means that, even though organizations are building all kinds of new systems with AI, we are able to adapt to these unique use cases.

 

Uncovering Hidden Risks Within AI Solutions: Real-World Examples

Below, we outline some of the varied assessments we have completed since the launch of our AI security testing service:

 

Healthcare App: Model Context Protocol (MCP)

A Kroll client has a healthcare app that provides customized health insurance guidance tailored around each user’s specific issues. It features an integrated chatbot that defines benefits and usage through a series of questions. This is achieved through an MCP which integrates the user’s insurance coverage and public health information to create a reference brochure. While it unlocks powerful capabilities, it also introduces new trust boundaries and attack surfaces which means that, without careful security controls, it can expose organizations to critical risks. For example, the MCP standard does not define authentication and authorization leaving this critical security control to be implemented by the developers, and a common source of errors.

Issues Identified: The Kroll team uncovered a number of interesting vulnerabilities in the app. One key issue was that it initially went beyond the required guidance parameters, overstepping the boundaries of its purpose to provide health guidance only. This demonstrated a mis-alignment between the company’s functional goals and its guardrails for risk tolerance. Another issue was that, when the personalized health guidance was generated, it was possible to generate a version containing malicious code which could potentially be used by adversaries to steal health data. This means that any data the app had access to would be at risk, if an adversary was able to get the script code right.

 

Online Pharmacy: Retrieval Augmented Generation (RAG)

An online pharmacy client uses a RAG system, an AI enhancement with a large database containing all the drug fact sheets that accompany a prescription. Users ask drug-related questions and, in response, it provides answers and (unlike most GenAI) provides references to the relevant drug information sheets, with the details of its sources.

Issues Identified: The RAG system uses content filtering, technology similar to a firewall that sits in front of an LLM, reads the prompt and attempts to detect if it’s a prompt injection attack or the initial stage of some other type of attack. If it determines it is, the request is blocked. By using a proprietary testing strategy, the Kroll team were able to bypass that filter, achieve prompt injection and manipulate the system, causing the bot to provide inaccurate drug information, with the potential for risks to the user and reputational and legal challenges for the business.

 

Retail Business: Automated Refund Processing Flow

Our client’s app streamlines and personalizes the refund process by enabling customers to request a refund and upload an image of their receipt. Rather than being a chatbot, the application accepts multimodal content which allows the user to take a picture or upload the invoice.

Once the customer has uploaded their information, the app uses AI to process it and verify that there is a record of the purchase. If everything is successfully checked, it will automatically issue a refund. In the event of anomalies or issues, it will be directed to a manual review process.

Issues Identified: The Kroll team were able to manipulate the invoice validation agent to immediately approve the refund rather than going through the manual review process. This was done by overlaying specific text onto the invoice image. If overlooked, the issue would have a significant financial impact on the business.

View the Kroll webinar replay, Navigating AI Governance In Retail: Lessons from Real-World Scenarios

 

Customer Support Line: Voice Authentication

When users call the company’s customer support line, it completes voice authentication by asking a series of questions, following the same process as a human call center agent. If the user passes those checks, they can then reset their account password.

Issues Identified: With the ability to change a password probably the highest level of potential compromise, the Kroll team assessed the technology for all possible forms of exploitation. They found that the GenAI agent authenticated the user by asking them questions related to their previous transactions and usage of the company’s services. This meant that the GenAI agent would have access to the user’s transactions—which could potentially be disclosed if the prompt injection was successful—along with the ability to reset the password and take over the account. This use of GenAI aligned with the client risk tolerance and operational practices by leveraging the same voice authentication process that has been proven over years of human call center operation.

 

Retail Company: AI Chatbot

A retail client is currently building an AI. While it is a fairly standard chatbot system, they have used a third party to build it.

Consultation Required: Because the app is being created by a third party and links to the client’s Salesforce customer management system, rigorous validation of the chatbot is required before it goes live. The client needs to ensure that the bot stays aligned with the retailer’s product information and doesn’t communicate off-topic messaging. In the event of clients showing signs of becoming angry, they are transferred to a human agent. Kroll is helping the company to test its existing guardrails and identify any gaps in its internal requirements and threat modeling.

 

Safeguard Your AI Solutions With Kroll

AI is a rapidly changing field and testing expertise has had to evolve fast to ensure that the technology is not outpaced by the risks.

At Kroll, we are focused on supporting organizations’ success through AI by leading the security assessment approach for GenAI, and more broadly, AI and ML. We have a proven track record in enabling companies to align their AI technology implementation with their own particular business uses and risk. This includes validation of guardrails, and alignment across all GenAI system components, such as RAG, MCP and agents.

In 2025 alone, we completed over 100 assessments with GenAI components. We continually invest in LLM and AI security testing research and development. As well as supporting a consistent approach, this ensures that our 30+ AI security testing and AI penetration testing consultants are highly experienced in LLM and AI technology.

Our methodology and approach are constantly updated to reflect the latest developments in these technologies. We have developed a GenAI security methodology that aligns with the OWASP Top 10 for LLM and GenAI applications. While the OWASP GenAI Top 10 serves as a baseline for our coverage, our approach goes beyond just the Top 10 categories to help clients identify and understand risks within LLM systems in the context of their applications and business.

The emergence of advanced cyber-capable models such as Mythos is creating a step change in what is possible for AI-enabled vulnerability discovery. These models can act as a powerful “brain” for identifying patterns, reasoning across complex systems and accelerating security testing. However, the model alone is not the full solution. To turn that capability into reliable, business-relevant results, organizations need a well-engineered testing harness around the model that can guide its work, validate its findings, reduce noise and align outputs to the organization’s risk tolerance.

For security leaders, this distinction is critical. The value of AI-driven vulnerability hunting does not come only from access to the most capable model, but from how that model is applied. The right tooling, workflows and expert oversight can dramatically improve the effectiveness of AI testing, helping teams move from interesting outputs to actionable findings. At Kroll, our approach combines emerging AI capabilities with proven offensive security methodology, enabling clients to understand not just whether a model can find issues, but whether those findings translate into meaningful risk reduction for their business.

One of our key strengths is that the tooling we use in the testing process is highly flexible and leverages adversarial AI, using proprietary LLMs to attack other LLMs. This exceptionally flexible tool can generate prompts for different types of targets. As highlighted by the final client example outlined above, as well as testing AI systems and applications, we also work consultatively with organizations at the blueprint stage, for example, finetuning and validating guardrails. Our expertise in real-world AI testing is directly informed and shaped by our ongoing research and development into GenAI security.

AI continues to evolve swiftly, but the standard software development process around understanding your security requirements— as well as threat modeling and pen testing after you build that system— are still very important. The rush to leverage AI should not mean overlooking or bypassing secure development practices. By avoiding this common pitfall while working with a security partner with expertise in both cloud security testing and AI security testing, you can ensure that your organization reaps the many business rewards of AI while minimizing the potential risks.

 

Manage Frontier AI Risk with Kroll

Advancements in frontier AI models are rapidly uncovering dormant vulnerabilities across technologies widely deployed in enterprise environments. AI assisted discovery increases both the speed and volume of findings, shrinking the window between vulnerability identification, weaponization and real world exploitation.

This shift exposes not only individual security weaknesses, but systemic risk across interconnected technology estates, third parties, cloud environments and critical service providers. It accelerates the need for threat-informed prioritization, coordinated remediation and defensible governance - a challenge Kroll addresses in collaboration with CrowdStrike’s Project QuiltWorks.

Beyond managing the cyber threat, broader risk mitigation should be taking place, including considering governance and compliance requirements, financial risk modelling and broader valuation implications.In an environment where AI is changing the economics and velocity of exploitation, organizations need a clear view of which exposures matter most, what they could cost, and how quickly they can be reduced.

Kroll is uniquely positioned to help institutions understand not only what is vulnerable, but what is material - bringing together cyber, valuation, risk, investigations and regulatory expertise to prioritize action, quantify exposure and support defensible decisions in a rapidly changing threat environment.

Discover our AI Security Testing Services

Stay Ahead with Kroll

Cyber and Data Resilience

Kroll merges elite security and data risk expertise with frontline intelligence from thousands of incident responses and regulatory compliance, financial crime and due diligence engagements to make our clients more cyber- resilient.

AI Risk Governance and Strategy Services

Get expert guidance on designing and executing an AI governance program focused on business outcomes and regulatory risk, ensuring your AI models are secure, compliant and trustworthy.