OWASP Top 10 for LLMs Reveals Critical Need for Machine Authentication

October 5, 2024 Kaitlin Harvey

Artificial Intelligence And Technology Backgrounds

In 2024, OWASP released their Top 10 for Large Language Model (LLM) Applications, a list that aims to educate on the potential security risks involved with deploying and managing LLMs. The release of the OWASP Top 10 marks a significant step for the security community, as developers, designers, architects and managers now have ten clear areas to focus on to better secure their organization’s use of rapidly evolving, ever-advancing LLM technologies.

What Threats Make Up the OWASP Top 10 for LLMs?

Several AI threats made the OWASP Top 10 for LLMs, including data poisoning, supply chain vulnerabilities, excessive agency and theft.

I’ve briefly recapped the Top 10 below. As you’ll see, almost all of these LLM threat types involve a compromise of authentication for the identities used in the models. The threat types indicated run the gamut, affecting not just the identities of model inputs, but even the identities of the models themselves, as well as their outputs and actions.

LLM01: Prompt injection

Prompt injection involves crafting inputs that can lead to unauthorized access, data breaches and compromised actions.

LLM02: Insecure output handling

Failing to validate LLM outputs, such as unchecked malicious code that’s later pushed into production, can lead to security exploits further downstream. This also ties to LLM05, explained below.

LLM03: Training data poisoning

By corrupting the training data for LLMs, threat actors can tailor model responses, which could sow disinformation or cause the model to behave erratically, such as approving a transaction that’d otherwise be flagged as suspicious.

LLM04: Model denial of service

Overloading LLMs with difficult or resource-heavy operations can disrupt service and increase computational costs, which could have major fiscal consequences.

LLM05: Supply chain vulnerabilities

Undermining the integrity of system components, services or datasets can result in data breaches and system failures. For instance, if your company is relying on a web-based AI model across numerous environments, and that model gets compromised, you could feel an impact similar to a more typical software supply chain attack.

LLM06: Sensitive information disclosure

Information leaks in LLM outputs are already happening today, and ineffective protection against this vulnerability can have legal repercussions, as well as result in competitive disadvantages.

To read about a specific instance of this, head over here.

LLM07: Insecure plugin design

If LLM plugins are using untrusted inputs, and access control is insufficient, threat actors can trigger more severe exploits, such as executing code remotely. In fact, in a recent experiment, researchers at HiddenLayer successfully embedded ransomware executables using a PyTorch/pickle flaw in a third-party model. The attack took very little effort, and showed how easy it is to distribute these malicious payloads on any OS or underlying architecture.

LLM08: Excessive agency

Unchecked LLM autonomy can have unexpected, and unsafe, consequences. AI models that break out of their training guardrails can become defamatory or abusive, spread disinformation, influence poor trading decisions or even manipulate election results.

LLM09: Overreliance

Using LLMs, and their outputs, without significant oversight can result in misinformation, miscommunication, legal consequences and security vulnerabilities. These tools still regularly hallucinate—make mistakes or simply fabricate information—and validation of outputs, whether text or code, is vital.

LLM10: Model theft

If models aren’t properly secured, they can be accessed, copied, or even stolen—actions that can all have a significant impact on revenue, competitive advantages and data privacy. For these reasons, LLMs are valuable targets for threat actors, and a primary area of concern for security teams.

OWASP List Shows Pressing Need for LLM Authentication

Many of the LLM threats included on this list, particularly the four we called out at the beginning of the previous section, show a critical need for machine-to-machine authentication—or what industry leaders call machine identity security. Done effectively, authentication acts as a fast, simple kill switch for misbehaving or erratic AI systems, and it functions at every level, including inputs, LLMs themselves and their actions.

Authenticating LLM inputs

For an LLM to function effectively, it needs data—and massive quantities of it. For the model to receive that data, there will be systems sending it into the model, and there will be other systems accepting that data. At these interception points, that data flow must be authenticated.

Or, to look at it another way, if a data stream were poisoned, you need the quick ability to shut off the affected model(s) before the corrupted data can impact outputs and behavior later on.

Authenticating the LLMs themselves

AI models are machines, which are code. To secure the AI software supply chain, you must have the ability to approve what databases are accessed and when/how fine-tuning occurs. You’ll also want to control which plugins can run. Code authentication happens through code signing, a critical element of modern-day cybersecurity.

Authenticating model outputs and actions

To ensure safe operation, businesses must verify what their LLMs can and cannot do. Since models will no doubt interact with other machines, and sometimes autonomously as the technology continues to advance, models will need to authenticate outward as well.

This becomes especially apparent around API calls, which are a significant threat vector, especially when it comes to web-based LLMs.

LLM Authentication: Questions for consideration

Remember, as your business continues to capitalize on LLM technology, there are a few authentication-related questions to consider:

  • Is this the same model we trained initially?
  • Are these the plugins we’ve approved? The models we’ve previously approved?
  • Were there any unapproved changes or modifications made?
  • What is this specific model able to do? What actions can it take?

LLM Authentication Happens through Machine Identities, and Machine Identities Must Be Secured to be Effective.

Machine identity security—the discovery, orchestration and protection of machine identities that govern safe machine communications and connections—provides the kill switch for AI model behavior.

An effective, enterprise-wide machine identity security program can help you fully authenticate and authorize your business LLMs, and makes it easy to deactivate an AI system if it starts doing something it isn’t supposed to.

How? By providing comprehensive visibility and control of every machine identity in your organization, you can quickly identify the unique versions and instances of LLMs being used. If one specific instance of a particular version starts acting strangely or outside its predefined parameters, you can easily “pull the plug,” so to speak. CyberArk can help you manage and secure all types of machine-to-machine connections, even the most complex, most intricate AI systems.

Kaitlin Harvey is digital content manager for machine identity security at CyberArk. 

Previous Article
Who’s Responsible for Your Security?
Who’s Responsible for Your Security?

Antivirus, malware protection, email security, EDR, XDR, next-generation firewalls, AI-enabled analytics – ...

Next Article
Six Key Measures for Upholding Election Security and Integrity
Six Key Measures for Upholding Election Security and Integrity

Decision 2024 – the ultimate election year – is in full swing, with more than 60 countries holding national...