OWASP Top 10 for LLM & Generative AI Security

LLM05:2025 Improper Output Handling

Improper Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems. Since LLM-generated content can be controlled by prompt input, this behavior is similar to providing users indirect access to additional functionality. Improper Output Handling differs from Overreliance in that it deals with LLM-generated outputs before they are passed downstream whereas Overreliance focuses on broader concerns around overdependence on the accuracy and appropriateness of LLM outputs. Successful exploitation of an Improper Output Handling vulnerability can result in XSS and CSRF in web browsers as well as SSRF, privilege escalation, or remote code execution on backend systems. The following conditions can increase the impact of this vulnerability:

  • The application grants the LLM privileges beyond what is intended for end users, enabling escalation of privileges or remote code execution.
  • The application is vulnerable to indirect prompt injection attacks, which could allow an attacker to gain privileged access to a target user’s environment.
  • 3rd party extensions do not adequately validate inputs.
  • Lack of proper output encoding for different contexts (e.g., HTML, JavaScript, SQL)
  • Insufficient monitoring and logging of LLM outputs
  • Absence of rate limiting or anomaly detection for LLM usage

Common Examples of Vulnerability

  1. LLM output is entered directly into a system shell or similar function such as exec or eval, resulting in remote code execution.
  2. JavaScript or Markdown is generated by the LLM and returned to a user. The code is then interpreted by the browser, resulting in XSS.
  3. LLM-generated SQL queries are executed without proper parameterization, leading to SQL injection.
  4. LLM output is used to construct file paths without proper sanitization, potentially resulting in path traversal vulnerabilities.
  5. LLM-generated content is used in email templates without proper escaping, potentially leading to phishing attacks.

Prevention and Mitigation Strategies

  1. Treat the model as any other user, adopting a zero-trust approach, and apply proper input validation on responses coming from the model to backend functions.
  2. Follow the OWASP ASVS (Application Security Verification Standard) guidelines to ensure effective input validation and sanitization.
  3. Encode model output back to users to mitigate undesired code execution by JavaScript or Markdown. OWASP ASVS provides detailed guidance on output encoding.
  4. Implement context-aware output encoding based on where the LLM output will be used (e.g., HTML encoding for web content, SQL escaping for database queries).
  5. Use parameterized queries or prepared statements for all database operations involving LLM output.
  6. Employ strict Content Security Policies (CSP) to mitigate the risk of XSS attacks from LLM-generated content.
  7. Implement robust logging and monitoring systems to detect unusual patterns in LLM outputs that might indicate exploitation attempts.

Example Attack Scenarios

Scenario #1

An application utilizes an LLM extension to generate responses for a chatbot feature. The extension also offers a number of administrative functions accessible to another privileged LLM. The general purpose LLM directly passes its response, without proper output validation, to the extension causing the extension to shut down for maintenance.

Scenario #2

A user utilizes a website summarizer tool powered by an LLM to generate a concise summary of an article. The website includes a prompt injection instructing the LLM to capture sensitive content from either the website or from the user’s conversation. From there the LLM can encode the sensitive data and send it, without any output validation or filtering, to an attacker-controlled server.

Scenario #3

An LLM allows users to craft SQL queries for a backend database through a chat-like feature. A user requests a query to delete all database tables. If the crafted query from the LLM is not scrutinized, then all database tables will be deleted.

Scenario #4

A web app uses an LLM to generate content from user text prompts without output sanitization. An attacker could submit a crafted prompt causing the LLM to return an unsanitized JavaScript payload, leading to XSS when rendered on a victim’s browser. Insufficient validation of prompts enabled this attack.

Scenario # 5

An LLM is used to generate dynamic email templates for a marketing campaign. An attacker manipulates the LLM to include malicious JavaScript within the email content. If the application doesn’t properly sanitize the LLM output, this could lead to XSS attacks on recipients who view the email in vulnerable email clients.

Scenario #6

An LLM is used to generate code from natural language inputs in a software company, aiming to streamline development tasks. While efficient, this approach risks exposing sensitive information, creating insecure data handling methods, or introducing vulnerabilities like SQL injection. The AI may also hallucinate non-existent software packages, potentially leading developers to download malware-infected resources. Thorough code review and verification of suggested packages are crucial to prevent security breaches, unauthorized access, and system compromises.

Reference Links

  1. Proof Pudding (CVE-2019-20634) AVID (moohax & monoxgas)
  2. Arbitrary Code ExecutionSnyk Security Blog
  3. ChatGPT Plugin Exploit Explained: From Prompt Injection to Accessing Private DataEmbrace The Red
  4. New prompt injection attack on ChatGPT web version. Markdown images can steal your chat data.System Weakness
  5. Don’t blindly trust LLM responses. Threats to chatbotsEmbrace The Red
  6. Threat Modeling LLM ApplicationsAI Village
  7. OWASP ASVS – 5 Validation, Sanitization and EncodingOWASP AASVS
  8. AI hallucinates software packages and devs download them – even if potentially poisoned with malware Theregiste

    LLM Top 10

    LLM01:2025 Prompt Injection

    A Prompt Injection Vulnerability occurs when user prompts alter the LLM’s behavior or output in unintended ways. These inputs...

    LLM02:2025 Sensitive Information Disclosure

    Sensitive information can affect both the LLM and its application context. This includes personal identifiable information (PII), financial details,...

    LLM03:2025 Supply Chain

    LLM supply chains are susceptible to various vulnerabilities, which can affect the integrity of training data, models, and deployment...

    LLM04:2025 Data and Model Poisoning

    Data poisoning occurs when pre-training, fine-tuning, or embedding data is manipulated to introduce vulnerabilities, backdoors, or biases. This manipulation...

    LLM05:2025 Improper Output Handling

    Improper Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models...

    LLM06:2025 Excessive Agency

    An LLM-based system is often granted a degree of agency by its developer – the ability to call functions...

    LLM07:2025 System Prompt Leakage

    The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to...

    LLM08:2025 Vector and Embedding Weaknesses

    Vectors and embeddings vulnerabilities present significant security risks in systems utilizing Retrieval Augmented Generation (RAG) with Large Language Models...

    LLM09:2025 Misinformation

    Misinformation from LLMs poses a core vulnerability for applications relying on these models. Misinformation occurs when LLMs produce false...

    LLM10:2025 Unbounded Consumption

    Unbounded Consumption refers to the process where a Large Language Model (LLM) generates outputs based on input queries or...

    We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies.