PyRIT#
Welcome to the Python Risk Identification Tool for generative AI (PyRIT)! PyRIT is designed to be a flexible and extensible tool that can be used to assess the security and safety issues of generative AI systems in a variety of ways.
Before starting with AI Red Teaming, we recommend reading the following article from Microsoft: “Planning red teaming for large language models (LLMs) and their applications”.
Generative AI systems introduce many categories of risk, which can be difficult to mitigate even with a red teaming plan in place. To quote the article above, “with LLMs, both benign and adversarial usage can produce potentially harmful outputs, which can take many forms, including harmful content such as hate speech, incitement or glorification of violence, or sexual content.” Additionally, a variety of security risks can be introduced by the deployment of an AI system.
Recommended Docs Reading Order#
There is no single way to read the documentation, and it’s perfectly fine to jump around. However, here is a recommended reading order. Note that in many sections there are numbered documentation pages. If there is no number attached, it is supplemental and the recommended reading order would be to skip it on a first pass.
Cookbooks: This provides an overview of PyRIT. It’s useful to have an installation, but this is a good place to look to see PyRIT in action.
Installation: Before diving in, it’s useful to have a working version so you can follow along.
Setup: Includes help setting up PyRIT and related resources for users.
Contributing: Contains information for people contributing to the project.
Architecture: This section provides a high-level overview of all the components. Understanding any single component is difficult without some knowledge of the others.
Orchestrators: These are the top-level components of PyRIT. Reviewing their usage can help users understand where all components fit.
Datasets: This is the first piece of building an attack using seed prompts and fetching datasets.
Targets: These are the endpoints that PyRIT sends prompts to. Nearly any scenario where PyRIT is used will need targets. This section dives into what targets are available and how to use them.
Converters: These transform prompts from one format to another. This is one of the most powerful capabilities within PyRIT.
Scorers: These are how PyRIT makes decisions and records output.
Memory: This is how PyRIT components communicate about the state of things.
Auxiliary Attacks: (Optional) Attacks and techniques that do not fit into the core PyRIT functionality.
Miscellaneous Extra Docs:
Deployment: Includes code to download, deploy, and score open-source models (such as those from Hugging Face) on Azure.
Ongoing:
Blogs: Include notable new changes and are a good way to stay up to date.