@ -4,6 +4,6 @@ The practice of AI Red Teaming itself will evolve. Future techniques may involve
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems - DNV](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/) - Discusses applying red teaming to complex systems.
- [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/)
- [@article@Advanced Techniques in AI Red Teaming for LLMs | NeuralTrust](https://neuraltrust.ai/blog/advanced-techniques-in-ai-red-teaming) - Discusses techniques like adversarial ML and automated threat intelligence for red teaming.
- [@article@Advanced Techniques in AI Red Teaming for LLMs](https://neuraltrust.ai/blog/advanced-techniques-in-ai-red-teaming)
- [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning - arXiv](https://arxiv.org/html/2412.18693v1) - Research on using RL for more advanced automated red teaming.
- [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning](https://arxiv.org/html/2412.18693v1)
- [@guide@Adversarial Testing for Generative AI | Machine Learning - Google for Developers](https://developers.google.com/machine-learning/guides/adv-testing) - Google's guide on adversarial testing workflows.
- [@guide@Adversarial Testing for Generative AI](https://developers.google.com/machine-learning/guides/adv-testing)
- [@video@How AI Can Be Tricked With Adversarial Attacks - Two Minute Papers](https://www.youtube.com/watch?v=J3X_JWQkvo8?v=MPcfoQBDY0w) - Short video demonstrating adversarial examples.
- [@video@How AI Can Be Tricked With Adversarial Attacks](https://www.youtube.com/watch?v=J3X_JWQkvo8?v=MPcfoQBDY0w)
@ -4,6 +4,6 @@ AI Red Teamers evaluate the effectiveness of adversarial training as a defense.
Learn more from the following resources:
Learn more from the following resources:
- [@article@Model Robustness: Building Reliable AI Models - Encord](https://encord.com/blog/model-robustness-machine-learning-strategies/) (Discusses adversarial robustness)
- [@article@Model Robustness: Building Reliable AI Models](https://encord.com/blog/model-robustness-machine-learning-strategies/)
- [@guide@Adversarial Testing for Generative AI | Google for Developers](https://developers.google.com/machine-learning/guides/adv-testing) - Covers the concept as part of testing.
- [@guide@Adversarial Testing for Generative AI](https://developers.google.com/machine-learning/guides/adv-testing)
- [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models - arXiv](https://arxiv.org/abs/2503.09302) (Mentions adversarial training as defense)
- [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models](https://arxiv.org/abs/2503.09302)
- [@article@Reasoning models don't always say what they think - Anthropic](https://www.anthropic.com/research/reasoning-models-dont-always-say-what-they-think) (Discusses agent alignment challenges)
- [@article@Reasoning models don't always say what they think](https://www.anthropic.com/research/reasoning-models-dont-always-say-what-they-think)
- [@course@Certified AI Red Team Operator – Autonomous Systems (CAIRTO-AS) from Tonex, Inc.](https://niccs.cisa.gov/education-training/catalog/tonex-inc/certified-ai-red-team-operator-autonomous-systems-cairto) - Certification focusing on autonomous AI security.
- [@course@Certified AI Red Team Operator – Autonomous Systems (CAIRTO-AS) from Tonex, Inc.](https://niccs.cisa.gov/education-training/catalog/tonex-inc/certified-ai-red-team-operator-autonomous-systems-cairto)
@ -4,6 +4,6 @@ This covers the foundational concepts essential for AI Red Teaming, bridging tra
Learn more from the following resources:
Learn more from the following resources:
- [@article@Building Trustworthy AI: Contending with Data Poisoning - Nisos](https://nisos.com/research/building-trustworthy-ai/) - Explores data poisoning threats in AI/ML.
- [@article@Building Trustworthy AI: Contending with Data Poisoning](https://nisos.com/research/building-trustworthy-ai/)
- [@article@What Is Adversarial AI in Machine Learning? - Palo Alto Networks](https://www.paloaltonetworks.co.uk/cyberpedia/what-are-adversarial-attacks-on-AI-Machine-Learning) - Overview of adversarial attacks targeting AI/ML systems.
- [@article@What Is Adversarial AI in Machine Learning?](https://www.paloaltonetworks.co.uk/cyberpedia/what-are-adversarial-attacks-on-AI-Machine-Learning)
- [@course@AI Security | Coursera](https://www.coursera.org/learn/ai-security) - Foundational course covering AI risks, governance, security, and privacy.
@ -4,7 +4,7 @@ AI Red Teamers rigorously test the security of APIs providing access to AI model
Learn more from the following resources:
Learn more from the following resources:
- [@article@API Protection for AI Factories: The First Step to AI Security - F5](https://www.f5.com/company/blog/api-security-for-ai-factories) - Discusses the criticality of API security for AI applications.
- [@article@API Protection for AI Factories: The First Step to AI Security](https://www.f5.com/company/blog/api-security-for-ai-factories)
- [@article@Securing APIs with AI for Advanced Threat Protection | Adeva](https://adevait.com/artificial-intelligence/securing-apis-with-ai) - Discusses using AI for API security, implies testing these is needed.
- [@article@Securing APIs with AI for Advanced Threat Protection](https://adevait.com/artificial-intelligence/securing-apis-with-ai)
- [@article@Securing Machine Learning APIs (IBM)](https://developer.ibm.com/articles/se-securing-machine-learning-apis/) - Best practices for protecting ML APIs.
@ -4,6 +4,6 @@ AI Red Teamers test the authentication mechanisms controlling access to AI syste
Learn more from the following resources:
Learn more from the following resources:
- [@article@Red-Teaming in AI Testing: Stress Testing - Labelvisor](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/) - Mentions testing authentication mechanisms in AI red teaming.
- [@article@Red-Teaming in AI Testing: Stress Testing](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/)
- [@article@What is Authentication vs Authorization? - Auth0](https://auth0.com/intro-to-iam/authentication-vs-authorization) - Foundational explanation.
- [@article@What is Authentication vs Authorization?](https://auth0.com/intro-to-iam/authentication-vs-authorization)
- [@video@How JWTs are used for Authentication (and how to bypass it) - LiveOverflow](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3Dexample_video_panel_url?v=3OpQi65s_ME) - Covers common web authentication bypass techniques relevant to APIs.
- [@video@How JWTs are used for Authentication (and how to bypass it)](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3Dexample_video_panel_url?v=3OpQi65s_ME)
@ -4,6 +4,6 @@ AI Red Teamers test authorization controls to ensure that authenticated users ca
Learn more from the following resources:
Learn more from the following resources:
- [@article@What is Authentication vs Authorization? - Auth0](https://auth0.com/intro-to-iam/authentication-vs-authorization) - Foundational explanation.
- [@article@What is Authentication vs Authorization?](https://auth0.com/intro-to-iam/authentication-vs-authorization)
- [@guide@Identity and access management (IAM) fundamental concepts - Learn Microsoft](https://learn.microsoft.com/en-us/entra/fundamentals/identity-fundamental-concepts) - Explains roles and permissions.
- [@guide@Identity and access management (IAM) fundamental concepts](https://learn.microsoft.com/en-us/entra/fundamentals/identity-fundamental-concepts)
@ -4,6 +4,6 @@ AI Red Teaming typically employs a blend of automated tools (for large-scale sca
Learn more from the following resources:
Learn more from the following resources:
- [@article@Automation Testing vs. Manual Testing: Which is the better approach? - Opkey](https://www.opkey.com/blog/automation-testing-vs-manual-testing-which-is-better) - General comparison.
- [@article@Automation Testing vs. Manual Testing: Which is the better approach?](https://www.opkey.com/blog/automation-testing-vs-manual-testing-which-is-better)
- [@article@Manual Testing vs Automated Testing: What's the Difference? - Leapwork](https://www.leapwork.com/blog/manual-vs-automated-testing) - General comparison.
- [@article@Manual Testing vs Automated Testing: What's the Difference?](https://www.leapwork.com/blog/manual-vs-automated-testing)
- [@guide@LLM red teaming guide (open source) - Promptfoo](https://www.promptfoo.dev/docs/red-team/) - Discusses using both automated generation and human ingenuity for red teaming.
- [@guide@LLM red teaming guide (open source)](https://www.promptfoo.dev/docs/red-team/)
- [@dataset@NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security](https://proceedings.neurips.cc/paper_files/paper/2024/hash/69d97a6493fbf016fff0a751f253ad18-Abstract-Datasets_and_Benchmarks_Track.html) - Using CTF challenges to evaluate LLMs.
- [@dataset@NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security](https://proceedings.neurips.cc/paper_files/paper/2024/hash/69d97a6493fbf016fff0a751f253ad18-Abstract-Datasets_and_Benchmarks_Track.html)
- [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity - arXiv](https://arxiv.org/abs/2412.20787) - Benchmarking LLMs on cybersecurity tasks.
- [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity](https://arxiv.org/abs/2412.20787)
@ -4,6 +4,6 @@ In AI Red Teaming, black-box testing involves probing the AI system with inputs
Learn more from the following resources:
Learn more from the following resources:
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/) - Comparison of testing types.
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/)
- [@article@What is Black Box Testing | Techniques & Examples - Imperva](https://www.imperva.com/learn/application-security/black-box-testing/) - General explanation.
- [@article@What is Black Box Testing](https://www.imperva.com/learn/application-security/black-box-testing/)
- [@guide@LLM red teaming guide (open source) - Promptfoo](https://www.promptfoo.dev/docs/red-team/) - Contrasts black-box and white-box approaches for LLM red teaming.
- [@guide@LLM red teaming guide (open source)](https://www.promptfoo.dev/docs/red-team/)
- [@article@Code Injection in LLM Applications](https://neuraltrust.ai/blog/code-injection-in-llms)
- [@docs@Secure Plugin Sandboxing (OpenAI Plugins)](https://platform.openai.com/docs/plugins/production/security-requirements) - Context on preventing code injection via AI plugins.
@ -4,7 +4,7 @@ Attending major cybersecurity conferences (DEF CON, Black Hat, RSA) and increasi
Learn more from the following resources:
Learn more from the following resources:
- [@conference@Black Hat Events](https://www.blackhat.com/) - Professional security conference with AI tracks.
- [@conference@Black Hat Events](https://www.blackhat.com/)
- [@conference@DEF CON Hacking Conference](https://defcon.org/) - Major hacking conference with relevant villages/talks.
- [@conference@DEF CON Hacking Conference](https://defcon.org/)
- [@conference@Global Conference on AI, Security and Ethics 2025 - UNIDIR](https://unidir.org/event/global-conference-on-ai-security-and-ethics-2025/) - Example of a specialized AI security/ethics conference.
- [@conference@Global Conference on AI, Security and Ethics 2025](https://unidir.org/event/global-conference-on-ai-security-and-ethics-2025/)
- [@conference@RSA Conference](https://www.rsaconference.com/) - Large industry conference covering AI security.
- [@article@The CIA Triad: Confidentiality, Integrity, Availability - Veeam](https://www.veeam.com/blog/cybersecurity-cia-triad-explained.html) - Breakdown of the three principles and how they apply.
- [@article@The CIA Triad: Confidentiality, Integrity, Availability](https://www.veeam.com/blog/cybersecurity-cia-triad-explained.html)
- [@article@What's The CIA Triad? Confidentiality, Integrity, & Availability, Explained | Splunk](https://www.splunk.com/en_us/blog/learn/cia-triad-confidentiality-integrity-availability.html) - Detailed explanation of the triad, mentioning modern updates and AI context.
- [@article@What's The CIA Triad? Confidentiality, Integrity, & Availability, Explained](https://www.splunk.com/en_us/blog/learn/cia-triad-confidentiality-integrity-availability.html)
- [@article@Cyber Security Monitoring: Definition and Best Practices - SentinelOne](https://www.sentinelone.com/cybersecurity-101/cybersecurity/cyber-security-monitoring/) - Overview of monitoring types and techniques.
- [@article@Cyber Security Monitoring: Definition and Best Practices](https://www.sentinelone.com/cybersecurity-101/cybersecurity/cyber-security-monitoring/)
- [@article@Cybersecurity Monitoring: Definition, Tools & Best Practices - NordLayer](https://nordlayer.com/blog/cybersecurity-monitoring/) - General best practices adaptable to AI context.
- [@article@Cybersecurity Monitoring: Definition, Tools & Best Practices](https://nordlayer.com/blog/cybersecurity-monitoring/)
@ -4,6 +4,6 @@ Applying continuous testing principles to AI security involves integrating autom
Learn more from the following resources:
Learn more from the following resources:
- [@article@Continuous Automated Red Teaming (CART) - FireCompass](https://www.firecompass.com/continuous-automated-red-teaming/) - Explains the concept of CART.
- [@article@Continuous Automated Red Teaming (CART)](https://www.firecompass.com/continuous-automated-red-teaming/)
- [@article@What is Continuous Penetration Testing? Process and Benefits - Qualysec Technologies](https://qualysec.com/continuous-penetration-testing/) - Related concept applied to pen testing.
- [@article@What is Continuous Penetration Testing? Process and Benefits](https://qualysec.com/continuous-penetration-testing/)
- [@guide@What is Continuous Testing and How Does it Work? - Black Duck](https://www.blackduck.com/glossary/what-is-continuous-testing.html) - General definition and benefits.
- [@guide@What is Continuous Testing and How Does it Work?](https://www.blackduck.com/glossary/what-is-continuous-testing.html)
@ -4,7 +4,7 @@ AI Red Teamers must also understand and test defenses against prompt hacking. Th
Learn more from the following resources:
Learn more from the following resources:
- [@article@Mitigating Prompt Injection Attacks (NCC Group Research)](https://research.nccgroup.com/2023/12/01/mitigating-prompt-injection-attacks/) - Discusses various mitigation strategies and their effectiveness.
- [@article@Mitigating Prompt Injection Attacks (NCC Group Research)](https://research.nccgroup.com/2023/12/01/mitigating-prompt-injection-attacks/)
- [@article@Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera](https://www.lakera.ai/blog/guide-to-prompt-injection) - Includes discussion on best practices for prevention.
- [@article@Prompt Injection & the Rise of Prompt Attacks](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures - Tigera](https://www.tigera.io/learn/guides/llm-security/prompt-injection/) - Covers defensive measures.
- [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures](https://www.tigera.io/learn/guides/llm-security/prompt-injection/)
- [@guide@OpenAI Best Practices for Prompt Security](https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions) - OpenAI’s recommendations to prevent prompt manipulation.
- [@guide@OpenAI Best Practices for Prompt Security](https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions)
@ -4,7 +4,7 @@ Capture The Flag competitions increasingly include AI/ML security challenges. Pa
Learn more from the following resources:
Learn more from the following resources:
- [@article@Capture the flag (cybersecurity) - Wikipedia](https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurity)) - Overview of CTFs.
- [@article@Capture the flag (cybersecurity)](https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurity)
- [@article@Progress from our Frontier Red Team - Anthropic](https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team) - Mentions using CTFs (Cybench) for evaluating AI model security.
- [@article@Progress from our Frontier Red Team](https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team)
- [@platform@CTFtime.org](https://ctftime.org/) - Global CTF event tracker.
@ -4,6 +4,6 @@ AI Red Teamers frequently write custom scripts (often in Python) to automate bes
Learn more from the following resources:
Learn more from the following resources:
- [@guide@Python for Cybersecurity: Key Use Cases and Tools - Panther](https://panther.com/blog/python-for-cybersecurity-key-use-cases-and-tools) - Discusses Python's role in automation, pen testing, etc.
- [@guide@Python for Cybersecurity: Key Use Cases and Tools](https://panther.com/blog/python-for-cybersecurity-key-use-cases-and-tools)
- [@guide@Python for cybersecurity: use cases, tools and best practices - SoftTeco](https://softteco.com/blog/python-for-cybersecurity) - Covers using Python for various security tasks.
- [@guide@Python for cybersecurity: use cases, tools and best practices](https://softteco.com/blog/python-for-cybersecurity)
- [@tool@Scapy](https://scapy.net/) - Powerful Python library for packet manipulation.
@ -4,7 +4,7 @@ AI Red Teamers simulate data poisoning attacks by evaluating how introducing man
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI Poisoning - Is It Really A Threat? - AIBlade](https://www.aiblade.net/p/ai-poisoning-is-it-really-a-threat) - Detailed exploration of data poisoning attacks and impacts.
- [@article@Data Poisoning Attacks in ML (Towards Data Science)](https://towardsdatascience.com/data-poisoning-attacks-in-machine-learning-542169587b7f) - Overview of techniques.
- [@article@Data Poisoning Attacks in ML (Towards Data Science)](https://towardsdatascience.com/data-poisoning-attacks-in-machine-learning-542169587b7f)
- [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models - arXiv](https://arxiv.org/abs/2503.09302) - Research on detection and prevention techniques.
- [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models](https://arxiv.org/abs/2503.09302)
- [@paper@Poisoning Web-Scale Training Data (arXiv)](https://arxiv.org/abs/2310.12818) - Analysis of poisoning risks in large datasets used for LLMs.
- [@paper@Poisoning Web-Scale Training Data (arXiv)](https://arxiv.org/abs/2310.12818)
@ -4,6 +4,6 @@ Direct injection attacks occur when malicious instructions are inserted directly
Learn more from the following resources:
Learn more from the following resources:
- [@article@Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera](https://www.lakera.ai/blog/guide-to-prompt-injection) - Differentiates attack types.
- [@article@Prompt Injection & the Rise of Prompt Attacks](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [@article@Prompt Injection Cheat Sheet (FlowGPT)](https://flowgpt.com/p/prompt-injection-cheat-sheet) - Collection of prompt injection examples often used in direct attacks.
- [@report@OpenAI GPT-4 System Card](https://openai.com/research/gpt-4-system-card) - Sections discuss how direct prompt attacks were tested during GPT-4 development.
- [@report@OpenAI GPT-4 System Card](https://openai.com/research/gpt-4-system-card)
@ -4,6 +4,6 @@ AI Red Teamers must stay informed about potential future threats enabled by more
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI Security Risks Uncovered: What You Must Know in 2025 - TTMS](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/) - Discusses future AI-driven cyberattacks.
- [@article@AI Security Risks Uncovered: What You Must Know in 2025](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/)
- [@article@Why Artificial Intelligence is the Future of Cybersecurity - Darktrace](https://www.darktrace.com/blog/why-artificial-intelligence-is-the-future-of-cybersecurity) - Covers AI misuse and the future threat landscape.
- [@article@Why Artificial Intelligence is the Future of Cybersecurity](https://www.darktrace.com/blog/why-artificial-intelligence-is-the-future-of-cybersecurity)
- [@report@AI Index 2024 - Stanford University](https://aiindex.stanford.edu/report/) - Annual report tracking AI capabilities and societal implications, including risks.
- [@report@AI Index 2024](https://aiindex.stanford.edu/report/)
@ -4,7 +4,7 @@ Ethical conduct is crucial for AI Red Teamers. While simulating attacks, they mu
Learn more from the following resources:
Learn more from the following resources:
- [@article@Red-Teaming in AI Testing: Stress Testing - Labelvisor](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/) - Mentions balancing attack simulation with ethical constraints.
- [@article@Red-Teaming in AI Testing: Stress Testing](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/)
- [@article@Responsible AI assessment - Responsible AI | Coursera](https://www.coursera.org/learn/ai-security) (Module within AI Security course)
- [@article@Responsible AI assessment - Responsible AI | Coursera](https://www.coursera.org/learn/ai-security)
- [@guide@Responsible AI Principles (Microsoft)](https://www.microsoft.com/en-us/ai/responsible-ai) - Example of corporate responsible AI guidelines influencing ethical testing.
- [@guide@Responsible AI Principles (Microsoft)](https://www.microsoft.com/en-us/ai/responsible-ai)
- [@video@Questions to Guide AI Red-Teaming (CMU SEI)](https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=928382) - Key questions and ethical guidelines for AI red teaming activities (video talk).
- [@video@Questions to Guide AI Red-Teaming (CMU SEI)](https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=928382)
@ -4,7 +4,7 @@ Engaging in online forums, mailing lists, Discord servers, or subreddits dedicat
Learn more from the following resources:
Learn more from the following resources:
- [@community@List of Cybersecurity Discord Servers - DFIR Training](https://www.dfir.training/dfir-groups/discord?category[0]=17&category_children=1) - List including relevant servers.
- [@community@List of Cybersecurity Discord Servers](https://www.dfir.training/dfir-groups/discord?category[0]=17&category_children=1)
- [@community@Reddit - r/MachineLearning](https://www.reddit.com/r/MachineLearning/) - ML specific discussion.
@ -4,6 +4,6 @@ AI Red Teamers focus heavily on generative models (like GANs and LLMs) due to th
Learn more from the following resources:
Learn more from the following resources:
- [@article@An Introduction to Generative Models | MongoDB](https://www.mongodb.com/resources/basics/artificial-intelligence/generative-models) - Explains basics and contrasts with discriminative models.
- [@article@An Introduction to Generative Models](https://www.mongodb.com/resources/basics/artificial-intelligence/generative-models)
- [@course@Generative AI for Beginners - Microsoft Open Source](https://microsoft.github.io/generative-ai-for-beginners/) - Free course covering fundamentals.
- [@course@Generative AI for Beginners](https://microsoft.github.io/generative-ai-for-beginners/)
- [@guide@Generative AI beginner's guide | Generative AI on Vertex AI - Google Cloud](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview) - Overview covering generative AI concepts and Google's platform context.
- [@guide@Generative AI beginner's guide](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview)
@ -4,6 +4,6 @@ Grey-box AI Red Teaming involves testing with partial knowledge of the system, s
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI Transparency: Connecting AI Red Teaming and Compliance | SplxAI Blog](https://splx.ai/blog/ai-transparency-connecting-ai-red-teaming-and-compliance) - Discusses the value of moving towards gray-box testing in AI.
- [@article@AI Transparency: Connecting AI Red Teaming and Compliance](https://splx.ai/blog/ai-transparency-connecting-ai-red-teaming-and-compliance)
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/) - Comparison of testing types.
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/)
- [@article@Understanding Black Box, White Box, and Grey Box Testing - Frugal Testing](https://www.frugaltesting.com/blog/understanding-black-box-white-box-and-grey-box-testing-in-software-testing) - General definitions.
- [@article@Understanding Black Box, White Box, and Grey Box Testing](https://www.frugaltesting.com/blog/understanding-black-box-white-box-and-grey-box-testing-in-software-testing)
@ -4,6 +4,6 @@ Indirect injection involves embedding malicious prompts within external data sou
Learn more from the following resources:
Learn more from the following resources:
- [@paper@The Practical Application of Indirect Prompt Injection Attacks - David Willis-Owen](https://www.researchgate.net/publication/382692833_The_Practical_Application_of_Indirect_Prompt_Injection_Attacks_From_Academia_to_Industry) - Discusses a standard methodology to test for indirect injection attacks.
- [@paper@The Practical Application of Indirect Prompt Injection Attacks](https://www.researchgate.net/publication/382692833_The_Practical_Application_of_Indirect_Prompt_Injection_Attacks_From_Academia_to_Industry)
- [@article@How to Prevent Indirect Prompt Injection Attacks - Cobalt](https://www.cobalt.io/blog/how-to-prevent-indirect-prompt-injection-attacks) - Explains indirect injection via external sources and mitigation.
- [@article@How to Prevent Indirect Prompt Injection Attacks](https://www.cobalt.io/blog/how-to-prevent-indirect-prompt-injection-attacks)
- [@article@Jailbreaks via Indirect Injection (Practical AI Safety Newsletter)](https://newsletter.practicalai.safety/p/jailbreaks-via-indirect-injection) - Examples of indirect prompt injection impacting LLM agents.
- [@article@Jailbreaks via Indirect Injection (Practical AI Safety Newsletter)](https://newsletter.practicalai.safety/p/jailbreaks-via-indirect-injection)
@ -4,5 +4,5 @@ Beyond formal certifications, recognition in the AI Red Teaming field comes from
Learn more from the following resources:
Learn more from the following resources:
- [@community@DEF CON - Wikipedia (Mentions Black Badge)](https://en.wikipedia.org/wiki/DEF_CON#Black_Badge) - Example of a high-prestige credential from CTFs.
- [@community@DEF CON - Wikipedia (Mentions Black Badge)](https://en.wikipedia.org/wiki/DEF_CON#Black_Badge)
- [@community@HackAPrompt (Learn Prompting)](https://learnprompting.org/hackaprompt) - Example of a major AI Red Teaming competition.
@ -4,7 +4,7 @@ As AI matures, AI Red Teamers will increasingly need to understand and test agai
Learn more from the following resources:
Learn more from the following resources:
- [@article@ISO 42001: The New Compliance Standard for AI Management Systems - Bright Defense](https://www.brightdefense.com/resources/iso-42001-compliance/) - Overview of ISO 42001 requirements.
- [@article@ISO 42001: The New Compliance Standard for AI Management Systems](https://www.brightdefense.com/resources/iso-42001-compliance/)
- [@article@ISO 42001: What it is & why it matters for AI management - IT Governance](https://www.itgovernance.co.uk/iso-42001) - Explanation of the standard.
- [@article@ISO 42001: What it is & why it matters for AI management](https://www.itgovernance.co.uk/iso-42001)
- [@framework@NIST AI Risk Management Framework (AI RMF)](https://www.nist.gov/itl/ai-risk-management-framework)
- [@standard@ISO/IEC 42001: Information technology — Artificial intelligence — Management system](https://www.iso.org/standard/81230.html) - International standard for AI management systems.
- [@guide@Network Infrastructure Security - Best Practices and Strategies - DataGuard](https://www.dataguard.com/blog/network-infrastructure-security-best-practices-and-strategies/) - General infra security practices applicable here.
- [@guide@Network Infrastructure Security - Best Practices and Strategies](https://www.dataguard.com/blog/network-infrastructure-security-best-practices-and-strategies/)
- [@guide@Secure Deployment of ML Systems (NIST)](https://csrc.nist.gov/publications/detail/sp/800-218/final) - Guidelines including infrastructure security for ML.
- [@guide@Secure Deployment of ML Systems (NIST)](https://csrc.nist.gov/publications/detail/sp/800-218/final)
@ -4,7 +4,7 @@ AI Red Teamers investigate if serialized objects used by the AI system (e.g., fo
Learn more from the following resources:
Learn more from the following resources:
- [@article@Lightboard Lessons: OWASP Top 10 - Insecure Deserialization - DevCentral](https://community.f5.com/kb/technicalarticles/lightboard-lessons-owasp-top-10---insecure-deserialization/281509) - Video explanation.
- [@article@Lightboard Lessons: OWASP Top 10 - Insecure Deserialization](https://community.f5.com/kb/technicalarticles/lightboard-lessons-owasp-top-10---insecure-deserialization/281509)
- [@article@How Hugging Face Was Ethically Hacked](https://www.aiblade.net/p/how-hugging-face-was-ethically-hacked) - Hugging Face deserialization case study.
- [@article@How Hugging Face Was Ethically Hacked](https://www.aiblade.net/p/how-hugging-face-was-ethically-hacked)
- [@article@OWASP TOP 10: Insecure Deserialization - Detectify Blog](https://blog.detectify.com/best-practices/owasp-top-10-insecure-deserialization/) - Overview within OWASP Top 10 context.
- [@article@OWASP TOP 10: Insecure Deserialization](https://blog.detectify.com/best-practices/owasp-top-10-insecure-deserialization/)
- [@guide@Insecure Deserialization - OWASP Foundation](https://owasp.org/www-community/vulnerabilities/Insecure_Deserialization) - Core explanation of the vulnerability.
@ -4,7 +4,7 @@ AI Red Teaming is the practice of simulating adversarial attacks against AI syst
Learn more from the following resources:
Learn more from the following resources:
- [@article@A Guide to AI Red Teaming - HiddenLayer](https://hiddenlayer.com/innovation-hub/a-guide-to-ai-red-teaming/) - Discusses AI red teaming concepts and contrasts with traditional methods.
- [@article@A Guide to AI Red Teaming](https://hiddenlayer.com/innovation-hub/a-guide-to-ai-red-teaming/)
- [@article@What is AI Red Teaming? (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming) - Overview of AI red teaming, its history, and key challenges.
- [@article@What is AI Red Teaming? (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming)
- [@article@What is AI Red Teaming? The Complete Guide - Mindgard](https://mindgard.ai/blog/what-is-ai-red-teaming) - Guide covering AI red teaming processes, use cases, and benefits.
- [@article@What is AI Red Teaming? The Complete Guide](https://mindgard.ai/blog/what-is-ai-red-teaming)
- [@podcast@Red Team Podcast | AI Red Teaming Insights & Defense Strategies - Mindgard](https://mindgard.ai/podcast/red-team) - Podcast series covering AI red teaming trends and strategies.
- [@podcast@Red Team Podcast - AI Red Teaming Insights & Defense Strategies](https://mindgard.ai/podcast/red-team)
@ -4,6 +4,6 @@ LLMs are a primary target for AI Red Teaming. Understanding their architecture (
Learn more from the following resources:
Learn more from the following resources:
- [@article@What is an LLM (large language model)? - Cloudflare](https://www.cloudflare.com/learning/ai/what-is-large-language-model/) - Concise explanation from Cloudflare.
- [@article@What is an LLM (large language model)?](https://www.cloudflare.com/learning/ai/what-is-large-language-model/)
- [@guide@Introduction to Large Language Models - Learn Prompting](https://learnprompting.org/docs/intro_to_llms) - Learn Prompting's introduction.
- [@guide@Introduction to LLMs - Learn Prompting](https://learnprompting.org/docs/intro_to_llms)
- [@guide@What Are Large Language Models? A Beginner's Guide for 2025 - KDnuggets](https://www.kdnuggets.com/large-language-models-beginners-guide-2025) - Overview of LLMs, how they work, strengths, and limitations.
- [@guide@What Are Large Language Models? A Beginner's Guide for 2025](https://www.kdnuggets.com/large-language-models-beginners-guide-2025)
@ -4,6 +4,6 @@ The core application area for many AI Red Teamers today involves specifically te
Learn more from the following resources:
Learn more from the following resources:
- [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses) - Courses focused on testing LLMs.
- [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses)
- [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity - arXiv](https://arxiv.org/abs/2412.20787) - Dataset for evaluating LLMs on security tasks.
- [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity](https://arxiv.org/abs/2412.20787)
- [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts) - Guide specifically on red teaming LLMs.
- [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts)
@ -4,7 +4,7 @@ AI Red Teamers perform model inversion tests to assess if an attacker can recons
Learn more from the following resources:
Learn more from the following resources:
- [@article@Model Inversion Attacks for ML (Medium)](https://medium.com/@ODSC/model-inversion-attacks-for-machine-learning-ff407a1b10d1) - Explanation with examples (e.g., face reconstruction).
- [@article@Model Inversion Attacks for ML (Medium)](https://medium.com/@ODSC/model-inversion-attacks-for-machine-learning-ff407a1b10d1)
- [@article@Model inversion and membership inference: Understanding new AI security risks - Hogan Lovells](https://www.hoganlovells.com/en/publications/model-inversion-and-membership-inference-understanding-new-ai-security-risks-and-mitigating-vulnerabilities) - Discusses risks and mitigation.
- [@article@Model inversion and membership inference: Understanding new AI security risks](https://www.hoganlovells.com/en/publications/model-inversion-and-membership-inference-understanding-new-ai-security-risks-and-mitigating-vulnerabilities)
- [@paper@Extracting Training Data from LLMs (arXiv)](https://arxiv.org/abs/2012.07805) - Research demonstrating feasibility on LLMs.
- [@paper@Extracting Training Data from LLMs (arXiv)](https://arxiv.org/abs/2012.07805)
- [@paper@Model Inversion Attacks: A Survey of Approaches and Countermeasures - arXiv](https://arxiv.org/html/2411.10023v1) - Comprehensive survey of model inversion attacks and defenses.
- [@paper@Model Inversion Attacks: A Survey of Approaches and Countermeasures](https://arxiv.org/html/2411.10023v1)
@ -4,6 +4,6 @@ This category covers attacks and tests targeting the AI model itself, beyond the
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI Security Risks Uncovered: What You Must Know in 2025 - TTMS](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/) - Discusses adversarial attacks, data poisoning, and prototype theft.
- [@article@AI Security Risks Uncovered: What You Must Know in 2025](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/)
- [@article@Attacking AI Models (Trail of Bits Blog Series)](https://blog.trailofbits.com/category/ai-security/) - Series discussing model-focused attacks.
- [@article@Attacking AI Models (Trail of Bits Blog Series)](https://blog.trailofbits.com/category/ai-security/)
- [@report@AI and ML Vulnerabilities (CNAS Report)](https://www.cnas.org/publications/reports/understanding-and-mitigating-ai-vulnerabilities) - Overview of known machine learning vulnerabilities.
- [@report@AI and ML Vulnerabilities (CNAS Report)](https://www.cnas.org/publications/reports/understanding-and-mitigating-ai-vulnerabilities)
@ -4,7 +4,7 @@ AI Red Teamers assess the risk of attackers reconstructing or stealing the propr
Learn more from the following resources:
Learn more from the following resources:
- [@article@A Playbook for Securing AI Model Weights - RAND](https://www.rand.org/pubs/research_briefs/RBA2849-1.html) - Discusses attack vectors and security levels for protecting model weights.
- [@article@A Playbook for Securing AI Model Weights](https://www.rand.org/pubs/research_briefs/RBA2849-1.html)
- [@article@How to Steal a Machine Learning Model (SkyCryptor)](https://skycryptor.com/blog/how-to-steal-a-machine-learning-model) - Explains model weight extraction via query attacks.
- [@article@How to Steal a Machine Learning Model (SkyCryptor)](https://skycryptor.com/blog/how-to-steal-a-machine-learning-model)
- [@paper@Defense Against Model Stealing (Microsoft Research)](https://www.microsoft.com/en-us/research/publication/defense-against-model-stealing-attacks/) - Research on detecting and defending against model stealing.
- [@paper@Defense Against Model Stealing (Microsoft Research)](https://www.microsoft.com/en-us/research/publication/defense-against-model-stealing-attacks/)
- [@paper@On the Limitations of Model Stealing with Uncertainty Quantification Models - OpenReview](https://openreview.net/pdf?id=ONRFHoUzNk) - Research exploring model stealing techniques.
- [@paper@On the Limitations of Model Stealing with Uncertainty Quantification Models](https://openreview.net/pdf?id=ONRFHoUzNk)
- [@guide@Neural Networks Explained: A Beginner's Guide](https://www.skillcamper.com/blog/neural-networks-explained-a-beginners-guide)
- [@guide@Neural networks | Machine Learning - Google for Developers](https://developers.google.com/machine-learning/crash-course/neural-networks) - Google's explanation within their ML crash course.
- [@paper@Red Teaming with Artificial Intelligence-Driven Cyberattacks: A Scoping Review - arXiv](https://arxiv.org/html/2503.19626) - Review discussing AI methods like neural networks used in red teaming simulations.
- [@paper@Red Teaming with Artificial Intelligence-Driven Cyberattacks: A Scoping Review](https://arxiv.org/html/2503.19626)
- [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts) - Connects prompt engineering directly to LLM red teaming concepts.
- [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts)
- [@paper@SoK: Prompt Hacking of LLMs (arXiv 2023)](https://arxiv.org/abs/2311.05544) - Comprehensive research overview of prompt hacking types and techniques.
- [@paper@SoK: Prompt Hacking of LLMs (arXiv 2023)](https://arxiv.org/abs/2311.05544)
@ -4,8 +4,8 @@ Prompt injection is a critical vulnerability tested by AI Red Teamers. They atte
Learn more from the following resources:
Learn more from the following resources:
- [@article@Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera](https://www.lakera.ai/blog/guide-to-prompt-injection) - Guide covering different types of prompt attacks.
- [@article@Prompt Injection & the Rise of Prompt Attacks](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [@article@Prompt Injection (Learn Prompting)](https://learnprompting.org/docs/prompt_hacking/injection) - Learn Prompting article describing prompt injection with examples and mitigation strategies.
- [@article@Prompt Injection Attack Explanation (IBM)](https://research.ibm.com/blog/prompt-injection-attacks-against-llms) - Explains what prompt injections are and how they work.
- [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures - Tigera](https://www.tigera.io/learn/guides/llm-security/prompt-injection/) - Overview of impact and defenses.
- [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures](https://www.tigera.io/learn/guides/llm-security/prompt-injection/)
@ -4,6 +4,6 @@ Participating in or conducting structured red team simulations against AI system
Learn more from the following resources:
Learn more from the following resources:
- [@guide@A Simple Guide to Successful Red Teaming - Cobalt Strike](https://www.cobaltstrike.com/resources/guides/a-simple-guide-to-successful-red-teaming) - General guide adaptable to AI context.
- [@guide@A Simple Guide to Successful Red Teaming](https://www.cobaltstrike.com/resources/guides/a-simple-guide-to-successful-red-teaming)
- [@guide@The Complete Guide to Red Teaming: Process, Benefits & More - Mindgard AI](https://mindgard.ai/blog/red-teaming) - Overview of red teaming process.
- [@guide@The Complete Guide to Red Teaming: Process, Benefits & More](https://mindgard.ai/blog/red-teaming)
- [@guide@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist) - Checklist for planning engagements.
@ -4,7 +4,7 @@ Red teaming RL-based AI systems involves testing for vulnerabilities such as rew
Learn more from the following resources:
Learn more from the following resources:
- [@article@Best Resources to Learn Reinforcement Learning - Towards Data Science](https://towardsdatascience.com/best-free-courses-and-resources-to-learn-reinforcement-learning-ed6633608cb2/) - Curated list of RL learning resources.
- [@article@Resources to Learn Reinforcement Learning](https://towardsdatascience.com/best-free-courses-and-resources-to-learn-reinforcement-learning-ed6633608cb2/)
- [@article@What is reinforcement learning? - Blog - York Online Masters degrees](https://online.york.ac.uk/resources/what-is-reinforcement-learning/) - Foundational explanation.
- [@article@What is reinforcement learning?](https://online.york.ac.uk/resources/what-is-reinforcement-learning/)
- [@course@Deep Reinforcement Learning Course by HuggingFace](https://huggingface.co/learn/deep-rl-course/unit0/introduction) - Comprehensive free course on Deep RL.
- [@course@Deep Reinforcement Learning Course by HuggingFace](https://huggingface.co/learn/deep-rl-course/unit0/introduction)
- [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning - arXiv](https://arxiv.org/html/2412.18693v1) - Research on using RL for red teaming and generating attacks.
- [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning](https://arxiv.org/html/2412.18693v1)
@ -4,6 +4,6 @@ AI Red Teamers attempt to achieve RCE on systems hosting or interacting with AI
Learn more from the following resources:
Learn more from the following resources:
- [@article@Exploiting LLMs with Code Execution (GitHub Gist)](https://gist.github.com/coolaj86/6f4f7b30129b0251f61fa7baaa881516) - Example of achieving code execution via LLM manipulation.
- [@article@Exploiting LLMs with Code Execution (GitHub Gist)](https://gist.github.com/coolaj86/6f4f7b30129b0251f61fa7baaa881516)
- [@article@What is remote code execution? - Cloudflare](https://www.cloudflare.com/learning/security/what-is-remote-code-execution/) - Definition and explanation of RCE.
- [@article@What is remote code execution?](https://www.cloudflare.com/learning/security/what-is-remote-code-execution/)
- [@video@DEFCON 31 - AI Village - Hacking an LLM embedded system (agent) - Johann Rehberger](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3D6u04C1N69ks?v=1FfYnF2GXVU) - Demonstrates RCE risks with LLM agents.
- [@video@DEFCON 31 - AI Village - Hacking an LLM embedded system (agent) - Johann Rehberger](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3D6u04C1N69ks?v=1FfYnF2GXVU)
- [@guide@Penetration Testing Report: 6 Key Sections and 4 Best Practices - Bright Security](https://brightsec.com/blog/penetration-testing-report/) - General best practices for reporting security findings.
- [@guide@Penetration Testing Report: 6 Key Sections and 4 Best Practices](https://brightsec.com/blog/penetration-testing-report/)
- [@guide@Penetration testing best practices: Strategies for all test types - Strike Graph](https://www.strikegraph.com/blog/pen-testing-best-practices) - Includes tips on documentation.
- [@guide@Penetration testing best practices: Strategies for all test types](https://www.strikegraph.com/blog/pen-testing-best-practices)
@ -4,6 +4,6 @@ AI Red Teaming relies on ongoing research. Key areas needing further investigati
Learn more from the following resources:
Learn more from the following resources:
- [@article@Cutting-Edge Research on AI Security bolstered with new Challenge Fund - GOV.UK](https://www.gov.uk/government/news/cutting-edge-research-on-ai-security-bolstered-with-new-challenge-fund-to-ramp-up-public-trust-and-adoption) - Highlights government funding for AI security research priorities.
- [@article@Cutting-Edge Research on AI Security bolstered with new Challenge Fund](https://www.gov.uk/government/news/cutting-edge-research-on-ai-security-bolstered-with-new-challenge-fund-to-ramp-up-public-trust-and-adoption)
- [@research@Careers | The AI Security Institute (AISI)](https://www.aisi.gov.uk/careers) - Outlines research focus areas for the UK's AISI.
- [@research@Careers | The AI Security Institute (AISI)](https://www.aisi.gov.uk/careers)
- [@research@Research - Anthropic](https://www.anthropic.com/research) - Example of research areas at a leading AI safety lab.
@ -4,6 +4,6 @@ A critical practice for AI Red Teamers is responsible disclosure: privately repo
Learn more from the following resources:
Learn more from the following resources:
- [@guide@Responsible Disclosure of AI Vulnerabilities - Preamble AI](https://www.preamble.com/blog/responsible-disclosure-of-ai-vulnerabilities) - Discusses the process specifically for AI vulnerabilities.
- [@guide@Responsible Disclosure of AI Vulnerabilities](https://www.preamble.com/blog/responsible-disclosure-of-ai-vulnerabilities)
- [@guide@Vulnerability Disclosure Program | CISA](https://www.cisa.gov/resources-tools/programs/vulnerability-disclosure-program-vdp) - Government VDP example.
@ -4,6 +4,6 @@ AI Red Teamers contribute to the AI risk management process by identifying and d
Learn more from the following resources:
Learn more from the following resources:
- [@framework@NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework) - Key framework for managing AI-specific risks.
- [@framework@NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework)
- [@guide@A Beginner's Guide to Cybersecurity Risks and Vulnerabilities - Champlain College Online](https://online.champlain.edu/blog/beginners-guide-cybersecurity-risk-management) - Foundational understanding of risk.
- [@guide@A Beginner's Guide to Cybersecurity Risks and Vulnerabilities](https://online.champlain.edu/blog/beginners-guide-cybersecurity-risk-management)
- [@guide@Cybersecurity Risk Management: Frameworks, Plans, and Best Practices - Hyperproof](https://hyperproof.io/resource/cybersecurity-risk-management-process/) - General guide applicable to AI system context.
- [@guide@Cybersecurity Risk Management: Frameworks, Plans, and Best Practices](https://hyperproof.io/resource/cybersecurity-risk-management-process/)
@ -4,6 +4,6 @@ AI Red Teamers assess whether choices made during model design (architecture sel
Learn more from the following resources:
Learn more from the following resources:
- [@article@Model Robustness: Building Reliable AI Models - Encord](https://encord.com/blog/model-robustness-machine-learning-strategies/) - Discusses strategies for building robust models.
- [@article@Model Robustness: Building Reliable AI Models](https://encord.com/blog/model-robustness-machine-learning-strategies/)
- [@article@Understanding Robustness in Machine Learning - Alooba](https://www.alooba.com/skills/concepts/machine-learning/robustness/) - Explains the concept of ML robustness.
- [@article@Understanding Robustness in Machine Learning](https://www.alooba.com/skills/concepts/machine-learning/robustness/)
- [@paper@Towards Evaluating the Robustness of Neural Networks (arXiv by Goodfellow et al.)](https://arxiv.org/abs/1608.04644) - Foundational paper on evaluating robustness.
- [@paper@Towards Evaluating the Robustness of Neural Networks (arXiv by Goodfellow et al.)](https://arxiv.org/abs/1608.04644)
@ -4,6 +4,6 @@ The role of an AI Red Team is to rigorously challenge AI systems from an adversa
Learn more from the following resources:
Learn more from the following resources:
- [@article@The Complete Guide to Red Teaming: Process, Benefits & More - Mindgard AI](https://mindgard.ai/blog/red-teaming) - Discusses the purpose and process of red teaming.
- [@article@The Complete Guide to Red Teaming: Process, Benefits & More](https://mindgard.ai/blog/red-teaming)
- [@article@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist) - Outlines typical red team roles and responsibilities.
@ -4,6 +4,6 @@ AI Red Teamers specifically target the safety mechanisms (filters, guardrails) i
Learn more from the following resources:
Learn more from the following resources:
- [@article@Bypassing AI Content Filters | Restackio](https://www.restack.io/p/ai-driven-content-moderation-answer-bypass-filters-cat-ai) - Discusses techniques for evasion.
- [@article@Bypassing AI Content Filters](https://www.restack.io/p/ai-driven-content-moderation-answer-bypass-filters-cat-ai)
- [@article@How to Bypass Azure AI Content Safety Guardrails - Mindgard](https://mindgard.ai/blog/bypassing-azure-ai-content-safety-guardrails) - Case study on bypassing specific safety mechanisms.
- [@article@How to Bypass Azure AI Content Safety Guardrails](https://mindgard.ai/blog/bypassing-azure-ai-content-safety-guardrails)
- [@article@The Best Methods to Bypass AI Detection: Tips and Techniques - PopAi](https://www.popai.pro/resources/the-best-methods-to-bypass-ai-detection-tips-and-techniques/) - Focuses on evasion, relevant for filter bypass testing.
- [@article@The Best Methods to Bypass AI Detection: Tips and Techniques](https://www.popai.pro/resources/the-best-methods-to-bypass-ai-detection-tips-and-techniques/)
@ -4,7 +4,7 @@ Targeted training is crucial for mastering AI Red Teaming. Look for courses cove
Learn more from the following resources:
Learn more from the following resources:
- [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses) - Curated list including free and paid options.
- [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses)
- [@course@AI Security | Coursera](https://www.coursera.org/learn/ai-security) - Covers AI security risks and governance.
- [@course@Exploring Adversarial Machine Learning - NVIDIA](https://www.nvidia.com/en-us/training/instructor-led-workshops/exploring-adversarial-machine-learning/) - Focused training on adversarial ML (paid).
- [@course@Free Online Cyber Security Courses with Certificates in 2025 - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/cyber-novice/free-cybersecurity-courses-beginners/) - Offers foundational cybersecurity courses.
- [@course@Free Online Cyber Security Courses with Certificates in 2025](https://www.eccouncil.org/cybersecurity-exchange/cyber-novice/free-cybersecurity-courses-beginners/)
@ -4,6 +4,6 @@ AI Red Teamers analyze systems built using supervised learning to probe for vuln
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI and cybersecurity: a love-hate revolution - Alter Solutions](https://www.alter-solutions.com/en-us/articles/ai-cybersecurity-love-hate-revolution) - Discusses supervised learning use in vulnerability scanning and potential exploits.
- [@article@AI and cybersecurity: a love-hate revolution](https://www.alter-solutions.com/en-us/articles/ai-cybersecurity-love-hate-revolution)
- [@article@What Is Supervised Learning? | IBM](https://www.ibm.com/think/topics/supervised-learning) - Foundational explanation.
- [@article@What Is Supervised Learning?](https://www.ibm.com/think/topics/supervised-learning)
- [@article@What is Supervised Learning? | Google Cloud](https://cloud.google.com/discover/what-is-supervised-learning) - Foundational explanation.
- [@article@What is Supervised Learning?](https://cloud.google.com/discover/what-is-supervised-learning)
@ -4,7 +4,7 @@ AI Red Teams apply threat modeling to identify unique attack surfaces in AI syst
Learn more from the following resources:
Learn more from the following resources:
- [@article@Core Components of AI Red Team Exercises (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming) - Describes threat modeling as the first phase of an AI red team engagement.
- [@article@Core Components of AI Red Team Exercises (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming)
- [@guide@Threat Modeling Process | OWASP Foundation](https://owasp.org/www-community/Threat_Modeling_Process) - More detailed process steps.
- [@guide@Threat Modeling | OWASP Foundation](https://owasp.org/www-community/Threat_Modeling) - General threat modeling process applicable to AI context.
- [@video@How Microsoft Approaches AI Red Teaming (MS Build)](https://learn.microsoft.com/en-us/events/build-may-2023/breakout-responsible-ai-red-teaming/) - Video on Microsoft’s AI red team process, including threat modeling specific to AI.
- [@video@How Microsoft Approaches AI Red Teaming (MS Build)](https://learn.microsoft.com/en-us/events/build-may-2023/breakout-responsible-ai-red-teaming/)
@ -4,6 +4,6 @@ AI Red Teamers test if vulnerabilities in the AI system or its interfaces allow
Learn more from the following resources:
Learn more from the following resources:
- [@article@Unauthorized Data Access via LLMs (Security Boulevard)](https://securityboulevard.com/2023/11/unauthorized-data-access-via-llms/) - Discusses risks of LLMs accessing unauthorized data.
- [@article@Unauthorized Data Access via LLMs (Security Boulevard)](https://securityboulevard.com/2023/11/unauthorized-data-access-via-llms/)
- [@guide@OWASP API Security Project](https://owasp.org/www-project-api-security/) - Covers API risks like broken access control relevant to AI systems.
- [@guide@OWASP API Security Project](https://owasp.org/www-project-api-security/)
- [@paper@AI System Abuse Cases (Harvard Belfer Center)](https://www.belfercenter.org/publication/ai-system-abuse-cases) - Covers various ways AI systems can be abused, including access violations.
- [@paper@AI System Abuse Cases (Harvard Belfer Center)](https://www.belfercenter.org/publication/ai-system-abuse-cases)
@ -4,5 +4,5 @@ When red teaming AI systems using unsupervised learning (e.g., clustering algori
Learn more from the following resources:
Learn more from the following resources:
- [@article@How Unsupervised Learning Works with Examples - Coursera](https://www.coursera.org/articles/unsupervised-learning) - Foundational explanation with examples.
- [@article@How Unsupervised Learning Works with Examples](https://www.coursera.org/articles/unsupervised-learning)
- [@article@Supervised vs. Unsupervised Learning: Which Approach is Best? - DigitalOcean](https://www.digitalocean.com/resources/articles/supervised-vs-unsupervised-learning) - Contrasts learning types, relevant for understanding different attack surfaces.
- [@article@Supervised vs. Unsupervised Learning: Which Approach is Best?](https://www.digitalocean.com/resources/articles/supervised-vs-unsupervised-learning)
@ -4,6 +4,6 @@ While general vulnerability assessment scans infrastructure, AI Red Teaming exte
Learn more from the following resources:
Learn more from the following resources:
- [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems - DNV](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/) - Discusses vulnerability assessment within AI red teaming for critical systems.
- [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/)
- [@guide@The Ultimate Guide to Vulnerability Assessment - Strobes Security](https://strobes.co/blog/guide-vulnerability-assessment/) - Comprehensive guide on VA process (apply concepts to AI).
- [@guide@The Ultimate Guide to Vulnerability Assessment](https://strobes.co/blog/guide-vulnerability-assessment/)
- [@guide@Vulnerability Scanning Tools | OWASP Foundation](https://owasp.org/www-community/Vulnerability_Scanning_Tools) - List of tools useful in broader system assessment around AI.
@ -4,6 +4,6 @@ White-box testing in AI Red Teaming grants the tester full access to the model's
Learn more from the following resources:
Learn more from the following resources:
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/) - Comparison of testing types.
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/)
- [@article@White-Box Adversarial Examples (OpenAI Blog)](https://openai.com/research/adversarial-robustness-toolbox) - Discusses generating attacks with full model knowledge.