Refactor red teaming resources (#8560)

pull/8562/head
Kamran Ahmed 3 days ago committed by GitHub
parent ed54dd663a
commit ebd34612a2
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 6
      src/data/roadmaps/ai-red-teaming/content/advanced-techniques@soC-kcem1ISbnCQMa6BIB.md
  2. 8
      src/data/roadmaps/ai-red-teaming/content/adversarial-examples@xjlttOti-_laPRn8a2fVy.md
  3. 6
      src/data/roadmaps/ai-red-teaming/content/adversarial-training@2Y0ZO-etpv3XIvunDLu-W.md
  4. 6
      src/data/roadmaps/ai-red-teaming/content/agentic-ai-security@FVsKivsJrIb82B0lpPmgw.md
  5. 6
      src/data/roadmaps/ai-red-teaming/content/ai-security-fundamentals@R9DQNc0AyAQ2HLpP4HOk6.md
  6. 8
      src/data/roadmaps/ai-red-teaming/content/api-protection@Tszl26iNBnQBdBEWOueDA.md
  7. 6
      src/data/roadmaps/ai-red-teaming/content/authentication@J7gjlt2MBx7lOkOnfGvPF.md
  8. 6
      src/data/roadmaps/ai-red-teaming/content/authentication@JQ3bR8odXJfd-1RCEf3-Q.md
  9. 6
      src/data/roadmaps/ai-red-teaming/content/automated-vs-manual@LVdYN9hyCyNPYn2Lz1y9b.md
  10. 6
      src/data/roadmaps/ai-red-teaming/content/benchmark-datasets@et1Xrr8ez-fmB0mAq8W_a.md
  11. 6
      src/data/roadmaps/ai-red-teaming/content/black-box-testing@0bApnJTt-Z2IUf0X3OCYf.md
  12. 6
      src/data/roadmaps/ai-red-teaming/content/code-injection@vhBu5x8INTtqvx6vcYAhE.md
  13. 8
      src/data/roadmaps/ai-red-teaming/content/conferences@LuKnmd9nSz9yLbTU_5Yp2.md
  14. 6
      src/data/roadmaps/ai-red-teaming/content/confidentiality-integrity-availability@WZkIHZkV2qDYbYF9KBBRi.md
  15. 6
      src/data/roadmaps/ai-red-teaming/content/continuous-monitoring@7Km0mFpHguHYPs5UhHTsM.md
  16. 6
      src/data/roadmaps/ai-red-teaming/content/continuous-testing@65Lo60JQS5YlvvQ6KevXt.md
  17. 8
      src/data/roadmaps/ai-red-teaming/content/countermeasures@G1u_Kq4NeUsGX2qnUTuJU.md
  18. 8
      src/data/roadmaps/ai-red-teaming/content/ctf-challenges@2Imb64Px3ZQcBpSQjdc_G.md
  19. 6
      src/data/roadmaps/ai-red-teaming/content/custom-testing-scripts@C1zO2xC0AqyV53p2YEPWg.md
  20. 8
      src/data/roadmaps/ai-red-teaming/content/data-poisoning@nD0_64ELEeJSN-0aZiR7i.md
  21. 6
      src/data/roadmaps/ai-red-teaming/content/direct@5zHow4KZVpfhch5Aabeft.md
  22. 6
      src/data/roadmaps/ai-red-teaming/content/emerging-threats@-G8v_CNa8wO_g-46_RFQo.md
  23. 8
      src/data/roadmaps/ai-red-teaming/content/ethical-considerations@1gyuEV519LjN-KpROoVwv.md
  24. 8
      src/data/roadmaps/ai-red-teaming/content/forums@Smncq-n1OlnLAY27AFQOO.md
  25. 6
      src/data/roadmaps/ai-red-teaming/content/generative-models@3XJ-g0KvHP75U18mxCqgw.md
  26. 6
      src/data/roadmaps/ai-red-teaming/content/grey-box-testing@ZVNAMCP68XKRXVxF2-hBc.md
  27. 6
      src/data/roadmaps/ai-red-teaming/content/indirect@3_gJRtJSdm2iAfkwmcv0e.md
  28. 4
      src/data/roadmaps/ai-red-teaming/content/industry-credentials@HHjsFR6wRDqUd66PMDE_7.md
  29. 8
      src/data/roadmaps/ai-red-teaming/content/industry-standards@WePO66_4-gNcSdE00WKmw.md
  30. 6
      src/data/roadmaps/ai-red-teaming/content/infrastructure-security@nhUKKWyBH80nyKfGT8ErC.md
  31. 8
      src/data/roadmaps/ai-red-teaming/content/insecure-deserialization@aKzai0A8J55-OBXTnQih1.md
  32. 8
      src/data/roadmaps/ai-red-teaming/content/introduction@HFJIYcI16OMyM77fAw9af.md
  33. 6
      src/data/roadmaps/ai-red-teaming/content/jailbreak-techniques@Ds8pqn4y9Npo7z6ubunvc.md
  34. 8
      src/data/roadmaps/ai-red-teaming/content/lab-environments@MmwwRK4I9aRH_ha7duPqf.md
  35. 6
      src/data/roadmaps/ai-red-teaming/content/large-language-models@8K-wCn2cLc7Vs_V4sC3sE.md
  36. 6
      src/data/roadmaps/ai-red-teaming/content/llm-security-testing@xJYTRbPxMn0Xs5ea0Ygn6.md
  37. 8
      src/data/roadmaps/ai-red-teaming/content/model-inversion@iE5PcswBHnu_EBFIacib0.md
  38. 6
      src/data/roadmaps/ai-red-teaming/content/model-vulnerabilities@uBXrri2bXVsNiM8fIHHOv.md
  39. 8
      src/data/roadmaps/ai-red-teaming/content/model-weight-stealing@QFzLx5nc4rCCD8WVc20mo.md
  40. 10
      src/data/roadmaps/ai-red-teaming/content/monitoring-solutions@59lkLcoqV4gq7f8Zm0X2p.md
  41. 6
      src/data/roadmaps/ai-red-teaming/content/neural-networks@RuKzVhd1nZphCrlW1wZGL.md
  42. 10
      src/data/roadmaps/ai-red-teaming/content/prompt-engineering@gx4KaFqKgJX9n9_ZGMqlZ.md
  43. 6
      src/data/roadmaps/ai-red-teaming/content/prompt-hacking@1Xr7mxVekeAHzTL7G4eAZ.md
  44. 10
      src/data/roadmaps/ai-red-teaming/content/prompt-injection@XOrAPDRhBvde9R-znEipH.md
  45. 6
      src/data/roadmaps/ai-red-teaming/content/red-team-simulations@DpYsL0du37n40toH33fIr.md
  46. 8
      src/data/roadmaps/ai-red-teaming/content/reinforcement-learning@Xqzc4mOKsVzwaUxLGjHya.md
  47. 6
      src/data/roadmaps/ai-red-teaming/content/remote-code-execution@kgDsDlBk8W2aM6LyWpFY8.md
  48. 6
      src/data/roadmaps/ai-red-teaming/content/reporting-tools@BLnfNlA0C4yzy1dvifjwx.md
  49. 8
      src/data/roadmaps/ai-red-teaming/content/research-groups@ZlR03pM-sqVFZNhD1gMSJ.md
  50. 6
      src/data/roadmaps/ai-red-teaming/content/research-opportunities@VmaIHVsCpq2um_0cA33V3.md
  51. 6
      src/data/roadmaps/ai-red-teaming/content/responsible-disclosure@KAcCZ3zcv25R6HwzAsfUG.md
  52. 6
      src/data/roadmaps/ai-red-teaming/content/risk-management@MupRvk_8Io2Hn7yEvU663.md
  53. 6
      src/data/roadmaps/ai-red-teaming/content/robust-model-design@6gEHMhh6BGJI-ZYN27YPW.md
  54. 6
      src/data/roadmaps/ai-red-teaming/content/role-of-red-teams@Irkc9DgBfqSn72WaJqXEt.md
  55. 6
      src/data/roadmaps/ai-red-teaming/content/safety-filter-bypasses@j7uLLpt8MkZ1rqM7UBPW4.md
  56. 8
      src/data/roadmaps/ai-red-teaming/content/specialized-courses@s1xKK8HL5-QGZpcutiuvj.md
  57. 6
      src/data/roadmaps/ai-red-teaming/content/supervised-learning@NvOJIv36Utpm7_kOZyr79.md
  58. 10
      src/data/roadmaps/ai-red-teaming/content/testing-platforms@c8n8FcYKDOgPLQvV9xF5J.md
  59. 8
      src/data/roadmaps/ai-red-teaming/content/threat-modeling@RDOaTBWP3aIJPUp_kcafm.md
  60. 6
      src/data/roadmaps/ai-red-teaming/content/unauthorized-access@DQeOavZCoXpF3k_qRDABs.md
  61. 4
      src/data/roadmaps/ai-red-teaming/content/unsupervised-learning@ZC0yKsu-CJC-LZKKo2pLD.md
  62. 6
      src/data/roadmaps/ai-red-teaming/content/vulnerability-assessment@887lc3tWCRH-sOHSxWgWJ.md
  63. 6
      src/data/roadmaps/ai-red-teaming/content/white-box-testing@Mrk_js5UVn4dRDw-Yco3Y.md

@ -4,6 +4,6 @@ The practice of AI Red Teaming itself will evolve. Future techniques may involve
Learn more from the following resources: Learn more from the following resources:
- [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems - DNV](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/) - Discusses applying red teaming to complex systems. - [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/)
- [@article@Advanced Techniques in AI Red Teaming for LLMs | NeuralTrust](https://neuraltrust.ai/blog/advanced-techniques-in-ai-red-teaming) - Discusses techniques like adversarial ML and automated threat intelligence for red teaming. - [@article@Advanced Techniques in AI Red Teaming for LLMs](https://neuraltrust.ai/blog/advanced-techniques-in-ai-red-teaming)
- [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning - arXiv](https://arxiv.org/html/2412.18693v1) - Research on using RL for more advanced automated red teaming. - [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning](https://arxiv.org/html/2412.18693v1)

@ -4,7 +4,7 @@ A core AI Red Teaming activity involves generating adversarial examples – inpu
Learn more from the following resources: Learn more from the following resources:
- [@article@Adversarial Examples Explained (OpenAI Blog)](https://openai.com/research/adversarial-examples) - Introduction by OpenAI. - [@article@Adversarial Examples Explained (OpenAI Blog)](https://openai.com/research/adversarial-examples)
- [@guide@Adversarial Examples – Interpretable Machine Learning Book](https://christophm.github.io/interpretable-ml-book/adversarial.html) - In-depth explanation and examples. - [@guide@Adversarial Examples – Interpretable Machine Learning Book](https://christophm.github.io/interpretable-ml-book/adversarial.html)
- [@guide@Adversarial Testing for Generative AI | Machine Learning - Google for Developers](https://developers.google.com/machine-learning/guides/adv-testing) - Google's guide on adversarial testing workflows. - [@guide@Adversarial Testing for Generative AI](https://developers.google.com/machine-learning/guides/adv-testing)
- [@video@How AI Can Be Tricked With Adversarial Attacks - Two Minute Papers](https://www.youtube.com/watch?v=J3X_JWQkvo8?v=MPcfoQBDY0w) - Short video demonstrating adversarial examples. - [@video@How AI Can Be Tricked With Adversarial Attacks](https://www.youtube.com/watch?v=J3X_JWQkvo8?v=MPcfoQBDY0w)

@ -4,6 +4,6 @@ AI Red Teamers evaluate the effectiveness of adversarial training as a defense.
Learn more from the following resources: Learn more from the following resources:
- [@article@Model Robustness: Building Reliable AI Models - Encord](https://encord.com/blog/model-robustness-machine-learning-strategies/) (Discusses adversarial robustness) - [@article@Model Robustness: Building Reliable AI Models](https://encord.com/blog/model-robustness-machine-learning-strategies/)
- [@guide@Adversarial Testing for Generative AI | Google for Developers](https://developers.google.com/machine-learning/guides/adv-testing) - Covers the concept as part of testing. - [@guide@Adversarial Testing for Generative AI](https://developers.google.com/machine-learning/guides/adv-testing)
- [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models - arXiv](https://arxiv.org/abs/2503.09302) (Mentions adversarial training as defense) - [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models](https://arxiv.org/abs/2503.09302)

@ -4,6 +4,6 @@ As AI agents capable of autonomous action become more common, AI Red Teamers mus
Learn more from the following resources: Learn more from the following resources:
- [@article@AI Agents - Learn Prompting](https://learnprompting.org/docs/intermediate/ai_agents) (Background on agents) - [@article@AI Agents - Learn Prompting](https://learnprompting.org/docs/intermediate/ai_agents)
- [@article@Reasoning models don't always say what they think - Anthropic](https://www.anthropic.com/research/reasoning-models-dont-always-say-what-they-think) (Discusses agent alignment challenges) - [@article@Reasoning models don't always say what they think](https://www.anthropic.com/research/reasoning-models-dont-always-say-what-they-think)
- [@course@Certified AI Red Team Operator – Autonomous Systems (CAIRTO-AS) from Tonex, Inc.](https://niccs.cisa.gov/education-training/catalog/tonex-inc/certified-ai-red-team-operator-autonomous-systems-cairto) - Certification focusing on autonomous AI security. - [@course@Certified AI Red Team Operator – Autonomous Systems (CAIRTO-AS) from Tonex, Inc.](https://niccs.cisa.gov/education-training/catalog/tonex-inc/certified-ai-red-team-operator-autonomous-systems-cairto)

@ -4,6 +4,6 @@ This covers the foundational concepts essential for AI Red Teaming, bridging tra
Learn more from the following resources: Learn more from the following resources:
- [@article@Building Trustworthy AI: Contending with Data Poisoning - Nisos](https://nisos.com/research/building-trustworthy-ai/) - Explores data poisoning threats in AI/ML. - [@article@Building Trustworthy AI: Contending with Data Poisoning](https://nisos.com/research/building-trustworthy-ai/)
- [@article@What Is Adversarial AI in Machine Learning? - Palo Alto Networks](https://www.paloaltonetworks.co.uk/cyberpedia/what-are-adversarial-attacks-on-AI-Machine-Learning) - Overview of adversarial attacks targeting AI/ML systems. - [@article@What Is Adversarial AI in Machine Learning?](https://www.paloaltonetworks.co.uk/cyberpedia/what-are-adversarial-attacks-on-AI-Machine-Learning)
- [@course@AI Security | Coursera](https://www.coursera.org/learn/ai-security) - Foundational course covering AI risks, governance, security, and privacy. - [@course@AI Security | Coursera](https://www.coursera.org/learn/ai-security)

@ -4,7 +4,7 @@ AI Red Teamers rigorously test the security of APIs providing access to AI model
Learn more from the following resources: Learn more from the following resources:
- [@article@API Protection for AI Factories: The First Step to AI Security - F5](https://www.f5.com/company/blog/api-security-for-ai-factories) - Discusses the criticality of API security for AI applications. - [@article@API Protection for AI Factories: The First Step to AI Security](https://www.f5.com/company/blog/api-security-for-ai-factories)
- [@article@Securing APIs with AI for Advanced Threat Protection | Adeva](https://adevait.com/artificial-intelligence/securing-apis-with-ai) - Discusses using AI for API security, implies testing these is needed. - [@article@Securing APIs with AI for Advanced Threat Protection](https://adevait.com/artificial-intelligence/securing-apis-with-ai)
- [@article@Securing Machine Learning APIs (IBM)](https://developer.ibm.com/articles/se-securing-machine-learning-apis/) - Best practices for protecting ML APIs. - [@article@Securing Machine Learning APIs (IBM)](https://developer.ibm.com/articles/se-securing-machine-learning-apis/)
- [@guide@OWASP API Security Project (Top 10 2023)](https://owasp.org/www-project-api-security/) - Essential checklist for API vulnerabilities. - [@guide@OWASP API Security Project (Top 10 2023)](https://owasp.org/www-project-api-security/)

@ -4,6 +4,6 @@ AI Red Teamers test the authentication mechanisms controlling access to AI syste
Learn more from the following resources: Learn more from the following resources:
- [@article@Red-Teaming in AI Testing: Stress Testing - Labelvisor](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/) - Mentions testing authentication mechanisms in AI red teaming. - [@article@Red-Teaming in AI Testing: Stress Testing](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/)
- [@article@What is Authentication vs Authorization? - Auth0](https://auth0.com/intro-to-iam/authentication-vs-authorization) - Foundational explanation. - [@article@What is Authentication vs Authorization?](https://auth0.com/intro-to-iam/authentication-vs-authorization)
- [@video@How JWTs are used for Authentication (and how to bypass it) - LiveOverflow](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3Dexample_video_panel_url?v=3OpQi65s_ME) - Covers common web authentication bypass techniques relevant to APIs. - [@video@How JWTs are used for Authentication (and how to bypass it)](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3Dexample_video_panel_url?v=3OpQi65s_ME)

@ -4,6 +4,6 @@ AI Red Teamers test authorization controls to ensure that authenticated users ca
Learn more from the following resources: Learn more from the following resources:
- [@article@What is Authentication vs Authorization? - Auth0](https://auth0.com/intro-to-iam/authentication-vs-authorization) - Foundational explanation. - [@article@What is Authentication vs Authorization?](https://auth0.com/intro-to-iam/authentication-vs-authorization)
- [@guide@Identity and access management (IAM) fundamental concepts - Learn Microsoft](https://learn.microsoft.com/en-us/entra/fundamentals/identity-fundamental-concepts) - Explains roles and permissions. - [@guide@Identity and access management (IAM) fundamental concepts](https://learn.microsoft.com/en-us/entra/fundamentals/identity-fundamental-concepts)
- [@guide@OWASP API Security Project](https://owasp.org/www-project-api-security/) (Covers Broken Object Level/Function Level Authorization) - [@guide@OWASP API Security Project](https://owasp.org/www-project-api-security/)

@ -4,6 +4,6 @@ AI Red Teaming typically employs a blend of automated tools (for large-scale sca
Learn more from the following resources: Learn more from the following resources:
- [@article@Automation Testing vs. Manual Testing: Which is the better approach? - Opkey](https://www.opkey.com/blog/automation-testing-vs-manual-testing-which-is-better) - General comparison. - [@article@Automation Testing vs. Manual Testing: Which is the better approach?](https://www.opkey.com/blog/automation-testing-vs-manual-testing-which-is-better)
- [@article@Manual Testing vs Automated Testing: What's the Difference? - Leapwork](https://www.leapwork.com/blog/manual-vs-automated-testing) - General comparison. - [@article@Manual Testing vs Automated Testing: What's the Difference?](https://www.leapwork.com/blog/manual-vs-automated-testing)
- [@guide@LLM red teaming guide (open source) - Promptfoo](https://www.promptfoo.dev/docs/red-team/) - Discusses using both automated generation and human ingenuity for red teaming. - [@guide@LLM red teaming guide (open source)](https://www.promptfoo.dev/docs/red-team/)

@ -4,6 +4,6 @@ AI Red Teamers may use or contribute to benchmark datasets specifically designed
Learn more from the following resources: Learn more from the following resources:
- [@dataset@CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset - GitHub](https://github.com/cysecbench/dataset) - Dataset of cybersecurity prompts for benchmarking LLMs. - [@dataset@CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset](https://github.com/cysecbench/dataset)
- [@dataset@NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security](https://proceedings.neurips.cc/paper_files/paper/2024/hash/69d97a6493fbf016fff0a751f253ad18-Abstract-Datasets_and_Benchmarks_Track.html) - Using CTF challenges to evaluate LLMs. - [@dataset@NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security](https://proceedings.neurips.cc/paper_files/paper/2024/hash/69d97a6493fbf016fff0a751f253ad18-Abstract-Datasets_and_Benchmarks_Track.html)
- [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity - arXiv](https://arxiv.org/abs/2412.20787) - Benchmarking LLMs on cybersecurity tasks. - [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity](https://arxiv.org/abs/2412.20787)

@ -4,6 +4,6 @@ In AI Red Teaming, black-box testing involves probing the AI system with inputs
Learn more from the following resources: Learn more from the following resources:
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/) - Comparison of testing types. - [@article@Black-Box, Gray Box, and White-Box Penetration Testing](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/)
- [@article@What is Black Box Testing | Techniques & Examples - Imperva](https://www.imperva.com/learn/application-security/black-box-testing/) - General explanation. - [@article@What is Black Box Testing](https://www.imperva.com/learn/application-security/black-box-testing/)
- [@guide@LLM red teaming guide (open source) - Promptfoo](https://www.promptfoo.dev/docs/red-team/) - Contrasts black-box and white-box approaches for LLM red teaming. - [@guide@LLM red teaming guide (open source)](https://www.promptfoo.dev/docs/red-team/)

@ -4,6 +4,6 @@ AI Red Teamers test for code injection vulnerabilities specifically in the conte
Learn more from the following resources: Learn more from the following resources:
- [@article@Code Injection in LLM Applications - NeuralTrust](https://neuraltrust.ai/blog/code-injection-in-llms) - Specifically discusses code injection risks involving LLMs. - [@article@Code Injection in LLM Applications](https://neuraltrust.ai/blog/code-injection-in-llms)
- [@docs@Secure Plugin Sandboxing (OpenAI Plugins)](https://platform.openai.com/docs/plugins/production/security-requirements) - Context on preventing code injection via AI plugins. - [@docs@Secure Plugin Sandboxing (OpenAI Plugins)](https://platform.openai.com/docs/plugins/production/security-requirements)
- [@guide@Code Injection - OWASP Foundation](https://owasp.org/www-community/attacks/Code_Injection) - Foundational knowledge on code injection attacks. - [@guide@Code Injection](https://owasp.org/www-community/attacks/Code_Injection)

@ -4,7 +4,7 @@ Attending major cybersecurity conferences (DEF CON, Black Hat, RSA) and increasi
Learn more from the following resources: Learn more from the following resources:
- [@conference@Black Hat Events](https://www.blackhat.com/) - Professional security conference with AI tracks. - [@conference@Black Hat Events](https://www.blackhat.com/)
- [@conference@DEF CON Hacking Conference](https://defcon.org/) - Major hacking conference with relevant villages/talks. - [@conference@DEF CON Hacking Conference](https://defcon.org/)
- [@conference@Global Conference on AI, Security and Ethics 2025 - UNIDIR](https://unidir.org/event/global-conference-on-ai-security-and-ethics-2025/) - Example of a specialized AI security/ethics conference. - [@conference@Global Conference on AI, Security and Ethics 2025](https://unidir.org/event/global-conference-on-ai-security-and-ethics-2025/)
- [@conference@RSA Conference](https://www.rsaconference.com/) - Large industry conference covering AI security. - [@conference@RSA Conference](https://www.rsaconference.com/)

@ -4,6 +4,6 @@ The CIA Triad is directly applicable in AI Red Teaming. Confidentiality tests fo
Learn more from the following resources: Learn more from the following resources:
- [@article@Confidentiality, Integrity, Availability: Key Examples - DataSunrise](https://www.datasunrise.com/knowledge-center/confidentiality-integrity-availability-examples/) - Explains CIA triad with examples, mentioning AI/ML relevance. - [@article@Confidentiality, Integrity, Availability: Key Examples](https://www.datasunrise.com/knowledge-center/confidentiality-integrity-availability-examples/)
- [@article@The CIA Triad: Confidentiality, Integrity, Availability - Veeam](https://www.veeam.com/blog/cybersecurity-cia-triad-explained.html) - Breakdown of the three principles and how they apply. - [@article@The CIA Triad: Confidentiality, Integrity, Availability](https://www.veeam.com/blog/cybersecurity-cia-triad-explained.html)
- [@article@What's The CIA Triad? Confidentiality, Integrity, & Availability, Explained | Splunk](https://www.splunk.com/en_us/blog/learn/cia-triad-confidentiality-integrity-availability.html) - Detailed explanation of the triad, mentioning modern updates and AI context. - [@article@What's The CIA Triad? Confidentiality, Integrity, & Availability, Explained](https://www.splunk.com/en_us/blog/learn/cia-triad-confidentiality-integrity-availability.html)

@ -4,6 +4,6 @@ AI Red Teamers assess the effectiveness of continuous monitoring systems by atte
Learn more from the following resources: Learn more from the following resources:
- [@article@Cyber Security Monitoring: 5 Key Components - BitSight Technologies](https://www.bitsight.com/blog/5-things-to-consider-building-continuous-security-monitoring-strategy) - Discusses key components of a monitoring strategy. - [@article@Cyber Security Monitoring: 5 Key Components](https://www.bitsight.com/blog/5-things-to-consider-building-continuous-security-monitoring-strategy)
- [@article@Cyber Security Monitoring: Definition and Best Practices - SentinelOne](https://www.sentinelone.com/cybersecurity-101/cybersecurity/cyber-security-monitoring/) - Overview of monitoring types and techniques. - [@article@Cyber Security Monitoring: Definition and Best Practices](https://www.sentinelone.com/cybersecurity-101/cybersecurity/cyber-security-monitoring/)
- [@article@Cybersecurity Monitoring: Definition, Tools & Best Practices - NordLayer](https://nordlayer.com/blog/cybersecurity-monitoring/) - General best practices adaptable to AI context. - [@article@Cybersecurity Monitoring: Definition, Tools & Best Practices](https://nordlayer.com/blog/cybersecurity-monitoring/)

@ -4,6 +4,6 @@ Applying continuous testing principles to AI security involves integrating autom
Learn more from the following resources: Learn more from the following resources:
- [@article@Continuous Automated Red Teaming (CART) - FireCompass](https://www.firecompass.com/continuous-automated-red-teaming/) - Explains the concept of CART. - [@article@Continuous Automated Red Teaming (CART)](https://www.firecompass.com/continuous-automated-red-teaming/)
- [@article@What is Continuous Penetration Testing? Process and Benefits - Qualysec Technologies](https://qualysec.com/continuous-penetration-testing/) - Related concept applied to pen testing. - [@article@What is Continuous Penetration Testing? Process and Benefits](https://qualysec.com/continuous-penetration-testing/)
- [@guide@What is Continuous Testing and How Does it Work? - Black Duck](https://www.blackduck.com/glossary/what-is-continuous-testing.html) - General definition and benefits. - [@guide@What is Continuous Testing and How Does it Work?](https://www.blackduck.com/glossary/what-is-continuous-testing.html)

@ -4,7 +4,7 @@ AI Red Teamers must also understand and test defenses against prompt hacking. Th
Learn more from the following resources: Learn more from the following resources:
- [@article@Mitigating Prompt Injection Attacks (NCC Group Research)](https://research.nccgroup.com/2023/12/01/mitigating-prompt-injection-attacks/) - Discusses various mitigation strategies and their effectiveness. - [@article@Mitigating Prompt Injection Attacks (NCC Group Research)](https://research.nccgroup.com/2023/12/01/mitigating-prompt-injection-attacks/)
- [@article@Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera](https://www.lakera.ai/blog/guide-to-prompt-injection) - Includes discussion on best practices for prevention. - [@article@Prompt Injection & the Rise of Prompt Attacks](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures - Tigera](https://www.tigera.io/learn/guides/llm-security/prompt-injection/) - Covers defensive measures. - [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures](https://www.tigera.io/learn/guides/llm-security/prompt-injection/)
- [@guide@OpenAI Best Practices for Prompt Security](https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions) - OpenAI’s recommendations to prevent prompt manipulation. - [@guide@OpenAI Best Practices for Prompt Security](https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions)

@ -4,7 +4,7 @@ Capture The Flag competitions increasingly include AI/ML security challenges. Pa
Learn more from the following resources: Learn more from the following resources:
- [@article@Capture the flag (cybersecurity) - Wikipedia](https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurity)) - Overview of CTFs. - [@article@Capture the flag (cybersecurity)](https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurity)
- [@article@Progress from our Frontier Red Team - Anthropic](https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team) - Mentions using CTFs (Cybench) for evaluating AI model security. - [@article@Progress from our Frontier Red Team](https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team)
- [@platform@CTFtime.org](https://ctftime.org/) - Global CTF event tracker. - [@platform@CTFtime.org](https://ctftime.org/)
- [@platform@picoCTF](https://picoctf.org/) - Beginner-friendly CTF platform. - [@platform@picoCTF](https://picoctf.org/)

@ -4,6 +4,6 @@ AI Red Teamers frequently write custom scripts (often in Python) to automate bes
Learn more from the following resources: Learn more from the following resources:
- [@guide@Python for Cybersecurity: Key Use Cases and Tools - Panther](https://panther.com/blog/python-for-cybersecurity-key-use-cases-and-tools) - Discusses Python's role in automation, pen testing, etc. - [@guide@Python for Cybersecurity: Key Use Cases and Tools](https://panther.com/blog/python-for-cybersecurity-key-use-cases-and-tools)
- [@guide@Python for cybersecurity: use cases, tools and best practices - SoftTeco](https://softteco.com/blog/python-for-cybersecurity) - Covers using Python for various security tasks. - [@guide@Python for cybersecurity: use cases, tools and best practices](https://softteco.com/blog/python-for-cybersecurity)
- [@tool@Scapy](https://scapy.net/) - Powerful Python library for packet manipulation. - [@tool@Scapy](https://scapy.net/)

@ -4,7 +4,7 @@ AI Red Teamers simulate data poisoning attacks by evaluating how introducing man
Learn more from the following resources: Learn more from the following resources:
- [@article@AI Poisoning - Is It Really A Threat? - AIBlade](https://www.aiblade.net/p/ai-poisoning-is-it-really-a-threat) - Detailed exploration of data poisoning attacks and impacts. - [@article@AI Poisoning](https://www.aiblade.net/p/ai-poisoning-is-it-really-a-threat)
- [@article@Data Poisoning Attacks in ML (Towards Data Science)](https://towardsdatascience.com/data-poisoning-attacks-in-machine-learning-542169587b7f) - Overview of techniques. - [@article@Data Poisoning Attacks in ML (Towards Data Science)](https://towardsdatascience.com/data-poisoning-attacks-in-machine-learning-542169587b7f)
- [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models - arXiv](https://arxiv.org/abs/2503.09302) - Research on detection and prevention techniques. - [@paper@Detecting and Preventing Data Poisoning Attacks on AI Models](https://arxiv.org/abs/2503.09302)
- [@paper@Poisoning Web-Scale Training Data (arXiv)](https://arxiv.org/abs/2310.12818) - Analysis of poisoning risks in large datasets used for LLMs. - [@paper@Poisoning Web-Scale Training Data (arXiv)](https://arxiv.org/abs/2310.12818)

@ -4,6 +4,6 @@ Direct injection attacks occur when malicious instructions are inserted directly
Learn more from the following resources: Learn more from the following resources:
- [@article@Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera](https://www.lakera.ai/blog/guide-to-prompt-injection) - Differentiates attack types. - [@article@Prompt Injection & the Rise of Prompt Attacks](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [@article@Prompt Injection Cheat Sheet (FlowGPT)](https://flowgpt.com/p/prompt-injection-cheat-sheet) - Collection of prompt injection examples often used in direct attacks. - [@article@Prompt Injection Cheat Sheet (FlowGPT)](https://flowgpt.com/p/prompt-injection-cheat-sheet)
- [@report@OpenAI GPT-4 System Card](https://openai.com/research/gpt-4-system-card) - Sections discuss how direct prompt attacks were tested during GPT-4 development. - [@report@OpenAI GPT-4 System Card](https://openai.com/research/gpt-4-system-card)

@ -4,6 +4,6 @@ AI Red Teamers must stay informed about potential future threats enabled by more
Learn more from the following resources: Learn more from the following resources:
- [@article@AI Security Risks Uncovered: What You Must Know in 2025 - TTMS](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/) - Discusses future AI-driven cyberattacks. - [@article@AI Security Risks Uncovered: What You Must Know in 2025](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/)
- [@article@Why Artificial Intelligence is the Future of Cybersecurity - Darktrace](https://www.darktrace.com/blog/why-artificial-intelligence-is-the-future-of-cybersecurity) - Covers AI misuse and the future threat landscape. - [@article@Why Artificial Intelligence is the Future of Cybersecurity](https://www.darktrace.com/blog/why-artificial-intelligence-is-the-future-of-cybersecurity)
- [@report@AI Index 2024 - Stanford University](https://aiindex.stanford.edu/report/) - Annual report tracking AI capabilities and societal implications, including risks. - [@report@AI Index 2024](https://aiindex.stanford.edu/report/)

@ -4,7 +4,7 @@ Ethical conduct is crucial for AI Red Teamers. While simulating attacks, they mu
Learn more from the following resources: Learn more from the following resources:
- [@article@Red-Teaming in AI Testing: Stress Testing - Labelvisor](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/) - Mentions balancing attack simulation with ethical constraints. - [@article@Red-Teaming in AI Testing: Stress Testing](https://www.labelvisor.com/red-teaming-abstract-competitive-testing-data-selection/)
- [@article@Responsible AI assessment - Responsible AI | Coursera](https://www.coursera.org/learn/ai-security) (Module within AI Security course) - [@article@Responsible AI assessment - Responsible AI | Coursera](https://www.coursera.org/learn/ai-security)
- [@guide@Responsible AI Principles (Microsoft)](https://www.microsoft.com/en-us/ai/responsible-ai) - Example of corporate responsible AI guidelines influencing ethical testing. - [@guide@Responsible AI Principles (Microsoft)](https://www.microsoft.com/en-us/ai/responsible-ai)
- [@video@Questions to Guide AI Red-Teaming (CMU SEI)](https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=928382) - Key questions and ethical guidelines for AI red teaming activities (video talk). - [@video@Questions to Guide AI Red-Teaming (CMU SEI)](https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=928382)

@ -4,7 +4,7 @@ Engaging in online forums, mailing lists, Discord servers, or subreddits dedicat
Learn more from the following resources: Learn more from the following resources:
- [@community@List of Cybersecurity Discord Servers - DFIR Training](https://www.dfir.training/dfir-groups/discord?category[0]=17&category_children=1) - List including relevant servers. - [@community@List of Cybersecurity Discord Servers](https://www.dfir.training/dfir-groups/discord?category[0]=17&category_children=1)
- [@community@Reddit - r/MachineLearning](https://www.reddit.com/r/MachineLearning/) - ML specific discussion. - [@community@Reddit - r/MachineLearning](https://www.reddit.com/r/MachineLearning/)
- [@community@Reddit - r/artificial](https://www.reddit.com/r/artificial/) - General AI discussion. - [@community@Reddit - r/artificial](https://www.reddit.com/r/artificial/)
- [@community@Reddit - r/cybersecurity](https://www.reddit.com/r/cybersecurity/) - General cybersecurity forum. - [@community@Reddit - r/cybersecurity](https://www.reddit.com/r/cybersecurity/)

@ -4,6 +4,6 @@ AI Red Teamers focus heavily on generative models (like GANs and LLMs) due to th
Learn more from the following resources: Learn more from the following resources:
- [@article@An Introduction to Generative Models | MongoDB](https://www.mongodb.com/resources/basics/artificial-intelligence/generative-models) - Explains basics and contrasts with discriminative models. - [@article@An Introduction to Generative Models](https://www.mongodb.com/resources/basics/artificial-intelligence/generative-models)
- [@course@Generative AI for Beginners - Microsoft Open Source](https://microsoft.github.io/generative-ai-for-beginners/) - Free course covering fundamentals. - [@course@Generative AI for Beginners](https://microsoft.github.io/generative-ai-for-beginners/)
- [@guide@Generative AI beginner's guide | Generative AI on Vertex AI - Google Cloud](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview) - Overview covering generative AI concepts and Google's platform context. - [@guide@Generative AI beginner's guide](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/overview)

@ -4,6 +4,6 @@ Grey-box AI Red Teaming involves testing with partial knowledge of the system, s
Learn more from the following resources: Learn more from the following resources:
- [@article@AI Transparency: Connecting AI Red Teaming and Compliance | SplxAI Blog](https://splx.ai/blog/ai-transparency-connecting-ai-red-teaming-and-compliance) - Discusses the value of moving towards gray-box testing in AI. - [@article@AI Transparency: Connecting AI Red Teaming and Compliance](https://splx.ai/blog/ai-transparency-connecting-ai-red-teaming-and-compliance)
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/) - Comparison of testing types. - [@article@Black-Box, Gray Box, and White-Box Penetration Testing](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/)
- [@article@Understanding Black Box, White Box, and Grey Box Testing - Frugal Testing](https://www.frugaltesting.com/blog/understanding-black-box-white-box-and-grey-box-testing-in-software-testing) - General definitions. - [@article@Understanding Black Box, White Box, and Grey Box Testing](https://www.frugaltesting.com/blog/understanding-black-box-white-box-and-grey-box-testing-in-software-testing)

@ -4,6 +4,6 @@ Indirect injection involves embedding malicious prompts within external data sou
Learn more from the following resources: Learn more from the following resources:
- [@paper@The Practical Application of Indirect Prompt Injection Attacks - David Willis-Owen](https://www.researchgate.net/publication/382692833_The_Practical_Application_of_Indirect_Prompt_Injection_Attacks_From_Academia_to_Industry) - Discusses a standard methodology to test for indirect injection attacks. - [@paper@The Practical Application of Indirect Prompt Injection Attacks](https://www.researchgate.net/publication/382692833_The_Practical_Application_of_Indirect_Prompt_Injection_Attacks_From_Academia_to_Industry)
- [@article@How to Prevent Indirect Prompt Injection Attacks - Cobalt](https://www.cobalt.io/blog/how-to-prevent-indirect-prompt-injection-attacks) - Explains indirect injection via external sources and mitigation. - [@article@How to Prevent Indirect Prompt Injection Attacks](https://www.cobalt.io/blog/how-to-prevent-indirect-prompt-injection-attacks)
- [@article@Jailbreaks via Indirect Injection (Practical AI Safety Newsletter)](https://newsletter.practicalai.safety/p/jailbreaks-via-indirect-injection) - Examples of indirect prompt injection impacting LLM agents. - [@article@Jailbreaks via Indirect Injection (Practical AI Safety Newsletter)](https://newsletter.practicalai.safety/p/jailbreaks-via-indirect-injection)

@ -4,5 +4,5 @@ Beyond formal certifications, recognition in the AI Red Teaming field comes from
Learn more from the following resources: Learn more from the following resources:
- [@community@DEF CON - Wikipedia (Mentions Black Badge)](https://en.wikipedia.org/wiki/DEF_CON#Black_Badge) - Example of a high-prestige credential from CTFs. - [@community@DEF CON - Wikipedia (Mentions Black Badge)](https://en.wikipedia.org/wiki/DEF_CON#Black_Badge)
- [@community@HackAPrompt (Learn Prompting)](https://learnprompting.org/hackaprompt) - Example of a major AI Red Teaming competition. - [@community@HackAPrompt (Learn Prompting)](https://learnprompting.org/hackaprompt)

@ -4,7 +4,7 @@ As AI matures, AI Red Teamers will increasingly need to understand and test agai
Learn more from the following resources: Learn more from the following resources:
- [@article@ISO 42001: The New Compliance Standard for AI Management Systems - Bright Defense](https://www.brightdefense.com/resources/iso-42001-compliance/) - Overview of ISO 42001 requirements. - [@article@ISO 42001: The New Compliance Standard for AI Management Systems](https://www.brightdefense.com/resources/iso-42001-compliance/)
- [@article@ISO 42001: What it is & why it matters for AI management - IT Governance](https://www.itgovernance.co.uk/iso-42001) - Explanation of the standard. - [@article@ISO 42001: What it is & why it matters for AI management](https://www.itgovernance.co.uk/iso-42001)
- [@framework@NIST AI Risk Management Framework (AI RMF)](https://www.nist.gov/itl/ai-risk-management-framework) - Voluntary framework gaining wide adoption. - [@framework@NIST AI Risk Management Framework (AI RMF)](https://www.nist.gov/itl/ai-risk-management-framework)
- [@standard@ISO/IEC 42001: Information technology — Artificial intelligence — Management system](https://www.iso.org/standard/81230.html) - International standard for AI management systems. - [@standard@ISO/IEC 42001: Information technology — Artificial intelligence — Management system](https://www.iso.org/standard/81230.html)

@ -4,6 +4,6 @@ AI Red Teamers assess the security posture of the infrastructure hosting AI mode
Learn more from the following resources: Learn more from the following resources:
- [@article@AI Infrastructure Attacks (VentureBeat)](https://venturebeat.com/ai/understanding-ai-infrastructure-attacks/) - Discussion of attacks targeting AI infrastructure. - [@article@AI Infrastructure Attacks (VentureBeat)](https://venturebeat.com/ai/understanding-ai-infrastructure-attacks/)
- [@guide@Network Infrastructure Security - Best Practices and Strategies - DataGuard](https://www.dataguard.com/blog/network-infrastructure-security-best-practices-and-strategies/) - General infra security practices applicable here. - [@guide@Network Infrastructure Security - Best Practices and Strategies](https://www.dataguard.com/blog/network-infrastructure-security-best-practices-and-strategies/)
- [@guide@Secure Deployment of ML Systems (NIST)](https://csrc.nist.gov/publications/detail/sp/800-218/final) - Guidelines including infrastructure security for ML. - [@guide@Secure Deployment of ML Systems (NIST)](https://csrc.nist.gov/publications/detail/sp/800-218/final)

@ -4,7 +4,7 @@ AI Red Teamers investigate if serialized objects used by the AI system (e.g., fo
Learn more from the following resources: Learn more from the following resources:
- [@article@Lightboard Lessons: OWASP Top 10 - Insecure Deserialization - DevCentral](https://community.f5.com/kb/technicalarticles/lightboard-lessons-owasp-top-10---insecure-deserialization/281509) - Video explanation. - [@article@Lightboard Lessons: OWASP Top 10 - Insecure Deserialization](https://community.f5.com/kb/technicalarticles/lightboard-lessons-owasp-top-10---insecure-deserialization/281509)
- [@article@How Hugging Face Was Ethically Hacked](https://www.aiblade.net/p/how-hugging-face-was-ethically-hacked) - Hugging Face deserialization case study. - [@article@How Hugging Face Was Ethically Hacked](https://www.aiblade.net/p/how-hugging-face-was-ethically-hacked)
- [@article@OWASP TOP 10: Insecure Deserialization - Detectify Blog](https://blog.detectify.com/best-practices/owasp-top-10-insecure-deserialization/) - Overview within OWASP Top 10 context. - [@article@OWASP TOP 10: Insecure Deserialization](https://blog.detectify.com/best-practices/owasp-top-10-insecure-deserialization/)
- [@guide@Insecure Deserialization - OWASP Foundation](https://owasp.org/www-community/vulnerabilities/Insecure_Deserialization) - Core explanation of the vulnerability. - [@guide@Insecure Deserialization](https://owasp.org/www-community/vulnerabilities/Insecure_Deserialization)

@ -4,7 +4,7 @@ AI Red Teaming is the practice of simulating adversarial attacks against AI syst
Learn more from the following resources: Learn more from the following resources:
- [@article@A Guide to AI Red Teaming - HiddenLayer](https://hiddenlayer.com/innovation-hub/a-guide-to-ai-red-teaming/) - Discusses AI red teaming concepts and contrasts with traditional methods. - [@article@A Guide to AI Red Teaming](https://hiddenlayer.com/innovation-hub/a-guide-to-ai-red-teaming/)
- [@article@What is AI Red Teaming? (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming) - Overview of AI red teaming, its history, and key challenges. - [@article@What is AI Red Teaming? (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming)
- [@article@What is AI Red Teaming? The Complete Guide - Mindgard](https://mindgard.ai/blog/what-is-ai-red-teaming) - Guide covering AI red teaming processes, use cases, and benefits. - [@article@What is AI Red Teaming? The Complete Guide](https://mindgard.ai/blog/what-is-ai-red-teaming)
- [@podcast@Red Team Podcast | AI Red Teaming Insights & Defense Strategies - Mindgard](https://mindgard.ai/podcast/red-team) - Podcast series covering AI red teaming trends and strategies. - [@podcast@Red Team Podcast - AI Red Teaming Insights & Defense Strategies](https://mindgard.ai/podcast/red-team)

@ -4,6 +4,6 @@ Jailbreaking is a specific category of prompt hacking where the AI Red Teamer ai
Learn more from the following resources: Learn more from the following resources:
- [@article@InjectPrompt (David Willis-Owen)](https://injectprompt.com) - Discusses jailbreaks for several LLMs - [@article@InjectPrompt (David Willis-Owen)](https://injectprompt.com)
- [@guide@Prompt Hacking Guide - Learn Prompting](https://learnprompting.org/docs/category/prompt-hacking) - Covers jailbreaking strategies. - [@guide@Prompt Hacking Guide - Learn Prompting](https://learnprompting.org/docs/category/prompt-hacking)
- [@paper@Jailbroken: How Does LLM Safety Training Fail? (arXiv)](https://arxiv.org/abs/2307.02483) - Research analyzing jailbreak failures. - [@paper@Jailbroken: How Does LLM Safety Training Fail? (arXiv)](https://arxiv.org/abs/2307.02483)

@ -4,7 +4,7 @@ AI Red Teamers need environments to practice attacking vulnerable systems safely
Learn more from the following resources: Learn more from the following resources:
- [@platform@Gandalf AI Prompt Injection Lab](https://gandalf.lakera.ai/) - A popular web-based lab for prompt injection practice. - [@platform@Gandalf AI Prompt Injection Lab](https://gandalf.lakera.ai/)
- [@platform@Hack The Box: Hacking Labs](https://www.hackthebox.com/hacker/hacking-labs) - General pentesting labs. - [@platform@Hack The Box: Hacking Labs](https://www.hackthebox.com/hacker/hacking-labs)
- [@platform@TryHackMe: Learn Cyber Security](https://tryhackme.com/) - Gamified cybersecurity training labs. - [@platform@TryHackMe: Learn Cyber Security](https://tryhackme.com/)
- [@platform@VulnHub](https://www.vulnhub.com/) - Provides vulnerable VM images for practice. - [@platform@VulnHub](https://www.vulnhub.com/)

@ -4,6 +4,6 @@ LLMs are a primary target for AI Red Teaming. Understanding their architecture (
Learn more from the following resources: Learn more from the following resources:
- [@article@What is an LLM (large language model)? - Cloudflare](https://www.cloudflare.com/learning/ai/what-is-large-language-model/) - Concise explanation from Cloudflare. - [@article@What is an LLM (large language model)?](https://www.cloudflare.com/learning/ai/what-is-large-language-model/)
- [@guide@Introduction to Large Language Models - Learn Prompting](https://learnprompting.org/docs/intro_to_llms) - Learn Prompting's introduction. - [@guide@Introduction to LLMs - Learn Prompting](https://learnprompting.org/docs/intro_to_llms)
- [@guide@What Are Large Language Models? A Beginner's Guide for 2025 - KDnuggets](https://www.kdnuggets.com/large-language-models-beginners-guide-2025) - Overview of LLMs, how they work, strengths, and limitations. - [@guide@What Are Large Language Models? A Beginner's Guide for 2025](https://www.kdnuggets.com/large-language-models-beginners-guide-2025)

@ -4,6 +4,6 @@ The core application area for many AI Red Teamers today involves specifically te
Learn more from the following resources: Learn more from the following resources:
- [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses) - Courses focused on testing LLMs. - [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses)
- [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity - arXiv](https://arxiv.org/abs/2412.20787) - Dataset for evaluating LLMs on security tasks. - [@dataset@SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity](https://arxiv.org/abs/2412.20787)
- [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts) - Guide specifically on red teaming LLMs. - [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts)

@ -4,7 +4,7 @@ AI Red Teamers perform model inversion tests to assess if an attacker can recons
Learn more from the following resources: Learn more from the following resources:
- [@article@Model Inversion Attacks for ML (Medium)](https://medium.com/@ODSC/model-inversion-attacks-for-machine-learning-ff407a1b10d1) - Explanation with examples (e.g., face reconstruction). - [@article@Model Inversion Attacks for ML (Medium)](https://medium.com/@ODSC/model-inversion-attacks-for-machine-learning-ff407a1b10d1)
- [@article@Model inversion and membership inference: Understanding new AI security risks - Hogan Lovells](https://www.hoganlovells.com/en/publications/model-inversion-and-membership-inference-understanding-new-ai-security-risks-and-mitigating-vulnerabilities) - Discusses risks and mitigation. - [@article@Model inversion and membership inference: Understanding new AI security risks](https://www.hoganlovells.com/en/publications/model-inversion-and-membership-inference-understanding-new-ai-security-risks-and-mitigating-vulnerabilities)
- [@paper@Extracting Training Data from LLMs (arXiv)](https://arxiv.org/abs/2012.07805) - Research demonstrating feasibility on LLMs. - [@paper@Extracting Training Data from LLMs (arXiv)](https://arxiv.org/abs/2012.07805)
- [@paper@Model Inversion Attacks: A Survey of Approaches and Countermeasures - arXiv](https://arxiv.org/html/2411.10023v1) - Comprehensive survey of model inversion attacks and defenses. - [@paper@Model Inversion Attacks: A Survey of Approaches and Countermeasures](https://arxiv.org/html/2411.10023v1)

@ -4,6 +4,6 @@ This category covers attacks and tests targeting the AI model itself, beyond the
Learn more from the following resources: Learn more from the following resources:
- [@article@AI Security Risks Uncovered: What You Must Know in 2025 - TTMS](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/) - Discusses adversarial attacks, data poisoning, and prototype theft. - [@article@AI Security Risks Uncovered: What You Must Know in 2025](https://ttms.com/uk/ai-security-risks-explained-what-you-need-to-know-in-2025/)
- [@article@Attacking AI Models (Trail of Bits Blog Series)](https://blog.trailofbits.com/category/ai-security/) - Series discussing model-focused attacks. - [@article@Attacking AI Models (Trail of Bits Blog Series)](https://blog.trailofbits.com/category/ai-security/)
- [@report@AI and ML Vulnerabilities (CNAS Report)](https://www.cnas.org/publications/reports/understanding-and-mitigating-ai-vulnerabilities) - Overview of known machine learning vulnerabilities. - [@report@AI and ML Vulnerabilities (CNAS Report)](https://www.cnas.org/publications/reports/understanding-and-mitigating-ai-vulnerabilities)

@ -4,7 +4,7 @@ AI Red Teamers assess the risk of attackers reconstructing or stealing the propr
Learn more from the following resources: Learn more from the following resources:
- [@article@A Playbook for Securing AI Model Weights - RAND](https://www.rand.org/pubs/research_briefs/RBA2849-1.html) - Discusses attack vectors and security levels for protecting model weights. - [@article@A Playbook for Securing AI Model Weights](https://www.rand.org/pubs/research_briefs/RBA2849-1.html)
- [@article@How to Steal a Machine Learning Model (SkyCryptor)](https://skycryptor.com/blog/how-to-steal-a-machine-learning-model) - Explains model weight extraction via query attacks. - [@article@How to Steal a Machine Learning Model (SkyCryptor)](https://skycryptor.com/blog/how-to-steal-a-machine-learning-model)
- [@paper@Defense Against Model Stealing (Microsoft Research)](https://www.microsoft.com/en-us/research/publication/defense-against-model-stealing-attacks/) - Research on detecting and defending against model stealing. - [@paper@Defense Against Model Stealing (Microsoft Research)](https://www.microsoft.com/en-us/research/publication/defense-against-model-stealing-attacks/)
- [@paper@On the Limitations of Model Stealing with Uncertainty Quantification Models - OpenReview](https://openreview.net/pdf?id=ONRFHoUzNk) - Research exploring model stealing techniques. - [@paper@On the Limitations of Model Stealing with Uncertainty Quantification Models](https://openreview.net/pdf?id=ONRFHoUzNk)

@ -4,8 +4,8 @@ AI Red Teamers interact with monitoring tools primarily to test their effectiven
Learn more from the following resources: Learn more from the following resources:
- [@article@Open Source IDS Tools: Comparing Suricata, Snort, Bro (Zeek), Linux - LevelBlue](https://levelblue.com/blogs/security-essentials/open-source-intrusion-detection-tools-a-quick-overview) - Comparison of common open source monitoring tools. - [@article@Open Source IDS Tools: Comparing Suricata, Snort, Bro (Zeek), Linux](https://levelblue.com/blogs/security-essentials/open-source-intrusion-detection-tools-a-quick-overview)
- [@tool@Snort](https://www.snort.org/) - Open source IDS/IPS. - [@tool@Snort](https://www.snort.org/)
- [@tool@Suricata](https://suricata.io/) - Open source IDS/IPS/NSM. - [@tool@Suricata](https://suricata.io/)
- [@tool@Wireshark](https://www.wireshark.org/) - Network protocol analyzer. - [@tool@Wireshark](https://www.wireshark.org/)
- [@tool@Zeek (formerly Bro)](https://zeek.org/) - Network security monitoring framework. - [@tool@Zeek (formerly Bro)](https://zeek.org/)

@ -4,6 +4,6 @@ Understanding neural network architectures (layers, nodes, activation functions)
Learn more from the following resources: Learn more from the following resources:
- [@guide@Neural Networks Explained: A Beginner's Guide - SkillCamper](https://www.skillcamper.com/blog/neural-networks-explained-a-beginners-guide) - Foundational guide. - [@guide@Neural Networks Explained: A Beginner's Guide](https://www.skillcamper.com/blog/neural-networks-explained-a-beginners-guide)
- [@guide@Neural networks | Machine Learning - Google for Developers](https://developers.google.com/machine-learning/crash-course/neural-networks) - Google's explanation within their ML crash course. - [@guide@Neural networks | Machine Learning](https://developers.google.com/machine-learning/crash-course/neural-networks)
- [@paper@Red Teaming with Artificial Intelligence-Driven Cyberattacks: A Scoping Review - arXiv](https://arxiv.org/html/2503.19626) - Review discussing AI methods like neural networks used in red teaming simulations. - [@paper@Red Teaming with Artificial Intelligence-Driven Cyberattacks: A Scoping Review](https://arxiv.org/html/2503.19626)

@ -4,8 +4,8 @@ For AI Red Teamers, prompt engineering is both a tool and a target. It's a tool
Learn more from the following resources: Learn more from the following resources:
- [@article@Introduction to Prompt Engineering - Datacamp](https://www.datacamp.com/tutorial/introduction-prompt-engineering) - Tutorial covering basics. - [@article@Introduction to Prompt Engineering](https://www.datacamp.com/tutorial/introduction-prompt-engineering)
- [@article@System Prompts - InjectPrompt](https://www.injectprompt.com/t/system-prompts) - Look at the system prompts of flagship LLMs. - [@article@System Prompts - InjectPrompt](https://www.injectprompt.com/t/system-prompts)
- [@course@Introduction to Prompt Engineering - Learn Prompting](https://learnprompting.org/courses/intro-to-prompt-engineering) - Foundational course from Learn Prompting. - [@course@Introduction to Prompt Engineering](https://learnprompting.org/courses/intro-to-prompt-engineering)
- [@guide@Prompt Engineering Guide - Learn Prompting](https://learnprompting.org/docs/prompt-engineering) - Comprehensive guide from Learn Prompting. - [@guide@Prompt Engineering Guide](https://learnprompting.org/docs/prompt-engineering)
- [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts) - Connects prompt engineering directly to LLM red teaming concepts. - [@guide@The Ultimate Guide to Red Teaming LLMs and Adversarial Prompts (Kili Technology)](https://kili-technology.com/large-language-models-llms/red-teaming-llms-and-adversarial-prompts)

@ -4,6 +4,6 @@ Prompt hacking is a core technique for AI Red Teamers targeting LLMs. It involve
Learn more from the following resources: Learn more from the following resources:
- [@course@Introduction to Prompt Hacking - Learn Prompting](https://learnprompting.org/courses/intro-to-prompt-hacking) - Free introductory course. - [@course@Introduction to Prompt Hacking](https://learnprompting.org/courses/intro-to-prompt-hacking)
- [@guide@Prompt Hacking Guide - Learn Prompting](https://learnprompting.org/docs/category/prompt-hacking) - Detailed guide covering techniques. - [@guide@Prompt Hacking Guide](https://learnprompting.org/docs/category/prompt-hacking)
- [@paper@SoK: Prompt Hacking of LLMs (arXiv 2023)](https://arxiv.org/abs/2311.05544) - Comprehensive research overview of prompt hacking types and techniques. - [@paper@SoK: Prompt Hacking of LLMs (arXiv 2023)](https://arxiv.org/abs/2311.05544)

@ -4,8 +4,8 @@ Prompt injection is a critical vulnerability tested by AI Red Teamers. They atte
Learn more from the following resources: Learn more from the following resources:
- [@article@Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera](https://www.lakera.ai/blog/guide-to-prompt-injection) - Guide covering different types of prompt attacks. - [@article@Prompt Injection & the Rise of Prompt Attacks](https://www.lakera.ai/blog/guide-to-prompt-injection)
- [@article@Prompt Injection (Learn Prompting)](https://learnprompting.org/docs/prompt_hacking/injection) - Learn Prompting article describing prompt injection with examples and mitigation strategies. - [@article@Prompt Injection (Learn Prompting)](https://learnprompting.org/docs/prompt_hacking/injection)
- [@article@Prompt Injection Attack Explanation (IBM)](https://research.ibm.com/blog/prompt-injection-attacks-against-llms) - Explains what prompt injections are and how they work. - [@article@Prompt Injection Attack Explanation (IBM)](https://research.ibm.com/blog/prompt-injection-attacks-against-llms)
- [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures - Tigera](https://www.tigera.io/learn/guides/llm-security/prompt-injection/) - Overview of impact and defenses. - [@article@Prompt Injection: Impact, How It Works & 4 Defense Measures](https://www.tigera.io/learn/guides/llm-security/prompt-injection/)
- [@course@Advanced Prompt Hacking - Learn Prompting](https://learnprompting.org/courses/advanced-prompt-hacking) - Covers advanced injection techniques. - [@course@Advanced Prompt Hacking - Learn Prompting](https://learnprompting.org/courses/advanced-prompt-hacking)

@ -4,6 +4,6 @@ Participating in or conducting structured red team simulations against AI system
Learn more from the following resources: Learn more from the following resources:
- [@guide@A Simple Guide to Successful Red Teaming - Cobalt Strike](https://www.cobaltstrike.com/resources/guides/a-simple-guide-to-successful-red-teaming) - General guide adaptable to AI context. - [@guide@A Simple Guide to Successful Red Teaming](https://www.cobaltstrike.com/resources/guides/a-simple-guide-to-successful-red-teaming)
- [@guide@The Complete Guide to Red Teaming: Process, Benefits & More - Mindgard AI](https://mindgard.ai/blog/red-teaming) - Overview of red teaming process. - [@guide@The Complete Guide to Red Teaming: Process, Benefits & More](https://mindgard.ai/blog/red-teaming)
- [@guide@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist) - Checklist for planning engagements. - [@guide@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist)

@ -4,7 +4,7 @@ Red teaming RL-based AI systems involves testing for vulnerabilities such as rew
Learn more from the following resources: Learn more from the following resources:
- [@article@Best Resources to Learn Reinforcement Learning - Towards Data Science](https://towardsdatascience.com/best-free-courses-and-resources-to-learn-reinforcement-learning-ed6633608cb2/) - Curated list of RL learning resources. - [@article@Resources to Learn Reinforcement Learning](https://towardsdatascience.com/best-free-courses-and-resources-to-learn-reinforcement-learning-ed6633608cb2/)
- [@article@What is reinforcement learning? - Blog - York Online Masters degrees](https://online.york.ac.uk/resources/what-is-reinforcement-learning/) - Foundational explanation. - [@article@What is reinforcement learning?](https://online.york.ac.uk/resources/what-is-reinforcement-learning/)
- [@course@Deep Reinforcement Learning Course by HuggingFace](https://huggingface.co/learn/deep-rl-course/unit0/introduction) - Comprehensive free course on Deep RL. - [@course@Deep Reinforcement Learning Course by HuggingFace](https://huggingface.co/learn/deep-rl-course/unit0/introduction)
- [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning - arXiv](https://arxiv.org/html/2412.18693v1) - Research on using RL for red teaming and generating attacks. - [@paper@Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning](https://arxiv.org/html/2412.18693v1)

@ -4,6 +4,6 @@ AI Red Teamers attempt to achieve RCE on systems hosting or interacting with AI
Learn more from the following resources: Learn more from the following resources:
- [@article@Exploiting LLMs with Code Execution (GitHub Gist)](https://gist.github.com/coolaj86/6f4f7b30129b0251f61fa7baaa881516) - Example of achieving code execution via LLM manipulation. - [@article@Exploiting LLMs with Code Execution (GitHub Gist)](https://gist.github.com/coolaj86/6f4f7b30129b0251f61fa7baaa881516)
- [@article@What is remote code execution? - Cloudflare](https://www.cloudflare.com/learning/security/what-is-remote-code-execution/) - Definition and explanation of RCE. - [@article@What is remote code execution?](https://www.cloudflare.com/learning/security/what-is-remote-code-execution/)
- [@video@DEFCON 31 - AI Village - Hacking an LLM embedded system (agent) - Johann Rehberger](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3D6u04C1N69ks?v=1FfYnF2GXVU) - Demonstrates RCE risks with LLM agents. - [@video@DEFCON 31 - AI Village - Hacking an LLM embedded system (agent) - Johann Rehberger](https://www.google.com/search?q=https://www.youtube.com/watch%3Fv%3D6u04C1N69ks?v=1FfYnF2GXVU)

@ -4,6 +4,6 @@ AI Red Teamers use reporting techniques and potentially tools to clearly documen
Learn more from the following resources: Learn more from the following resources:
- [@article@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist) (Mentions reporting and templates) - [@article@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist)
- [@guide@Penetration Testing Report: 6 Key Sections and 4 Best Practices - Bright Security](https://brightsec.com/blog/penetration-testing-report/) - General best practices for reporting security findings. - [@guide@Penetration Testing Report: 6 Key Sections and 4 Best Practices](https://brightsec.com/blog/penetration-testing-report/)
- [@guide@Penetration testing best practices: Strategies for all test types - Strike Graph](https://www.strikegraph.com/blog/pen-testing-best-practices) - Includes tips on documentation. - [@guide@Penetration testing best practices: Strategies for all test types](https://www.strikegraph.com/blog/pen-testing-best-practices)

@ -4,7 +4,7 @@ Following and potentially contributing to research groups at universities (like
Learn more from the following resources: Learn more from the following resources:
- [@group@AI Cybersecurity | Global Cyber Security Capacity Centre (Oxford)](https://gcscc.ox.ac.uk/ai-security) - Academic research center. - [@group@AI Cybersecurity | Global Cyber Security Capacity Centre (Oxford)](https://gcscc.ox.ac.uk/ai-security)
- [@group@Anthropic Research](https://www.anthropic.com/research) - AI safety research lab. - [@group@Anthropic Research](https://www.anthropic.com/research)
- [@group@Center for AI Safety](https://www.safe.ai/) - Non-profit research organization. - [@group@Center for AI Safety](https://www.safe.ai/)
- [@group@The AI Security Institute (AISI)](https://www.aisi.gov.uk/) - UK government institute focused on AI safety/security research. - [@group@The AI Security Institute (AISI)](https://www.aisi.gov.uk/)

@ -4,6 +4,6 @@ AI Red Teaming relies on ongoing research. Key areas needing further investigati
Learn more from the following resources: Learn more from the following resources:
- [@article@Cutting-Edge Research on AI Security bolstered with new Challenge Fund - GOV.UK](https://www.gov.uk/government/news/cutting-edge-research-on-ai-security-bolstered-with-new-challenge-fund-to-ramp-up-public-trust-and-adoption) - Highlights government funding for AI security research priorities. - [@article@Cutting-Edge Research on AI Security bolstered with new Challenge Fund](https://www.gov.uk/government/news/cutting-edge-research-on-ai-security-bolstered-with-new-challenge-fund-to-ramp-up-public-trust-and-adoption)
- [@research@Careers | The AI Security Institute (AISI)](https://www.aisi.gov.uk/careers) - Outlines research focus areas for the UK's AISI. - [@research@Careers | The AI Security Institute (AISI)](https://www.aisi.gov.uk/careers)
- [@research@Research - Anthropic](https://www.anthropic.com/research) - Example of research areas at a leading AI safety lab. - [@research@Research - Anthropic](https://www.anthropic.com/research)

@ -4,6 +4,6 @@ A critical practice for AI Red Teamers is responsible disclosure: privately repo
Learn more from the following resources: Learn more from the following resources:
- [@guide@Responsible Disclosure of AI Vulnerabilities - Preamble AI](https://www.preamble.com/blog/responsible-disclosure-of-ai-vulnerabilities) - Discusses the process specifically for AI vulnerabilities. - [@guide@Responsible Disclosure of AI Vulnerabilities](https://www.preamble.com/blog/responsible-disclosure-of-ai-vulnerabilities)
- [@guide@Vulnerability Disclosure Program | CISA](https://www.cisa.gov/resources-tools/programs/vulnerability-disclosure-program-vdp) - Government VDP example. - [@guide@Vulnerability Disclosure Program](https://www.cisa.gov/resources-tools/programs/vulnerability-disclosure-program-vdp)
- [@policy@Google Vulnerability Reward Program (VRP)](https://bughunters.google.com/) - Example of a major tech company's VDP/bug bounty program. - [@policy@Google Vulnerability Reward Program (VRP)](https://bughunters.google.com/)

@ -4,6 +4,6 @@ AI Red Teamers contribute to the AI risk management process by identifying and d
Learn more from the following resources: Learn more from the following resources:
- [@framework@NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework) - Key framework for managing AI-specific risks. - [@framework@NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework)
- [@guide@A Beginner's Guide to Cybersecurity Risks and Vulnerabilities - Champlain College Online](https://online.champlain.edu/blog/beginners-guide-cybersecurity-risk-management) - Foundational understanding of risk. - [@guide@A Beginner's Guide to Cybersecurity Risks and Vulnerabilities](https://online.champlain.edu/blog/beginners-guide-cybersecurity-risk-management)
- [@guide@Cybersecurity Risk Management: Frameworks, Plans, and Best Practices - Hyperproof](https://hyperproof.io/resource/cybersecurity-risk-management-process/) - General guide applicable to AI system context. - [@guide@Cybersecurity Risk Management: Frameworks, Plans, and Best Practices](https://hyperproof.io/resource/cybersecurity-risk-management-process/)

@ -4,6 +4,6 @@ AI Red Teamers assess whether choices made during model design (architecture sel
Learn more from the following resources: Learn more from the following resources:
- [@article@Model Robustness: Building Reliable AI Models - Encord](https://encord.com/blog/model-robustness-machine-learning-strategies/) - Discusses strategies for building robust models. - [@article@Model Robustness: Building Reliable AI Models](https://encord.com/blog/model-robustness-machine-learning-strategies/)
- [@article@Understanding Robustness in Machine Learning - Alooba](https://www.alooba.com/skills/concepts/machine-learning/robustness/) - Explains the concept of ML robustness. - [@article@Understanding Robustness in Machine Learning](https://www.alooba.com/skills/concepts/machine-learning/robustness/)
- [@paper@Towards Evaluating the Robustness of Neural Networks (arXiv by Goodfellow et al.)](https://arxiv.org/abs/1608.04644) - Foundational paper on evaluating robustness. - [@paper@Towards Evaluating the Robustness of Neural Networks (arXiv by Goodfellow et al.)](https://arxiv.org/abs/1608.04644)

@ -4,6 +4,6 @@ The role of an AI Red Team is to rigorously challenge AI systems from an adversa
Learn more from the following resources: Learn more from the following resources:
- [@article@The Complete Guide to Red Teaming: Process, Benefits & More - Mindgard AI](https://mindgard.ai/blog/red-teaming) - Discusses the purpose and process of red teaming. - [@article@The Complete Guide to Red Teaming: Process, Benefits & More](https://mindgard.ai/blog/red-teaming)
- [@article@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist) - Outlines typical red team roles and responsibilities. - [@article@The Complete Red Teaming Checklist [PDF]: 5 Key Steps - Mindgard AI](https://mindgard.ai/blog/red-teaming-checklist)
- [@article@What is AI Red Teaming? - Learn Prompting](https://learnprompting.org/docs/category/ai-red-teaming) - Defines the role and activities. - [@article@What is AI Red Teaming? - Learn Prompting](https://learnprompting.org/docs/category/ai-red-teaming)

@ -4,6 +4,6 @@ AI Red Teamers specifically target the safety mechanisms (filters, guardrails) i
Learn more from the following resources: Learn more from the following resources:
- [@article@Bypassing AI Content Filters | Restackio](https://www.restack.io/p/ai-driven-content-moderation-answer-bypass-filters-cat-ai) - Discusses techniques for evasion. - [@article@Bypassing AI Content Filters](https://www.restack.io/p/ai-driven-content-moderation-answer-bypass-filters-cat-ai)
- [@article@How to Bypass Azure AI Content Safety Guardrails - Mindgard](https://mindgard.ai/blog/bypassing-azure-ai-content-safety-guardrails) - Case study on bypassing specific safety mechanisms. - [@article@How to Bypass Azure AI Content Safety Guardrails](https://mindgard.ai/blog/bypassing-azure-ai-content-safety-guardrails)
- [@article@The Best Methods to Bypass AI Detection: Tips and Techniques - PopAi](https://www.popai.pro/resources/the-best-methods-to-bypass-ai-detection-tips-and-techniques/) - Focuses on evasion, relevant for filter bypass testing. - [@article@The Best Methods to Bypass AI Detection: Tips and Techniques](https://www.popai.pro/resources/the-best-methods-to-bypass-ai-detection-tips-and-techniques/)

@ -4,7 +4,7 @@ Targeted training is crucial for mastering AI Red Teaming. Look for courses cove
Learn more from the following resources: Learn more from the following resources:
- [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses) - Curated list including free and paid options. - [@course@AI Red Teaming Courses - Learn Prompting](https://learnprompting.org/blog/ai-red-teaming-courses)
- [@course@AI Security | Coursera](https://www.coursera.org/learn/ai-security) - Covers AI security risks and governance. - [@course@AI Security | Coursera](https://www.coursera.org/learn/ai-security)
- [@course@Exploring Adversarial Machine Learning - NVIDIA](https://www.nvidia.com/en-us/training/instructor-led-workshops/exploring-adversarial-machine-learning/) - Focused training on adversarial ML (paid). - [@course@Exploring Adversarial Machine Learning](https://www.nvidia.com/en-us/training/instructor-led-workshops/exploring-adversarial-machine-learning/)
- [@course@Free Online Cyber Security Courses with Certificates in 2025 - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/cyber-novice/free-cybersecurity-courses-beginners/) - Offers foundational cybersecurity courses. - [@course@Free Online Cyber Security Courses with Certificates in 2025](https://www.eccouncil.org/cybersecurity-exchange/cyber-novice/free-cybersecurity-courses-beginners/)

@ -4,6 +4,6 @@ AI Red Teamers analyze systems built using supervised learning to probe for vuln
Learn more from the following resources: Learn more from the following resources:
- [@article@AI and cybersecurity: a love-hate revolution - Alter Solutions](https://www.alter-solutions.com/en-us/articles/ai-cybersecurity-love-hate-revolution) - Discusses supervised learning use in vulnerability scanning and potential exploits. - [@article@AI and cybersecurity: a love-hate revolution](https://www.alter-solutions.com/en-us/articles/ai-cybersecurity-love-hate-revolution)
- [@article@What Is Supervised Learning? | IBM](https://www.ibm.com/think/topics/supervised-learning) - Foundational explanation. - [@article@What Is Supervised Learning?](https://www.ibm.com/think/topics/supervised-learning)
- [@article@What is Supervised Learning? | Google Cloud](https://cloud.google.com/discover/what-is-supervised-learning) - Foundational explanation. - [@article@What is Supervised Learning?](https://cloud.google.com/discover/what-is-supervised-learning)

@ -4,8 +4,8 @@ Platforms used by AI Red Teamers range from general penetration testing OS distr
Learn more from the following resources: Learn more from the following resources:
- [@tool@AI Red Teaming Agent - Azure AI Foundry | Microsoft Learn](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/ai-red-teaming-agent) - Microsoft's tool leveraging PyRIT. - [@tool@AI Red Teaming Agent - Azure AI Foundry | Microsoft Learn](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/ai-red-teaming-agent)
- [@tool@Kali Linux](https://www.kali.org/) - Standard pentesting distribution. - [@tool@Kali Linux](https://www.kali.org/)
- [@tool@OWASP Zed Attack Proxy (ZAP)](https://owasp.org/www-project-zap/) - Widely used for web/API security testing. - [@tool@OWASP Zed Attack Proxy (ZAP)](https://owasp.org/www-project-zap/)
- [@tool@Promptfoo](https://www.promptfoo.dev/) - Open-source tool for testing and evaluating LLMs, includes red teaming features. - [@tool@Promptfoo](https://www.promptfoo.dev/)
- [@tool@PyRIT (Python Risk Identification Tool for generative AI) - GitHub](https://github.com/Azure/PyRIT) - Open-source framework from Microsoft. - [@tool@PyRIT (Python Risk Identification Tool for generative AI)](https://github.com/Azure/PyRIT)

@ -4,7 +4,7 @@ AI Red Teams apply threat modeling to identify unique attack surfaces in AI syst
Learn more from the following resources: Learn more from the following resources:
- [@article@Core Components of AI Red Team Exercises (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming) - Describes threat modeling as the first phase of an AI red team engagement. - [@article@Core Components of AI Red Team Exercises (Learn Prompting)](https://learnprompting.org/blog/what-is-ai-red-teaming)
- [@guide@Threat Modeling Process | OWASP Foundation](https://owasp.org/www-community/Threat_Modeling_Process) - More detailed process steps. - [@guide@Threat Modeling Process](https://owasp.org/www-community/Threat_Modeling_Process)
- [@guide@Threat Modeling | OWASP Foundation](https://owasp.org/www-community/Threat_Modeling) - General threat modeling process applicable to AI context. - [@guide@Threat Modeling](https://owasp.org/www-community/Threat_Modeling)
- [@video@How Microsoft Approaches AI Red Teaming (MS Build)](https://learn.microsoft.com/en-us/events/build-may-2023/breakout-responsible-ai-red-teaming/) - Video on Microsoft’s AI red team process, including threat modeling specific to AI. - [@video@How Microsoft Approaches AI Red Teaming (MS Build)](https://learn.microsoft.com/en-us/events/build-may-2023/breakout-responsible-ai-red-teaming/)

@ -4,6 +4,6 @@ AI Red Teamers test if vulnerabilities in the AI system or its interfaces allow
Learn more from the following resources: Learn more from the following resources:
- [@article@Unauthorized Data Access via LLMs (Security Boulevard)](https://securityboulevard.com/2023/11/unauthorized-data-access-via-llms/) - Discusses risks of LLMs accessing unauthorized data. - [@article@Unauthorized Data Access via LLMs (Security Boulevard)](https://securityboulevard.com/2023/11/unauthorized-data-access-via-llms/)
- [@guide@OWASP API Security Project](https://owasp.org/www-project-api-security/) - Covers API risks like broken access control relevant to AI systems. - [@guide@OWASP API Security Project](https://owasp.org/www-project-api-security/)
- [@paper@AI System Abuse Cases (Harvard Belfer Center)](https://www.belfercenter.org/publication/ai-system-abuse-cases) - Covers various ways AI systems can be abused, including access violations. - [@paper@AI System Abuse Cases (Harvard Belfer Center)](https://www.belfercenter.org/publication/ai-system-abuse-cases)

@ -4,5 +4,5 @@ When red teaming AI systems using unsupervised learning (e.g., clustering algori
Learn more from the following resources: Learn more from the following resources:
- [@article@How Unsupervised Learning Works with Examples - Coursera](https://www.coursera.org/articles/unsupervised-learning) - Foundational explanation with examples. - [@article@How Unsupervised Learning Works with Examples](https://www.coursera.org/articles/unsupervised-learning)
- [@article@Supervised vs. Unsupervised Learning: Which Approach is Best? - DigitalOcean](https://www.digitalocean.com/resources/articles/supervised-vs-unsupervised-learning) - Contrasts learning types, relevant for understanding different attack surfaces. - [@article@Supervised vs. Unsupervised Learning: Which Approach is Best?](https://www.digitalocean.com/resources/articles/supervised-vs-unsupervised-learning)

@ -4,6 +4,6 @@ While general vulnerability assessment scans infrastructure, AI Red Teaming exte
Learn more from the following resources: Learn more from the following resources:
- [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems - DNV](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/) - Discusses vulnerability assessment within AI red teaming for critical systems. - [@article@AI red-teaming in critical infrastructure: Boosting security and trust in AI systems](https://www.dnv.com/article/ai-red-teaming-for-critical-infrastructure-industries/)
- [@guide@The Ultimate Guide to Vulnerability Assessment - Strobes Security](https://strobes.co/blog/guide-vulnerability-assessment/) - Comprehensive guide on VA process (apply concepts to AI). - [@guide@The Ultimate Guide to Vulnerability Assessment](https://strobes.co/blog/guide-vulnerability-assessment/)
- [@guide@Vulnerability Scanning Tools | OWASP Foundation](https://owasp.org/www-community/Vulnerability_Scanning_Tools) - List of tools useful in broader system assessment around AI. - [@guide@Vulnerability Scanning Tools](https://owasp.org/www-community/Vulnerability_Scanning_Tools)

@ -4,6 +4,6 @@ White-box testing in AI Red Teaming grants the tester full access to the model's
Learn more from the following resources: Learn more from the following resources:
- [@article@Black-Box, Gray Box, and White-Box Penetration Testing - EC-Council](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/) - Comparison of testing types. - [@article@Black-Box, Gray Box, and White-Box Penetration Testing](https://www.eccouncil.org/cybersecurity-exchange/penetration-testing/black-box-gray-box-and-white-box-penetration-testing-importance-and-uses/)
- [@article@White-Box Adversarial Examples (OpenAI Blog)](https://openai.com/research/adversarial-robustness-toolbox) - Discusses generating attacks with full model knowledge. - [@article@White-Box Adversarial Examples (OpenAI Blog)](https://openai.com/research/adversarial-robustness-toolbox)
- [@guide@LLM red teaming guide (open source) - Promptfoo](https://www.promptfoo.dev/docs/red-team/) - Mentions white-box testing benefits for LLMs. - [@guide@LLM red teaming guide (open source)](https://www.promptfoo.dev/docs/red-team/)

Loading…
Cancel
Save