[note] Introduction to Antropic's automated AI safety auditing framework `safety-research/petri`
Note: This page is an AI-generated (gpt-5-mini-2025-08-07) translation from Traditional Chinese and may contain minor inaccuracies.
📌 Introduction
Petri is a red-team tool for AI safety testing that simulates realistic interaction scenarios to detect potential risks in models. Through collaboration among an Auditor, a Target model, and a Judge model, it performs various tasks such as general audits, multi-model comparisons, and whistleblowing tests, checking whether models leak information or exhibit bias, improving AI safety and reliability in complex situations