Quick facts: GPT-5's defense against adversarial attacks

gpt-5-thinking is trained to follow OpenAI's safety policies.

Two-tiered system monitors and blocks unsafe prompts and generations.

User accounts may be banned for attempting to extract harmful bio information.

Safe-completions training improves the model's response safety.

Extensive red teaming identified jailbreaks, but most were blocked by safeguards.

Space: Let’s explore the GPT-5 Model Card