Safety is foundational to our approach to open models.
OpenAI[1]
Once they are released, determined attackers could fine-tune them to bypass safety refusals or directly optimize for harm.
OpenAI[1]
We also investigated two additional questions.
OpenAI[1]
Adversarial actors fine-tuning gpt-oss-120b did not reach High capability in Biological and Chemical Risk or Cyber risk.
OpenAI[1]
The gpt-oss models are trained to follow OpenAI’s safety policies by default.
OpenAI[1]
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: