What is safe-completions training?

 title: 'Figure 17'

Safe-completions training is a safety-training approach used in GPT-5 that focuses on maximizing helpfulness in the model's output while adhering to safety policy constraints. This method is designed to overcome the limitations of traditional training, which often relied on binary refusals to user requests. As a result, safe-completions aim to address prompts that may have hidden malicious intent by enabling the model to provide useful and safe responses rather than outright refusing potentially beneficial inquiries. This has improved the model's performance, particularly in handling dual-use requests, where the responses can be both informative and non-harmful according to safety guidelines. The implementation of safe-completions in GPT-5 has led to observed reductions in the severity of safety failures and an increase in overall helpfulness compared to prior model versions, such as OpenAI o3, which were primarily trained on refusals[1].

Space: Let’s explore the GPT-5 Model Card