Key metrics: GPT-5 versus OpenAI o3 and GPT-4o

gpt-5-thinking has a hallucination rate 65% smaller than OpenAI o3.

gpt-5-main outperforms GPT-4o in illicit/nonviolent and illicit/violent categories.

gpt-5-main achieved a 44% reduction in responses with major factual errors compared to GPT-4o.

Overall safety scores improved for gpt-5-thinking compared to OpenAI o3.

gpt-5-main underperformed OpenAI o4 in non-violent hate and harassment/threatening categories.

Space: Let’s explore the GPT-5 Model Card