In the GPT-5 evaluation on HealthBench Hard, the score for the gpt-5-thinking model is reported to be 46.2%, which shows a substantial improvement from 31.6% for OpenAI o3. The gpt-5-thinking-mini model also performed well, achieving a score of 40.3% on HealthBench Hard, outperforming all previous models, including OpenAI’s gpt-oss open-weight models[1].
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: