Multilingual capabilities were evaluated using the MMMLU evaluation.
The gpt-oss-120b at high reasoning performs nearly as well as OpenAI o4-mini.
The MMMLU evaluation included professionally human-translated versions in 14 languages.
gpt-oss-120b's average accuracy in MMMLU high reasoning is 81.3%.
gpt-oss-20b's average accuracy in MMMLU high reasoning is 75.7%.
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: