Artificial intelligence has advanced significantly, enhancing our abilities in scientific discovery and decision-making, but it also brings challenges like misinformation and privacy concerns. One fascinating aspect is the difference in how humans and machines generalize knowledge. While humans exce...
ViewQ1. What are the names of the two open-weight reasoning models introduced by OpenAI? 🤖 - gpt-oss-120b and gpt-oss-20b - openai-120 and openai-20 - gpt-x and gpt-y - ai-120b and ai-20b Answer: gpt-oss-120b and gpt-oss-20b Q2. What technique is used by gpt-oss models to reduce their memory footprint?...
ViewThe most interesting takeaways from the model card on gpt-oss-120b and gpt-oss-20b are their robust reasoning capabilities and safety measures. These open-weight models are designed to follow strong instruction and have advanced reasoning abilities while being customizable for various applications. ...
ViewThe primary focus of open-weight models, such as gpt-oss-120b and gpt-oss-20b, is to enhance safety and provide customizable performance within agentic workflows. These models are designed to follow strong instruction following, tool use, and reasoning capabilities, allowing them to be integrated in...
ViewThe model card for gpt-oss-120b and gpt-oss-20b outlines their capabilities and safety measures, emphasizing that they are designed for instruction following, tool use, and reasoning. These models utilize a mixture-of-experts architecture with quantization techniques to operate efficiently. Evaluati...
ViewThe benchmark tests for health performance mentioned in the source are 'HealthBench,' 'HealthBench Hard,' and 'HealthBench Consensus.' These benchmarks evaluate the performance and safety of the models in health-related scenarios, including realistic conversations with individuals and health profess...
ViewThe largest parameter count is **116.8 billion** for the gpt-oss-120b model, while the gpt-oss-20b model contains **20.9 billion** parameters....
ViewQuantization reduces the memory footprint of the models. Models are post-trained with quantization of the Mixture-of-Experts weights. Weights are quantized to 4.25 bits per parameter. Quantizing MoE weights enables the larger model to fit on a single 80GB GPU. The smaller model can run on systems wi...
ViewTwo model sizes: gpt-oss-120b and gpt-oss-20b. gpt-oss-120b has 116.8 billion total parameters. Both models use autoregressive Mixture-of-Experts (MoE) transformer architecture. gpt-oss-20b consists of 20.9 billion total parameters. Attention blocks in the models alternate between banded window and ...
ViewMultilingual capabilities were evaluated using the MMMLU evaluation. The gpt-oss-120b at high reasoning performs nearly as well as OpenAI o4-mini. The MMMLU evaluation included professionally human-translated versions in 14 languages. gpt-oss-120b's average accuracy in MMMLU high reasoning is 81.3%....
ViewThe harmony chat format is important because it provides special tokens to delineate message boundaries and uses keyword arguments to indicate message authors and recipients. This structure helps the models follow a role-based information hierarchy that resolves instruction conflicts, prioritizing s...
ViewAgentic tool use in the gpt-oss models includes employing various tools to enhance their capabilities. Specifically, the models are trained to use a browsing tool, which allows them to call search functions and interact with the web to fetch information beyond their knowledge cutoff. Additionally, t...
View"Safety is foundational to our approach to open models." — OpenAI "Rigorously assessing an open-weights release’s risks should include testing for a reasonable range of ways a malicious party could feasibly modify the model." — OpenAI "We confirmed that the default model does not reach our indicativ...
ViewQuantization helps deployment by reducing the memory footprint of models, enabling them to be run on hardware with lower resource requirements. In the gpt-oss models, quantization of the Mixture-of-Experts (MoE) weights to MXFP4 format allows the larger model to fit on a single 80GB GPU and the smal...
Viewgpt-oss models do not reach indicative thresholds for High capability. The models are trained to refuse on a wide range of content. Jailbreak evaluations show general performance against adversarial prompts. Disallowed Content Evaluations ensure adherence to OpenAI's safety policies. Models are test...
View