Reflections on safety challenges with Gemini 2.5

In Gemini 2.5, we have focused on improving helpfulness / instruction following (IF), specifically to reduce refusals on such benign requests
Unknown[1]
The 2.0 models are substantially safer. However, they over-refused on a wide variety of benign user requests
Unknown[1]
Our primary safety evaluations assess the extent to which our models follow our content safety policies
Unknown[1]
That is, we find it possible that subsequent revisions in the next few months could lead to a model that reaches the CCL. In anticipation of this possibility, we have accelerated our mitigation efforts
Unknown[1]
We are continuing to evolve our adversarial evaluations to accurately measure and monitor the resilience of increasingly capable Gemini models, as well as our adversarial training techniques
Unknown[1]