Which model uses self-reflection?

 title: 'Figure 5: Pass@k performance of thinking vs. non-thinking models across equivalent compute budgets in puzzle environments of low , medium , and high complexity. Non-thinking models excel in simple problems, thinking models show advantages at medium complexity, while both approaches fail at high complexity regardless of compute allocation.'

The model that uses self-reflection is Claude 3.7 Sonnet Thinking, which is described as having 'thinking mechanisms' such as long Chain-of-Thought (CoT) with self-reflection. This model is included in a discussion on Large Reasoning Models (LRMs) that demonstrate promising results across various reasoning benchmarks[1].

Moreover, the paper highlights that despite the sophisticated self-reflection mechanisms, these models still fail to develop generalizable problem-solving capabilities beyond certain complexity thresholds[1]. Thus, self-reflection is a key feature of Claude 3.7 Sonnet Thinking within the context of reasoning models.