Which model uses self-reflection?

title: 'Figure 5: Pass@k performance of thinking vs. non-thinking models across equivalent compute budgets in puzzle environments of low , medium , and high complexity. Non-thinking models excel in simple problems, thinking models show advantages at medium complexity, while both approaches fail at high complexity regardless of compute allocation.'

The model that uses self-reflection is Claude 3.7 Sonnet Thinking, which is described as having 'thinking mechanisms' such as long Chain-of-Thought (CoT) with self-reflection. This model is included in a discussion on Large Reasoning Models (LRMs) that demonstrate promising results across various reasoning benchmarks^[1].

Moreover, the paper highlights that despite the sophisticated self-reflection mechanisms, these models still fail to develop generalizable problem-solving capabilities beyond certain complexity thresholds^[1]. Thus, self-reflection is a key feature of Claude 3.7 Sonnet Thinking within the context of reasoning models.

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Related Content From The Pandipedia

Which model is used as LLM-as-a-judge?Which model precedes gpt-5-main?Comparison of gpt-oss Models and OpenAI o4-mini What are the trends shaping the future of renewable energy?Overview of Anthropic’s Claude 3 Models and Their Features What do model evaluations reveal?What sets gpt-oss models apart?What is overgeneralisation in AI models?The beginner sewing repair kit that saves the most money. Use a four slide arc: hook with the cost of replacing basics, then reveal the small kit, then show what each tool fixes, and end with a simple first repair challenge and save prompt. Keep it visually consistent with a flat lay kit shot and labeled callouts that make it easy to screenshot.Evaluation and Mitigation Methods for Deceptive Behavior in GPT-5 Reasoning Models Where do thinking models waste computation?What is an omni model LLM?Name a native agent model that uses system-2 reasoning.What is Google's open source browser engine?What is overthinking in reasoning models?