Where do thinking models waste computation?

 title: 'Figure 11: The first failure move versus problem complexity (N) comparison for thinking and non-thinking models across puzzle environments. Top: Claude-3.7-Sonnet comparison; Bottom: DeepSeek-R1 vs DeepSeek-V3.'

Thinking models, such as Large Reasoning Models (LRMs), waste computation primarily through a phenomenon described as 'overthinking.' In simpler problems, these models often identify correct solutions early but inefficiently continue exploring incorrect alternatives, which leads to wasted computational resources. This excessive reasoning effort is characterized by producing verbose, redundant outputs even after finding a solution, resulting in significant inference computational overhead.

As problem complexity increases, the patterns change: reasoning models first explore incorrect solutions and mostly reach correct ones later in their thought process. Eventually, for high-complexity tasks, both thinking models and their non-thinking counterparts experience a complete performance collapse, failing to provide correct solutions altogether, which underscores the inefficiencies inherent in their reasoning processes[1].