Current Large Reasoning Models (LRMs) can experience significant performance issues when faced with complex puzzles. The study indicates that LRMs experience 'complete accuracy collapse beyond certain complexities' and that they struggle with 'generalizable problem-solving capabilities for planning tasks' as puzzle difficulty increases [1]. 

Additionally, LRMs are shown to engage in inefficient reasoning processes, often falling into an 'overthinking phenomenon,' where they explore incorrect solutions instead of arriving at correct ones efficiently [1]. This behavior underscores the limitations of LRMs in executing precise computations and reasoning tasks effectively under more complex scenarios.

Which puzzles break current LRM abilities?

Related Content From The Pandipedia