How do token limits affect reasoning traces?

 title: 'Figure 3: Illustration of the four puzzle environments. Columns show the progression from initial state (top) through intermediate state (middle) to target state (bottom) for puzzles: Tower of Hanoi (disk transfer across pegs), Checkers Jumping (position swapping of colored tokens), River Crossing (transporting entities across a river), and Blocks World (stack reconfiguration).'

Token limits significantly influence reasoning traces in Large Reasoning Models (LRMs). As problem complexity increases, there is an observable pattern where LRMs initially use more tokens for reasoning but then exhibit a counterintuitive reduction in reasoning effort despite remaining below their generation limits. This trend indicates a fundamental scaling limitation in their reasoning capabilities relative to problem complexity, leading to performance collapse at higher complexities[1].

Moreover, the inefficiencies in reasoning processes become evident as models often explore incorrect solutions, wasting token budgets, and their ability to self-correct diminishes[1]. Ultimately, these token dynamics affect the overall effectiveness of reasoning in these models.