Large language models (LLMs) face significant challenges with generalisation, particularly with out-of-distribution (OOD) scenarios. Generalisation can only be expected in areas covered by observations, meaning LLMs often struggle to apply their learned patterns to new contexts that do not resemble their training data. As stated, 'the generalisation behaviour does not match human generalisation well, lacking the ability to generalise to OOD samples and exhibit compositionality' [1].

Moreover, the phenomenon of 'hallucination,' where models confidently make incorrect predictions, is a notable overgeneralisation challenge for LLMs. This occurs when critical differences are ignored in their predictions [1].

What challenges do LLMs face with generalisation?

Related Content From The Pandipedia