Fractal Patterns May Unravel the Intelligence in Next-Token Prediction

Published: 2 February 2024

Abstract

We study the fractal structure of language, aiming to provide a precise formalism for quantifying several properties that may have been previously suspected but not formally shown. We establish that language is: (1) self-similar, exhibiting complexities at all levels of granularity, with no particular characteristic granularity level or context length, and (2) long-range dependent (LRD), with tokens at any instant typically correlated with all subsequent tokens. Based on these findings, we argue that short-term patterns in language, such as in paragraphs, mirror the patterns seen in larger scopes, like entire documents. This may shed some light on how next-token prediction can lead to a comprehension of the structure of text at multiple levels of granularity, from words and clauses to broader contexts and intents. In addition, we demonstrate a connection between fractal parameters, such as the Hurst exponent, and scaling laws when varying the context length at inference time. We hope that these findings offer a fresh perspective on the nature of language and the mechanisms underlying the success of LLMs.

Authors

Ibrahim Alabdulmohsin, Vinh Q. Tran, Mostafa Dehghani

Venue

arXiv

Fractal Patterns May Unravel the Intelligence in Next-Token Prediction

Share

Abstract

Authors

Venue