Since 2022, AI inference costs have fallen[1]. Between 2022 and 2024, the cost-per-token to run language models fell by an estimated 99.7%[1]. This decline was driven by improvements in both hardware and algorithmic efficiency[1].
As inference becomes cheaper and more efficient, the competitive pressure amongst LLM providers increases[1]. What used to cost dollars can now cost pennies, and what cost pennies may soon cost fractions of a cent[1].
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: