Do Large Language Models learn world models or just surface statistics?

Kenneth Li describes evidence suggesting world models.

Jan 22, 2023

Do Large Language Models learn world models or just surface statistics?

Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. They are trained by playing a guess-the-next-word game with itself over and over again. Each time, the model looks at a partial sentence and guesses the following word. If it makes it correctly, it will update its parameters to reinforce its confidence; otherwise, it will learn from the error and give a better guess next time.

Keep Reading

Discussion about this post

Ready for more?