October 28, 2023
Function Vectors in Large Language Models
In 1936, Alonzo Church made an amazing discovery: if a function can treat other functions as data, then it becomes so powerful that it can even express unsolvable problems.
We know that deep neural networks learn to represent many concepts as data. Do they also learn to treat functions as data?
In a new preprint, my student Eric Todd finds evidence that deep networks do contain function references. Inside large transformer language models (like GPT) trained on ordinary text, he discovers internal vectors that behave like functions. These Function Vectors (FVs) can be created from examples, invoked in different contexts, and even composed using vector algebra. But they are different from regular word-embedding vector arithmetic because they trigger complex calculations, rather than just making linear steps in representation space.
|Copyright 2023 © David Bau. All Rights Reserved.|