October 28, 2023

Function Vectors in Large Language Models

In 1936, Alonzo Church made an amazing discovery: if a function can treat other functions as data, then it becomes so powerful that it can even express unsolvable problems.

We know that deep neural networks learn to represent many concepts as data. Do they also learn to treat functions as data?

In a new preprint, my student Eric Todd finds evidence that deep networks do contain function references. Inside large transformer language models (like GPT) trained on ordinary text, he discovers internal vectors that behave like functions. These Function Vectors (FVs) can be created from examples, invoked in different contexts, and even composed using vector algebra. But they are different from regular word-embedding vector arithmetic because they trigger complex calculations, rather than just making linear steps in representation space.
It is a very cool finding. Help Eric spread the word!

Read and retweet the Twitter thread
Share the Facebook post
The project website: functions.baulab.info

Posted by David at 11:17 AM | Comments (0)