Induction Mosaic

Key: The following is a mosaic heatmap of the induction heads in ~40 open source models. This was calculated by giving each model a sequence of repeated random tokens, and measuring the average attention each head paid to the token after the previous copy of the current token. Each heatmap is a different model, and each cell is a different head (y-axis is the head index, x-axis is the layer index).

(each model is labelled with its name in my EasyTransformer library, you can play around yourself with `model=EasyTransformer.from_pretrained`. The attn-only-nl, solu-nl, gelu-nl, solu-nl-old are not-yet-documented models I’ve trained for interpretability).