Sep 3

The expansion-contraction layer that opens new possibilities.

3 Comments

What stood out to me is how the MLP’s expansion–contraction process mirrors how we sometimes need to stretch ideas into bigger spaces before distilling them back down

Do you think this mechanism also shapes the kinds of “world knowledge” associations LLMs surface beyond the immediate text?

Expand full comment

Reply (1)

Mike X Cohen

Interesting connection, Frankline! There does seem to be something universal about spaciousness leading to better connections.

As for world-knowledge: Yes, one of the ideas of the MLP layers is that they learn facts about the world that get incorporated into the token embeddings adjustments.

Expand full comment

Reply (1)

Data Frank

Nicely put

Thanks for the clarity Mike

Expand full comment

Mike X Cohen

LLM breakdown 6/6: MLP