Discussion about this post

User's avatar
Julien's avatar

This is an absolutely fantastic series of posts. Really. I will recommend it to absolutely anyone curious about how LLMs work. Congratulations Mike for this outstanding work!! This was delightful to read.

Data Frank's avatar

What stood out to me is how the MLP’s expansion–contraction process mirrors how we sometimes need to stretch ideas into bigger spaces before distilling them back down

Do you think this mechanism also shapes the kinds of “world knowledge” associations LLMs surface beyond the immediate text?

3 more comments...

No posts

Ready for more?