Mental Models Matter

May 6

The importance of understanding LLMs as flexible interpolative functions from prompts to continuations

6 Comments

I read again this morning in Ursula Franklin Speaks, her emphasis on constructing understanding from study of elements. I can see her teachings in this post. I’m an English teacher, but I see the transfer pedagogically; the value of this slow, hands-on, collaborative, bottom-up philosophy in helping students to feel both free and humble is immeasurable.

Expand full comment

Reply (1)

Josh Brake

Jun 5

Yes, I love this. Well said!

Expand full comment

Craig Teague

May 8

This was useful - thanks! I appreciate the thoughts and the analogy to processors. I think this approach will be useful in helping students understand what LLMs are and are not.

Expand full comment

Reply (1)

Josh Brake

May 8

Thanks Craig!

Expand full comment

Kalen

May 6

Indeed. I'd go one step further and suggest that making LLMs and diffusion models into their ilk into serious tools versus the slop generators and oracles for the lonely they are now will hinge on finding more ways to make what's under the hood legible and accessible. LLMs could run a parallel search on their training corpus and their output to show you if it has adequate training to possibly give a good answer, and suggest conceptual attribution for its results (though that might reveal how often it is a forgetful plagiarism machine and not a wizard). You could show the user the other nearby output tokens in the latent space and let them choose. Now that circuit tracing is a thing, expose the circuits to some kind of interface so they can be amplified and muted, and the training data leading to desirable and undesirable functions exposed. Etc., etc. Like, prompts are fundamentally such a goofy thing- I'm going to 'ask' the robot to function a certain way, and if I fuss around enough I can coax it to occasionally do what I want? That vision has been sold as this end state of sci-fi dreams of helpful factotums, but man, I want some sliders and number fields and access to the guts. Much of that is still a tall technical order- okay, well, get to work!

But I suspect a lot of that is not very interesting to the powers that be because it does damage to what I think is actually their major product right now, even for programmers and the like- the entertainment value of talking to a robot in a box. It's sold as the future because you can pretend you're talking to the computer on the Enterprise but really you're just playing Zork.

Expand full comment

Reply (1)

Josh Brake

May 8

Hi Kalen, thanks for this thoughtful comment. I agree with you. I also would love this sort of dashboard for playing with the outputs of the models and agree that having these sorts of knobs would be very helpful both for understanding what the models are doing and how to best use them for specific applications. Unfortunately, I also agree with you that this is likely not interesting to the big players right now for the reasons you mentioned, but I am hopeful that others will see this as a valuable use case and innovate there.

Expand full comment

The Absent-Minded Professor

Mental Models Matter