We Must Help Our Students Engage with AI
If for no other reason than to help them sniff it out and understand how to proceed wisely
Thank you for being here. As always, these essays are free and publicly available without a paywall. If you can, please consider supporting my writing by becoming a patron via a paid subscription.
The smoke is beginning to clear. As the hype around LLMs begins to dissipate, it seems that we're finally coming to some more realistic visions of the impact they might have. It is becoming increasingly hard to ignore that LLMs are not likely to be the path to artificial general intelligence that some hoped. OpenAI is trying to supplement their GPT models with additional prompting strategies to encode "reasoning" capabilities, but this is a bellwether for the diminishing returns of throwing more training data and computational power at the GPT architecture.
I suspect there are still performance improvements to be had, but that they will be rather incremental rather than transformational. We saw some significant jumps in capability from GPT-3 to GPT-3.5 to GPT-4, but we are reaching a plateau. Performance leaps are increasingly difficult to come by. A logistic function looks exponential in the short term but ultimately approaches an asymptotic threshold.
LLMs are not intelligent, but they are useful nonetheless
All this being said, it would be a mistake to dismiss LLMs altogether. While LLMs are not intelligent and we're seeing diminishing returns on their performance as time goes on, this does not mean that they are not useful. It just means they are not useful for the types of artificial general intelligence tasks that the Silicon Valley hype cycle wanted us to believe they might be in their quest to curry billions of dollars of venture capital funding to buy GPUs.
Even if our state-of-the-art LLMs get no better than they currently are, they'll continue to be increasingly useful as we fine-tune them for specific use cases. I suspect there are still many useful applications that can be developed by taking a general foundation model and fine-tuning it on a set of proprietary data.
There are also new, unexpected uses waiting to be discovered. The capability of LLMs to interact with images and video is still largely untapped. While the ability to generate media often gets the most press (e.g., Midjourney, Stable Diffusion, DALLE, Sora), the ability to process media will be much more useful in practical applications. LLMs like ChatGPT are very good at transcribing and processing media: we've seen this with Whisper for audio which is what is under the hood of ChatGPTs excellent transcription abilities. GPT-4 is also very good at doing optical character recognition from images, easily transcribing text from images.
How should educators respond?
With all this in mind, I've been returning to the question that educators have been wrestling with ever since ChatGPT launched two years ago: how should I handle generative AI in my classes? As we see the hype cycle cool, we're going to see lots of opportunities to explore AI together with our students. Today I want to lay out some of the ways I'm doing that with my students and argue that my approach provides a blueprint worth adopting.
The core of my argument remains the same: we must engage with generative AI, but we need not integrate it. It’s a subtle point, but engaging with AI allows us to experiment with AI without giving in to a rhetoric of inevitability. While LLMs may be useful in many areas, we should not adopt them in our classrooms unless we do so with a clear plan and pedagogically-motivated rationale. Similarly, as we approach and explore these tools, we must find ways to partner with our students and engage these tools from a foundtion of trust, helping them understand that our goal is to find ways to approach these tools wisely.
The calculator analogy revisited
Once we get away from the Silicon Valley quest for artificial general intelligence (AGI), the way forward is clearer. I've seen many arguments over the calculator analogy. In general, folks seem to be caught up in the operational difference between calculators and LLMs. They focus on the fact that LLMs and calculators function in different ways, noting the fundamental distinction between the deterministic behavior of a calculator (the same inputs always get the same outputs) and the stochastic behavior of LLMs (the same input never yields the same output due to the randomness inherent in the architecture).
This is the wrong way to draw the battle lines. Focusing on the operation of these tools misses the larger point: both of these technologies are tools. Every tool has unique strengths and weaknesses based on how it operates and the task to which it is applied. For some applications (e.g., mathematical computation), it is critical that you get a deterministic and consistent result. For others like ideation, the ability to generate different answers in response to the same prompt can be a clear advantage. When we focus on the operational differences, we miss the bigger picture. The inherent weaknesses of a tool for a particular application are not important. We don't judge a hammer on its ability to turn a screw.
Where LLMs really shine is the natural language interface. Once you burst the illusion that there is an actual intelligence on the other side of the line, then you can make some progress on interesting use cases. While LLMs can't reason, they are shockingly good at synthesizing answers from their training data. This is one place where LLMs can be significantly more useful than a traditional search engine. I find myself increasingly turning to LLMs whenever I need help with a question about how to configure some obscure application in Linux or to explain how something works. Given its training data, the LLM can use the prompt to synthesize information that can help me to make progress on my question. It is also useful that it can adapt its answers to the particular error messages or logs that I’m seeing. If you were to search for those error messages in a traditional search engine, you will often fail to find any results.
This view of LLMs also helps to guard against the pitfalls of the stochastic nature of the generated answers. Whenever I query ChatGPT or Claude I know that the answer is just a stochastic interpolation of the training data. I know the model is stitching together some of its training data to generate a plausible response. This naturally causes hallucinations, but if you understand that these tools are just generating text by predicting probably tokens in a sequence, you can guard against this deception.
This is why an understanding how LLMs work is so important. We need knowledge, but even more importantly we need wisdom. Wisdom goes beyond knowing how something works and helps us understand when and how we should use something.
How it’s going on campus this fall
These questions are not hypothetical for me. I'm continually thinking about how to continue to engage with AI in my classes.
Before the semester started, I shared my approach to AI in my classes. I started with a clear AI policy in my syllabus. Then, on the first day of class I was forthright with my students: from some experimentation, I was convinced that GPT-4 could solve all of the design projects in the first half of the class. I tried putting a number of them into GPT-4 and was very impressed with the output. With this in mind, I made a case for why it was important to do the labs without the aid of an LLM. Even though the LLM could create the product, it would short-circuit the process of learning. Sure, my students could use an LLM to generate the code, but in doing so they would be robbing themselves of the chance to build the expertise needed to become an accomplished digital designer.
This approach can map across a range of different fields and classes. LLMs have been said to democratize expertise. In fact, they democratize the appearance of expertise. It's one thing for me to use an LLM to generate the solutions to the design projects. I know what the solutions are supposed to look like, so I can look at what the LLM generates and evaluate whether it's any good or not. That expertise is hard-earned. If you've never built a system from scratch and learned what a good design looks like, you'll never be able to adequately judge the LLM's output.
However, once you develop a foundation of the fundamentals, an LLM can be very useful. In many engineering contexts, there is a great deal of work to be done on relatively mundane, repetetive tasks. Normally, if you need to do these sorts of things over and over you automate them via a program or another tool. Once you're in this mode, now the LLM is just another tool in your toobox.
This is where the fundamental skills and judgment come into play. When you're designing a digital system, much of the design process comes down to adapting specific templates and code structures to a particular application. For example, most complex digital systems—including the processor in your computer—can be understood through the framework of a finite-state machine. If you've got a particular problem to solve, you simply need to map it onto the template of a finite state machine and then it's just a matter of typing the code. I don't see any reason why LLMs shouldn't be used to do this sort of work.
Not only that, but the multimodal input of the LLM creates new opportunities to engage students in the design process. Now, I can have students try things like drawing diagrams of the hardware they want to imply and then writing a description of each module and what it should do. Then, they can simply take a photo of their diagram and description, upload it to their favorite LLM, and ask the LLM to generate the SystemVerilog from their drawing. This type of application would have been next to impossible to imagine implementing before the advent of generative AI.
We’re now at the part of the semester where the design projects from the first half are wrapped up. Now, we are taking the limit off the throttle. In the second half of the course, I am encouraging them to experiment extensively with AI and use it however and whenever they want. My only caveat is that I want them to leverage the foundation they've built in the first half of the class to interrogate the outputs they get back and then report on their experience in their final reports. Along the way, I want them to ask questions:
Is the code they get back any good?
Does it make sense?
Can they understand what it's doing?
How is it impacting their ability to design and debug complicated systems?
A generalized framework for engaging AI
As we think more generally about AI in education, this framework feels like a fruitful one. First, begin without AI. Then, at some point in the class, encourage students to experiment with AI and report on their experiences. In any class, we need to isolate the most important skills that students need to build without the use of AI. These skills are important even when you're using AI because they provide the foundation from which you can critique and wisely apply AI.
Here are a few principles to consider:
Understand that LLMs are interpolation engines and are not intelligent. They can synthesize information from their training data, but are fundamentally powered by a stochastic process. This means that they generate plausible answers, but will always be prone, at least on their own, to interpolating factually incorrect statements.
Think of LLMs as natural language text processors. In this mode, LLMs can parse textual input and analyze it in ways that were previously not possible. You can use them in the vein of an advanced spellchecker, using AI to parse textual input and generate output. Just like spellcheck enables you to critique a particular aspect of your writing, LLMs can enable you to analyze aspects of your work in ways that were impossible with previous tools. For example, you can query an LLM to look for repeated phrases or areas of weakness in your arguments where you may need to bolster your case with more evidence.
Retain your agency. Just like any other feedback you receive on your work, remember that you are ultimately the one who decides what to do with it.
Double down on metacognitive reflection. LLMs provide an excellent opportunity to encourage ourselves and our students to consider how and why we are learning. If an LLM can reproduce the work we are doing, does this mean that work is meaningless? If the value of the process is the product alone, then perhaps. But in many situations, especially in education, the process is the most important part.
Try weird things. The fact of the matter is that LLMs remain mysterious. We have some idea of how they work, but we still don’t fully understand. As we engage AI, we should continue to try crazy ideas with AI and experiment with it in various contexts. This remains the best way to pique your interest in AI: just play around with it and see if it’s useful.
LLMs are here to stay. I’m increasingly confident that there aren’t any big breakthroughs coming by just throwing more data and computational power at LLMs, but in the process of doing that, we’re going to continue to see LLMs become cheaper and more accessible. This commodification combined with the tempering of the hype cycle opens opportunities for us to find practical uses for these technologies.
As LLMs get cheaper to use, we’ll also see them increasingly integrated into the tools we use every day, tools that up until now have not had or needed generative AI capabilities. This is what
calls “The Button.”If we want to be able to detect and evaluate whether we want to use the button, we need to understand what’s going on under the hood. We need to develop a feel for the contours of generative AI and what’s under the hood of these models. Only then will we be able to use generative AI wisely.
As we approach this brave new world of generative AI, we must do it together with our students. In the world they are entering as professionals, generative AI will be all around them. As educators, we have a responsibility to help them learn how to see through the hype and understand where these tools will be helpful and where they should be avoided. As we consider how this impacts our courses we have an opportunity to partner with our students to explore generative AI together. I hope we don’t miss it.
Got a comment? Share it below.
Reading Recommendations
This week’s recommendations include two excellent pieces from The New Atlantis.
The first is a piece from Brian Boyd titled “B.S. Jobs and the Coming Crisis of Meaning.” In it, he examines AI, and the move to develop AI agents in particular, through the lens of the meaning of work.
If we want to retain any semblance of self-governance, we will need to find means for subsidiary, localized, even personal ownership and management of these AI agents. Otherwise, the triumph of postmodernity will entail a return to premodern hierarchies of social organization, where only a small minority has sufficient access, ownership, and status to exercise their agency and attain the full range of human excellence. The creation of perfect AI servants, if embedded in social structures with roles designed to maximize profit or sustain oligarchy, may bring about not a broad social empowerment but a “servile state,” formalizing the subjugation of an underclass to those who control the means of production — or of perception, to borrow from James Poulos. To answer “Who, whom?,” we need only discover who designs the virtual- and augmented-reality headsets, and who wears them; who instructs the AI agents, and who is impacted by their actions.
The second is an essay from the most recent fall edition by
titled “For Whom Shall We Build?” (pdf) Written as a critique of and response to Marc Andreessen’s Techno-Optimist manifesto, Levin’s response is one of the most thoughtful that I’ve read. At the core of his thesis he looks inward to see the problems, rather than always looking outward.We fail to build in our time because we are in something like that vain great man’s position. We direct our political energies to terrifying ourselves with imaginary visions of catastrophe rather than to building for the future, because we cannot relate to the future. We do not view ourselves and the denizens of that future as links in a chain. We do not naturally see ourselves as the beneficiaries of our fathers’ and mothers’ generosity, and so as owing the same to our children. We fail to take the long view.
He closes with a call to build, but not for ourselves. We ought to build for those who come after us, realizing the limitations of our own frailty and the meaning of what we can pass on to our descendants.
Ultimately, the kind of humility we lack most is the willingness to understand ourselves as working for those who will come after us. Such humility, indeed any humility, is very rare among today’s technologists, and so Andreessen’s call for it is a welcome sign of seriousness and of health. He seems to grasp that building the future means building for others.
If Andreessen’s warning against despair and paralysis is going to resonate, we will need to orient ourselves toward those others, and to see that our labors here, under the sun, can really only matter if the prospect of those who will come after us inheriting our handiwork can fill us with joy, hope, and gratitude. It’s time to build for them.
The Book Nook
I’m desperately behind in our latest book club book, We Solve Murders, by Richard Osman. Wish me luck as I try to catch up by the time we meet on Thursday!
The Professor Is In
Yesterday I ran a lunch workshop at Mudd in partnership with our Office of Career Services to help students build their own portfolio websites using Quarto. We all had a good time and managed to get almost the whole group through from a blank slate to a basic template for their site that they can adapt.
If you’re interested, you can check out the source code here. I also published a copy of the demo website and the slides from the workshop.
Leisure Line
I realize this makes me old, but I love our hummingbirds. This week I managed to get a shot of one of them perched on one of the wires coming into our house off of the utility pole. Such fun little birds.
Still Life
Stumbled on this little guy in the backyard this weekend. My iPhone’s photo app says it is a Ammopelmatus muwu. At any rate, it was a pretty cool-looking bug and big—probably almost an inch long!
> LLMs are here to stay. I’m increasingly confident that there aren’t any big breakthroughs coming by just throwing more data and computational power at LLMs
It seems true that basic data/compute scaling is showing diminishing returns, but there's still a lot of room for growth in reasoning/agentic models. It's not a coincidence that a bunch of OpenAI execs who left started working in that area.
I credit you frequently for the "engage yes, integrate maybe" approach I now promote. I'm curious how much time you spend, if any, explaining how these tools work? I feel like your students, being future engineers from Harvey Mudd, may be coming in with a better basic understanding of this than most? Or not?