The Education Endowment Foundation published a report in 2021 on some cognitive neuroscience strategies that should be integrated into the education system.
The techniques are not magic - the prefix ‘neuro’ inevitably conjures up people looking like Peter Capaldi in The Suicide Squad - but on the most part, fairly intuitive.
Here they are, from the EEF, in scintillating bullet-point format:
Spaced Learning: Distributing learning and retrieval opportunities over a longer period of time rather than concentrating them in ‘massed’ practice.
Interleaving: switching between different types of problem or different ideas within the same lesson or study session.
Retrieval Practice: Using a variety of strategies to recall information from memory, for example flash cards, practice tests or quizzing, or mind-mapping.
Dual Coding: Using both verbal and non-verbal information (such as words and pictures) to teach concepts; dual coding forms one part of a wider theory known as the cognitive theory of multimedia learning (CTML).
Strategies To Manage Cognitive Load: focusing students on key information without overloading them, for example, by breaking down or ‘chunking’ subject content or using worked examples, exemplars, or ‘scaffolds’.
I’m going to look at the first three points here. Pitching “strategies to manage cognitive load” as a new educational strategy feels a bit like I’ve rocked up to a bakery to pitch my idea of making dough rise using yeast. Dual coding involves showing people learning stimuli that are multi-sensory. Radical. I think this is in there to counter the idea that people have specific learning styles; that there are ‘visual learners', or ‘auditory learners’, or ‘kinaesthetic learners’, which hasn’t held up in the literature. This Veritasium video is a good riposte to this idea.
Understanding how information actually accrues in the ol’ cranium has been studied since Hermann ‘The Germann’ Ebbinghaus sketched out some ‘Forgetting Curves’ in the 1880s. These were a way of showing how we lose information that we learn over time. The drop-off in retention is typically pretty aggressive.
It also continues over time, which this graph doesn’t show, and usually to around 30% of the learned information, but varying massively depending on context.
Ebbinghaus proposed the part of the graph where people retrieve things in order to retain them. Note how the retention drop-off is lower the more items are retrieved.
And hey, look! It replicates!
The Forgetting Curve is showing the benefits of spaced over massed practice. Briefly, massed practice is akin to cramming - studying any topic really intensely for a short period of time. Spaced practice involves leaving gaps between learning, where you forget things. In this sense, it’s actually an example of the retrieval effect, where you remember things better when you’re forced to recall them or produce them yourself.
Also worth a shout out is the generation effect, where you remember things better if you have to do some cognitive work to generate them. For instance, in Make It Stick, they give the example of learning classic phrases like “foot-shoe”. Learning was more effective if people were asked to learn the phrases and fill in parts of the phrase themself - for instance, “foot-s__e”. Bear this in mind next time you set yourself a list of nonsense phrases to learn.
The authors of Make It Stick and Gwern really emphasise the central problem here, namely, that massed practice feels like you’re learning really well. If you go over and over a topic, you get a false sense of confidence about how effective your learning has been.
And this is a rule that generalises to all of these techniques. Retaining information seems to require cognitive effort, so any strategy which makes you feel like you’re learning lots of stuff easily probably isn’t very effective. If you feel like you’re struggling, you’re probably learning more than someone who isn’t, or, concerningly, you might just be struggling. See the problem?
What’s needed is what the educational psychologists Elizabeth and Robert Bjork call “desirable difficulties”. These are difficulties that are surmountable with enough effort.
Here’s Alexander Grothendieck, one of the most influential mathematicians of the 20th century, talking about how he had to really struggle through ideas:
Since then I’ve had the chance in the world of mathematics that bid me welcome, to meet quite a number of people, both among my “elders” and among young people in my general age group who were more brilliant, much more ‘gifted’ than I was. I admired the facility with which they picked up, as if at play, new ideas, juggling them as if familiar with them from the cradle–while for myself I felt clumsy, even oafish, wandering painfully up an arduous track, like a dumb ox faced with an amorphous mountain of things I had to learn (so I was assured) things I felt incapable of understanding the essentials or following through to the end. Indeed, there was little about me that identified the kind of bright student who wins at prestigious competitions or assimilates almost by sleight of hand, the most forbidding subjects.
Nonetheless, his suffering through ambiguity produced some of the most important work on algebraic geometry, homological algebra, and K-Theory. Please don’t ask me to explain why it is important. It’s important, okay?
I’ve been trying to determine how many of these techniques are actually taught in schools. The EEF suggests a figure:
Our survey of teachers found that over 85% of respondents said that cognitive science strategies were central to their own approach to teaching.
But if you were a teacher and you were asked whether you were using cognitive science strategies, you probably would say yes, right? Presumably the EEF are aware of this, and tried to design the survey in such a way that this didn’t happen. But the tone of the question in the image below suggests that they just went ahead with it anyway, so, who knows.
A psychology teacher friend points out that OFSTED have recently changed some of their guidance, and now inclusion of interleaving forms part of an assessment criteria, so we will probably see more of that in schools. But she also points out that often schools are fairly insular, and what happens in one school means little for what goes on in another school.
It seems really unlikely spaced practice or retrieval practice is being utilised en masse. This is because teachers are inherently lazy and need to be whipped before they incorporate new things into their teaching style - oh no, wait, it’s because asking teachers to repeat parts of an already overloaded curriculum is insane.
Here’s some survey answers about why it doesn’t get used:
The curriculum is crowded, with too much to be ‘covered within a specific time frame’, so taking lesson time away to revisit previous content ‘is a luxury that is rare’.
‘There is no time to recap or interweave previously taught material.’
‘Exam syllabus restrictions can limit opportunities to develop spaced practice and interleaving.’
‘[Spaced practice is] easy over a few years but in a GCSE or a level course it’s tricky to have sufficient teaching time.’
‘Spaced practic[e], and to a degree retrieval practice, both need time, the first for planning and the second in terms of lesson time.’
‘Talking to other teachers, one concern is common: that the schemes people are using doesn’t allow for [spaced learning].’
‘School timetabling tends to block subjects, making spacing more difficult.’
‘[It’s] more an organisational challenge than a reflection of implicit difficulty in implementation.’
‘Spaced practice [is] more difficult to implement, as not a whole-school policy [so] not well-recognised by leadership [and] seen as an unknown.’
‘Spaced practice has been tricky to plan whole school through a two-year rolling programme.’
‘Spacing the scheme of work makes it difficult—[I] need [the] whole department to adjust.’
Some of these are about the way departments are structured, which could be changed. Most of them are about time. The problem is that these two particular techniques are the most effective for learning, as John Dunlosky and Katherine Rawson point out:
Two techniques—practice testing and spaced practice—received the highest marks because the evidence consistently demonstrated benefits to students’ learning across a wide variety of materials and for learners of different ages and abilities. These techniques really work, and they work particularly well when combined in a technique called successive relearning. Successive relearning … is based on combining practice testing with spacing across multiple sessions.
Reducing curriculum size to integrate spaced repetition and retrieval testing makes some sense, because there’s no point overfilling a curriculum that then doesn’t actually get learned.
There’s a Dunlosky et al. review here of how effective learning methods are against each other, which brutalises the most popular learning techniques:
Five techniques received a low utility assessment: summarization, highlighting, the keyword mnemonic, imagery use for text learning, and rereading. Summarization and imagery use for text learning have been shown to help some students on some criterion tasks, yet the conditions under which these techniques produce benefits are limited, and much research is still needed to fully explore their overall effectiveness. The keyword mnemonic is difficult to implement in some contexts, and it appears to benefit students for a limited number of materials and for short retention intervals. Most students report rereading and highlighting, yet these techniques do not consistently boost students’ performance, so other techniques should be used in their place (e.g., practice testing instead of rereading).
Highlighting and rereading, despite being two of the most common methods to learn, do very little for students. The benefits they do confer are likely a result of spacing of practice (i.e rereading over multiple days), than of benefits inherent to the activity itself. Bizarrely, it doesn’t even seem to matter which colour Stabilo marker you use to highlight things.
What are some good strategies to deploy, then, if you’re the sort of person who wants to learn things? Testing yourself after reading things is a big one. This offers much better retention (in the long run) across a vast range of studies, as compared with simply rereading the text.
Others include developing a habit of using Spaced Repetition Software (SRS). One of the key ideas of spaced repetition is that the best time to learn something is when you’re about to forget it. This is also not helpful, because when you’re about to forget something is also a pretty hard time to learn it.
SRS solves this problem for you. You are tested on a flashcard, and you then put in how well you did, and the system will then interpret your confidence estimates. If you thought you did well, it moves that flashcard to a long way in the future. If you think you did badly, it gives that flashcard again to you tomorrow.
Here’s a fun chart of how long I have until I’m shown various words again from a German vocabulary Anki deck:
Spaced repetition works very, very well for anything that is a simple factual representation of something else. There are two categories that feature prominently on the list of popular Anki decks; the first is medical and anatomical knowledge, and the second is foreign language vocabularies.
If you’re learning a language, or what the human body looks like on the inside, or anything else that is not complex or abstract, but a simple list of many, many facts, then Anki, or Supermemo, or Mnemosyne is currently the best way to do so. Given UK law exams seem to operate on similar principles - cram a load of stuff into your head, regurgitate for an exam - it should also work well for lawyers.
People have used it to facilitate deeper understanding of more complex topics. Andy Matuschak and Michael Nielsen are the most prominent advocates of this method. The basic principle is to utilise Anki to embed the key concepts of whichever more complicated thing you’re trying to understand into your head, and then you can chunk these more effectively when you’re thinking about more complicated ideas. Adults can generally hold 3-4 chunks of information in their heads at any one time, and children can usually hold 3 chunks, so this is relatively useful if you’re trying to understand a topic area. The difference between adults and children is that adults have more developed chunking strategies.
If you’ve never used SRS before, it’s best to start with simple learning in order to get to grips with how it functions and how to use it effectively.
SRS in this regard is obscenely effective versus its counterpart - creating a big list of stuff you want to learn, and then uh, working through it? Before I used SRS software I used to dread exams. Earlier this year I breezed through some exams which required a fairly complex understanding of the physics of MRI machines because I just Anki’ed different parts of the process - T1 versus T2, spin echoes, etc. As Nielsen puts it:
SRS makes memory a choice.
The main problem with SRS is that it requires self-motivation. You have to use your SRS every day. This makes it a difficult educational tool; weekends, half-terms and holidays will all mess with the SRS. Every day means every day. If you miss days, it doubles your workload the next day. I think most people would describe me as a self-motivated individual and I find this very hard work. It’s very easy to overload yourself with decks, and it’s very easy to read things and then not really bother to test yourself.
And that’s okay. We don’t need to remember every detail of every thing we learn, and if we did we’d go insane. Here’s Kim et al. on the benefits of forgetting:
Forgetting is often considered to be bad, but selective forgetting of unreliable information can have the positive side effect of reducing mental clutter, thereby making it easier to access our most important memories.
And this is important because your memories are not always particularly accurate:
Specifically, we show that the brain automatically generates predictions about which items should appear in familiar contexts; if these items fail to appear, their memories are weakened. This process is adaptive, because such memories may have been encoded incorrectly or may represent unstable aspects of the world.
One brave teacher has tried to utilise SRS ideas in the classroom, but it doesn’t work that well, and it seems fairly hacky:
SRS is best when used by a self-motivated individual, and my classroom antics are an ugly hack around the fact that self-motivation is a rare element this deep in the mines.
Anyone who can show us a way out will have my attention.
The problem of self-motivation is haunting, because one feels that if these strategies do become widespread, they will create further differentiation. It’s pretty easy to imagine anyone from private-schooled children with pushy parents to Chinese ‘chicken babies’ being rammed through the meatgrinder of extensive spaced repetition and testing. It’s harder to imagine other groups (cockerel babies? cow babies?) with access to less resources getting the same experience.
One of the most interesting parts of tanagrabeast’s attempt to teach children how to remember things effectively was that it changed their mind on how effective this strategy is. Forgetting things is generally okay, as I mentioned above. But if children are going to forget things, why are we teaching them? For me, this was the best part of tanagrabeast’s essay, so I’ll quote at length:
With a few mostly upper-level exceptions, though — math, physics, chemistry — most of what we teach in school is more conceptual than technical. We make you take history so you have a better model of how civilizations and governments work, not so you remember who shot Alexander Hamilton. We make you take English to improve your word-based input and output abilities, not so you remember the difference between simile and metaphor. At least, I hope we do.
Besides, even in the technical classes, forgetting is the near-universal outcome, and the long-term benefits are mostly conceptual — for if you don’t use these skills continuously for the rest of your life, you’re almost certainly going to lose them. Maybe more than once.
I’ve forgotten algebra twice. I’ve forgotten how to write code at least three times. I can’t do either one at the moment. But I’m still changed by having known them. I have an intuition for what sorts of problems ought to be mathematically solvable. I can think in terms of algorithms. And I could relearn either skill more easily than on the first or second occasions. Also, relearning has an anecdotal tendency to deepen understanding in a way that continuous retention may not, especially when approached from a different direction.
Still, as long as I’m defending retention, I think it’s valid to ask whether we should force kids (and often, by extension, their parents) to relearn math every frickin’ year. Consider: The conventional wisdom is that technical companies begrudgingly expect to have to (re)train most new workers in the very specific areas they need. They look to your resume and transcripts mostly for evidence that you have learned technical skills before and can presumably learn them again. I don’t think they care if you’ve re-learned them three times already instead of six. So, if we’re going to force kids to demonstrate intermediate math chops to graduate (a dubious demand), perhaps we could at least wait until the last practical moment, and then do it in bigger continuous lumps — like two-hour daily block classes starting in grade 9 or 10 — so they would have fewer opportunities to forget as they climb the dependency pyramid. Think of the tears we could save (or at least postpone).
Knowing how to do something changes your perception of that activity, and leads to transfer of similar ideas to different contexts. I didn’t know how the physics of sound worked when I worked as a podcast editor, but working with and editing frequency-time graphs every day for two years meant that when I came to the physics of sound, it was much more straightforward for me than it was for other people on that same course.
Transfer of skills also facilitates new ideas in different contexts. I’m planning to write next week or the week after about Friston and free energy, currently the dominant paradigm in neuroscience, which is essentially an idea borrowed from statistical physics and used to explain the brain. (Okay, it’s a little more complicated than that, but the basic principles are similar.) I loved Range by David Epstein for this reason. He researched how people with different backgrounds are often better at finding solutions for problems, especially when coupled with specialists.
The best way to learn? Learn concepts, learn principles, and learn the fundamentals of things. I think most tech CEOs have to tweet some version of that every year or the Illuminati revoke their contract.
Things of Interest:
On AI-generated images, and NFTs.
“These images are not simulacra because they don’t represent or imitate anything. The new modes of figuration don’t refer to anything at all. They are pictures from somewhere else. They are garbled whispers of code in the fall. Containing no meaning, more empty than a black square.”
Amazon is burning through its workers. I am mainly curious as to how it is possible to exhaust a supply of labour. This Recode article constantly emphasises ways Amazon can extend the lifespan of its workforce, but why wouldn't there be a relatively fixed pool to employ from? Potentially once people work for Amazon, they don't go back? Will do some more digging.
This is lovely:
I think these kind of remixes are one of my favourite types of music:
I was reminded of The Holy Mountain by Jodorowsky this week, which goes down as one of the strangest yet funniest movies ever:
Later on they simulate the Spanish colonialists arriving in South America with lizards, it is delightful.