Teaching Ladders

by radimentary

I’ve been teaching math to people one or two levels below me my entire life. Although this seems like a limitation, I think it’s the natural state of affairs.

On the Kiseido Go Server (KGS), there’s a room called the KGS Teaching Ladder where players can find teaching games with players just a few stones stronger than them. The few times I participated, it was extraordinarily positive. Because of the relative linearity of progression in Go, losing to a slightly stronger player is legible: they will usually play a move you considered but just barely don’t understand, or find the simplest good moves that you don’t know yet. Losing to a much stronger player, however, is completely illegible. Much stronger players will often play completely incorrect moves (“overplays”) just to test your instincts, or play otherwise incomprehensibly complicated variations and traps that you immediately fall into.

I distinguish between two models of teaching:

  1. (Traditional) The master teaches everyone.
  2. (Teaching Ladder) The students one or two stages up from you teach you.

Previously, I noted that many progressions come in three stages: “naive, cynical, naive but wise,” where the third stage bears more resemblance to the first than the second. The value of Teaching Ladders is that they naturally mesh with the three stages: Stage 3’s have a difficult time teaching Stage 1’s, and Stage 2’s are needed to fill that gap.

Scaffolding and Assimilation

The history of every major galactic civilization tends to pass through three distinct and recognizable phases, those of Survival, Inquiry and Sophistication, otherwise known as the How, Why, and Where phases. For instance, the first phase is characterized by the question ‘How can we eat?’, the second by the question ‘Why do we eat?’ and the third by the question, ‘Where shall we have lunch?’ (Douglas Adams, “The Hitchhiker’s Guide to the Galaxy“)

In Singularity Mindset, I articulated the following model of development without explaining its origins:

Oftentimes, progress curves look like “naive, cynical, naive but wise”:

  1. For mathematicians, the curve is pre-rigor, rigor, post-rigor.
  2. Picasso said, “It took me four years to paint like Raphael, but a lifetime to paint like a child.”
  3. Scott Alexander foretold that idealism is the new cynicism.
  4. Knowing about biases can hurt you.

This is a general phenomenon which applies not just at the level of an entire field, but also at the level of individual skills. With Terry’s example of mathematics in mind, the stages look like:

  1. (naive) The student has bad instincts. He thinks that proof by example is a valid argument. Saying “trust your instincts” doesn’t help and deeply frustrates him. Progress is achieved by dropping the instincts, acquiring explicit knowledge, and following fixed and deliberate rules.
  2. (cynical) The student has the knowledge and understands the fixed and deliberate rules. He starts every proof with a cookie-cutter template for proof by induction or proof by contradiction and fills in the logic line by line. Unfortunately, doing everything by System 2 is slow and clunky. Progress is achieved by pushing acquired knowledge back down to System 1 via practice, metaphor, and exploration.
  3. (naive but wise) The student has successfully integrated skills into System 1. He produces intuitive arguments that only mention the salient details. A completely rigorous proof can be reconstructed on demand, but requires effort. At this point the explicit structures originally built to progress to Stage 2 are unnecessary, and are slowly taken down.

This model resembles – and perhaps generalizes – the interaction of Babble and Prune, where conscious Prune filters are slowly pushed down into the subconscious Babble. Learning occurs as superior algorithms are constructed in System 2 and then pushed back down to System 1, the instinctual level. After the algorithm is constructed, however, the remaining machinery in System 2 is outdated scaffolding. The farther along a student is past Stage 2, the more of this scaffolding is forgotten.

Let’s call the transition between Stage 1 and Stage 2 Scaffolding and the transition between Stage 2 and Stage 3 Assimilation. Every progression in every domain looks roughly like a ladder built out of alternating Scaffolding and Assimilation rungs. In the Scaffolding stage, bad instincts are explicitly corrected with procedure and hard-and-fast rules. In the Assimilation stage, the explicit Scaffolding built is now practiced and stretched until it becomes instinct. Afterwards, although there is rarely an explicit call to remove the Scaffolding, it is no longer in use and slowly crumbles, leaving only the pure instinct behind.

Teach Scaffolding

In a traditional teaching model, the master teaches students at all levels of development, from precalculus to (infinity, 1)-categories. The basic pitfall to this model is usually described as Expecting Short Inferential Distances, i.e. that the master has a hard time reaching back down the tall tower of inferences to meet her students. She may even be a great speaker, throwing down her instincts in the way of quirky metaphors in an attempt to boost her students up. But she is no Rapunzel and the students are left staring up longingly from the bottom of the tower. Every so often, one of them tries to hop up and make progress by asking what that symbol means, but the tower is too damn high.

Long inferential distances are certainly part of the problem, but even if the master is sufficiently humble to back down a hundred steps, she may lack key pieces of Scaffolding that are required to convey ideas to the students. A master looks down the tower of inference and sees only the transition between Stage 1 and Stage 3, as if Assimilation can be achieved without the Scaffolding. She would never dream of teaching proof by contradiction with a cookie-cutter mad-lib proof template, but that seems to be an effective starting point for students who’ve never handled proofs.

Teaching Ladders, on the other hand, not only reduce inferential distance but introduce teachers who still have their Stage 2 Scaffolding mostly intact. That’s why (in my experience) the TA in an undergraduate-level math course is usually more effective than the lecturer. Unless they are explicitly trained in teaching, even lecturers who understand and correct for the inferential gulfs involved lack the mental machinery to convey Scaffolding.

Often when I’m asked to teach a concept I recently learned, I have the urge to punt it to a known master’s writing or lectures. Today I’m learning to fight that urge. However incomplete my knowledge, I convey it with Scaffolding intact, and that will do more good than harm.