"Everything can be made radically elementary." ~Steven Rudich

Two Explorations

Much has been written about the fundamental opposition between explore and exploit, chaos and order, yin and yang. In this post I make two observations about the psychology of this opposition.

In the first part, I challenge the metaphor of the comfort zone: a slowly-changing region in activity-space where everything inside is comfortable and everything outside induces anxiety. The point is that anxiety depends not only on the spookiness of the activity itself, but on one’s proximity to safety. I am afraid because it is dark; I am terrified because the light switch is all the way down the hall. In particular, it is possible to reduce anxiety by bringing your comfort zone with you in the form of a safety behavior or a trusted companion. This serves as an alternative to Comfort Zone Expansion.

In the second part, I note that explore and exploit are often embodied in the human personality as two competing subagents. In almost everyone I’ve met, one of these subagents dominates the other. I tell four typical stories of this imbalance, and then suggest that something better is possible. This is perhaps the central example of integrating disagreeing subagents.

Part 1: Distance to Safety

1. Mother and Child

(The following is my retelling of an old story, dating back probably to at least Jean Piaget. I first encountered a variant of this story in The Monkey Wars, where identical behavior was observed in primates.)

A mother brings her child, a girl of perhaps three or four years of age, to an empty playground at the park. Autumn has progressed to the stage where leaves fall in twos and threes. The girl peeks out of the folds of her mother’s coat, staring now at the swing set, now at the neon tube slide. With a nudge, the mother pushes her daughter onto the mulch and gives her an encouraging nod. After glancing back to her mother several times, the girl slowly approaches the playground and begins to play.

Soon, she is clambering about, but periodically looks back at the bench where her mother is sitting. Each time their eyes meet, the girl waves, pig-tails dancing, and the mother waves back. The girl then returns to free play.

At some point, the mother briefly leaves her post to chat with an acquaintance. The girl, pepping herself up to go down the slide for the first time, looks back, finds her mother gone, and panics. She curls up into a ball at the top of the playground and fights back tears, trying to make herself as small as possible. The thought of going down the slide vanishes from her mind.

2. The Difficulty of Dark Souls

Dark Souls (III) is one of those difficult video games which spawns endless internet debates about whether every game needs an easy mode. (“I have enough challenge in my day job, I just want to relax when I play a game,” says the one. “Git gud scrub,” says the other.) Dark Souls is not in fact mechanically difficult, or at least not exceptionally so. Just off the top of my head, Celeste, XCOM 2, and FTL were all significantly more difficult mechanically for me than Dark Souls. And yet I do believe that Dark Souls was the hardest game I ever played.

Call me a coward, but the main difficulty of Dark Souls for me is the psychological difficulty of its dominant aesthetic: loneliness and nihilism. You, the protagonist, are born “Nameless, accursed Undead, unfit even to be cinders. And so it is… that ash seeketh embers.” All the friendlies in the world have been sitting in the same positions for eons before you died, and will be sitting there long after you return to ash. The brief encounters with friendly NPCs in the wild are ephemeral: each person passes in and out of your life, and you are as likely to meet them again as to stumble upon their ashes. The entire rest of the world is Out to Get You: treasure chests that turn out to be mimics, halberdiers hide between crates, face-eaters hang from ceilings. Your job is to save a world that doesn’t care to be saved.

I could never play Dark Souls for more than a couple hours at a time, and found myself constantly teleporting back to base. I told myself I came home to level up, to upgrade gear, and to purchase items, but the truth of the matter is: I kept coming home just to hear a friendly human voice again.

Code Vein is one of many Dark Souls knockoffs, notorious mainly for the addition of anime waifus to the familiar formula. For all its abysmal enemy variety and boring level design, I loved playing Code Vein for one simple reason: I can bring a companion. Bringing a companion on my journey solved all my anxiety in-game. It didn’t matter so much that my digital companion got stuck in corners and committed suicide in boss fights and repeated the same half-dozen canned lines. The feeling that someone has my back let me enjoy a Souls-like world for entire afternoons at a time, and I almost never felt the need to teleport back home.


Human beings explore too little. One common solution to this problem is Comfort Zone Expansion, gently straying beyond the boundary of your comfort zone to notice things out there are not as scary as you thought. This can be a fine solution, but is inadequate for situations where you are required to immediately leave your comfort zone far behind for long periods of time.

Perhaps you travel internationally for the first time. Perhaps you attend a self-improvement workshop with cult-ish vibes in a remote location. Perhaps you try to prove the Riemann hypothesis, and are beset on all sides by the diabolical malice inherent in the primes.

If so, remember that exploration anxiety is a function of how far (you feel like) you are from safety.

The little girl comfortably clambers around the playground when her mother is nearby. When the mother disappears, the girl curls up in fear and is incapable of sliding down the exact same slide she went down a minute ago. The slide didn’t change, the girl’s distance to safety did. Playing Dark Souls, I teleport to base after finding each new bonfire (checkpoint). Playing Code Vein, a companion follows me around and the psychological need return home disappears. Every time a married person gives an acceptance speech, they thank their spouse for being “their rock.” This is a rather unflattering term for “unwavering center of my comfort zone.”

What are concrete applications of this principle?

  1. One purpose of collaboration is for each collaborator to serve as a mobile comfort zone for the others. This might explain why most successful startups are founded by a small group and not an individual or a larger group. The comfort zone effect hits rapidly diminishing returns past one trusted collaborator. In this lens, the purpose of open communication in collaboration is simply to feel psychologically close to the other person.
  2. The optimal solution to comfort zone expansion may be planting a small number of well-spaced “bases of operation.” Instead of continuously expanding one connected chunk of activity-space, plant comfort flags on the points of an ε-net. Comfort zones, like lighthouses and highway truck stops, cover more space if you place them far apart. If anxiety is a major limiting factor for you, consider focusing your energy on a small number of extremely different activities so that the comfort zones that radiate out of each one together cover as much space as possible.

Part 2: Explore Versus Exploit

The previous part upgrades the usual model for comfort zone expansion, taking for granted the value of exploration.

This part turns to a new topic altogether: the internal conflict between the drive to explore and the drive to exploit.

Hereafter I assume a multi-agent model of the mind, and refer to two common subagents: the “explore” subagent which tends towards freedom, creativity, contrarianism, and chaos, and the “exploit” subagent which tends towards structure, discipline, lawfulness, and order. Whether to interpret these subagents as full-blown independent subpersonalities or merely conflicting desires within a single mind is entirely up to you, and should not affect the meaning of this post.

I begin with four archetypal stories to illustrate varying levels of imbalance between “explore” and “exploit.” These are amalgams of real stories from my own life and others’.

1. The Unmoored

The Unmoored was stifled as a girl, surveilled at every quarter-hour by parents, teachers, tutors, and coaches. Every day from dawn to dusk was packed with activity that would gentle her mind and ennoble her condition. As her fingers marched from piano to textbook to tennis racket, her mind danced farther and farther away into Wonderland.

But even her dreams are turned against her. After she shows a passing aptitude for story-telling, her parents sign her up for creative writing classes and poetry jams. The beloved characters in her flights of fancy are clamped into straightjackets and paraded onto stage to be judged by panels of condescending curmudgeons.

When she finally escapes her shackles, there is no temperance to her wildness, no second-guessing, no backward glances. She drops out of pre-med, then out of college, then out of polite society altogether. The world is joy and light and one-way airfares.

Many years later, she snaps awake in the lap of a street urchin in the outskirts of Ulaan Bataar. She’d had a strange dream: she was back in front of the piano playing Mendelssohn, and she liked that feeling of rote and mindless obedience to the sheet music. She shakes off that absurd notion and takes another hit.

2. The Dreamer

Like the Unmoored, the Dreamer yearns to be free, to one day live fully unconstrained to pursue his creative vision. He has not quite decided whether he wants to write screenplays, musicals, or novels; his creative side finds the even the idea of making such a decision fettering. In the meantime, he works an programming job that makes good money.

Unlike the Unmoored, the Dreamer knows the instrumental value of discipline and constraint. Under the flickering lamplight, he writes short stories with a tomato timer by his side, following a classic book of writing prompts. The more exotic the prompt, the more alive he feels bending around it.

At times, the Dreamer worries that his day job is changing him. He cannot help but take joy in turning on his monitors in the morning, in passing code review on the first try, in following procedures to the letter to build something that comes alive before his very eyes.

When he notices himself feeling this way, the Dreamer clamps down on this joy and reminds himself, “I hate this boring, technical job. I’m only working it to get my art off the ground. One day, I will finally be free of this drudgery.”

The tomato timer goes off and he goes back to writing. He never considers that part of his soul might not want to be free.

3. The Magpie

Unlike the Unmoored and the Dreamer, the Magpie is primarily motivated by the joy and comfort of the known. She delights in decorating and redecorating her cozy little apartment, in organizing her books and plants in tidy rows, and in folding origami animals that she can leave around all the rooms, each a permanent addition to her little family.

One day, she hopes to add a few extra bedrooms and a full bathtub to that apartment, and a partner and children to that family. She is the Dreamer’s coworker, but unlike him she hopes to stay at this programming job for the rest of her career. She imagines inviting direct reports into her well-lit corner office to admire her cactus collection and ask for her advice on the operating system she helped architect.

The Magpie understands the instrumental value of exploration and creativity, but fears it. Every year, she takes a vacation to a scary new place to challenge herself, but more importantly to bring back souvenirs, pictures, and memories, to better decorate her place. At work, she pushes herself to learn new technologies and programming languages, but she yearns for the day she won’t have to do this any more.

The Magpie believes that at the end of the day, one should only explore until one finds the best place to nest.

4. The Recluse

Like the Magpie, the Recluse is primarily motivated by comfort and familiarity. But his world was constantly in flux for too long, so he does everything in his power to hide away in his comfort zone and wall off the rest of the world.

His parents divorced before he entered middle school, and he ping-ponged between their two new families, never truly belonging to either. Traveling, learning new things, meeting strangers all terrify him, and yet he’s forced to do more and more of all of these to survive.

When he finally finds a place to settle down, he never leaves it again. In every relationship, he becomes deeply codependent. He is a consistent member of a few local clubs; new faces join and leave, but he is one of the few who remain through the years. If he had his way, the clubs would meet in his living room and he’d never have to go outside.

Occasionally, he reconnects with an old friend who is a Dreamer or an Unmoored, and becomes deeply fascinated with their alien way of being. It is hard for him to understand how comfort and familiarity can be claustrophobic, but he’s glad he has these friends. They bring him new knowledge and experiences in a safe and digested way, or if not, at least they send him postcards.


In each of the above four stories, there is an imbalance between the “explore” subagent and the “exploit” subagent.

The Unmoored lives to “explore.” The “exploit” subagent is greatly suppressed or externalized.

The Dreamer also lives to “explore”, but understands the instrumental value of “exploit.” He views “exploit” as a unsightly means to an end, and suppresses the needs of the personality (comfort, safety, order) associated with it.

The Magpie lives to “exploit,” but understands the instrumental value of “explore.” She views “explore” as a dangerous means to an end, and suppresses the needs of the personality (freedom, creativity, chaos) associated with it.

The Recluse also lives to “exploit.” The “explore” subagent is greatly suppressed or externalized.

In all four cases, one subagent or the other dominates the personality, and holds the other subagent and its needs in contempt. Internal alignment of the two subagents can only occur if the whole person recognizes not only the instrumental value of each subagent, but respects their needs as ends in themselves. Here is my loose prescription for alignment, which might be attempted with an exercise like Internal Double Crux:

  1. If you are near the extremes (the Unmoored or the Recluse), learn to recognize at least the instrumental value of the suppressed subagent. If you lean heavily towards exploring, recognize that more systematic exploiting can often make you better at exploring in the long run. Similarly, if you lean heavily towards exploiting, recognize that more systematic exploring can often make you better at exploiting in the long run. Hopefully, you will level up into the Dreamer or the Magpie.
  2. If you are near the middle (the Dreamer or the Magpie), learn to respect the needs of the weaker subagent as ends in themselves. If you lean towards exploring, realize that it’s genuinely ok for you to enjoy checking boxes, following rules, and tidying things up. If you lean towards exploiting, realize that it’s genuinely ok for you to enjoy trying crazy things, breaking rules, and making a mess.

Pain is not the unit of Effort

(Content warning: self-harm, parts of this post may be actively counterproductive for readers with certain mental illnesses or idiosyncrasies.)

What doesn’t kill you makes you stronger. ~ Kelly Clarkson.

No pain, no gain. ~ Exercise motto.

The more bitterness you swallow, the higher you’ll go. ~ Chinese proverb.

I noticed recently that, at least in my social bubble, pain is the unit of effort. In other words, how hard you are trying is explicitly measured by how much suffering you put yourself through. In this post, I will share some anecdotes of how damaging and pervasive this belief is, and propose some counterbalancing ideas that might help rectify this problem.

I. Anecdotes

1. As a child, I spent most of my evenings studying mathematics under some amount of supervision from my mother. While studying, if I expressed discomfort or fatigue, my mother would bring me a snack or drink and tell me to stretch or take a break. I think she took it as a sign that I was trying my best. If on the other hand I was smiling or joyful for extended periods of time, she took that as a sign that I had effort to spare and increased the hours I was supposed to study each day. To this day there’s a gremlin on my shoulder that whispers, “If you’re happy, you’re not trying your best.”

2. A close friend who played sports in school reports that training can be harrowing. He told me that players who fell behind the pack during for daily jogs would be singled out and publicly humiliated. One time the coach screamed at my friend for falling behind the asthmatic boy who was alternating between running and using his inhaler. Another time, my friend internalized “no pain, no gain” to the point of losing his toenails.

3. In high school and college, I was surrounded by overachievers constantly making (what seemed to me) incomprehensibly bad life choices. My classmates would sign up for eight classes per semester when the recommended number is five, jigsaw extracurricular activities into their calendar like a dynamic programming knapsack-solver, and then proceed to have loud public complaining contests about which libraries are most comfortable to study at past 2am and how many pages they have left to write for the essay due in three hours. Only later did I learn to ask: what incentives were they responding to?

4. A while ago I became a connoisseur of Chinese webnovels. Among those written for a male audience, there is a surprisingly diverse set of character traits represented among the main characters. Doubtless many are womanizing murderhobos with no redeeming qualities, but others are classical heroes with big hearts, or sarcastic antiheroes who actually grow up a little, or ambitious empire-builders with grand plans to pave the universe with Confucian order, or down-on-their-luck starving artists who just want to bring happiness to the world through song.

If there is a single common virtue shared by all these protagonists, it is their superhuman pain tolerance. Protagonists routinely and often voluntarily dunk themselves in vats of lava, have all their bones broken, shattered, and reforged, get trapped inside alternate dimensions of freezing cold for millennia (which conveniently only takes a day in the outside world), and overdose on level-up pills right up to the brink of death, all in the name of becoming stronger. Oftentimes the defining difference between the protagonist and the antagonist is that the antagonist did not have enough pain tolerance and allowed the (unbearable physical) suffering in his life to drive him mad.

5. I have a close friend who often asks for my perspective on personal problems. A pattern arose in a couple of our conversations:

alkjash: I feel like you’re not actually trying. [Meaning: using all the tools at your disposal, getting creative, throwing money at the problem to make it go away.]

alkjash’s friend: What do you mean I’m not trying? I think I’m trying my best, can’t you tell how hard I’m trying? [Meaning: piling on time, energy, and willpower to the point of burnout.]

After several of these conversations went nowhere, I learned that asking this friend to try harder directly translated in his mind to accusing him of low pain tolerance and asking him to hurt himself more.

II. Antidotes

I often hear on the internet laments like “Why is nobody actually trying?” Once upon a time, I was honestly and genuinely confused by this question. It seemed to me that “actually trying” – aiming the full force of your being at the solution of a problem you care about – is self-evidently motivating and requires zero extra justification if you care about the problem.

I think I finally understand why so few people are “actually trying.” The reason is this pervasive and damaging belief that pain is the unit of effort. With this belief, the injunction “actually try” means “put yourself in as much pain as you can handle.” Similarly, “she’s trying her best” translates to “she’s really hurting right now.” Even worse, people with this belief optimize for the appearance of suffering. Answering emails at midnight and appearing fatigued at meetings are somehow taken to be more credible signals of effort than actual results. And if you think that’s pathological, wait until you meet someone for whom telling them about opportunities actively hurts them, because you’ve just created another knife they feel pressured to cut themselves with.

I see a mob of people walking up to houses and throwing themselves bodily at the closed front doors. I walk up to block one man and ask, “Stop it! Why don’t you try the doorknob first? Have you rung the doorbell?” The man responds in tears, nursing his bloody right shoulder, “I’m trying as hard as I can!” With his one good arm, he shoves me aside and takes a running start to lunge at the door again. Finally, the timber shatters and the man breaks through. The surrounding mob cheers him on, “Look how hard he’s trying!”

Once you understand that pain is how people define effort, the answer to the question “why is nobody actually trying?” becomes astoundingly obvious. I’d like to propose two beliefs to counterbalance this awful state of affairs.

1. If it hurts, you’re probably doing it wrong.

If your wrists ache on the bench press, you’re probably using bad form and/or too much weight. If your feet ache from running, you might need sneakers with better arch support. If you’re consistently sore for days after exercising, you should learn to stretch properly and check your nutrition.

Such rules are well-established in the setting of physical exercise, but their analogs in intellectual work seem to be completely lost on people. If reading a math paper is actively unpleasant, you should find a better-written paper or learn some background material first (most likely both). If you study or work late into the night and it disrupts your Circadian rhythm, you’re trading off long-term productivity and well-being for low-quality work. That’s just bad form.

If it hurts, you’re probably doing it wrong.

2. You’re not trying your best if you’re not happy.

Happiness is really, really instrumentally useful. Being happy gives you more energy, increases your physical health and lifespan, makes you more creative and risk-tolerant, and (even if all the previous effects are unreplicated pseudoscience) causes other people to like you more. Whether you are tackling the Riemann hypothesis, climate change, or your personal weight loss, one of the first steps should be to acquire as much happiness as you can get your hands on. And the good news is: at least anecdotally, it is possible to substantially raise your happiness set-point through jedi mind tricks.

Becoming happy is a fully general problem-solving strategy. And although one can in principle trade off happiness for short bursts of productivity, in practice this is never worth it.

Culturally, we’ve been led to believe that over-stressed and tired people are the ones trying their best. It is right and proper to be kind to such people, but let’s not go so far as to support the delusion that they are inputting as much effort as their joyful, boisterous peers bouncing off the walls.

You’re not trying your best if you’re not happy.

Is Success the Enemy of Freedom? (Full)

I. Parables

A. Anna is a graduate student studying p-adic quasicoherent topology. It’s a niche subfield of mathematics where Anna feels comfortable working on neat little problems with the small handful of researchers interested in this topic. Last year, Ann stumbled upon a connection between her pet problem and algebraic matroid theory, solving a big open conjecture in the matroid Langlands program. Initially, she was over the moon about the awards and the Quanta articles, but now that things have returned to normal, her advisor is pressuring her to continue working with the matroid theorists with their massive NSF grants and real-world applications. Anna hasn’t had time to think about p-adic quasicoherent topology in months.

B. Ben is one of the top Tetris players in the world, infamous for his signature move: the reverse double T-spin. Ben spent years perfecting this move, which requires lightning fast reflexes and nerves of steel, and has won dozens of tournaments on its back. Recently, Ben felt like his other Tetris skills needed work and tried to play online without using his signature move, but was greeted by a long string of losses: the Tetris servers kept matching him with the other top players in the world, who absolutely stomped him. Discouraged, Ben gave up on the endeavor and went back to practicing the reverse double T-spin.

C. Clara was just promoted to be the youngest Engineering Director at a mid-sized software startup. She quickly climbed the ranks, thanks to her amazing knowledge of all things object-oriented and her excellent communication skills. These days, she finds her schedule packed with what the company needs: back-to-back high-level strategy meetings preparing for the optics of the next product launch, instead of what she loves: rewriting whole codebases in Haskell++.

D. Deborah started her writing career as a small-time crime novelist, who split her time between a colorful cast of sleuthy protagonists. One day, her spunky children’s character Detective Dolly blew up in popularity due to a Fruit Loops advertising campaign. At the beginning of every month, Deborah tells herself she’s going to finally kill off Dolly and get to work on that grand historical romance she’s been dreaming about. At the end of every month, Deborah’s husband comes home with the mortgage bills for their expensive bayside mansion, paid for with “Dolly money,” and Deborah starts yet another Elementary School Enigma.

E. While checking his email in the wee hours of the morning, Professor Evan Evanson notices an appealing seminar announcement: “A Gentle Introduction to P-adic Quasicoherent Topology (Part the First).” Ever since being exposed to the topic in his undergraduate matroid theory class, Evan has always wanted to learn more. He arrives bright and early on the day of the seminar and finds a prime seat, but as others file into the lecture hall, he’s greeted by a mortifying realization: it’s a graduate student learning seminar, and he’s the only faculty member present. Squeezing in his embarrassment, Evan sits through the talk and learns quite a bit of fascinating new mathematics. For some reason, even though he enjoyed the experience, Evan never comes back for Part the Second.

F. Whenever Frank looks back to his college years, he remembers most fondly the day he was kicked out of the conservative school newspaper for penning a provocative piece about jailing all billionaires. Although he was a mediocre student with a medium-sized drinking problem, on that day Frank felt like a man with principles. A real American patriot in the ranks of Patrick Henry or Thomas Jefferson. After college, Frank met a girl who helped him sort himself out and get sober, and now he’s the proud owner of a small accounting firm and has two beautiful daughters Jenny and Taylor. Yesterday, arsonists set fire to the Planned Parenthood clinic across the street, and his employees have been clamoring for Frank to make a political statement. Frank almost threw caution to the wind and Tweeted #bodilyautonomy from the company account right there, but then the picture on his desk catches his eye: his wife and daughters at Taylor’s elementary school graduation. It’s hard to be a man of principles when you have something to lose.

G. Garrett is a popular radio psychologist who has been pressured by his sponsors into being the face of the yearly Breast Cancer Bike-a-thon. Unfortunately, Garrett has a dark secret: he’s never ridden a bicycle. Too embarrassed to ask anyone for help or even be seen practicing – he is a respected public figure, for god’s sake – Garrett buys a bike and sneaks to an abandoned lot to practice by himself after sunset. He thinks to himself, “how hard can it be?” Garrett shatters his ankle ten minutes into his covert practice session and has to pull out of the event. Fortunately, Garrett’s sponsors find an actual celebrity to fill in for him and breast cancer donations reach record highs.

II. Motivation

What is personal success for?

We say success opens doors. Broadens horizons. Pushes the envelope. Shatters glass ceilings.

Success sets you free.

But what if it doesn’t?

Take a good hard look at the successful people around you. Doctors too busy to see their children on weekdays. Mathematicians too brilliant in one field to switch to another. Businessmen too wealthy to avoid nightly wining and dining. Professional gamers too specialized to learn a new hero. Public figures too popular to change their minds.

Remember that time Michael Jordan took a break from basketball and played professional baseball? They said he would have made an excellent professional player given time. Jordan said baseball was his childhood dream. Even so, in just over a year Jordan was back in basketball. It is hard not to imagine what a baseball player Michael Jordan could have been, had he been less successful going in.

I think it was in college that I first noticed something wasn’t right about this picture. I spent my first semester studying and playing Go for about eight hours a day. I remember setting out a goban on the carpet of my dorm room and studying patterns in the morning as my roommate left for classes; when he returned to the room in the evening, he was surprised to see me still sitting there contemplating the flow of the stones. Because this was not the first or tenth time this had happened, he commented something like, “You must be really smart to not need to study.”

I remember being dumbstruck by that statement. It suggested that my freedom to play board games for eight hours a day was gated by my personal success, and other Harvard students would be able to live like me if only they were smarter. But you know who else can play board games for eight hours a day? Basement-dwelling high school dropouts, who are – for all their unsung virtues – definitely not smarter than Harvard students.

When I entered college, they told me a Harvard education would empower me do anything I want. The world would be my oyster. I took that message to heart in those four years – I fell in love, played every PC game that money could buy, studied programming languages and systems programming, and read more than one Russian novel. When I talked to my peers, however, I was constantly surprised at the overwhelming sameness of their ambitions. Four years later, twenty out of thirty-odd graduating seniors at our House planned to work in finance or consulting.

(Now, it could be that college really empowers these bright young scholars to realize their childhood dreams of arbitraging the yen against the kroner. But this is, as they say in the natural sciences, definitely not the null hypothesis.)

All of this would have made a teenager hate the idea of success altogether. I was not a teenager anymore, so I formulated a slightly more sophisticated answer: Regardless of how successful I become, I resolve to live like a failure.

This is a post about all the forces, real and imagined, that can make success the enemy of personal freedom. As long as these forces exist, and as long as human heart yearns for liberty, few people will ever want wholeheartedly to succeed. Were it not already reality, that is a state of affairs too depressing to contemplate.

(Just to be clear, people are plenty motivated to succeed when basic needs are at stake – to put food on the table, to get laid, to pay for the mortgage. But after those needs get met, success just doesn’t look all that great and only certain sorts of delightful weirdos keep striving. The rest of us mostly just lay back and enjoy the fruits of their labor.)

III. Factorization

I think all of the experiences in Section I can be summed up by the umbrella-term “Sunk Cost Fallacy,” but that theory is a little too low-resolution for my tastes. In this section I identify three main psychological factors of the phenomenon.

1. You rose to meet the challenge. Your peer group rose to meet you.

We are constantly sorted together with people of the same age group, at similar levels of competence, at similar stages in our careers. To keep up with the group, you have to run as fast as you can just to stay in place, as the saying goes. And if you run twice as fast as that, you just end up in a new, even harder-to-impress peer group. When your friends are all level 80, it’s dreadfully difficult to restart at level 1.

Your friends may even be sympathetic, but it rarely helps matters.

Maybe you want to try something totally new, and your friends are too invested in their pet genre to emigrate with you.

Maybe you’re excited to learn a new skill one of your hyper-competent friends is specialized in, and you ask them to coach you. Unfortunately, this turns out to be a massive mistake, because your friend only remembers how she got from level 75 to 80, and sort of assumes everything below is trivial. It’s technically possible to learn area formulas as a special case of integral calculus, but only technically.

Maybe you transition to a new role within the team, you struggle to learn a new set of tricks, and you start hating yourself for not pulling as much weight as you’re used to. You start to see a mix of pity and frustration in your teammates eyes as you drag the whole team down.

2. Yesterday, you were bad at everything, and that really sucked. Today you’re good at one thing, and you’re hanging on for dear life.

It’s hard to move out of your comfort zone when your comfort zone is one hundred square feet on top of Mount Olympus and every cardinal direction points straight off a cliff. Seems like just yesterday you stood at the base of this mountain among the rest of the mortals, craning your neck to get a peek at what it’s like up here.

Kindly god-uncle Zeus calls a special thunderstorm for your arrival. Dionysus pours you a frothy drink and shares a bawdy tale. Hephaestus personally fashions you a blade as a symbol of your newfound status. Aphrodite invites you to her parlor for a night of good old-fashioned philosophy. They all act so welcoming, so natural, so in their element, and you know you’re only up here by a stroke of pure luck.

When Hermes returns the next morning and invites you to fly with him on his winged boots to see the world, you decline graciously. Not because you don’t want to – they’re winged boots! – but because the moment you try anything out of the ordinary you’ll be found out for the impostor that you are and god-uncle Zeus will show you his not-so-kindly side and chain you to a liver-eating eagle or a boulder that only obeys the laws of gravity intermittently.

3. Success gave you something to lose.

They say beware the man with nothing to lose.

I say envy him, because he alone is free.

You fondly recall the good old days of two thousand and two when you could go online and post diatribes against religion as a “militant atheist.” In those days, you had nothing, and you were free. You were unattached. You were intellectually wealthy but financially insolvent. You could see one end of the place you call home from the other.

Now that you’ve made it big, you’d have to carefully position mirrors at the ends of three hallways to see that far. You’re attached to wonderful person(s) of amenable sexual orientation(s). You have a reputation to maintain in the ever-smaller circles that you walk. Children in your community look up to you, or so you tell yourself. And so, even though deep in your heart you still believe that only idiots believe in an old man in the sky your Twitter profile identifies you as “spiritual, yearning, exploring.”

IV. Resolution?

It seems to me we have a problem.

We are not a species known for risk-taking, so human flourishing really depends on the explicit emphasis of exploration and openness to new experience. And yet it seems that the game is set up so that the most successful people are least incentivized to explore further. That all the trying new things and pushing boundaries and calling for revolution is likely to come from those with neither the power to get it done nor the competence to do it correctly.

But it’s not a hopeless case by any means. Many of the most successful people got there precisely by valuing freedom, creativity, and exploration, and still practice these values – so far as they can – within the confines of their walled gardens. We live in an information age where getting good at things is as easy as it’s ever been. And at very least we pay lip service to healthy adages like “Stay hungry, stay foolish.”

But what does one do personally to maintain one’s freedom?

I don’t claim to have a fully general solution to this problem, but here is a rule that’s helped me in the past.

When learning something new, treat yourself like a five-year-old.

If you’ve never spoken a word of Korean in your life, it doesn’t matter if you’re a professor of English Literature. As far as learning Korean goes, you’re a five-year-old. Treat yourself like one. Make yourself a snack for memorizing the vertical vowels. Take a break after reading your first sentence and come back tomorrow. When you’re done for the day, suck your thumb while staring at the first Korean word you’ve ever learned and feel the honest pride well up in your heart.

If you’ve never washed a dish in your life, it doesn’t matter if you’re a professional chef. As far as washing dishes is concerned, you’re a five-year-old. Treat yourself like one. Make yourself a snack for figuring out how to dispense dish soap without getting it everywhere. Take a break after finishing the bowls and come back tomorrow. When you’re all done, take a moment to take in that beautiful empty sink and feel the honest pride well up in your toddler heart.

Do you see how profoundly counterproductive it would be for the Korean learner to beat herself up for not being able to converse fluently with her Asian friends after two weeks? Do you see how completely unkind it would be for the novice dishwasher to call himself a useless piece of shit for not being able to execute the most basic of adult tasks?

Be kind to yourself and adjust your expectations to reality. When learning something new, treat yourself like a five-year-old.

Of Math and Memory, Part 3 (Final)

In Part 1, we noted how extraordinarily taxing mathematics can be on short-term working memory. Being able to hold one extra Greek letter in your head can make the difference between following a lecture and getting completely lost. Having background and mathematical maturity means many standard techniques need not be forcibly remembered, freeing up space for the few genuinely novel ideas.

In Part 2, we gave a simple conceptual model for short-term memory, based on a fundamental principle of information theory: compression is equivalent to prediction. The more predictable data is (or the better we get at predicting it), the less new information you have to store.

What do these ideas mean concretely for mathematicians? In this concluding post, we give a practical algorithm for making the most of your short-term memory in mathematics, which I call dyadic scanning.

We will start with the concrete problem of how to read a paper, and later generalize to how to write papers, how to listen and give talks, and how to have mathematical conversations, all while making the most of our short-term memory.

Dyadic Scanning

Consider the markings on a standard ‘Murican ruler:


The two longest vertical lines mark the beginning and end of an inch. It is then divided dyadically into half-inches, quarter-inches, eighth-inches, and sixteenth-inches by progressively smaller “teeth.”

A mathematical paper is organized much like the markings on a ruler: first it is divided into a few main theorems, each of which is divided into several major lemmas, which are then interspersed with minor or technical lemmas and definitions, themselves pieced together from many tedious details. The obvious and standard way of reading a paper is a sequential scan:


Sequential Scan (BAD)

Up until a couple years ago, this was how I attempted to read any given paper: read it through once from beginning to end, pausing on each detail and tracing it down to the lowest level until I could follow it line-by-line. The sequential scan is a fairly useful way to build foundations and mathematical maturity: one spends a lot of time piecing together details and developing a taste for rigor. It is, however, a generally misguided and inefficient approach to reading mathematics in general.

“I could follow line by line, but I have no idea what’s going on” is a common complaint that comes out of reading mathematics like this.

What are the downsides of the sequential scan?

You get easily lost in details, missing the forest for the trees.

You sacrifice agency, accepting the order in which ideas are presented as if from on high.

Most importantly, you don’t know where you’re going.

You can’t ask “Where was condition (a) of Lemma 2 used in the proof of Lemma 3? Can we weaken it?” if you don’t even know that Lemma 2 only exists to prove Lemma 3 later on. This is the kind of question that senior mathematicians are asking all the time to shore up their understanding.

Unless the paper is very cleverly and thoughtfully written, reading it sequentially is going in blind. You will have the hardest possible time predicting each next step, and therefore have to bear the heaviest possible burden remembering every detail.  Without knowing what will be used where or how, you will have to default to remembering everything.

Here is a more enlightened way to read a paper, which I call dyadic scanning.


Dyadic Scan (GOOD)

Instead of reading the paper in a single pass, you split your reading into logarithmically many passes of progressively higher resolution. In the first pass, you figure out the overarching organization and the main results. In the second, you locate the main lemmas, how they fit together, and where the genuine innovations are in the paper. In the third, you piece together how all the minor technical lemmas are involved in the proof, and the locale where each one is relevant. Only in the fourth pass, or later, do you dig into the details of the rigorous proofs. More passes are added if the paper is especially dense, or if you’re especially unfamiliar with the field. The longer and more technical the paper is, the longer you should wait before diving into details.

Why make dyadic scans?

You know where you’re going. If you read mathematics this way, you will know what each line of mathematics is for before digging into why it’s true. Knowing the purpose of Lemma 2 lets you figure out which terms are important and which terms are negligible error terms. Knowing when you’re done using that functional equation means you can free up that memory for something new.

You’re forced to develop an eye for what matters. Not every paper is written so that the main lemmas stand out from the minor lemmas stand out from the boring technical details. Doing dyadic scans forces you to develop a taste for what matters, and to discriminate between the innovative and the boilerplate.

You read in an order closer to how mathematics is actually done. Very few proofs are devised in the order they’re presented. There may be an important and difficult technical lemma that requires a seven-page calculation using the calculus of variations, but I can guarantee you that the author did not work out the details of this argument before fleshing out the main arc of the proof first. Reading via dyadic scans reinforces the habit of keeping the big picture in your head at all times. The rigorous correctness of each detail matters less than you think – often any given technical argument can be done in several other ways.


For clarity, I’ve focused on dyadic scanning for reading papers. The method applies equally well to other settings. I will not bore you with the details, but here is a sketch of how.

On paper-writing

A well-written paper should make it easy for the reader to pick out its dyadic structure. To some extent, this is already standard practice: the abstract is an outline for the introduction, which is an outline for the entire paper. A great example where such a structure is pushed further is this paper of Tao which proves an almost version of the infamous Collatz conjecture. It is a fairly dense 49-page paper, so in addition to the standard abstract and introduction, there is a ten-page extended outline detailing the main arc of the proof and highlighting the important ideas.

When arguments get even longer than this, it is not uncommon to see a proof divided among multiple papers, with the first one serving as an extended introduction for the whole series.

This does not mean that a paper should necessarily be written in the explicit order of the dyadic scans: all the main theorems first, then the main lemmas, then the minor lemmas, and then the technical arguments should be bunched up at the very end. Often this will result in a very unnatural structure obfuscating the dependencies between ideas. It may be better to split the paper into subsections which are as functionally independent as possible, and carefully point out the dependencies and relative importance of various parts when they appear.

On giving and receiving talks

Attending talks is even more taxing on working memory than reading papers; the audience will generally have a wider variation in background, they will rarely have the luxury of pen and paper, and regardless of whether the talk is given via slides or a blackboard only a fraction of the total content is visible at any given time.

It is therefore even more essential when giving talks that the audience knows exactly where you’re going. Be sure to have multiple levels of signposting and continuous reminders on how each argument fits into the big picture. Most of the details in the lower levels should be omitted altogether unless they are absolutely essentially.

On the receiving end, my general advice is that trying to follow every word in a talk is a mistake akin to making a single sequential pass in reading a paper. Instead, treat going to a talk as taking a single dyadic pass into a topic, out of potentially many. Which pass to treat it as depends on your current level of exposure. If you are a beginner, watch the talk as if taking the first dyadic pass, noting the key words and main ideas and how they fit together. If you have some exposure to the field, you can treat the talk like a second or third pass, paying attention to the major details and innovations. If you are an expert in the field, almost nothing in the talk will be new to you, and you can really dig into the key details or open problems. Be realistic about how much you expect to get out of the talk, and plan accordingly for what to focus on.

On mathematical conversations

Much of the advice in this post applies just as well to the more informal setting of a mathematical conversation, where often one person must convey a fairly complicated argument verbally to one or more others. The key difference in this setting is that the listener(s) can play a much more active role.

The simplest level of active listening is asking for clarifications and more details when things are not clear. A more sophisticated level of active listening is asking directly for the pieces of the dyadic structure that one is missing: if the speaker dives into the details of a lemma before its purpose is made clear, it is often correct for you to ask for that purpose instead.

In Conclusion

In all of these activities, the fundamental resource is your very limited working memory, and the more you can predict, the less you have to remember. By looking ahead, by asking for clarification, by making multiple passes, we can “cheat” and see the future, freeing up our memory for what really matters.

Of Math and Memory, Part 2

Last time, I wrote that having a good memory is essential in mathematics.

Today I will describe my model for working memory.

Compression and Prediction

Data compression is the science of storing information in as few bits as possible. I claim that optimizing your working memory is mainly a problem of data compression: there’s a bounded amount of data you can store over a short period of time, and the problem is to compress the information you need so that this storage is as efficient as possible.

One of the fundamental notions in data compression is that compression is equivalent to prediction. Another way of saying this is: the more you can predict, the less you have to remember.

Here are three examples.

I. Text compression

Cnsdr ths prgrph. ‘v rmvd ll th vwls nd t rmns bsclly rdbl, bcs wth jst th cnsnnts n cn prdct wht th mssng vwls wr. Th vwls wr rdndnt nd cld b cmprssd wy.

All text compression algorithms work basically the same way: they store a smaller amount of data from which the rest of the information can be predicted. The better you are at predicting the future, the less arbitrary data you have to carry around.

II. Memory for Go

Every strong amateur Go player can, after a slow-paced game, reproduce the entire game from memory. An average game consists of between one and two hundred moves, each of which can be placed on any of the 19×19 grid points.


A typical amateur game, midway through.

Anyone who practices playing Go for a year or two will gain this amazing ability. It is not because their general memory improved either: if you showed them a sequence of nonsensical, randomly generated Go moves, they would have almost as hard of a time remembering them as an absolute novice.

The reason it’s so easy to remember your own games is because your own moves are so predictable. Given a game state, you don’t have to actually remember the coordinates where the stone landed. You just have to think “what would I do in this position?” and reproduce the train of thought.

The only moves in the game you really need to explicitly store in memory are the “surprising” moves that you didn’t expect. Surprise, of course, is just another word for entropy. The better you are at prediction, the less surprise (entropy) you’ll meet, and the less you have to remember.

III. Mathematical theorems

A general feature of learning things well is that you get better at predicting. Fill in the blank:

If a and b are both the sum of two squares, then so is ___.

A beginning student looks at this statement and recalls the answer is ab, simply by retrieving this answer directly from memory.

A practiced number theorist doesn’t need to store this exact statement directly in memory; instead, they know that any of an infinite variety of such statements can be reconstructed from a small number of core insights. Here, the two core insights are that a sum of two squares is the norm of a Gaussian integer, and that norms are multiplicative.

Getting better at prediction in mathematics often follows the same general pattern: identifying the small number of core truths from which everything else follows.

We reduced the problem of improving your working memory to the problem of predicting the future. At face value, this reduction seems less than useless, because predicting the future is harder than memorizing flash cards. Thankfully, human beings are embodied agents who can interact with our world. In particular, we can cheat by instead making the world easier to predict.

More on this next time.

Of Math and Memory, Part 1

Memory is not sexy in mathematics.

“Rote memorization” is the most degrading slur you can fling at a math class. “Reciter of digits of pi” is the most awful caricature of mathematicians in the public eye. In grad school, the cardinal sin is to read a paper with a focus on memorizing names and results: we are bombarded with exhortations like if you learned the Arzelà-Ascoli theorem deeply, it would be impossible to forget. Apparently, if you really understand mathematics, everything (down to the accents on the names of 19th century Italian mathematicians) would be so natural as to render rote memorization completely unnecessary.

All these attitudes can be quite detrimental to the young mathematician who, at the end of the day, needs to memorize an enormous amount of arbitrary data in order to get up to speed in their field. In this post, I will tell some archetypal stories about how memory, especially short-term working memory, is perhaps the scarcest resource in mathematical work.

In a future post, I will attempt to provide some solutions to address this scarcity.


Have you ever tried to copy a phone or bank account number from one place to another, without the benefit of Ctrl-C? You stare at the number for ten seconds, repeating it back to yourself in a rap-like rhythm. That sick beat, you hope, will help you remember an extra digit or two.

Conjure up that feeling of impending doom as you repeat those numbers back to yourself, knowing full well you can’t move 10 digits in one go. That’s the feeling of not having enough working memory. It’s the same feeling in each of the following scenarios.

These are not technically true stories, but they are all pieced together from literally true events.

A brilliant analytic number theorist is half-way through a riveting talk on the distribution of low-lying zeroes of L-functions. About three-quarters of the way through the blackboard space, the speaker finally switches gears from giving motivation and carefully treads into a long, technical calculation. Every Cauchy-Schwarz application and Fourier transform is clearly explained and surprisingly simple, until –

Uh oh!

The speaker reaches the bottom of the blackboard and begins erasing. You can almost hear the collective sigh of despair as most of the listeners think the same thought.

We’ve reached the end of the line.

After that half of the calculations are erased, only a handful of senior mathematicians who know the subject inside and out follow the rest of the talk.

Three mathematicians are throwing around ideas in a meeting. One is suddenly struck by inspiration, and starts explaining how to carry out a tricky change-of-variables. Another joins in with excitement, quickly catching on and offering a crude approximation which simplifies things significantly. All of this is happening in the air, so to speak. Writing things down would severely hamper their progress.

The third person, a younger graduate student, has a number of questions about the equations everyone’s keeping in their heads. The first time they ask for clarification, they are reminded gently that all calculations are in characteristic p. The second time, they are informed of a standard fact about eigenvalues of random matrices, and given a minute to catch up.

The third time, they can’t remember whether x was defined in the Fourier domain. They don’t ask.

In the next meeting, there are only two mathematicians.

I’m out for lunch, and need to attend a seminar talk afterwards. My weekly meeting with my PhD adviser is two hours away, and I haven’t made any progress this week. Away from pen and paper, I rack my brain and scrape the bottom of the proverbial barrel for any stray thought that might be worth presenting to him.

By some miracle, a casual remark during lunch sets off a series of revelations. I begin methodically working out the details in my head, getting more and more excited that I’m onto something. I completely ignore the seminar talk, running back and forth over the calculations in my mind. I get more and more confident that it works.

I walk into my adviser’s office and try to explain the idea to him, only to realize that I’d forgotten an essential intermediate step and mixed up two important variables. I get up to the board and attempt to work things out from the beginning, but I’m so flustered by this point that I keep forgetting what I’m doing.

We spend the hour going back and forth on minor technicalities, trying to see if there’s anything to my idea. In the end, my adviser becomes pessimistic that there’s anything at all and gently shoos me out for his next meeting.

When I get back to my office afterwards, I pull out pen and paper to try to salvage the idea.

I figure out all the details in fifteen minutes.

Problem Statement

It is difficult to collaborate with someone with significantly more or less short-term memory. Someone with more will appear to skip ahead three steps at a time, and you will continually feel in their debt for asking them to explain details. Conversely, someone with less will often ask you to rewind and write ideas down that you find inessential.

It’s difficult to read a mathematical paper without a good short-term memory. A reader who needs to keep referring back to the statement of Lemma 4.3(a) does not have the mental capacity to think about the big picture. If the paper is improperly structured, introduces clumsy notation, or is liberally sprinkled with abstruse citations, trying to follow it can feel like taking a forgetful random walk. How many times will I flip back to the conventions section before I remember the difference between S_k and \mathcal{S}_k?

It’s difficult to either follow or give a mathematical talk without a good short-term memory. An audience member can get lost by zoning out briefly and losing track of an important definition or theorem statement. A speaker who doesn’t remember the contents of their slides constantly reads off them and has no attention to pay to the audience. Next, an audience question about a previous slide breaks the artificial flow of the talk and causes a minor catastrophe.

People often worry that they cannot do mathematics because they are not clever enough. This is a very serious worry, because as far as we know everyone is born with a certain amount of clever and nobody really knows how to get more.

I think people should instead worry they cannot do mathematics because their memories are too poor. And I think this is very good news, because memory can be trained, and deficiencies in memory can be optimized around.

To be continued…




The Pit


If you brought a man with keen ears to the edge of the pit and dropped a quarter over exactly the right spot, you could count to eleven before he heard it hit the ground. If you next told the man that a sliver of sunlight was visible from the very bottom of said pit, he might have squinted at you skeptically. If you proceeded to say that the bowels of this same pit were inhabited by twenty-odd live human beings, he would certainly have slapped you across the side of the head and called you a shameless liar. But you wouldn’t have lied once.

The inhabitants of the pit – the pitfolk – were frail people, bone-pale from the perennial lack of sunlight, all taut skin wrapped about wan elbows. However they shifted their bodies to and fro, they were bound – as if by cowed by that measly sliver of sunlight – to walk hunched over, keeping their faces downcast.

Through the decades, the pitfolk developed an extraordinary black and white vision, as all they could see were the meager shadows which shifted around their ankles. From these faded images the pitfolk deduced the whole of their reality. This made for a rather miserable experience, but it did not stop the pitfolk from building an entire way of life about the dance of dim shadows – black silhouettes against grey stone – that was their everything.

Each day at noon, when the dim light against their backs shone brightest, the pitfolk gathered in a large circle with one tribe member or another in the center, and that chosen member would play out a long and complex dance with the long and dextrous fingers attached to his long and sinewy limbs. The whole circle would watch intently as the shadows cast from the dance leaped across the ground, swaying to a silent rhythm. The full performance, which lasted nearly two hours, had been passed down from parent to child as long as any living tribe member could remember, and probably longer. And although each generation brought into it their own unique flicks of the finger and twirls of the elbow, the dance remained remarkably unchanged through the years. As everyone knew, the shadow dance was the story of their tribe.

Of course, each tribe member was free to form their own opinion about what exactly that story was. By some miracle of memory the pitfolk retained the faintest inkling of the great goings-on in the outside world, and through this hint of a memory they interpreted the shadow dance. 

Some saw in the flapping of hand-shadows the wings of the great Father Bird, while others saw the flapping capes of the first men. Some thought the writhing finger-shadows on the ground represented a plague of snakes that would bring about the end of the world, while others saw them as the tongues of a purifying fire that would bring its redemption. Some thought it strange that the great savior had five heads, one shorter and bulkier than the others, while others believed the five heads actually represented five spirits in one body, one of them a pudgy child. All such disagreements existed, and many more, but as the pitfolk had no means of communication other than the shadow dance, it appeared to each of them as if everyone else agreed with their understanding of things. At least on one thing they did agree: that the shadow dance was the story of their tribe, and that story must be passed down.

The youngest of the tribe was a boy, not yet seven, whose name was Two-Crossed-Fingers – that was the way they made his shadow sign. Unlike the older tribe members, Two-Crossed-Fingers was still learning moves of the shadow dance. His elders had mostly calcified on their interpretations of the dance, and perhaps even began to bore of it after so many years of dialectic, but the boy was still wide-eyed with excitement, playing through each piece of the story in delighted confusion. It was his greatest dream to complete the shadow dance and take his place within the tribe, so he studied very hard and very long.

There were many points when Two-Crossed-Fingers became stuck on a motion that seemed impossible. To draw different shapes simultaneously with his two hands challenged his mind. To stretch his arms wide apart and swing round and round challenged his body. To watch the tendrils of darkness consume each other in terrifying awful motions challenged his heart. And as Two-Crossed-Fingers was especially young and frail, these challenges were especially hard on him. But nevertheless he persisted, practicing deep into the dark of night, and to his amazement he discovered a certain way of interpreting the shadow dance that made what seemed to be impossible motions easy.

When he saw the two different shapes he had to draw as the two sexes, two sides of the same humanity, that motion merged into one unified story.

When he saw the swinging of his arms as the collapsing of the great bond between people, its frantic energy became natural and effortless.

When he accepted the throwing of one hand into the other as the final sacrifice that saved the last remnants of humanity, his terror subsided and was replaced by inner peace.

Two-Crossed-Fingers basked in the feeling of these revelations, and there was no doubt in his mind that these were the one true interpretation of the shadow dance. He was a great deal impressed by the genius who had woven together a dance so that the truth itself would shine through its execution, and thus carry forever the story of his people through the ages. Each step he took brought him closer to this story, and closer to his place in the tribe. He had no reason to suspect that he, uniquely, was the only one who had felt the meaning of the dance. Two-Crossed-Fingers had no reason to suspect that, unlike him, all twenty-two other tribespeople were just going through the motions, and they each had their own clumsily patched-together notion of things.

The day finally came when Two-Crossed-Fingers was deemed ready to perform. He twirled into the center of the ring just before noon, flush with excitement yet oddly composed. Many years had already been spent on this journey, but it was just the beginning. Two-Crossed-Fingers planned to spend many yet sharing the joy of the dance with his people. They had so few joys.

Just as the boy threw his hands out in opposite shapes to mime the courtship dance of man and woman, all of the other pitfolk heard a strange and shocking sound.

You must know that no sounds had been heard or uttered down here for a very long time.

It was a very weak scratching sound that seemed to carry down the pit from a great distance above, and it slowly grew more and more insistent. Bits of dirt and rubble began to tumble down into the pit, scaring what little color there was out of the downcast faces of the pitfolk. Nobody moved, nor blinked, nor paid any attention to anything except the growing noise.

Nobody except for the dancing boy in the center, who was so entranced by his moment that he didn’t notice the new stimuli.

The scratching sounds and falling rubble built into a crescendo, until even Two-Crossed-Fingers couldn’t ignore them, but the boy, despite his shock, continued to dance. Who could guess what went through his mind at that time? Whatever it was, he knew that the shadow dance, once initiated, must continue to completion.

A great Crack! was heard, reverberating around the cavern, and suddenly existence itself shattered – or so it seemed to the pitfolk. A great boulder had been dislodged from the opening of the pit, and sunlight poured in at an intensity they had never before experienced. The light was blinding.

Their eyes watered, their knees buckled, and they all knew that the end was nigh. But Two-Crossed-Fingers continued to dance the bond between good and evil unfurling into chaos. If his eyes began to bleed as he sped up his frantic motions, he did not seem to notice. The boy was possessed by a singular purpose – to retell the story of his tribe one final time.

Faster and faster he danced until the shadows made only a blur on the ground, telling no story at all except in the boy’s mind’s eye. One by one the muscles in his body gave way, but still he whirled. If anyone had stopped to ask him why he bothered dancing as the world fell apart, that single note of confusion might have broken his trance. But no one paid him any mind, so no stray thoughts entered the boy’s head, so Two-Crossed-Fingers danced to the very end. 

As the final sacrifice was thrown into the purifying fire to save humanity from its ultimate doom, he let out a long breath of relief. 

Then everything went white.

When we opened up the pit, we couldn’t believe our eyes. The records clearly showed that a mining accident had sealed the shaft nearly two hundred years ago, and yet when we dug down to its depths, we discovered twenty-three living human beings, cringing and frightened in a circle, thin as sticks. I’m proud to say that the team took action rapidly and without hesitation, climbing back up with the pit people strapped to their backs. We herded them like frightened sheep into our trucks, and soon had them in the local hospital.

What I remember most from that day was one little boy who latched onto me with a vice-like grip. He shook up and down, clearly frightened out of his wits, and a smear of blood ran down the side of his cheek. Yet still there was a line of defiance in his brow.

The mission itself had to continue, but a few of us volunteered to stay behind with the pit people and take care of their rehabilitation, as the hospital was understaffed for this kind of work. It took many months of intensive care to nurse the pit people back to health. None of them knew any spoken language, but they were surprisingly quick studies, and the programs worked as well as could be expected. Eventually they were able to tell us in simple words their unbelievable tale –  that all of them, down to the oldest man, had lived in the pit since birth, and never known any other life.

We tried our best to help the pit people, but it was difficult, for they had been down there so very long. What kept us going was how grateful they were, and they were very expressive of their gratitude with their long, bony limbs. The pit people all agreed that it had been a living nightmare down there in the pit, surviving in some demi-state between life and unlife. The plainest things – the green of grass, the ripples on pondwater, the crunch of tires against gravel – brought tears of ecstasy to their eyes.

There was one exception: the youngest among them, a boy who couldn’t have been more than eight years old. The same boy who had left such an impression on me that very first day.

When I was tasked to observe him, I found the boy was grateful and agreeable, but not extraordinarily so. He smiled a distant smile and made odd motions with his arms, often waving his crossed fingers at me as if he was about to lie. But he refused to learn to speak.

One day, he pulled me away from my lunch break and into his tent. The boy turned the lamp off so that only a thin sliver of light made it through the flap, and proceeded to do the strangest little dance. Even in the darkness, it was such a grotesque and unnatural series of motions to my eyes, joints bent in all the wrong angles, that I reflexively cast my eyes away. 

I think this offended him, and he stopped dead in his tracks. 

Ever since that day, he ignored me and all the other staff entirely, despite my best efforts to help him open up. The boy had no interest in the toys and games that fascinated other children.

Even so, I held out hope that something would change.

One chilly winter morning some weeks later, I was woken up by a nurse to learn that the boy had gone missing. There was no trace of him around the camp, and a long rope ladder had disappeared with him.

Propelled by a sinking feeling in my stomach, I jumped into my truck and quickly drove my way back to the opening of the pit where we first found them. Just as I suspected, a rope ladder fell into the darkness, its ends amateurishly looped around a tree stump nearby.

After refastening the rope properly, I descended down the ladder, too anxious to go back for proper protective gear. I knew that the boy was at the bottom of the pit, and part of me was already rehearsing for what I would do when I found him. Would I comfort him, or be cross? Would I have to take him back by force?

As I climbed down, the sounds of life retreated into the distance and the light above faded to a tiny point, but still the bottom was nowhere in sight. I felt as if I was descending into another plane, where time and space and smell and taste all faded into metaphor.

Finally, my boots hit solid ground. Using my smartphone as a makeshift flashlight, I discovered the same large cavern that we initially found the pit people in. It was about sixty feet across, and almost perfectly circular. The cold stone stretched out flat and empty, devoid of any sign that dozens of people had ever spent their lives here. 

To my surprise, there was not even a single trace of the boy.

I stood still for a moment, drinking in the emptiness. In the corner of my eye, the shadows on the ground seemed to dance and cavort gracefully, but when I turned to stare at them they steadied. 

It must have been the unsteadiness of my hands.

Another silhouette flitted across my peripheral vision, but when I turned, there was again nothing.

I was unsettled.

I am not a claustrophobic man, but the emptiness, the silence, and the chill in the air made it unbearable to stay too long. All thoughts of the boy disappeared from my mind.

As soon as my limbs recovered I clambered back up the rope ladder as fast as they would take me, out of this plane of demi-life. The trip upwards seemed to take twice as long as the trip down, and my limbs almost gave way before I made it.

After returning to the land of the living, I thought for a moment to pull up the rope ladder and take it with me.

In the end, I decided against it.

Objectives vs Constraints

I was thinking the other day about how strange linear programming duality is, and how great it would be if something like it applied in real life. This led me to thinking about how human beings optimize in practice. 

I think a huge number of optimization problems at every level from public policy to personal decision-making can be framed as “Maximize A and B” where A and B are two values. Conflict arises when A and B compete and need to be traded off for each other. 

The first key insight is:

People almost always implement “maximize A and B” as either “maximize A given B” or “maximize B given A,” and these are NOT the same strategy.

If someone is implementing “maximize A given B,” I’ll say they’re treating A as the objective and B as the constraint. It is important to note that even though the objective A may seem like the thing you’re working hardest on and care the most about because you’re trying to maximize it, the constraint B is actually the value you’re putting more weight on. That’s the second insight:

When you think you’re prioritizing A you might actually be putting most of your energy in guaranteeing a different value B, and optimizing A with only the residual energy that remains.


I have taken a good number of college math classes, and I would roughly divide the pedagogy into three categories, based on what the lectures seem to be optimizing for out of (A) student understanding and (B) material covered.

Classes in category 1 (common among large introductory courses like linear algebra or real analysis) feel as if they’re designed to make sure the median student understands all the material. Examples are copious, homework exercises are comprehensive, and each important argument or tool is practiced deliberately and with spaced repetition. The revealed preference of the lecturer is “Maximize material covered conditioned on student understanding.”

Classes in category 2 (common among upper-level graduate courses) feel as if they’re designed to cram as much of the instructor’s pet topic into a semester as humanly possible. Homework is sparse if it exists, while details, proofs, and entire months of intermediate background material are skipped or brushed under the rug. By the end of the course, the number of students not completely lost is between 0 and the number of instructor’s doctoral students taking the class, inclusive. Usually, the lecturer is both blissfully unaware that nobody is following and perfectly happy to slow down for questions and fill in details when prodded. So they clearly care about student understanding at some level. The problem is that they skip five steps for every one covered, and even generously filling in one or two of those steps helps almost nobody. The revealed preference of the lecturer is “Maximize student understanding conditioned on covering all the material.”

The third – and possibly largest – category is an uncanny middle ground between these two extremes.

Take a Data Structures class I sat through in some previous life. Before the first two midterms, we met all the usual suspects – BSTs, hashtables, suffix arrays – the stuff techbros memorize to pass Google interviews and never touch again. Once or twice the instructor gets a bit of color in his cheeks and does something a little risqué like put a BST inside a hashtable, but on the whole you can follow along by watching the lecture videos at 3x speed with Katy Perry playing in the background.

Well I’m zooming along happily and then right after the second midterm, a switch flips. The instructor has covered all the “Data Structures 101” and has six lectures left to introduce us to the bountiful fruits of modern research. You can almost see him giddily preparing lecture notes the night before and bashfully remarking, “oops, this part needs a whole two lectures on circuit complexity to make sense, teehee.” The fraction of students who are nodding along excitedly in lecture drops from 1-o(1) to o(1).

This kind of sharp phase transition has happened to me enough times that I’m kind of numb to the process. I almost know from day one that at some point lectures will suddenly stop making sense, even if I loved the lecturer’s style at the beginning. Classes in category 3 (which tend to be upper-level undergraduate courses or introductory graduate classes) start out “maximize material covered conditioned on student understanding”, and then BAM! experience a sharp transition around the two-thirds mark into “maximize student understanding conditioned on material covered.” In Algebra 1, the lecturer covers rings, fields, and a smattering of Galois theory, and then runs out of patience and suddenly starts preaching the mAgIcAl LaNgUaGe Of ScHeMeS. An exquisite course on Riemann surfaces runs adrift after the second midterm into the dynamics on moduli spaces of nonorientable genus something somethings.

And the sad thing is, I really understand where these lecturers are coming from. After all, a human being can only optimize for one thing at once.


The criminal justice system primarily cares about two things: (A) doing bad things to guilty people, and (B) not doing bad things to innocent people. For almost all of human history, the default optimization protocol was “minimize B given A,” in other words, “guilty until proven innocent.” This kind of thinking is built into us: we would rather wipe out villages of extra innocents than let dangerous criminals or enemies go free. Almost every culture has ancient concepts of original sin or guilt by association. In Chinese literature, the bad guys’ catch phrase is “斩草除根” (when cutting grass, pull out the roots), which is usually used to justify killing the good guy’s children to prevent them from retaliating when they grow up. Murder some ten-year-olds just to be safe. After all, it’s the humble thing to do.

At some momentous inflection point in history, the fundamental legal axiom flipped to “innocent until proven guilty.” The switch between these two optimization protocols, which are superficially doing the same thing, “maximize A and minimize B,” was possibly the most important and unlikely step ever made in the advance of human civilization. “Innocent until proven guilty” affirms the principle that an individual human being has intrinsic value, and that we cannot murder someone just to be safe. What it means, unfortunately, is that we let scumbags and criminals go all the time and this is by design. If you think this was an easy principle for human beings to agree upon, you have not met human beings.

A diagnostic cancer test primarily cares about two things: (A) telling cancer patients they have cancer, and (B) not telling healthy people they have cancer. In a world where technology is not perfect and we have to trade off between some amount of A or some amount of B, the medical profession uses the protocol “minimize B conditioned on A.”

This is not as trivial a choice as it might seem – remember that one probability problem about false positive rates they ask on every standardized test? Even if the false positive rate is only 1%, most diagnoses will be false positives because very few people have cancer, but many people don’t have cancer. But it’s still worth it – it’s much much more important that every early cancer patients is diagnosed correctly than that healthy people don’t get scared and inconvenienced, even if we scare a huge number of such people.

An immigration policy cares about two things: (A) letting good people in, and (B) keeping bad people out. There was a point in the history of the North American continent where the immigration policy was entirely open, ignoring B altogether. This was an unmitigated disaster for the Americans of the time, as European immigrants came in with their guns, germs, and steel and wiped out 90% of the native population. In recent history, it seems like the opposite policy is the case, “maximize A conditioned on B,” but it is a huge source of controversy because we cannot agree on whether which of A or B should be the objective and which should be the constraint. Merely saying both sides care about A and B does nothing to solve the problem.


Here’s a parable about the kind of person I am. A psychologist once gave five-year-old me an infinite marshmallow test: “For each 15 minutes you wait, you get one more marshmallows at the end!” Legend says I’m still waiting in that room.

Of course, the marshmallow test is not mostly about impulse control or delayed gratification, as it’s usually sold. It’s about being willing to sacrifice (A) your own comfort to (B) pass other people’s tests and get their approval. I was always very much willing to play the game “maximize A conditioned on B” – when I could laze out and be comfortable I would, but only after guaranteeing I’d pass the test.

I spent a lot of time as a child being alternatively confused about or contemptuous of other kids who didn’t do as well at tests, especially when they claimed to be “doing their best.” It seemed to me that “doing your best” means passing the test at all costs, and it was glaringly obvious to me that every single other student could do that, especially given how easy the tests were. It took a long time for me to realize that “do your best” actually meant “maximize B conditioned no A” – don’t mutilate yourself to get other people’s approval – and even longer to understand that this might actually be right.


I wanted to conclude this essay by making sweeping generalizations about human psychology, but then I realized that I’m still not confident the phenomenon I’m describing is real. Here are the claims I’d like to make:

  • All hard decisions involve tradeoffs between (at least) two competing values.
  • Instead of treating competing values as roughly equal in weight, usually human beings will weigh one WAY more than the other, so in practice “maximize A and B” rounds off to “maximize A conditioned on B.”
  • Often this is the correct behavior, even if it is surprising. Usually one of A or B will actually be several orders of magnitude more impactful than the other.
  • Sometimes this is the incorrect behavior but people still do it because human beings can only optimize one function at a time.
  • Many interpersonal conflicts occur because one person is trying to solve “maximize A given B” and the other is trying to solve “maximize B given A” and each thinks they’re solving the same problem as the other person, just in a better way.
  • We need to learn to maximize functions like A+B.

Thoughts, examples, counterexamples?

The Arrogance of Vision

Humility is almost uniformly lauded in our culture, to the point that many people have forgotten the appeal of pride. I’m bullish on pride because I can’t help noticing how inherently appealing arrogance can be. On TV there’s an endless litany of charming characters like Gregory House, Sherlock Holmes, and Tony Stark whose defining characteristic is their bullheaded ability to plow through social and cultural norms with the sheer force of intellect.

I want to pinpoint a taste for particular type of arrogance today. Perhaps to be clear I think it’s not truly about arrogance at all, but it certainly comes off that way. I associate this character trait with the eye: people who have it are gifted with a particular strength of vision, and the confidence to rely on it. This trait is personified by Eliezer Yudkowsky. Consider:

Look, you don’t understand human nature. People wouldn’t try for five minutes before giving up if the fate of humanity were at stake.

Use the Try Harder, Luke

Sometimes I think the main difference between people who like Yudkowsky and people who hate him is whether they respond well to this kind of arrogance. I imagine a simple litmus test is whether you identify with the writer who uses a lot of italics to drill into your thick skull how very important this point right here is.

But I digress.

Culture Simplifies

Culture is a powerful and simplifying thing – it lets us label a messy world with a discrete set of approved categories and interact with this world through a set of approved actions. The world that’s too complicated and contains too much action space for a single human to search by himself. And as part of the social contract, because we need the simplifying power of culture, we pretend to be much simpler, much cleaner, much more inoffensive than we are. Think of culture as an GUI for social reality, an enormous simplifying force that lets you assume the stranger on the street will not pounce on you and tickle you, that your relationship is acceptable because a number is greater than eighteen, that this cap makes you a brooding artist but that one with the different brim makes you a creepy neckbeard.

I derive endless pleasure from reading the subreddit r/relationships, and one of the most interesting patterns I’ve noticed over the years is that people tend to ask “is the way my husband screamed at me normal?” as often or more often than “is the way my husband screamed at me right?” And of course the former is seen as a proxy for the latter, because, you see, to answer  “is this normal?” requires only a simple verification against the rules of culture, whereas to answer “is this right?” requires a detailed analysis of context and a complete moral philosophy, something very few people have access to.

The Role of the Eye

Every so often, a highly disagreeable individual with a good eye comes along and says: the overlay is kind of broken. It’s oversimplifying here. It’s mis-categorizing there. Stop using it. You have built into you the faculty to see the world as it is, to interact with reality on its own, in all of its wonderful complexity, without recourse to this child’s gadget called culture. Every six months I have to stare at the Kandinsky print on my wall and remind myself that not only is this a symbol of my rarefied taste, it’s a painting I actually enjoy looking at.

Next time you make an important decision, notice how what you’re trying to calculate is what you’re supposed to do. Notice how you can also calculate what is right instead. I’m not saying you need to be a hero and go do that instead – maybe the answers to both questions are the same, even. But notice how different the processes by which you answer these questions feel. That to figure out what is right requires so much more work, and such a richer interaction with reality.

And maybe if you can live like this for a period of years, taking the time to independently verify the answers that culture drip-feeds you, there will come a day when you too have the confidence to take off the overlay and see the world as it is.

At least that’s what I tell myself.

Four WEIRD Theorems I Learned from Wikipedia

You Won’t Believe How SIMPLE Their Proofs Are!

1. Balinski’s Theorem. If P is a convex d-dimensional polytope, then its skeleton is a d-connected graph. (The skeleton of a polytope is the graph you get from just taking the vertices and edges. A d-connected graph is a graph which is still connected if you delete any d-1 vertices.)

Proof.  Let S be any set of at most d-1 vertices of P, and pick a vertex v \in P outside S. Since S \cup \{v\} contains at most d points, it is contained in a hyperplane H in \mathbb{R}^d, and there is a linear functional f which is zero on H but nonzero outside it. Call a point positive if f takes a positive value on it, and negative otherwise.

Now, apply the simplex method from linear programming; it tells us that there is a path from every vertex to the f-maximizing vertex v_+, and each move along this path only increases the value of f. Thus v and every positive point is connected to v_+ by a path of only positive points. Similarly, v and every negative point is connected to the f-minimizing vertex v_- by only negative points. Thus every vertex outside S is connected to v.

2. The De Bruijn–Erdős Theorem. If all finite subgraphs of an infinite graph G are k-colorable, then so is G(A graph is k-colorable if it has a proper k-coloring, i.e. an assignment of integers up through k to vertices so that adjacent vertices have different colors.)

Proof. Proper colorings of G=(V,E) are functions V \rightarrow [k], which we can think of as points in X = [k]^V. Under the product topology, X is compact by Tychonoff’s theorem. For any finite subgraph H of G, denote by X_H the (closed) subset of functions in X which properly color H. By the assumption, X_H is nonempty. Furthermore, any finite intersection of X_H‘s is itself and X_H and is nonempty. Thus, by compactness the mutual intersection of all the X_H‘s is nonempty. Any common intersection is a proper coloring of G.

3. Frucht’s Theorem. Every finite group is the automorphism group of a finite simple graph.

Proof. For a finite group G, pick a set of generators S and construct the (directed) Cayley graph H = \Gamma(G, S), the graph whose vertex set is G and edges are drawn between elements differing by an element of S. Color each edge according to which generator it represents multiplication by. Then, it is easy to check that the only automorphisms of the (colored, directed) H are given by multiplication by elements of G, and so Aut(H) = G.

It remains to convert H into an honest uncolored, undirected graph. This can be done by choosing |S| different “weird gadget” graphs to replace the edges in each color class with. For example, I might replace the edges of the first color with paths of length seventeen with a Petersen graph attached to the fourth internal vertex. As long as these gadgets are weird enough, no additional automorphisms are introduced.

4. The Gallai–Hasse–Roy–Vitaver Theorem. The chromatic number of an undirected graph G is equal to the minimum number of vertices, over all orientations of G, in the longest oriented path in G(An orientation of an undirected graph G is a directed graph obtained by giving each edge in G a direction.)

Proof. If G has a proper k-coloring, orient each edge to go from the smaller color to the larger color. Then, every oriented path in this orientation contains at most k vertices.

In the opposite direction, suppose we have an orientation H of G for which every oriented path contains at most k vertices. Pick a maximal acyclic subgraph H_0 \subseteq H, i.e. keep adding edges that don’t form an oriented cycle until you can’t. Color each vertex of G by the length of the longest path of H_0 ending in that vertex. This will be a proper k-coloring of G.