The Human Mind

I’ve recently noticed some possibly important confusion about machine learning (ML)/deep learning. I’m quite uncertain how much harm the confusion will cause.

On MIRI’s Intelligent Agent Foundations Forum:

If you don’t do cognitive reductions, you will put your confusion in boxes and hide the actual problem. … E.g. if neural networks are used to predict math, then the confusion about how to do logical uncertainty is placed in the black box of “what this neural net learns to do”

On SlateStarCodex:

Imagine a future inmate asking why he was denied parole, and the answer being “nobody knows and it’s impossible to find out even in principle” … (DeepMind employs a Go master to help explain AlphaGo’s decisions back to its own programmers, which is probably a metaphor for something)

A possibly related confusion, from a conversation that I observed recently: philosophers have tried to understand how concepts work for centuries, but have made little progress; therefore deep learning isn’t very close to human-level AGI.

I’m unsure whether any of the claims I’m criticizing reflect actually mistaken beliefs, or whether they’re just communicated carelessly. I’m confident that at least some people at MIRI are wise enough to avoid this confusion [1]. I’ve omitted some ensuing clarifications from my description of the deep learning conversation – maybe if I remembered those sufficiently well, I’d see that I was reacting to a straw man of that discussion. But it seems likely that some people were misled by at least the SlateStarCodex comment.

There’s an important truth that people refer to when they say that neural nets (and machine learning techniques in general) are opaque. But that truth gets seriously obscured when rephrased as “black box” or “impossible to find out even in principle”.
Continue Reading

Book review: The Rationality Quotient: Toward a Test of Rational Thinking, by Keith E. Stanovich, Richard F. West and Maggie E. Toplak.

This book describes an important approach to measuring individual rationality: an RQ test that loosely resembles an IQ test. But it pays inadequate attention to the most important problems with tests of rationality.


My biggest concern about rationality testing is what happens when people anticipate the test and are motivated to maximize their scores (as is the case with IQ tests). Do they:

  • learn to score high by “cheating” (i.e. learn what answers the test wants, without learning to apply that knowledge outside of the test)?
  • learn to score high by becoming more rational?
  • not change their score much, because they’re already motivated to do as well as their aptitudes allow (as is mostly the case with IQ tests)?

Alas, the book treats these issues as an afterthought. Their test knowingly uses questions for which cheating would be straightforward, such as asking whether the test subject believes in science, and whether they prefer to get $85 now rather than $100 in three months. (If they could use real money, that would drastically reduce my concerns about cheating. I’m almost tempted to advocate doing that, but doing so would hinder widespread adoption of the test, even if using real money added enough value to pay for itself.)

Continue Reading

[Caveat: this post involves abstract theorizing whose relevance to practical advice is unclear. ]

What we call willpower mostly derives from conflicts between parts of our minds, often over what discount rate to use.

An additional source of willpower-like conflicts comes from social desirability biases.

I model the mind as having many mental sub-agents, each focused on a fairly narrow goal. Different goals produce different preferences for caring about the distant future versus caring only about the near future.

The sub-agents typically are as smart and sophisticated as a three year old (probably with lots of variation). E.g. my hunger-minimizing sub-agent is willing to accept calorie restriction days with few complaints now that I have a reliable pattern of respecting the hunger-minimizing sub-agent the next day, but complained impatiently when calorie restriction days seemed abnormal.

We have beliefs about how safe we are from near-term dangers, often reflected in changes to the autonomic nervous system (causing relaxation or the fight or flight reflex). Those changes cause quick, crude shifts in something resembling a global discount rate. In addition, each sub-agent has some ability to demand that it’s goals be treated fairly.

We neglect sub-agents whose goals are most long-term when many sub-agents say their goals have been neglected, and/or when the autonomic nervous system says immediate problems deserve attention.

Our willpower is high when we feel safe and are satisfied with our progress at short-term goals.

Social status

The time-discounting effects are sometimes obscured by social signaling.

Writing a will hints at health problems, whereas doing something about global warming can signal wealth. We have sub-agents that steer us to signal health and wealth, but without doing so in a deliberate enough way that people see that we are signaling. That leads us to exaggerate how much of our failure to write a will is due to the time-discounting type of low willpower.

Video games convince parts of our minds that we’re gaining status (in a virtual society) and/or training to win status-related games in real life. That satisfies some sub-agents who care about status. (Video games deceive us about status effects, but that has limited relevance to this post.) Yet as with most play, we suppress awareness of the zero-sum competitions we’re aiming to win. So we get confused about whether we’re being short-sighted here, because we’re pursuing somewhat long-term benefits, probably deceiving ourselves somewhat about them, and pretending not to care about them.

Time asymmetry?

Why do we feel an asymmetry in effects of neglecting distant goals versus neglecting immediate goals?

The fairness to sub-agents metaphor suggests that neglecting the distant future ought to produce emotional reactions comparable to what happens when we neglect the near future.

Neglecting the distant future does produce some discomfort that somewhat resembles willpower problems. If I spend lots of time watching TV, I end up feeling declining life-satisfaction, which tends to eventually cause me to pay more attention to long-term goals.

But the relevant emotions still don’t seem symmetrical.

One reason for asymmetry is that different goals imply different things for what constitutes neglecting a goal: neglecting sleep or food for a day implies something more unfair to the relevant sub-agents than does neglecting one’s career skills.

Another reason is that for both time-preference and social desirability conflicts, we have instincts that aren’t optimized for our current environment.

Our hunter-gatherer ancestors needed to devote most of their time to tasks that paid off within days, and didn’t know how to devote more than a few percent of their time to usefully preparing for events that were several years in the future. Our farmer ancestors needed to devote more time to 3-12 month planning horizons, but not much more than hunter-gatherers did. Today many of us can productively spend large fractions of our time on tasks (such as getting a college degree) that take more than 5 years to pay off. Social desirability biases show (less clear) versions of that same pattern.

That means we need to override our system 1 level heuristics with system 2 level analysis. That requires overriding the instinctive beliefs of some sub-agents about how much attention their goals deserve. Whereas the long-term goals we override to deal with hunger have less firmly established “rights” to fairness.

Also, there may be some fairness rules about how often system 2 can override system 1 agents – doing that too often may cause coalitions within system 1 to treat system 2 as a politician who has grabbed too much power. [Does this explain decision fatigue? I’m unsure.]

Other Models of Willpower

The depletion model

Willpower depletion captures a nontrivial effect of key sub-agents rebelling when their goals have been overlooked for too long.

But I’m confused – the depletion model doesn’t seem like it’s trying to be a complete model of willpower. In particular, it either isn’t trying explain evolutionary sources of willpower problems, or is trying to explain it via the clearly inadequate claim that willpower is a simple function of current blood glucose levels.

It would be fine if the depletion model were just a heuristic that helped us develop more willpower. But if anything it seems more likely to reduce willpower.

Kurzban’s opportunity costs model

Kurzban et al. have a model involving the opportunity costs of using cognitive resources for a given task.

It seems more realistic than most models I’ve seen. It describes some important mental phenomena more clearly than I can, but doesn’t quite seem to be about willpower. In particular, it seems uninformative about differing time horizons. Also, it focuses on cognitive resource constraints, whereas I’d expect some non-cognitive resource constraints to be equally important.

Ainslie’s Breakdown of Will

George Ainslie wrote a lot about willpower, describing it as intertemporal bargaining, with hyperbolic discounting. I read that book 6 years ago, but don’t remember it very clearly, and I don’t recall how much it influenced my current beliefs. I think my model looks a good deal like what I’d get if I had set out to combine the best parts of Ainslie’s ideas and Kurzban’s ideas, but I wrote 90% of this post before remembering that Ainslie’s book was relevant.

Ainslie apparently wrote his book before it became popular to generate simple models of willpower, so he didn’t put much thought into comparing his views to others.

Hyperbolic discounting seems to be a real phenomenon that would be sufficient to cause willpower-like conflicts. But I’m unclear on why it should be a prominent part of a willpower model.


This “model” isn’t designed to say much beyond pointing out that willpower doesn’t reliably get depleted.


A Hot/cool-system model sounds like an attempt to generalize the effects of the autonomic nervous system to explain all of willpower. I haven’t found it to be very informative.


Some say that willpower works like a muscle, in that using it strengthens it.

My model implies that we should expect this result when preparing for the longer-term future causes our future self to be safer and/or to more easily satisfy near-term goals.

I expect this effect to be somewhat observable with using willpower to save money, because having more money makes us feel safer and better able to satisfy our goals.

I expect this effect to be mostly absent after using willpower to loose weight or to write a will, since those produce benefits which are less intuitive and less observable.

Why do drugs affect willpower?

Scott at SlateStarCodex asks why drugs have important effects on willpower.

Many drugs affect the autonomic nervous system, thereby influencing our time preferences. I’d certainly expect that drugs which reduce anxiety will enable us to give higher priority to far future goals.

I expect stimulants make us feel less concern about depleting our available calories, and less concern about our need for sleep, thereby satisfying a few short-term sub-agents. I expect this to cause small increases in willpower.

But this is probably incomplete. I suspect the effect of SSRIs on willpower varies quite widely between people. I suspect that’s due to an anti-anxiety effect which increases willpower, plus an anti-obsession effect which reduces willpower in a way that my model doesn’t explain.

And Scott implies that some drugs have larger effects on willpower than I can explain.

My model implies that placebos can be mildly effective at increasing willpower, by convincing some short-sighted sub-agents that resources are being applied toward their goals. A quick search suggests this prediction has been poorly studied so far, with one low-quality study confirming this.


I’m more puzzled than usual about whether these ideas are valuable. Is this model profound, or too obvious to matter?

I presume part of the answer is that people who care about improving willpower care less about theory, and focus on creating heuristics that are easy to apply.

CFAR does a decent job of helping people develop more willpower, not by explaining a clear theory of what willpower is, but by focusing more on how to resolve conflicts between sub-agents.

And I recommend that most people start with practical advice, such as the advice in The Willpower Instinct, and worry about theory later.

I’ve substantially reduced my anxiety over the past 5-10 years.

Many of the important steps along that path look easy in hindsight, yet the overall goal looked sufficiently hard prospectively that I usually assumed it wasn’t possible. I only ended up making progress by focusing on related goals.

In this post, I’ll mainly focus on problems related to general social anxiety among introverted nerds. It will probably be much less useful to others.

In particular, I expect it doesn’t apply very well to ADHD-related problems, and I have little idea how well it applies to the results of specific PTSD-type trauma.

It should be slightly useful for anxiety over politicians who are making America grate again. But you’re probably fooling yourself if you blame many of your problems on distant strangers.

Trump: Make America Grate Again!

Continue Reading

I started writing morning pages a few months ago. That means writing three pages, on paper, before doing anything else [1].

I’ve only been doing this on weekends and holidays, because on weekdays I feel a need to do some stock market work close to when the market opens.

It typically takes me one hour to write three pages. At first, it felt like I needed 75 minutes but wanted to finish faster. After a few weeks, it felt like I could finish in about 50 minutes when I was in a hurry, but often preferred to take more than an hour.

That suggests I’m doing much less stream-of-consciousness writing than is typical for morning pages. It’s unclear whether that matters.

It feels like devoting an hour per day to morning pages ought to be costly. Yet I never observed it crowding out anything I valued (except maybe once or twice when I woke up before getting an optimal amount of sleep in order to get to a hike on time – that was due to scheduling problems, not due to morning pages reducing the available of time per day).
Continue Reading

Why do people knowingly follow bad investment strategies?

I won’t ask (in this post) about why people hold foolish beliefs about investment strategies. I’ll focus on people who intend to follow a decent strategy, and fail. I’ll illustrate this with a stereotype from a behavioral economist (Procrastination in Preparing for Retirement):[1]

For instance, one of the authors has kept an average of over $20,000 in his checking account over the last 10 years, despite earning an average of less than 1% interest on this account and having easy access to very liquid alternative investments earning much more.

A more mundane example is a person who holds most of their wealth in stock of a single company, for reasons of historical accident (they acquired it via employee stock options or inheritance), but admits to preferring a more diversified portfolio.

An example from my life is that, until this year, I often borrowed money from Schwab to buy stock, when I could have borrowed at lower rates in my Interactive Brokers account to do the same thing. (Partly due to habits that I developed while carelessly unaware of the difference in rates; partly due to a number of trivial inconveniences).

Behavioral economists are somewhat correct to attribute such mistakes to questionable time discounting. But I see more patterns than such a model can explain (e.g. people procrastinate more over some decisions (whether to make a “boring” trade) than others (whether to read news about investments)).[2]

Instead, I use CFAR-style models that focus on conflicting motives of different agents within our minds.

Continue Reading

Book review: Are We Smart Enough to Know How Smart Animals Are?, by Frans de Waal.

This book is primarily about discrediting false claims of human uniqueness, and showing how easy it is to screw up evaluations of a species’ cognitive abilities. It is best summarized by the cognitive ripple rule:

Every cognitive capacity that we discover is going to be older and more widespread than initially thought.

De Waal provides many anecdotes of carefully designed experiments detecting abilities that previously appeared to be absent. E.g. asian elephants failed mirror tests with small, distant mirrors. When experimenters dared to put large mirrors close enough for the elephants to touch, some of them passed the test.

Likewise, initial observations of behaviorist humans suggested they were rigidly fixated on explaining all behavior via operant conditioning. Yet one experimenter managed to trick a behaviorist into demonstrating more creativity, by harnessing the one motive that behaviorists prefer over their habit of advocating operant conditioning: their desire to accuse people of recklessly inferring complex cognition.

De Waal seems moderately biased toward overstating cognitive abilities of most species (with humans being one clear exception to that pattern).

At one point he gave me the impression that he was claiming elephants could predict where a thunderstorm would hit days in advance. I checked the reference, and what the elephants actually did was predict the arrival of the wet season, and respond with changes such as longer steps (but probably not with indications that they knew where thunderstorms would hit). After rereading de Waal’s wording, I decided it was ambiguous. But his claim that elephants “hear thunder and rainfall hundreds of miles away” exaggerates the original paper’s “detected … at distances greater than 100 km … perhaps as much as 300 km”.

But in the context of language, de Waal switches to downplaying reports of impressive abilities. I wonder how much of that is due to his desire to downplay claims that human minds are better, and how much of that is because his research isn’t well suited to studying language.

I agree with the book’s general claims. The book provides evidence that human brains embody only small, somewhat specialized improvements on the cognitive abilities of other species. But I found the book less convincing on that subject than some other books I’ve read recently. I suspect that’s mainly due to de Waal’s focus on anecdotes that emphasize what’s special about each species or individual. Whereas The Human Advantage rigorously quantifies important ways in which human brains are just a bigger primate brain (but primate brains are special!). Or The Secret of our Success (which doesn’t use particularly rigorous methods) provides a better perspective, by describing a model in which ape minds evolve to human minds via ordinary, gradual adaptations to mildly new environments.

In sum, this book is good at explaining the problems associated with research into animal cognition. It is merely ok at providing insights about how smart various species are.

Book review: Made-Up Minds: A Constructivist Approach to Artificial Intelligence, by Gary L. Drescher.

It’s odd to call a book boring when it uses the pun “ontology recapitulates phylogeny”[1]. to describe a surprising feature of its model. About 80% of the book is dull enough that I barely forced myself to read it, yet the occasional good idea persuaded me not to give up.

Drescher gives a detailed model of how Piaget-style learning in infants could enable them to learn complex concepts starting with minimal innate knowledge.
Continue Reading

Book review: The Human Advantage: A New Understanding of How Our Brain Became Remarkable, by Suzana Herculano-Houzel.

I used to be uneasy about claims that the human brain was special because it is large for our body size: relative size just didn’t seem like it could be the best measure of whatever enabled intelligence.

At last, Herculano-Houzel has invented a replacement for that measure. Her impressive technique for measuring the number of neurons in a brain has revolutionized this area of science.

We can now see an important connection between the number of cortical neurons and cognitive ability. I’m glad that the book reports on research that compares the cognitive abilities of enough species to enable moderately objective tests of the relevant hypotheses (although the research still has much room for improvement).

We can also see that the primate brain is special, in a way that enables large primates to be smarter than similarly sized nonprimates. And that humans are not very special for a primate of our size, although energy constraints make it tricky for primates to reach our size.

I was able to read the book quite quickly. Much of it is arranged in an occasionally suspenseful story about how the research was done. It doesn’t have lots of information, but the information it does have seems very new (except for the last two chapters, where Herculano-Houzel gets farther from her area of expertise).

Added 2016-08-25:
Wikipedia has a List of animals by number of neurons which lists the long-finned pilot whale as having 37.2 billion cortical neurons, versus 21 billion for humans.

The paper reporting that result disagrees somewhat with Herculano-Houzel:

Our results underscore that correlations between cognitive performance and absolute neocortical neuron numbers across animal orders or classes are of limited value, and attempts to quantify the mental capacity of a dolphin for cross-species comparisons are bound to be controversial.

But I don’t see much of an argument against the correlation between intelligence and cortical neuron numbers. The lack of good evidence about long-finned pilot whale intelligence mainly implies we ought to be uncertain.

Book review: The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter, by Joseph Henrich.

This book provides a clear explanation of how an ability to learn cultural knowledge made humans evolve into something unique over the past few million years. It’s by far the best book I’ve read on human evolution.

Before reading this book, I thought human uniqueness depended on something somewhat arbitrary and mysterious which made sexual selection important for human evolution, and wondered whether human language abilities depended on some lucky mutation. Now I believe that the causes of human uniqueness were firmly in place 2-3 million years ago, and the remaining arbitrary events seem much farther back on the causal pathway (e.g. what was unique about apes? why did our ancestors descend from trees 4.4 million years ago? why did the climate become less stable 3 million years ago?)

Human language now seems like a natural byproduct of previous changes, and probably started sooner (and developed more gradually) than many researchers think.

I used to doubt that anyone could find good evidence of cultures that existed millions of years ago. But Henrich provides clear explanations of how features such as right-handedness and endurance running demonstrate important milestones in human abilities to generate culture.

Henrich’s most surprising claim is that there’s an important sense in which individual humans are no smarter than other apes. Our intellectual advantage over apes is mostly due to a somewhat special-purpose ability to combine our individual brains into a collective intelligence. His evidence on this point is weak, but it’s plausible enough to be interesting.

Henrich occasionally exaggerates a bit. The only place where that bothered me was where he claimed that heart attack patients who carefully adhered to taking placebos were half as likely to die as patients who failed to reliably take placebos. The author wants to believe that demonstrates the power of placebos. I say the patient failure to take placebos was just a symptom of an underlying health problem (dementia?).

I’m a bit surprised at how little Robin Hanson says about the Henrich’s main points. Henrich suggests that there’s cultural pressure to respect high-status people, for reasons that are somewhat at odds with Robin’s ally/coalition based reasons. Henrich argues that knowledge coming from high-status people, at least in hunter-gatherer societies, tended to be safer than knowledge from more directly measurable evidence. The cultural knowledge that accumulates over many generations aggregates information that could not be empirically acquired in a short time.

So Henrich implies it’s reasonable for people to be confused about whether evidence based medicine embodies more wisdom than eminence based medicine. Traditional culture has become less valuable recently due to the rapid changes in our environment (particularly the technology component of our environment), but cultures that abandoned traditions too readily were often hurt by consequences which take decades to observe.

I got more out of this book than a short review can describe (such as “How Altruism is like a Chili Pepper”). Here’s a good closing quote:

we are smart, but not because we stand on the shoulders of giants or are giants ourselves. We stand on the shoulders of a very large pyramid of hobbits.