A new paper titled When Will AI Exceed Human Performance? Evidence from AI Experts reports some bizarre results. From the abstract:

Researchers believe there is a 50% chance of AI outperforming humans in all tasks in 45 years and of automating all human jobs in 120 years, with Asian respondents expecting these dates much sooner than North Americans.

So we should expect a 75 year period in which machines can perform all tasks better and more cheaply than humans, but can’t automate all occupations. Huh?

I suppose there are occupations that consist mostly of having status rather than doing tasks (queen of England, or waiter at a classy restaurant that won’t automate service due to the high status of serving food the expensive way). Or occupations protected by law, such as gas station attendants who pump gas in New Jersey, decades after most drivers switched to pumping for themselves.

But I’d be rather surprised if machine learning researchers would think of those points when answering a survey in connection with a machine learning conference.

Maybe the actual wording of the survey questions caused a difference that got lost in the abstract? Hmmm …

“High-level machine intelligence” (HLMI) is achieved when unaided machines can accomplish every task better and more cheaply than human workers


when all occupations are fully automatable. That is, when for any occupation, machines could be built to carry out the task better and more cheaply than human workers.

I tried to convince myself that the second version got interpreted as referring to actually replacing humans, while the first version referred to merely being qualified to replace humans. But the more I compared the two, the more that felt like wishful thinking. If anything, the “unaided” in the first version should make that version look farther in the future.

Can I find any other discrepancies between the abstract and the details? The 120 years in the abstract turns into 122 years in the body of the paper. So the authors seem to be downplaying the weirdness of the results.

There’s even a prediction of a 50% chance that the occupation “AI researcher” will be automated in about 88 years (I’m reading that from figure 2; I don’t see an explicit number for it). I suspect some respondents said this would take longer than for machines to “accomplish every task better and more cheaply”, but I don’t see data in the paper to confirm that [1].

A more likely hypothesis is that researchers alter their answers based on what they think people want to hear. Researchers might want to convince their funders that AI deals with problems that can be solved within the career of the researcher [2], while also wanting to reassure voters that AI won’t create massive unemployment until the current generation of workers has retired.

That would explain the general pattern of results, although the magnitude of the effect still seems strange. And it would imply that most machine learning researchers are liars, or have so little understanding of when HLMI will arrive that they don’t notice a 50% shift in their time estimates.

The ambiguity in terms such as “tasks” and “better” could conceivably explain confusion over the meaning of HLMI. I keep intending to write a blog post that would clarify concepts such as human-level AI and superintelligence, but then procrastinating because my thoughts on those topics are unclear.

It’s hard to avoid the conclusion that I should reduce my confidence in any prediction of when AI will reach human-level competence. My prior 90% confidence interval was something like 10 to 300 years. I guess I’ll broaden it to maybe 8 to 400 years [3].

P.S. – See also Katja’s comments on prior surveys.

[1] – the paper says most participants were asked the question that produced the estimate of 45 years to HLMI, the rest got the question that produced the 122 year estimate. So the median for all participants ought to be less than about 84 years, unless there are some unusual quirks in the data.

[2] – but then why do experienced researchers say human-level AI is farther in the future than new researchers, who presumably will be around longer? Maybe the new researchers are chasing fads or get-rich-quick schemes, and will mostly quit before becoming senior researchers?

[3] – years of subjective time as experienced by the fastest ems. So probably nowhere near 400 calendar years.

[Another underwhelming book; I promise to get out of the habit of posting only book reviews Real Soon Now.]

Book review: Seeing like a State: How Certain Schemes to Improve the Human Condition Have Failed, by James C. Scott.

Scott begins with a history of the tension between the desire for legibility versus the desire for local control. E.g. central governments wanted to know how much they could tax peasants without causing famine or revolt. Yet even in the optimistic case where they got an honest tax collector to report how many bushels of grain John produced, they had problems due to John’s village having an idiosyncratic meaning of “bushel” that the tax collector couldn’t easily translate to something the central government knew. And it was hard to keep track of whether John had paid the tax, since the central government didn’t understand how the villagers distinguished that John from the John who lived a mile away.

So governments that wanted to grow imposed lots of standards on people. That sometimes helped peasants by making their taxes fairer and more predictable, but often trampled over local arrangements that had worked well (especially complex land use agreements).

I found that part of the book to be a fairly nice explanation of why an important set of conflicts was nearly inevitable. Scott gives a relatively balanced view of how increased legibility had both good and bad effects (more efficient taxation, diseases tracked better, Nazis found more Jews, etc.).

Then Scott becomes more repetitive and one-sided when describing high modernism, which carried the desire for legibility to a revolutionary, authoritarian extreme (especially between 1920 and 1960). I didn’t want 250 pages of evidence that Soviet style central planning was often destructive. Maybe that conclusion wasn’t obvious to enough people when Scott started writing the book, but it was painfully obvious by the time the book was published.

Scott’s complaints resemble the Hayekian side of the socialist calculation debate, except that Scott frames in terms that minimize associations with socialism and capitalism. E.g. he manages to include Taylorist factory management in his cluster of bad ideas.

It’s interesting to compare Fukuyama’s description of Tanzania with Scott’s description. They both agree that villagization (Scott’s focus) was a disaster. Scott leaves readers with the impression that villagization was the most important policy, whereas Fukuyama only devotes one paragraph to it, and gives the impression that the overall effects of Tanzania’s legibility-increasing moves were beneficial (mainly via a common language causing more cooperation). Neither author provides a balanced view (but then they were both drawing attention to neglected aspects of history, not trying to provide a complete picture).

My advice: read the SlateStarCodex review, don’t read the whole book.

[An unimportant book that I read for ARC; feel free to skip this.]

Book review: Be Yourself, Everyone Else is Already Taken: Transform Your Life with the Power of Authenticity, by Mike Robbins.

This book’s advice mostly feels half-right, and mostly directed at people who have somewhat different problems than I have.

The book’s exercises range from things I’ve already done enough of, to things I ought to practice more but which feel hard (such as the self-love exercise).
Continue Reading

Book review: State, Economy, and the Great Divergence: Great Britain and China, 1680s – 1850s, by Peer Vries.

Yet another book on why Britain and China diverged dramatically starting around 1800. This one focuses on documenting the differences between the regions, with relatively little theorizing.

Some interesting differences of possible relevance to the divergence:

  • British per capita tax collections were 15 times China’s [1]; Vries emphasizes the underlying British bureaucratic competence.
  • Britain changed its tax rules often; China treated tax rules as if set in stone.
  • British tax policy caused it to promote standardization of a wide variety of weights and measures, which helped long-distance trades; China had nothing similar.
  • Britain’s taxation was more egalitarian than China’s (but still much less egalitarian than today).
  • British government debt looked recklessly high; China consistently had a surplus.
  • British elites wanted to keep the masses poor (to make them industrious); China’s elites seemed neutral or had slight preferences for the poor to prosper.
  • Most British workers were nearly slaves – laws restricted their mobility due to the expectation that most who left their area of work were beggars/thieves; China was less restrictive.
  • Britain condoned or supported powerful monopolies; China broke up concentrations of merchant power / capital under the assumption that they came at the expense of ordinary people.
  • Britain had three times as much farm land per capita as China.
  • Britain was more urban, so it had more commercial / monetary activity.
  • China denied that anything outside its borders mattered. Britain had a fairly global worldview.

Continue Reading

Book review: The Measure of All Minds: Evaluating Natural and Artificial Intelligence, by José Hernández-Orallo.

Much of this book consists of surveys of the psychometric literature. But the best parts of the book involve original results that bring more rigor and generality to the field. The best parts of the book approach the quality that I saw in Judea Pearl’s Causality, and E.T. Jaynes’ Probability Theory, but Measure of All Minds achieves a smaller fraction of its author’s ambitions, and is sometimes poorly focused.

Hernández-Orallo has an impressive ambition: measure intelligence for any agent. The book mentions a wide variety of agents, such as normal humans, infants, deaf-blind humans, human teams, dogs, bacteria, Q-learning algorithms, etc.

The book is aimed at a narrow and fairly unusual target audience. Much of it reads like it’s directed at psychology researchers, but the more original parts of the book require thinking like a mathematician.

The survey part seems pretty comprehensive, but I wasn’t satisfied with his ability to distinguish the valuable parts (although he did a good job of ignoring the politicized rants that plague many discussions of this subject).

For nearly the first 200 pages of the book, I was mostly wondering whether the book would address anything important enough for me to want to read to the end. Then I reached an impressive part: a description of an objective IQ-like measure. Hernández-Orallo offers a test (called the C-test) which:

  • measures a well-defined concept: sequential inductive inference,
  • defines the correct responses using an objective rule (based on Kolmogorov complexity),
  • with essentially no arbitrary cultural bias (the main feature that looks like an arbitrary cultural bias is the choice of alphabet and its order)[1],
  • and gives results in objective units (based on Levin’s Kt).

Yet just when I got my hopes up for a major improvement in real-world IQ testing, he points out that what the C-test measures is too narrow to be called intelligence: there’s a 960 line Perl program that exhibits human-level performance on this kind of test, without resembling a breakthrough in AI.
Continue Reading

Book review: Political Order and Political Decay, by Francis Fukuyama.

This book describes the rise of modern nation-states, from the French revolution to the present.

Fukuyama focuses on three features that influence national success: state (effective bureaucracy), rule of law, and autonomy (democratic accountability).

Much of the book argues against libertarian ideas from a fairly centrist perspective, although he mostly avoids directly discussing libertarian beliefs. Instead, he implies that we should de-emphasize debates over big government versus small government, and look more at effectiveness versus corruption (i.e. we should pull sideways).

Many of these ideas build on what Fukuyama wrote in Trust – I suggest reading that book first.


War! What Is It Good For?. Fukuyama believes that war sometimes causes states to make their bureaucracy more efficient. Fukuyama is more credible here than Morris because Fukuyama is more cautious about the effects he claims to see.

The book suggests that young nations have some key stage where threat of conquest can create the right incentives for developing an efficient bureaucracy (i.e. without efficient support for the military, including effective taxation, they get absorbed into a state that does better at those tasks). Without such a threat, states can get stuck in an equilibrium where the bureaucracy simply serves a small number of powerful people. But with such a threat, politicians need to delegate enough authority that the bureaucracy develops some independence, which enables it to care about broader notions of national welfare. (Fukuyama talks as if the bureaucracies are somewhat altruistic. I think of it more as the bureaucracies caring about their long-term revenue source, when individual politicians don’t hold power long enough to care about the long term).

It seems plausible that China would have helped to lead the industrial revolution if it had faced a serious risk of being conquered in the 17th and 18th centuries. China’s relative safety back then seems to have left it complacent and stagnant.


Fukuyama hints that the three pillars of modern nation-states (state, law, autonomy) have roughly equal importance.

Yet I don’t buy that. I expect that whatever virtues are responsible for the rule of law are a good deal more important than effective bureaucracies or democratic accountability.

Fukuyama doesn’t make a strong case for the value of democracy for national success, presumably in part because he expects most readers to already agree with him about that. I’ll conjecture that democracy is mostly a byproduct of success at the other features that Fukuyama considers important.

It’s likely that democracy is somewhat valuable for generating fairness, but that has limited relevance to what Fukuyama tries to explain (i.e. mainly power and wealth).


Full-fledged rule of law might be needed to get all the benefits of the best modern societies. But the differences between good and bad nations seems to have originated well before those nations had more than a rudimentary version of rule of law.

That suggests some underlying factor that matters – maybe just the basic notion of law as something separate from individual leaders or ethnic groups (Fukuyama’s previous book says Christianity played an important role here); or maybe the kind of cultural advance suggested by Greg Clark.

Fukuyama argues that it’s risky to adopt democracy before creating effective states and the rule of law. He’s probably right to expect that such democracies will be dominated by people who fight to get the spoils of politics for their family / clan / ethnic group, with little thought to national wellbeing.


National identity is important for producing the kind of government that Fukuyama likes. It’s hard for government employees to focus on the welfare of the nation if they identify mainly as members of a non-majority ethnic group.

He mentions that the printing press helped create national identities out of more fragmented cultures. This seems important enough to Europe’s success that it deserves more emphasis than the two paragraphs he devotes to it.

He describes several countries that started out as a patchwork of ethnic groups, and had differing degrees of success at developing a unified national identity: Tanzania, Kenya, Nigeria, and Indonesia. I was a bit disappointed that the differences there seemed to be mostly accidents of the personalities of leading politicians.

He talks as if the only two options for such regions were to develop a clear national identity or be crippled by ethnic conflict. Why not also consider the option of splitting into smaller political units that can aim to become city-states such as Singapore and Dubai?


He makes many minor claims that sound suspicious enough for me to have moderate doubts about trusting his scholarship.

For example, he tries to refute claims that “industrial policy never works”, mainly by using the example of the government developing the internet. (His use of the word “never” suggests that he’s not exactly attacking the most sophisticated version of the belief in question). How familiar is he with the history of the internet? The entities in charge of internet tried to restrict commercial use until 1995. Actual commercial use of the internet started before the government made a clear decision to tolerate such use, much less endorse it. So Fukuyama either has a faulty understanding of internet history, or is using the phrase industrial policy in a way that puzzles me.

Then there’s the claim that the Spanish conquered important parts of the New World before the native nations had declined due to European diseases. Fukuyama seems unfamiliar with the contrary evidence reported by Charles C. Mann in 1491 and 1493. Mann may not be an ideal source, but he appears at least as reliable as the sources that Fukuyama cites.


That leads into more general doubts about history books, especially ambitiously broad books aimed at popular audiences.

Tetlock’s research into the accuracy of political pundits has led me to assume that a broad range of “expert” commentary is roughly equivalent to random guessing. Much of what historians do [1] seems quite similar to the opinions of the experts that Tetlock studies. Neither historians nor political pundits get adequate feedback about mistaken beliefs, or get significant rewards for insights that are later confirmed by new evidence. That leads me to worry that the study of history is little better than voodoo.


In sum, I can’t quite decide whether to recommend that you read this book.

[1] – I.e. drawing inferences from aggregations of data. That’s not to say that historians don’t devote lots of time to reporting observed facts. But most of those facts don’t have value to me unless I can generalize from them in ways that help me understand the future. Historian’s choices of what facts to emphasize will unavoidably influence any generalizations I draw.

Book review: Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness, by Peter Godfrey-Smith.

This book describes some interesting mysteries, but provides little help at solving them.

It provides some pieces of a long-term perspective on the evolution of intelligence.

Cephalopods’ most recent common ancestor with vertebrates lived way back before the Cambrian explosion. Nervous systems back then were primitive enough that minds didn’t need to react to other minds, and predation was a rare accident, not something animals prepared carefully to cause and avoid.

So cephalopod intelligence evolved rather independently from most of the minds we observe. We could learn something about alien minds by understanding them.

Intelligence may even have evolved more than once in cephalopods – nobody seems to know whether octopuses evolved intelligence separately from squids/cuttlefish.

An octopus has a much less centralized mind than vertebrates do. Does an octopus have a concept of self? The book presents evidence that octopuses sometimes seem to think of their arms as parts of their self, yet hints that their concept of self is a good deal weaker than in humans, and maybe the octopus treats its arms as semi-autonomous entities.


Does an octopus have color vision? Not via its photoreceptors the way many vertebrates do. Simple tests of octopuses’ ability to discriminate color also say no.

Yet octopuses clearly change color to camouflage themselves. They also change color in ways that suggest they’re communicating via a visual language. But to whom?

One speculative guess is that the color-producing parts act as color filters, with monochrome photoreceptors in the skin evaluating the color of the incoming light by how much the light is attenuated by the filters. So they “see” color with their skin, but not their eyes.

That would still leave plenty of mystery about what they’re communicating.


The author’s understanding of aging implies that few organisms die of aging in the wild. He sees evidence in Octopuses that conflicts with this prediction, yet that doesn’t alert him to the growing evidence of problems with the standard theories of aging.

He says octopuses are subject to much predation. Why doesn’t this cause them to be scared of humans? He has surprising anecdotes of octopuses treating humans as friends, e.g. grabbing one and leading him on a ten-minute “tour”.

He mentions possible REM sleep in cuttlefish. That would almost certainly have evolved independently from vertebrate REM sleep, which must indicate something important.

I found the book moderately entertaining, but I was underwhelmed by the author’s expertise. The subtitle’s reference to “the Deep Origins of Consciousness” led me to expect more than I got.

I’ve recently noticed some possibly important confusion about machine learning (ML)/deep learning. I’m quite uncertain how much harm the confusion will cause.

On MIRI’s Intelligent Agent Foundations Forum:

If you don’t do cognitive reductions, you will put your confusion in boxes and hide the actual problem. … E.g. if neural networks are used to predict math, then the confusion about how to do logical uncertainty is placed in the black box of “what this neural net learns to do”

On SlateStarCodex:

Imagine a future inmate asking why he was denied parole, and the answer being “nobody knows and it’s impossible to find out even in principle” … (DeepMind employs a Go master to help explain AlphaGo’s decisions back to its own programmers, which is probably a metaphor for something)

A possibly related confusion, from a conversation that I observed recently: philosophers have tried to understand how concepts work for centuries, but have made little progress; therefore deep learning isn’t very close to human-level AGI.

I’m unsure whether any of the claims I’m criticizing reflect actually mistaken beliefs, or whether they’re just communicated carelessly. I’m confident that at least some people at MIRI are wise enough to avoid this confusion [1]. I’ve omitted some ensuing clarifications from my description of the deep learning conversation – maybe if I remembered those sufficiently well, I’d see that I was reacting to a straw man of that discussion. But it seems likely that some people were misled by at least the SlateStarCodex comment.

There’s an important truth that people refer to when they say that neural nets (and machine learning techniques in general) are opaque. But that truth gets seriously obscured when rephrased as “black box” or “impossible to find out even in principle”.
Continue Reading

Book review: Aging is a Group-Selected Adaptation: Theory, Evidence, and Medical Implications, by Joshua Mitteldorf.

This provocative book argues that our genes program us to age because aging provided important benefits.

I’ll refer here to antagonistic pleiotropy (AP) and programmed aging (PA) as the two serious contending hypotheses of aging. (Mutation accumulation used to be a leading hypothesis, but it seems discredited now, due to the number of age-related deaths seen in a typical species, and due to evidence that aging is promoted by some ancient genes).

Here’s a dumbed down version of the debate:
<theorist>: Hamilton proved that all conceivable organisms age due to AP and/or mutation accumulation.
<critic>: But the PA theories better predict how many die from aging, the effects of telomeres, calorie restriction, etc. Also, here’s some organisms with zero or negative aging …
<theorist>: A few anomalies aren’t enough to overturn a well-established theory. The well-known PA theories are obviously wrong because selfish genes would outbreed the PA genes.
<critic>: Here are some new versions which might explain how aging could enhance a species’ fitness …
<theorist>: I’ve read enough bad group-selection theories that I’m not going to waste my time with more of them.

That kind of reaction from theorists might make sense if AP was well established. But AP seems to have been well established only in the Darwinian sense of being firmly entrenched in scientists’ minds. It got entrenched mainly by being the least wrong of a flawed set of theories, combined with some poor communication between theorists and naturalists. Wikipedia has a surprisingly good[1] page on the evolution of aging that says:

Antagonistic pleiotropy is a prevailing theory today, but this is largely by default, and not because the theory has been well verified.

Continue Reading

Book review: Superforecasting: The Art and Science of Prediction, by Philip E. Tetlock and Dan Gardner.

This book reports on the Good Judgment Project (GJP).

Much of the book recycles old ideas: 40% of the book is a rerun of Thinking Fast and Slow, 15% of the book repeats Wisdom of Crowds, and 15% of the book rehashes How to Measure Anything. Those three books were good enough that it’s very hard to improve on them. Superforecasting nearly matches their quality, but most people ought to read those three books instead. (Anyone who still wants more after reading them will get decent value out of reading the last 4 or 5 chapters of Superforecasting).

The book’s style is very readable, using an almost Gladwell-like style (a large contrast to Tetlock’s previous, more scholarly book), at a moderate cost in substance. It contains memorable phrases, such as “a fox with the bulging eyes of a dragonfly” (to describe looking at the world through many perspectives).

Continue Reading