Artificial Intelligence

Book review: Warnings: Finding Cassandras to Stop Catastrophes, by Richard A. Clarke and R.P. Eddy.

This book is moderately addictive softcore version of outrage porn. Only small portions of the book attempt to describe how to recognize valuable warnings and ignore the rest. Large parts of the book seem written mainly to tell us which of the people portrayed in the book we should be outraged at, and which we should praise.

Normally I wouldn’t get around to finishing and reviewing a book containing this little information value, but this one was entertaining enough that I couldn’t stop.

The authors show above-average competence at selecting which warnings to investigate, but don’t convince me that they articulated how they accomplished that.

I’ll start with warnings on which I have the most expertise. I’ll focus a majority of my review on their advice for deciding which warnings matter, even though that may give the false impression that much of the book is about such advice.
Continue Reading

[Warning: long post, of uncertain value, with annoyingly uncertain conclusions.]

This post will focus on how hardware (cpu power) will affect AGI timelines. I will undoubtedly overlook some important considerations; this is just a model of some important effects that I understand how to analyze.

I’ll make some effort to approach this as if I were thinking about AGI timelines for the first time, and focusing on strategies that I use in other domains.

I’m something like 60% confident that the most important factor in the speed of AI takeoff will be the availability of computing power.

I’ll focus here on the time to human-level AGI, but I suspect this reasoning implies getting from there to superintelligence at speeds that Bostrom would classify as slow or moderate.
Continue Reading

The paper When Will AI Exceed Human Performance? Evidence from AI Experts reports ML researchers expect AI will create a 5% chance of “Extremely bad (e.g. human extinction)” consequences, yet they’re quite divided over whether that implies it’s an important problem to work on.

Slate Star Codex expresses confusion about and/or disapproval of (a slightly different manifestation of) this apparent paradox. It’s a pretty clear sign that something is suboptimal.

Here are some conjectures (not designed to be at all mutually exclusive).
Continue Reading

A new paper titled When Will AI Exceed Human Performance? Evidence from AI Experts reports some bizarre results. From the abstract:

Researchers believe there is a 50% chance of AI outperforming humans in all tasks in 45 years and of automating all human jobs in 120 years, with Asian respondents expecting these dates much sooner than North Americans.

So we should expect a 75 year period in which machines can perform all tasks better and more cheaply than humans, but can’t automate all occupations. Huh?

I suppose there are occupations that consist mostly of having status rather than doing tasks (queen of England, or waiter at a classy restaurant that won’t automate service due to the high status of serving food the expensive way). Or occupations protected by law, such as gas station attendants who pump gas in New Jersey, decades after most drivers switched to pumping for themselves.

But I’d be rather surprised if machine learning researchers would think of those points when answering a survey in connection with a machine learning conference.

Maybe the actual wording of the survey questions caused a difference that got lost in the abstract? Hmmm …

“High-level machine intelligence” (HLMI) is achieved when unaided machines can accomplish every task better and more cheaply than human workers


when all occupations are fully automatable. That is, when for any occupation, machines could be built to carry out the task better and more cheaply than human workers.

I tried to convince myself that the second version got interpreted as referring to actually replacing humans, while the first version referred to merely being qualified to replace humans. But the more I compared the two, the more that felt like wishful thinking. If anything, the “unaided” in the first version should make that version look farther in the future.

Can I find any other discrepancies between the abstract and the details? The 120 years in the abstract turns into 122 years in the body of the paper. So the authors seem to be downplaying the weirdness of the results.

There’s even a prediction of a 50% chance that the occupation “AI researcher” will be automated in about 88 years (I’m reading that from figure 2; I don’t see an explicit number for it). I suspect some respondents said this would take longer than for machines to “accomplish every task better and more cheaply”, but I don’t see data in the paper to confirm that [1].

A more likely hypothesis is that researchers alter their answers based on what they think people want to hear. Researchers might want to convince their funders that AI deals with problems that can be solved within the career of the researcher [2], while also wanting to reassure voters that AI won’t create massive unemployment until the current generation of workers has retired.

That would explain the general pattern of results, although the magnitude of the effect still seems strange. And it would imply that most machine learning researchers are liars, or have so little understanding of when HLMI will arrive that they don’t notice a 50% shift in their time estimates.

The ambiguity in terms such as “tasks” and “better” could conceivably explain confusion over the meaning of HLMI. I keep intending to write a blog post that would clarify concepts such as human-level AI and superintelligence, but then procrastinating because my thoughts on those topics are unclear.

It’s hard to avoid the conclusion that I should reduce my confidence in any prediction of when AI will reach human-level competence. My prior 90% confidence interval was something like 10 to 300 years. I guess I’ll broaden it to maybe 8 to 400 years [3].

P.S. – See also Katja’s comments on prior surveys.

[1] – the paper says most participants were asked the question that produced the estimate of 45 years to HLMI, the rest got the question that produced the 122 year estimate. So the median for all participants ought to be less than about 84 years, unless there are some unusual quirks in the data.

[2] – but then why do experienced researchers say human-level AI is farther in the future than new researchers, who presumably will be around longer? Maybe the new researchers are chasing fads or get-rich-quick schemes, and will mostly quit before becoming senior researchers?

[3] – years of subjective time as experienced by the fastest ems. So probably nowhere near 400 calendar years.

Book review: The Measure of All Minds: Evaluating Natural and Artificial Intelligence, by José Hernández-Orallo.

Much of this book consists of surveys of the psychometric literature. But the best parts of the book involve original results that bring more rigor and generality to the field. The best parts of the book approach the quality that I saw in Judea Pearl’s Causality, and E.T. Jaynes’ Probability Theory, but Measure of All Minds achieves a smaller fraction of its author’s ambitions, and is sometimes poorly focused.

Hernández-Orallo has an impressive ambition: measure intelligence for any agent. The book mentions a wide variety of agents, such as normal humans, infants, deaf-blind humans, human teams, dogs, bacteria, Q-learning algorithms, etc.

The book is aimed at a narrow and fairly unusual target audience. Much of it reads like it’s directed at psychology researchers, but the more original parts of the book require thinking like a mathematician.

The survey part seems pretty comprehensive, but I wasn’t satisfied with his ability to distinguish the valuable parts (although he did a good job of ignoring the politicized rants that plague many discussions of this subject).

For nearly the first 200 pages of the book, I was mostly wondering whether the book would address anything important enough for me to want to read to the end. Then I reached an impressive part: a description of an objective IQ-like measure. Hernández-Orallo offers a test (called the C-test) which:

  • measures a well-defined concept: sequential inductive inference,
  • defines the correct responses using an objective rule (based on Kolmogorov complexity),
  • with essentially no arbitrary cultural bias (the main feature that looks like an arbitrary cultural bias is the choice of alphabet and its order)[1],
  • and gives results in objective units (based on Levin’s Kt).

Yet just when I got my hopes up for a major improvement in real-world IQ testing, he points out that what the C-test measures is too narrow to be called intelligence: there’s a 960 line Perl program that exhibits human-level performance on this kind of test, without resembling a breakthrough in AI.
Continue Reading

I’ve recently noticed some possibly important confusion about machine learning (ML)/deep learning. I’m quite uncertain how much harm the confusion will cause.

On MIRI’s Intelligent Agent Foundations Forum:

If you don’t do cognitive reductions, you will put your confusion in boxes and hide the actual problem. … E.g. if neural networks are used to predict math, then the confusion about how to do logical uncertainty is placed in the black box of “what this neural net learns to do”

On SlateStarCodex:

Imagine a future inmate asking why he was denied parole, and the answer being “nobody knows and it’s impossible to find out even in principle” … (DeepMind employs a Go master to help explain AlphaGo’s decisions back to its own programmers, which is probably a metaphor for something)

A possibly related confusion, from a conversation that I observed recently: philosophers have tried to understand how concepts work for centuries, but have made little progress; therefore deep learning isn’t very close to human-level AGI.

I’m unsure whether any of the claims I’m criticizing reflect actually mistaken beliefs, or whether they’re just communicated carelessly. I’m confident that at least some people at MIRI are wise enough to avoid this confusion [1]. I’ve omitted some ensuing clarifications from my description of the deep learning conversation – maybe if I remembered those sufficiently well, I’d see that I was reacting to a straw man of that discussion. But it seems likely that some people were misled by at least the SlateStarCodex comment.

There’s an important truth that people refer to when they say that neural nets (and machine learning techniques in general) are opaque. But that truth gets seriously obscured when rephrased as “black box” or “impossible to find out even in principle”.
Continue Reading

Two and a half years ago, Eliezer was (somewhat plausibly) complaining that virtually nobody outside of MIRI was working on AI-related existential risks.

This year (at EAGlobal) one of MIRI’s talks was a bit hard to distinguish from an AI safety talk given by someone with pretty mainstream AI affiliations.

What happened in that time to cause that shift?

A large change was catalyzed by the publication of Superintelligence. I’ve been mildly disappointed about how little it affected discussions among people who were already interested in the topic. But Superintelligence caused a large change in how many people are willing to express concern over AI risks. That’s presumably because Superintelligence looks sufficiently academic and neutral to make many people comfortable about citing it, whereas similar arguments by Eliezer/MIRI didn’t look sufficiently prestigious within academia.

A smaller part of the change was MIRI shifting its focus somewhat to be more in line with how mainstream machine learning (ML) researchers expect AI to reach human levels.

Also, OpenAI has been quietly shifting in a more MIRI-like direction (I’m very unclear on how big a change this is). (Paul Christiano seems to deserve some credit for both the MIRI and OpenAI shifts in strategies.)

Given those changes, it seems like MIRI ought to be able to attract more donations than before. Especially since it has demonstrated evidence of increasing competence, and also because HPMoR seemed to draw significantly more people into the community of people who are interested in MIRI.

MIRI has gotten one big grant from OpenPhilanthropy that it probably couldn’t have gotten when mainstream AI researchers were treating MIRI’s concerns as too far-fetched to be worth commenting on. But donations from MIRI’s usual sources have stagnated.

That pattern suggests that MIRI was previously benefiting from a polarization effect, where the perception of two distinct “tribes” (those who care about AI risks versus those who promote AI) energized people to care about “their tribe”.

Whereas now there’s no clear dividing line between MIRI and mainstream researchers. Also, there’s lots of money going into other organizations that plan to do something about AI safety. (Most of those haven’t yet articulated enough of a strategy to make me optimistic that that money is well spent. I still endorse the ideas I mentioned last year in How much Diversity of AGI-Risk Organizations is Optimal?. I’m unclear on how much diversity of approaches we’re getting from the recent proliferation of AI safety organizations.)

That kind of pattern of donations creates perverse incentives to charities to at least market themselves as fighting a powerful group of people, rather than (as the ideal charity should be) addressing a neglected problem. Even if that marketing doesn’t distort a charity’s operations, the charity will be tempted to use counterproductive alarmism. AI risk organizations have resisted those temptations (at least recently), but it seems risky to tempt them.

That’s part of why I recently made a modest donation to MIRI, in spite of the uncertainty over the value of their efforts (I had last donated to them in 2009).

Book review: Notes on a New Philosophy of Empirical Science (Draft Version), by Daniel Burfoot.

Standard views of science focus on comparing theories by finding examples where they make differing predictions, and rejecting the theory that made worse predictions.

Burfoot describes a better view of science, called the Compression Rate Method (CRM), which replaces the “make prediction” step with “make a compression program”, and compares theories by how much they compress a standard (large) database.

These views of science produce mostly equivalent results(!), but CRM provides a better perspective.

Machine Learning (ML) is potentially science, and this book focuses on how ML will be improved by viewing its problems through the lens of CRM. Burfoot complains about the toolkit mentality of traditional ML research, arguing that the CRM approach will turn ML into an empirical science.

This should generate a Kuhnian paradigm shift in ML, with more objective measures of the research quality than any branch of science has achieved so far.

Burfoot focuses on compression as encoding empirical knowledge of specific databases / domains. He rejects the standard goal of a general-purpose compression tool. Instead, he proposes creating compression algorithms that are specialized for each type of database, to reflect what we know about topics (such as images of cars) that are important to us.
Continue Reading

MIRI has produced a potentially important result (called Garrabrant induction) for dealing with uncertainty about logical facts.

The paper is somewhat hard for non-mathematicians to read. This video provides an easier overview, and more context.

It uses prediction markets! “It’s a financial solution to the computer science problem of metamathematics”.

It shows that we can evade disturbing conclusions such as Godel incompleteness and the paradox of the liar, by expecting to only be very confident about logically deducible facts (as opposed to being mathematically certain). That’s similar to the difference between treating beliefs about empirical facts as probabilities, as opposed to boolean values.

I’m somewhat skeptical that it will have an important effect on AI safety, but my intuition says it will produce enough benefits somewhere that it will become at least as famous as Pearl’s work on causality.