MIRI

All posts tagged MIRI

The paper When Will AI Exceed Human Performance? Evidence from AI Experts reports ML researchers expect AI will create a 5% chance of “Extremely bad (e.g. human extinction)” consequences, yet they’re quite divided over whether that implies it’s an important problem to work on.

Slate Star Codex expresses confusion about and/or disapproval of (a slightly different manifestation of) this apparent paradox. It’s a pretty clear sign that something is suboptimal.

Here are some conjectures (not designed to be at all mutually exclusive).
Continue Reading

I’ve recently noticed some possibly important confusion about machine learning (ML)/deep learning. I’m quite uncertain how much harm the confusion will cause.

On MIRI’s Intelligent Agent Foundations Forum:

If you don’t do cognitive reductions, you will put your confusion in boxes and hide the actual problem. … E.g. if neural networks are used to predict math, then the confusion about how to do logical uncertainty is placed in the black box of “what this neural net learns to do”

On SlateStarCodex:

Imagine a future inmate asking why he was denied parole, and the answer being “nobody knows and it’s impossible to find out even in principle” … (DeepMind employs a Go master to help explain AlphaGo’s decisions back to its own programmers, which is probably a metaphor for something)

A possibly related confusion, from a conversation that I observed recently: philosophers have tried to understand how concepts work for centuries, but have made little progress; therefore deep learning isn’t very close to human-level AGI.

I’m unsure whether any of the claims I’m criticizing reflect actually mistaken beliefs, or whether they’re just communicated carelessly. I’m confident that at least some people at MIRI are wise enough to avoid this confusion [1]. I’ve omitted some ensuing clarifications from my description of the deep learning conversation – maybe if I remembered those sufficiently well, I’d see that I was reacting to a straw man of that discussion. But it seems likely that some people were misled by at least the SlateStarCodex comment.

There’s an important truth that people refer to when they say that neural nets (and machine learning techniques in general) are opaque. But that truth gets seriously obscured when rephrased as “black box” or “impossible to find out even in principle”.
Continue Reading

Two and a half years ago, Eliezer was (somewhat plausibly) complaining that virtually nobody outside of MIRI was working on AI-related existential risks.

This year (at EAGlobal) one of MIRI’s talks was a bit hard to distinguish from an AI safety talk given by someone with pretty mainstream AI affiliations.

What happened in that time to cause that shift?

A large change was catalyzed by the publication of Superintelligence. I’ve been mildly disappointed about how little it affected discussions among people who were already interested in the topic. But Superintelligence caused a large change in how many people are willing to express concern over AI risks. That’s presumably because Superintelligence looks sufficiently academic and neutral to make many people comfortable about citing it, whereas similar arguments by Eliezer/MIRI didn’t look sufficiently prestigious within academia.

A smaller part of the change was MIRI shifting its focus somewhat to be more in line with how mainstream machine learning (ML) researchers expect AI to reach human levels.

Also, OpenAI has been quietly shifting in a more MIRI-like direction (I’m very unclear on how big a change this is). (Paul Christiano seems to deserve some credit for both the MIRI and OpenAI shifts in strategies.)

Given those changes, it seems like MIRI ought to be able to attract more donations than before. Especially since it has demonstrated evidence of increasing competence, and also because HPMoR seemed to draw significantly more people into the community of people who are interested in MIRI.

MIRI has gotten one big grant from OpenPhilanthropy that it probably couldn’t have gotten when mainstream AI researchers were treating MIRI’s concerns as too far-fetched to be worth commenting on. But donations from MIRI’s usual sources have stagnated.

That pattern suggests that MIRI was previously benefiting from a polarization effect, where the perception of two distinct “tribes” (those who care about AI risks versus those who promote AI) energized people to care about “their tribe”.

Whereas now there’s no clear dividing line between MIRI and mainstream researchers. Also, there’s lots of money going into other organizations that plan to do something about AI safety. (Most of those haven’t yet articulated enough of a strategy to make me optimistic that that money is well spent. I still endorse the ideas I mentioned last year in How much Diversity of AGI-Risk Organizations is Optimal?. I’m unclear on how much diversity of approaches we’re getting from the recent proliferation of AI safety organizations.)

That kind of pattern of donations creates perverse incentives to charities to at least market themselves as fighting a powerful group of people, rather than (as the ideal charity should be) addressing a neglected problem. Even if that marketing doesn’t distort a charity’s operations, the charity will be tempted to use counterproductive alarmism. AI risk organizations have resisted those temptations (at least recently), but it seems risky to tempt them.

That’s part of why I recently made a modest donation to MIRI, in spite of the uncertainty over the value of their efforts (I had last donated to them in 2009).

Book review: Notes on a New Philosophy of Empirical Science (Draft Version), by Daniel Burfoot.

Standard views of science focus on comparing theories by finding examples where they make differing predictions, and rejecting the theory that made worse predictions.

Burfoot describes a better view of science, called the Compression Rate Method (CRM), which replaces the “make prediction” step with “make a compression program”, and compares theories by how much they compress a standard (large) database.

These views of science produce mostly equivalent results(!), but CRM provides a better perspective.

Machine Learning (ML) is potentially science, and this book focuses on how ML will be improved by viewing its problems through the lens of CRM. Burfoot complains about the toolkit mentality of traditional ML research, arguing that the CRM approach will turn ML into an empirical science.

This should generate a Kuhnian paradigm shift in ML, with more objective measures of the research quality than any branch of science has achieved so far.

Burfoot focuses on compression as encoding empirical knowledge of specific databases / domains. He rejects the standard goal of a general-purpose compression tool. Instead, he proposes creating compression algorithms that are specialized for each type of database, to reflect what we know about topics (such as images of cars) that are important to us.
Continue Reading

MIRI has produced a potentially important result (called Garrabrant induction) for dealing with uncertainty about logical facts.

The paper is somewhat hard for non-mathematicians to read. This video provides an easier overview, and more context.

It uses prediction markets! “It’s a financial solution to the computer science problem of metamathematics”.

It shows that we can evade disturbing conclusions such as Godel incompleteness and the paradox of the liar, by expecting to only be very confident about logically deducible facts (as opposed to being mathematically certain). That’s similar to the difference between treating beliefs about empirical facts as probabilities, as opposed to boolean values.

I’m somewhat skeptical that it will have an important effect on AI safety, but my intuition says it will produce enough benefits somewhere that it will become at least as famous as Pearl’s work on causality.