existential risks

All posts tagged existential risks

Book review: Human Compatible, by Stuart Russell.

Human Compatible provides an analysis of the long-term risks from artificial intelligence, by someone with a good deal more of the relevant prestige than any prior author on this subject.

What should I make of Russell? I skimmed his best-known book, Artificial Intelligence: A Modern Approach, and got the impression that it taught a bunch of ideas that were popular among academics, but which weren’t the focus of the people who were getting interesting AI results. So I guessed that people would be better off reading Deep Learning by Goodfellow, Bengio, and Courville instead. Human Compatible neither confirms nor dispels the impression that Russell is a bit too academic.

However, I now see that he was one of the pioneers of inverse reinforcement learning, which looks like a fairly significant advance that will likely become important someday (if it hasn’t already). So I’m inclined to treat him as a moderately good authority on AI.

The first half of the book is a somewhat historical view of AI, intended for readers who don’t know much about AI. It’s ok.

Continue Reading

Book review: The AI Does Not Hate You: Superintelligence, Rationality and the Race to Save the World, by Tom Chivers.

This book is a sympathetic portrayal of the rationalist movement by a quasi-outsider. It includes a well-organized explanation of why some people expect tha AI will create large risks sometime this century, written in simple language that is suitable for a broad audience.

Caveat: I know many of the people who are described in the book. I’ve had some sort of connection with the rationalist movement since before it became distinct from transhumanism, and I’ve been mostly an insider since 2012. I read this book mainly because I was interested in how the rationalist movement looks to outsiders.

Chivers is a science writer. I normally avoid books by science writers, due to an impression that they mostly focus on telling interesting stories, without developing a deep understanding of the topics they write about.

Chivers’ understanding of the rationalist movement doesn’t quite qualify as deep, but he was surprisingly careful to read a lot about the subject, and to write only things he did understand.

Many times I reacted to something he wrote with “that’s close, but not quite right”. Usually when I reacted that way, Chivers did a good job of describing the the rationalist message in question, and the main problem was either that rationalists haven’t figured out how to explain their ideas in a way that a board audience can understand, or that rationalists are confused. So the complaints I make in the rest of this review are at most weakly directed in Chivers direction.

I saw two areas where Chivers overlooked something important.

Rationality

One involves CFAR.

Chivers wrote seven chapters on biases, and how rationalists view them, ending with “the most important bias”: knowing about biases can make you more biased. (italics his).

I get the impression that Chivers is sweeping this problem under the rug (Do we fight that bias by being aware of it? Didn’t we just read that that doesn’t work?). That is roughly what happened with many people who learned rationalism solely via written descriptions.

Then much later, when describing how he handled his conflicting attitudes toward the risks from AI, he gives a really great description of maybe 3% of what CFAR teaches (internal double crux), much like a blind man giving a really clear description of the upper half of an elephant’s trunk. He prefaces this narrative with the apt warning: “I am aware that this all sounds a bit mystical and self-helpy. It’s not.”

Chivers doesn’t seem to connect this exercise with the goal of overcoming biases. Maybe he was too busy applying the technique on an important problem to notice the connection with his prior discussions of Bayes, biases, and sanity. It would be reasonable for him to argue that CFAR’s ideas have diverged enough to belong in a separate category, but he seems to put them in a different category by accident, without realizing that many of us consider CFAR to be an important continuation of rationalists’ interest in biases.

World conquest

Chivers comes very close to covering all of the layman-accessible claims that Yudkowsky and Bostrom make. My one complaint here is that he only give vague hints about why one bad AI can’t be stopped by other AI’s.

A key claim of many leading rationalists is that AI will have some winner take all dynamics that will lead to one AI having a decisive strategic advantage after it crosses some key threshold, such as human-level intelligence.

This is a controversial position that is somewhat connected to foom (fast takeoff), but which might be correct even without foom.

Utility functions

“If I stop caring about chess, that won’t help me win any chess games, now will it?” – That chapter title provides a good explanation of why a simple AI would continue caring about its most fundamental goals.

Is that also true of an AI with more complex, human-like goals? Chivers is partly successful at explaining how to apply the concept of a utility function to a human-like intelligence. Rationalists (or at least those who actively research AI safety) have a clear meaning here, at least as applied to agents that can be modeled mathematically. But when laymen try to apply that to humans, confusion abounds, due to the ease of conflating subgoals with ultimate goals.

Chivers tries to clarify, using the story of Odysseus and the Sirens, and claims that the Sirens would rewrite Odysseus’ utility function. I’m not sure how we can verify that the Sirens work that way, or whether they would merely persuade Odysseus to make false predictions about his expected utility. Chivers at least states clearly that the Sirens try to prevent Odysseus (by making him run aground) from doing what his pre-Siren utility function advises. Chivers’ point could be a bit clearer if he specified that in his (nonstandard?) version of the story, the Sirens make Odysseus want to run aground.

Philosophy

“Essentially, he [Yudkowsky] (and the Rationalists) are thoroughgoing utilitarians.” – That’s a bit misleading. Leading rationalists are predominantly consequentialists, but mostly avoid committing to a moral system as specific as utilitarianism. Leading rationalists also mostly endorse moral uncertainty. Rationalists mostly endorse utilitarian-style calculation (which entails some of the controversial features of utilitarianism), but are careful to combine that with worry about whether we’re optimizing the quantity that we want to optimize.

I also recommend Utilitarianism and its discontents as an example of one rationalist’s nuanced partial endorsement of utilitarianism.

Political solutions to AI risk?

Chivers describes Holden Karnofsky as wanting “to get governments and tech companies to sign treaties saying they’ll submit any AGI designs to outside scrutiny before switching them on. It wouldn’t be iron-clad, because firms might simply lie”.

Most rationalists seem pessimistic about treaties such as this.

Lying is hardly the only problem. This idea assumes that there will be a tiny number of attempts, each with a very small number of launches that look like the real thing, as happened with the first moon landing and the first atomic bomb. Yet the history of software development suggests it will be something more like hundreds of attempts that look like they might succeed. I wouldn’t be surprised if there are millions of times when an AI is turned on, and the developer has some hope that this time it will grow into a human-level AGI. There’s no way that a large number of designs will get sufficient outside scrutiny to be of much use.

And if a developer is trying new versions of their system once a day (e.g. making small changes to a number that controls, say, openness to new experience), any requirement to submit all new versions for outside scrutiny would cause large delays, creating large incentives to subvert the requirement.

So any realistic treaty would need provisions that identify a relatively small set of design choices that need to be scrutinized.

I see few signs that any experts are close to developing a consensus about what criteria would be appropriate here, and I expect that doing so would require a significant fraction of the total wisdom needed for AI safety. I discussed my hope for one such criterion in my review of Drexler’s Reframing Superintelligence paper.

Rationalist personalities

Chivers mentions several plausible explanations for what he labels the “semi-death of LessWrong”, the most obvious being that Eliezer Yudkowsky finished most of the blogging that he had wanted to do there. But I’m puzzled by one explanation that Chivers reports: “the attitude … of thinking they can rebuild everything”. Quoting Robin Hanson:

At Xanadu they had to do everything different: they had to organize their meetings differently and orient their screens differently and hire a different kind of manager, everything had to be different because they were creative types and full of themselves. And that’s the kind of people who started the Rationalists.

That seems like a partly apt explanation for the demise of the rationalist startups MetaMed and Arbital. But LessWrong mostly copied existing sites, such as Reddit, and was only ambitious in the sense that Eliezer was ambitious about what ideas to communicate.

Culture

I guess a book about rationalists can’t resist mentioning polyamory. “For instance, for a lot of people it would be difficult not to be jealous.” Yes, when I lived in a mostly monogamous culture, jealousy seemed pretty standard. That attititude melted away when the bay area cultures that I associated with started adopting polyamory or something similar (shortly before the rationalists became a culture). Jealousy has much more purpose if my partner is flirting with monogamous people than if he’s flirting with polyamorists.

Less dramatically, We all know people who are afraid of visiting their city centres because of terrorist attacks, but don’t think twice about driving to work.

This suggests some weird filter bubbles somewhere. I thought that fear of cities got forgotten within a month or so after 9/11. Is this a difference between London and the US? Am I out of touch with popular concerns? Does Chivers associate more with paranoid people than I do? I don’t see any obvious answer.

Conclusion

It would be really nice if Chivers and Yudkowsky could team up to write a book, but this book is a close substitute for such a collaboration.

See also Scott Aaronson’s review.

Eric Drexler has published a book-length paper on AI risk, describing an approach that he calls Comprehensive AI Services (CAIS).

His primary goal seems to be reframing AI risk discussions to use a rather different paradigm than the one that Nick Bostrom and Eliezer Yudkowsky have been promoting. (There isn’t yet any paradigm that’s widely accepted, so this isn’t a Kuhnian paradigm shift; it’s better characterized as an amorphous field that is struggling to establish its first paradigm). Dueling paradigms seems to be the best that the AI safety field can manage to achieve for now.

I’ll start by mentioning some important claims that Drexler doesn’t dispute:

  • an intelligence explosion might happen somewhat suddenly, in the fairly near future;
  • it’s hard to reliably align an AI’s values with human values;
  • recursive self-improvement, as imagined by Bostrom / Yudkowsky, would pose significant dangers.

Drexler likely disagrees about some of the claims made by Bostrom / Yudkowsky on those points, but he shares enough of their concerns about them that those disagreements don’t explain why Drexler approaches AI safety differently. (Drexler is more cautious than most writers about making any predictions concerning these three claims).

CAIS isn’t a full solution to AI risks. Instead, it’s better thought of as an attempt to reduce the risk of world conquest by the first AGI that reaches some threshold, preserve existing corrigibility somewhat past human-level AI, and postpone need for a permanent solution until we have more intelligence.

Continue Reading

Book review: Artificial Intelligence Safety and Security, by Roman V. Yampolskiy.

This is a collection of papers, with highly varying topics, quality, and importance.

Many of the papers focus on risks that are specific to superintelligence, some assuming that a single AI will take over the world, and some assuming that there will be many AIs of roughly equal power. Others focus on problems that are associated with current AI programs.

I’ve tried to arrange my comments on individual papers in roughly descending order of how important the papers look for addressing the largest AI-related risks, while also sometimes putting similar topics in one group. The result feels a little more organized than the book, but I worry that the papers are too dissimilar to be usefully grouped. I’ve ignored some of the less important papers.

The book’s attempt at organizing the papers consists of dividing them into “Concerns of Luminaries” and “Responses of Scholars”. Alas, I see few signs that many of the authors are even aware of what the other authors have written, much less that the later papers are attempts at responding to the earlier papers. It looks like the papers are mainly arranged in order of when they were written. There’s a modest cluster of authors who agree enough with Bostrom to constitute a single scientific paradigm, but half the papers demonstrate about as much of a consensus on what topic they’re discussing as I would expect to get from asking medieval peasants about airplane safety.

Continue Reading

[Warning: long post, of uncertain value, with annoyingly uncertain conclusions.]

This post will focus on how hardware (cpu power) will affect AGI timelines. I will undoubtedly overlook some important considerations; this is just a model of some important effects that I understand how to analyze.

I’ll make some effort to approach this as if I were thinking about AGI timelines for the first time, and focusing on strategies that I use in other domains.

I’m something like 60% confident that the most important factor in the speed of AI takeoff will be the availability of computing power.

I’ll focus here on the time to human-level AGI, but I suspect this reasoning implies getting from there to superintelligence at speeds that Bostrom would classify as slow or moderate.
Continue Reading

The paper When Will AI Exceed Human Performance? Evidence from AI Experts reports ML researchers expect AI will create a 5% chance of “Extremely bad (e.g. human extinction)” consequences, yet they’re quite divided over whether that implies it’s an important problem to work on.

Slate Star Codex expresses confusion about and/or disapproval of (a slightly different manifestation of) this apparent paradox. It’s a pretty clear sign that something is suboptimal.

Here are some conjectures (not designed to be at all mutually exclusive).
Continue Reading

Two and a half years ago, Eliezer was (somewhat plausibly) complaining that virtually nobody outside of MIRI was working on AI-related existential risks.

This year (at EAGlobal) one of MIRI’s talks was a bit hard to distinguish from an AI safety talk given by someone with pretty mainstream AI affiliations.

What happened in that time to cause that shift?

A large change was catalyzed by the publication of Superintelligence. I’ve been mildly disappointed about how little it affected discussions among people who were already interested in the topic. But Superintelligence caused a large change in how many people are willing to express concern over AI risks. That’s presumably because Superintelligence looks sufficiently academic and neutral to make many people comfortable about citing it, whereas similar arguments by Eliezer/MIRI didn’t look sufficiently prestigious within academia.

A smaller part of the change was MIRI shifting its focus somewhat to be more in line with how mainstream machine learning (ML) researchers expect AI to reach human levels.

Also, OpenAI has been quietly shifting in a more MIRI-like direction (I’m very unclear on how big a change this is). (Paul Christiano seems to deserve some credit for both the MIRI and OpenAI shifts in strategies.)

Given those changes, it seems like MIRI ought to be able to attract more donations than before. Especially since it has demonstrated evidence of increasing competence, and also because HPMoR seemed to draw significantly more people into the community of people who are interested in MIRI.

MIRI has gotten one big grant from OpenPhilanthropy that it probably couldn’t have gotten when mainstream AI researchers were treating MIRI’s concerns as too far-fetched to be worth commenting on. But donations from MIRI’s usual sources have stagnated.

That pattern suggests that MIRI was previously benefiting from a polarization effect, where the perception of two distinct “tribes” (those who care about AI risks versus those who promote AI) energized people to care about “their tribe”.

Whereas now there’s no clear dividing line between MIRI and mainstream researchers. Also, there’s lots of money going into other organizations that plan to do something about AI safety. (Most of those haven’t yet articulated enough of a strategy to make me optimistic that that money is well spent. I still endorse the ideas I mentioned last year in How much Diversity of AGI-Risk Organizations is Optimal?. I’m unclear on how much diversity of approaches we’re getting from the recent proliferation of AI safety organizations.)

That kind of pattern of donations creates perverse incentives to charities to at least market themselves as fighting a powerful group of people, rather than (as the ideal charity should be) addressing a neglected problem. Even if that marketing doesn’t distort a charity’s operations, the charity will be tempted to use counterproductive alarmism. AI risk organizations have resisted those temptations (at least recently), but it seems risky to tempt them.

That’s part of why I recently made a modest donation to MIRI, in spite of the uncertainty over the value of their efforts (I had last donated to them in 2009).

This post is partly a response to arguments for only donating to one charity and to an 80,000 Hours post arguing against diminishing returns. But I’ll focus mostly on AGI-risk charities.

Diversifying Donations?

The rule that I should only donate to one charity is a good presumption to start with. Most objections to it are due to motivations that diverge from pure utilitarian altruism. I don’t pretend that altruism is my only motive for donating, so I’m not too concerned that I only do a rough approximation of following that rule.

Still, I want to follow the rule more closely than most people do. So when I direct less than 90% of my donations to tax-deductible nonprofits, I feel a need to point to diminishing returns [1] to donations to justify that.

With AGI risk organizations, I expect the value of diversity to sometimes override the normal presumption even for purely altruistic utilitarians (with caveats about having the time needed to evaluate multiple organizations, and having more than a few thousand dollars to donate; those caveats will exclude many people from this advice, so this post is mainly oriented toward EAs who are earning to give or wealthier people).

Diminishing Returns?

Before explaining that, I’ll reply to the 80,000 Hours post about diminishing returns.

The 80,000 Hours post focuses on charities that mostly market causes to a wide audience. The economies of scale associated with brand recognition and social proof seem more plausible than any economies of scale available to research organizations.

The shortage of existential risk research seems more dangerous than any shortage of charities which are devoted to marketing causes, so I’m focusing on the most important existential risk.

I expect diminishing returns to be common after an organization grows beyond two or three people. One reason is that the founders of most organizations exert more influence than subsequent employees over important policy decisions [2], so at productive organizations founders are more valuable.

For research organizations that need the smartest people, the limited number of such people implies that only small organizations can have a large fraction of employees be highly qualified.

I expect donations to very young organizations to be more valuable than other donations (which implies diminishing returns to size on average):

  • It takes time to produce evidence that the organization is accomplishing something valuable, and donors quite sensibly prefer organizations that have provided such evidence.
  • Even when donors try to compensate for that by evaluating the charity’s mission statement or leader’s competence, it takes some time to adequately communicate those features (e.g. it’s rare for a charity to set up an impressive web site on day one).
  • It’s common for a charity to have suboptimal competence at fundraising until it grows large enough to hire someone with fundraising expertise.
  • Some charities are mainly funded by a few grants in the millions of dollars, and I’ve heard reports that those often take many months between being awarded and reaching the charities’ bank (not to mention delays in awarding the grants). This sometimes means months when a charity has trouble hiring anyone who demands an immediate salary.
  • Donors could in principle overcome these causes of bias, but as far as I can tell, few care about doing so. EA’s come a little closer to doing this than others, but my observations suggest that EA’s are almost as lazy about analyzing new charities as non EA’s.
  • Therefore, I expect young charities to be underfunded.

Why AGI risk research needs diversity

I see more danger of researchers pursuing useless approaches for existential risks in general, and AGI risks in particular (due partly to the inherent lack of feedback), than with other causes.

The most obvious way to reduce that danger is to encourage a wide variety of people and organizations to independently research risk mitigation strategies.

I worry about AGI-risk researchers focusing all their effort on a class of scenarios which rely on a false assumption.

The AI foom debate seems superficially like the main area where a false assumption might cause AGI research to end up mostly wasted. But there are enough influential people on both sides of this issue that I expect research to not ignore one side of that debate for long.

I worry more about assumptions that no prominent people question.

I’ll describe how such an assumption might look in hindsight via an analogy to some leading developers of software intended to accomplish what the web ended up accomplishing [3].

Xanadu stood out as the leading developer of global hypertext software in the 1980s to about the same extent that MIRI stands out as the leading AGI-risk research organization. One reason [4] that Xanadu accomplished little was the assumption that they needed to make money. Part of why that seemed obvious in the 1980s was that there were no ISPs delivering an internet-like platform to ordinary people, and hardware costs were a big obstacle to anyone who wanted to provide that functionality. The hardware costs declined at a predictable enough rate that Drexler was able to predict in Engines of Creation (published in 1986) that ordinary people would get web-like functionality within a decade.

A more disturbing reason for assuming that web functionality needed to make a profit was the ideology surrounding private property. People who opposed private ownership of home, farms, factories, etc. were causing major problems. Most of us automatically treated ownership of software as working the same way as physical property.

People who are too young to remember attitudes toward free / open source software before about 1997 will have some trouble believing how reluctant people were to imagine valuable software being free. [5] Attitudes changed unusually fast due to the demise of communism and the availability of affordable internet access.

A few people (such as RMS) overcame the focus on cold war issues, but were too eccentric to convert many followers. We should pay attention to people with similarly eccentric AGI-risk views.

If I had to guess what faulty assumption AGI-risk researchers are making, I’d say something like faulty guesses about the nature of intelligence or the architecture of feasible AGIs. But the assumptions that look suspicious to me are ones that some moderately prominent people have questioned.

Vague intuitions along these lines have led me to delay some of my potential existential-risk donations in hopes that I’ll discover (or help create?) some newly created existential-risk projects which produce more value per dollar.

Conclusions

How does this affect my current giving pattern?

My favorite charity is CFAR (around 75 or 80% of my donations), which improves the effectiveness of people who might start new AGI-risk organizations or AGI-development organizations. I’ve had varied impressions about whether additional donations to CFAR have had diminishing returns. They seem to have been getting just barely enough money to hire employees they consider important.

FLI is a decent example of a possibly valuable organization that CFAR played some hard-to-quantify role in starting. It bears a superficial resemblance to an optimal incubator for additional AGI-risk research groups. But FLI seems too focused on mainstream researchers to have much hope of finding the eccentric ideas that I’m most concerned about AGI-researchers overlooking.

Ideally I’d be donating to one or two new AGI-risk startups per year. Conditions seem almost right for this. New AGI-risk organizations are being created at a good rate, mostly getting a few large grants that are probably encouraging them to focus on relatively mainstream views [6].

CSER and FLI sort of fit this category briefly last year before getting large grants, and I donated moderate amounts to them. I presume I didn’t give enough to them for diminishing returns to be important, but their windows of unusual need were short enough that I might well have come close to that.

I’m a little surprised that the increasing interest in this area doesn’t seem to be catalyzing the formation of more low-budget groups pursuing more unusual strategies. Please let me know of any that I’m overlooking.

See my favorite charities web page (recently updated) for more thoughts about specific charities.

[1] – Diminishing returns are the main way that donating to multiple charities at one time can be reconciled with utilitarian altruism.

[2] – I don’t know whether it ought to work this way, but I expect this pattern to continue.

[3] – they intended to accomplish a much more ambitious set of goals.

[4] – probably not the main reason.

[5] – presumably the people who were sympathetic to communism weren’t attracted to small software projects (too busy with politics?) or rejected working on software due to the expectation that it required working for evil capitalists.

[6] – The short-term effects are probably good, increasing the diversity of approaches compared to what would be the case if MIRI were the only AGI-risk organization, and reducing the risk that AGI researchers would become polarized into tribes that disagree about whether AGI is dangerous. But a field dominated by a few funders tends to focus on fewer ideas than one with many funders.

I’d like to see more discussion of uploaded ape risks.

There is substantial disagreement over how fast an uploaded mind (em) would improve its abilities or the abilities of its progeny. I’d like to start by analyzing a scenario where it takes between one and ten years for an uploaded bonobo to achieve human-level cognitive abilities. This scenario seems plausible, although I’ve selected it more to illustrate a risk that can be mitigated than because of arguments about how likely it is.

I claim we should anticipate at least a 20% chance a human-level bonobo-derived em would improve at least as quickly as a human that uploaded later.

Considerations that weigh in favor of this are: that bonobo minds seem to be about as general-purpose as humans, including near-human language ability; and the likely ease of ems interfacing with other software will enable them to learn new skills faster than biological minds will.

The most concrete evidence that weighs against this is the modest correlation between IQ and brain size. It’s somewhat plausible that it’s hard to usefully add many neurons to an existing mind, and that bonobo brain size represents an important cognitive constraint.

I’m not happy about analyzing what happens when another species develops more powerful cognitive abilities than humans, so I’d prefer to have some humans upload before the bonobos become superhuman.

A few people worry that uploading a mouse brain will generate enough understanding of intelligence to quickly produce human-level AGI. I doubt that biological intelligence is simple / intelligible enough for that to work. So I focus more on small tweaks: the kind of social pressures which caused the Flynn Effect in humans, selective breeding (in the sense of making many copies of the smartest ems, with small changes to some copies), and faster software/hardware.

The risks seem dependent on the environment in which the ems live and on the incentives that might drive their owners to improve em abilities. The most obvious motives for uploading bonobos (research into problems affecting humans, and into human uploading) create only weak incentives to improve the ems. But there are many other possibilities: military use, interesting NPCs, or financial companies looking for interesting patterns in large databases. No single one of those looks especially likely, but with many ways for things to go wrong, the risks add up.

What could cause a long window between bonobo uploading and human uploading? Ethical and legal barriers to human uploading, motivated by risks to the humans being uploaded and by concerns about human ems driving human wages down.

What could we do about this risk?

Political activism may mitigate the risks of hostility to human uploading, but if done carelessly it could create a backlash which worsens the problem.

Conceivably safety regulations could restrict em ownership/use to people with little incentive to improve the ems, but rules that looked promising would still leave me worried about risks such as irresponsible people hacking into computers that run ems and stealing copies.

A more sophisticated approach is to improve the incentives to upload humans. I expect the timing of the first human uploads to be fairly sensitive to whether we have legal rules which enable us to predict who will own em labor. But just writing clear rules isn’t enough – how can we ensure political support for them at a time when we should expect disputes over whether they’re people?

We could also find ways to delay ape uploading. But most ways of doing that would also delay human uploading, which creates tradeoffs that I’m not too happy with (partly due to my desire to upload before aging damages me too much).

If a delay between bonobo and human uploading is dangerous, then we should also ask about dangers from other uploaded species. My intuition says the risks are much lower, since it seems like there are few technical obstacles to uploading a bonobo brain shortly after uploading mice or other small vertebrates.

But I get the impression that many people associated with MIRI worry about risks of uploaded mice, and I don’t have strong evidence that I’m wiser than they are. I encourage people to develop better analyses of this issue.

Book review: Artificial Superintelligence: A Futuristic Approach, by Roman V. Yampolskiy.

This strange book has some entertainment value, and might even enlighten you a bit about the risks of AI. It presents many ideas, with occasional attempts to distinguish the important ones from the jokes.

I had hoped for an analysis that reflected a strong understanding of which software approaches were most likely to work. Yampolskiy knows something about computer science, but doesn’t strike me as someone with experience at writing useful code. His claim that “to increase their speed [AIs] will attempt to minimize the size of their source code” sounds like a misconception that wouldn’t occur to an experienced programmer. And his chapter “How to Prove You Invented Superintelligence So No One Else Can Steal It” seems like a cute game that someone might play with if he cared more about passing a theoretical computer science class than about, say, making money on the stock market, or making sure the superintelligence didn’t destroy the world.

I’m still puzzling over some of his novel suggestions for reducing AI risks. How would “convincing robots to worship humans as gods” differ from the proposed Friendly AI? Would such robots notice (and resolve in possibly undesirable ways) contradictions in their models of human nature?

Other suggestions are easy to reject, such as hoping AIs will need us for our psychokinetic abilities (abilities that Yampolskiy says are shown by peer-reviewed experiments associated with the Global Consciousness Project).

The style is also weird. Some chapters were previously published as separate papers, and weren’t adapted to fit together. It was annoying to occasionally see sentences that seemed identical to ones in a prior chapter.

The author even has strange ideas about what needs footnoting. E.g. when discussing the physical limits to intelligence, he cites (Einstein 1905).

Only read this if you’ve read other authors on this subject first.