Book review: The Cult of Smart, by Fredrik deBoer.
The Cult of Smart is a sporadically thoughtful book about education politics, sometimes rising above tribal politics, and sometimes repeating tired old tribal rants.Continue Reading
Book review: The Cult of Smart, by Fredrik deBoer.
The Cult of Smart is a sporadically thoughtful book about education politics, sometimes rising above tribal politics, and sometimes repeating tired old tribal rants.Continue Reading
I said in my review of WEIRDest People that the Flynn effect seems like a natural consequence of thinking styles that became more analytical, abstract, reductionist, and numerical.
I’ll expand here on some questions which I swept under the rug, so that I could keep that review focused on the book’s most important aspects.
After reading WEIRDest People, I find that the goal of a culture-neutral IQ test looks strange (and, of course, WEIRD). At least as strange as trying to fix basketball to stop favoring tall people.Continue Reading
Book review: Human Compatible, by Stuart Russell.
Human Compatible provides an analysis of the long-term risks from artificial intelligence, by someone with a good deal more of the relevant prestige than any prior author on this subject.
What should I make of Russell? I skimmed his best-known book, Artificial Intelligence: A Modern Approach, and got the impression that it taught a bunch of ideas that were popular among academics, but which weren’t the focus of the people who were getting interesting AI results. So I guessed that people would be better off reading Deep Learning by Goodfellow, Bengio, and Courville instead. Human Compatible neither confirms nor dispels the impression that Russell is a bit too academic.
However, I now see that he was one of the pioneers of inverse reinforcement learning, which looks like a fairly significant advance that will likely become important someday (if it hasn’t already). So I’m inclined to treat him as a moderately good authority on AI.
The first half of the book is a somewhat historical view of AI, intended for readers who don’t know much about AI. It’s ok.
[Warning: long post, of uncertain value, with annoyingly uncertain conclusions.]
This post will focus on how hardware (cpu power) will affect AGI timelines. I will undoubtedly overlook some important considerations; this is just a model of some important effects that I understand how to analyze.
I’ll make some effort to approach this as if I were thinking about AGI timelines for the first time, and focusing on strategies that I use in other domains.
I’m something like 60% confident that the most important factor in the speed of AI takeoff will be the availability of computing power.
I’ll focus here on the time to human-level AGI, but I suspect this reasoning implies getting from there to superintelligence at speeds that Bostrom would classify as slow or moderate.
Book review: The Measure of All Minds: Evaluating Natural and Artificial Intelligence, by José Hernández-Orallo.
Much of this book consists of surveys of the psychometric literature. But the best parts of the book involve original results that bring more rigor and generality to the field. The best parts of the book approach the quality that I saw in Judea Pearl’s Causality, and E.T. Jaynes’ Probability Theory, but Measure of All Minds achieves a smaller fraction of its author’s ambitions, and is sometimes poorly focused.
Hernández-Orallo has an impressive ambition: measure intelligence for any agent. The book mentions a wide variety of agents, such as normal humans, infants, deaf-blind humans, human teams, dogs, bacteria, Q-learning algorithms, etc.
The book is aimed at a narrow and fairly unusual target audience. Much of it reads like it’s directed at psychology researchers, but the more original parts of the book require thinking like a mathematician.
The survey part seems pretty comprehensive, but I wasn’t satisfied with his ability to distinguish the valuable parts (although he did a good job of ignoring the politicized rants that plague many discussions of this subject).
For nearly the first 200 pages of the book, I was mostly wondering whether the book would address anything important enough for me to want to read to the end. Then I reached an impressive part: a description of an objective IQ-like measure. Hernández-Orallo offers a test (called the C-test) which:
Yet just when I got my hopes up for a major improvement in real-world IQ testing, he points out that what the C-test measures is too narrow to be called intelligence: there’s a 960 line Perl program that exhibits human-level performance on this kind of test, without resembling a breakthrough in AI.
One of most important assumptions in The Age of Ems is that non-em AGI will take a long time to develop.
Scott Alexander at SlateStarCodex complains that Robin rejects survey data that uses validated techniques, and instead uses informal surveys whose results better fit Robin’s biases . Robin clearly explains one reason why he does that: to get the outside view of experts.
Whose approach to avoiding bias is better?
Tetlock and Silver mostly focus on attitudes that are better captured by the advice to be a fox, not a hedgehog.
Tetlock’s commandment number 3 says “Strike the right balance between inside and outside views”. Neither Tetlock or Silver offer hope that either more rigorous sampling of experts or dogmatically choosing the outside view over the inside view help us win a forecasting contest.
So instead of asking who is right, we should be glad to have two approaches to ponder, and should want more. (Robin only uses one approach for quantifying the time to non-em AGI, but is more fox-like when giving qualitative arguments against fast AGI progress).
What Robin downplays is that there’s no consensus of the experts on whom he relies, not even about whether progress is steady, accelerating, or decelerating.
Robin uses the median expert estimate of progress in various AI subfields. This makes sense if AI progress depends on success in many subfields. It makes less sense if success in one subfield can make the other subfields obsolete. If “subfield” means a guess about what strategy best leads to intelligence, then I expect the median subfield to be rendered obsolete by a small number of good subfields . If “subfield” refers to a subset of tasks that AI needs to solve (e.g. vision, or natural language processing), then it seems reasonable to look at the median (and I can imagine that slower subfields matter more). Robin appears to use both meanings of “subfield”, with fairly similar results for each, so it’s somewhat plausible that the median is informative.
Scott also complains that Robin downplays the importance of research spending while citing only a paper dealing with government funding of agricultural research. But Robin also cites another paper (Ulku 2004), which covers total R&D expenditures in 30 countries (versus 16 countries in the paper that Scott cites) .
Robin claims that AI progress will slow (relative to economic growth) due to slowing hardware progress and reduced dependence on innovation. Even if I accept Robin’s claims about these factors, I have trouble believing that AI progress will slow.
I expect higher em IQ will be one factor that speeds up AI progress. Garrett Jones suggests that a 40 IQ point increase in intelligence causes a 50% increase in a country’s productivity. I presume that AI researcher productivity is more sensitive to IQ than is, say, truck driver productivity. So it seems fairly plausible to imagine that increased em IQ will cause more than a factor of two increase in the rate of AI progress. (Robin downplays the effects of IQ in contexts where a factor of two wouldn’t much affect his analysis; he appears to ignore them in this context).
I expect that other advantages of ems will contribute additional speedups – maybe ems who work on AI will run relatively fast, maybe good training/testing data will be relatively cheap to create, or maybe knowledge from experimenting on ems will better guide AI research.
Robin’s arguments against an intelligence explosion are weaker than they appear. I mostly agree with those arguments, but I want to discourage people from having strong confidence in them.
The most suspicious of those arguments is that gains in software algorithmic efficiency “remain surprisingly close to the rate at which hardware costs have fallen. This suggests that algorithmic gains have been enabled by hardware gains”. He cites only (Grace 2013) in support of this. That paper doesn’t comment on whether hardware changes enable software changes. The evidence seems equally consistent with that or with the hypothesis that both are independently caused by some underlying factor. I’d say there’s less than a 50% chance that Robin is correct about this claim.
Robin lists 14 other reasons for doubting there will be an intelligence explosion: two claims about AI history (no citations), eight claims about human intelligence (one citation), and four about what causes progress in research (with the two citations mentioned earlier). Most of those 14 claims are probably true, but it’s tricky to evaluate their relevance.
I’d say there’s maybe a 15% chance that Robin is basically right about the timing of non-em AI given his assumptions about ems. His book is still pretty valuable if an em-dominated world lasts for even one subjective decade before something stranger happens. And “something stranger happens” doesn’t necessarily mean his analysis becomes obsolete.
 – I can’t find any SlateStarCodex complaint about Bostrom doing something in Superintelligence that’s similar to what Scott accuses Robin of, when Bostrom’s survey of experts shows an expected time of decades for human-level AI to become superintelligent. Bostrom wants to focus on a much faster takeoff scenario, and disagrees with the experts, without identifying reasons for thinking his approach reduces biases.
 – One example is that genetic algorithms are looking fairly obsolete compared to neural nets, now that they’re being compared on bigger problems than when genetic algorithms were trendy.
Robin wants to avoid biases from recent AI fads by looking at subfields as they were defined 20 years ago. Some recent changes in AI are fads, but some are increased wisdom. I expect many subfields to be dead ends, given how immature AI was 20 years ago (and may still be today).
 – Scott quotes from one of three places that Robin mentions this subject (an example of redundancy that is quite rare in the book), and that’s the one place out of three where Robin neglects to cite (Ulku 2004). Age of Em is the kind of book where it’s easy to overlook something important like that if you don’t read it more carefully than you’d read a normal book.
I tried comparing (Ulku 2004) to the OECD paper that Scott cites, and failed to figure out whether they disagree. The OECD paper is probably consistent with Robin’s “less than proportionate increases” claim that Scott quotes. But Scott’s doubts are partly about Robin’s bolder prediction that AI progress will slow down, and academic papers don’t help much in evaluating that prediction.
If you’re tempted to evaluate how well the Ulku paper supports Robin’s views, beware that this quote is one of its easier to understand parts:
In addition, while our analysis lends support for endogenous growth theories in that it confirms a significant relationship between R&D stock and innovation, and between innovation and per capita GDP, it lacks the evidence for constant returns to innovation in terms of R&D stock. This implies that R&D models are not able to explain sustainable economic growth, i.e. they are not fully endogenous.
Book review: The Age of Em: Work, Love and Life when Robots Rule the Earth, by Robin Hanson.
This book analyzes a possible future era when software emulations of humans (ems) dominate the world economy. It is too conservative to tackle longer-term prospects for eras when more unusual intelligent beings may dominate the world.
Hanson repeatedly tackles questions that scare away mainstream academics, and gives relatively ordinary answers (guided as much as possible by relatively standard, but often obscure, parts of the academic literature).
Hanson’s scenario relies on a few moderately controversial assumptions. The assumptions which I find most uncertain are related to human-level intelligence being hard to understand (because it requires complex systems), enough so that ems will experience many subjective centuries before artificial intelligence is built from scratch. For similar reasons, ems are opaque enough that it will be quite a while before they can be re-engineered to be dramatically different.
Hanson is willing to allow that ems can be tweaked somewhat quickly to produce moderate enhancements (at most doubling IQ) before reaching diminishing returns. He gives somewhat plausible reasons for believing this will only have small effects on his analysis. But few skeptics will be convinced.
Some will focus on potential trillions of dollars worth of benefits that higher IQs might produce, but that wealth would not much change Hanson’s analysis.
Others will prefer an inside view analysis which focuses on the chance that higher IQs will better enable us to handle risks of superintelligent software. Hanson’s analysis implies we should treat that as an unlikely scenario, but doesn’t say what we should do about modest probabilities of huge risks.
Another way that Hanson’s assumptions could be partly wrong is if tweaking the intelligence of emulated Bonobos produces super-human entities. That seems to only require small changes to his assumptions about how tweakable human-like brains are. But such a scenario is likely harder to analyze than Hanson’s scenario, and it probably makes more sense to understand Hanson’s scenario first.
Wages in this scenario are somewhat close to subsistence levels. Ems have some ability to restrain wage competition, but less than they want. Does that mean wages are 50% above subsistence levels, or 1%? Hanson hints at the former. The difference feels important to me. I’m concerned that sound-bite versions of book will obscure the difference.
Hanson claims that “wealth per em will fall greatly”. It would be possible to construct a measure by which ems are less wealthy than humans are today. But I expect it will be at least as plausible to use a measure under which ems are rich compared to humans of today, but have high living expenses. I don’t believe there’s any objective unit of value that will falsify one of those perspectives .
The style is more like a reference book than a story or an attempt to persuade us of one big conclusion. Most chapters (except for a few at the start and end) can be read in any order. If the section on physics causes you to doubt whether the book matters, skip to chapter 12 (labor), and return to the physics section later.
The style is very concise. Hanson rarely repeats a point, so understanding him requires more careful attention than with most authors.
It’s odd that the future of democracy gets less than twice as much space as the future of swearing. I’d have preferred that Hanson cut out a few of his less important predictions, to make room for occasional restatements of important ideas.
Many little-known results that are mentioned in the book are relevant to the present, such as: how the pitch of our voice affects how people perceive us, how vacations affect productivity, and how bacteria can affect fluid viscosity.
I was often tempted to say that Hanson sounds overconfident, but he is clearly better than most authors at admitting appropriate degrees of uncertainty. If he devoted much more space to caveats, I’d probably get annoyed at the repetition. So it’s hard to say whether he could have done any better.
Even if we should expect a much less than 50% chance of Hanson’s scenario becoming real, it seems quite valuable to think about how comfortable we should be with it and how we could improve on it.
 – The difference matters only in one paragraph, where Hanson discusses whether ems deserve charity more than do humans living today. Hanson sounds like he’s claiming ems deserve our charity because they’re poor. Most ems in this scenario are comfortable enough for this to seem wrong.
Hanson might also be hinting that our charity would be effective at increasing the number of happy ems, and that basic utilitarianism says that’s preferable to what we can do by donating to today’s poor. That argument deserves more respect and more detailed analysis.
Book review: The Human Advantage: A New Understanding of How Our Brain Became Remarkable, by Suzana Herculano-Houzel.
I used to be uneasy about claims that the human brain was special because it is large for our body size: relative size just didn’t seem like it could be the best measure of whatever enabled intelligence.
At last, Herculano-Houzel has invented a replacement for that measure. Her impressive technique for measuring the number of neurons in a brain has revolutionized this area of science.
We can now see an important connection between the number of cortical neurons and cognitive ability. I’m glad that the book reports on research that compares the cognitive abilities of enough species to enable moderately objective tests of the relevant hypotheses (although the research still has much room for improvement).
We can also see that the primate brain is special, in a way that enables large primates to be smarter than similarly sized nonprimates. And that humans are not very special for a primate of our size, although energy constraints make it tricky for primates to reach our size.
I was able to read the book quite quickly. Much of it is arranged in an occasionally suspenseful story about how the research was done. It doesn’t have lots of information, but the information it does have seems very new (except for the last two chapters, where Herculano-Houzel gets farther from her area of expertise).
The paper reporting that result disagrees somewhat with Herculano-Houzel:
Our results underscore that correlations between cognitive performance and absolute neocortical neuron numbers across animal orders or classes are of limited value, and attempts to quantify the mental capacity of a dolphin for cross-species comparisons are bound to be controversial.
But I don’t see much of an argument against the correlation between intelligence and cortical neuron numbers. The lack of good evidence about long-finned pilot whale intelligence mainly implies we ought to be uncertain.
Book review: Hive Mind: How your nation’s IQ matters so much more than your own, by Garett Jones.
Hive Mind is a solid and easy to read discussion of why high IQ nations are more successful than low IQ nations.
There’s a pretty clear correlation between national IQ and important results such as income. It’s harder to tell how much of the correlation is caused by IQ differences. The Flynn Effect hints that high IQ could instead be a symptom of increased wealth.
The best evidence for IQ causing wealth (more than being caused by wealth) is that Hong Kong and Taiwan had high IQs back in the 1960s, before becoming rich.
Another piece of similar evidence (which Hive Mind doesn’t point to) is that Saudi Arabia is the most conspicuous case of a country that became wealthy via luck. Its IQ is lower than countries of comparable wealth, and lower than neighbors of similar culture/genes.
Much of the book is devoted to speculations about how IQ could affect a nation’s success.
High IQ is associated with more patience, probably due to better ability to imagine the future:
Imagine two societies: one in which the future feels like a dim shadow, the other in which the future seems a real as now. Which society will have more restaurants that care about repeat customers? Which society will have more politicians who turn down bribes because they worry about eventually getting caught?
Hive Mind describes many possible causes of the Flynn Effect, without expressing much of a preference between them. Flynn’s explanation still seems strongest to me. The most plausible alternative that Hive Mind mentions is anxiety and stress from poverty-related problems distracting people during tests (and possibly also from developing abstract cognitive skills). But anxiety / stress explanations seem less likely to produce the Hong Kong/Taiwan/Saudi Arabia results.
Hive Mind talks about the importance of raising national IQ, especially in less-developed countries. That goal would be feasible if differences in IQ were mainly caused by stress or nutrition. Flynn’s cultural explanation points to causes that are harder for governments or charities to influence (how do you legislate an increased desire to think abstractly?).
What about the genetic differences that contribute to IQ differences? The technology needed to fix that contributing factor to low IQs is not ready today, but looks near enough that we should pay attention. Hive Mind implies [but avoids saying] that potentially large harm from leaving IQ unchanged could outweigh the risks of genetic engineering. Fears about genetic engineering of IQ often involve fears of competition, but Hive Mind shows that higher IQ means more cooperation. More cooperation suggests less war, less risk of dangerous nanotech arms races, etc.
It shouldn’t sound paradoxical to say that aggregate IQ matters more than individual IQ. It should start to seem ordinary if more people follow the example of Hive Mind and focus more attention on group success than on individual success as they relate to IQ.
Book review: The Intelligence Paradox: Why the Intelligent Choice Isn’t Always the Smart One, by Satoshi Kanazawa.
This book is entertaining and occasionally thought-provoking, but not very well thought out.
The main idea is that intelligence (what IQ tests measure) is an adaptation for evolutionarily novel situations, and shouldn’t be positively correlated with cognitive abilities that are specialized for evolutionarily familiar problems. He defines “smart” so that it’s very different from intelligence. His notion of smart includes a good deal of common sense that is unconnected with IQ.
He only provides one example of an evolutionarily familiar skill which I assumed would be correlated with IQ but which isn’t: finding your way in situations such as woods where there’s some risk of getting lost.
He does make and test many odd predictions about high IQ people being more likely to engage in evolutionarily novel behavior, such as high IQ people going to bed later than low IQ people. But I’m a bit concerned at the large number of factors he controls for before showing associations (e.g. 19 factors for alcohol use). How hard would it be to try many combinations and only report results when he got conclusions that fit his prediction? On the other hand, he can’t be trying too hard to reject all evidence that conflicts with his predictions, since he occasionally reports evidence that conflicts with his predictions (e.g. tobacco use).
He reports that fertility is heritable, and finds that puzzling. He gives a kin selection based argument saying that someone with many siblings ought to put more effort into the siblings reproductive success and less into personally reproducing. But I see no puzzle – I expect people to have varying intuitions about whether the current abundance of food will last, and pursue different strategies, some of which will be better if food remains abundant, and others better if overpopulation produces a famine.
He’s eager to sound controversial, and his chapter titles will certainly offend some people. Sometimes those are backed up by genuinely unpopular claims, sometimes the substance is less interesting. E.g. the chapter title “Why Homosexuals Are More Intelligent than Heterosexuals” says there’s probably no connection between intelligence and homosexual desires, but there’s a connection between intelligence and how willing people are to act on those desires (yawn).
Here is some evidence against his main hypothesis.