Science and Technology

I’ve substantially reduced my anxiety over the past 5-10 years.

Many of the important steps along that path look easy in hindsight, yet the overall goal looked sufficiently hard prospectively that I usually assumed it wasn’t possible. I only ended up making progress by focusing on related goals.

In this post, I’ll mainly focus on problems related to general social anxiety among introverted nerds. It will probably be much less useful to others.

In particular, I expect it doesn’t apply very well to ADHD-related problems, and I have little idea how well it applies to the results of specific PTSD-type trauma.

It should be slightly useful for anxiety over politicians who are making America grate again. But you’re probably fooling yourself if you blame many of your problems on distant strangers.

Trump: Make America Grate Again!

Continue Reading

Book review: Notes on a New Philosophy of Empirical Science (Draft Version), by Daniel Burfoot.

Standard views of science focus on comparing theories by finding examples where they make differing predictions, and rejecting the theory that made worse predictions.

Burfoot describes a better view of science, called the Compression Rate Method (CRM), which replaces the “make prediction” step with “make a compression program”, and compares theories by how much they compress a standard (large) database.

These views of science produce mostly equivalent results(!), but CRM provides a better perspective.

Machine Learning (ML) is potentially science, and this book focuses on how ML will be improved by viewing its problems through the lens of CRM. Burfoot complains about the toolkit mentality of traditional ML research, arguing that the CRM approach will turn ML into an empirical science.

This should generate a Kuhnian paradigm shift in ML, with more objective measures of the research quality than any branch of science has achieved so far.

Burfoot focuses on compression as encoding empirical knowledge of specific databases / domains. He rejects the standard goal of a general-purpose compression tool. Instead, he proposes creating compression algorithms that are specialized for each type of database, to reflect what we know about topics (such as images of cars) that are important to us.
Continue Reading

MIRI has produced a potentially important result (called Garrabrant induction) for dealing with uncertainty about logical facts.

The paper is somewhat hard for non-mathematicians to read. This video provides an easier overview, and more context.

It uses prediction markets! “It’s a financial solution to the computer science problem of metamathematics”.

It shows that we can evade disturbing conclusions such as Godel incompleteness and the paradox of the liar, by expecting to only be very confident about logically deducible facts (as opposed to being mathematically certain). That’s similar to the difference between treating beliefs about empirical facts as probabilities, as opposed to boolean values.

I’m somewhat skeptical that it will have an important effect on AI safety, but my intuition says it will produce enough benefits somewhere that it will become at least as famous as Pearl’s work on causality.

One of the weakest claims in The Age of Em was that AI progress has not been accelerating.

J Storrs Hall (aka Josh) has a hypothesis that AI progress accelerated about a decade ago due to a shift from academia to industry. (I’m puzzled why the title describes it as a coming change, when it appears to have already happened).

I find it quite likely that something important happened then, including an acceleration in the rate at which AI affects people.

I find it less clear whether that indicates a change in how fast AI is approaching human intelligence levels.

Josh points to airplanes as an example of a phase change being important.

I tried to compare AI progress to other industries which might have experienced a similar phase change, driven by hardware progress. But I was deterred by the difficulty of estimating progress in industries when they were driven by academia.

One industry I tried to compare to was photovoltaics, which seemed to be hyped for a long time before becoming commercially important (10-20 years ago?). But I see only weak signs of a phase change around 2007, from looking at Swanson’s Law. It’s unclear whether photovoltaic progress was ever dominated by academia enough for a phase change to be important.

Hypertext is a domain where a clear phase change happened in the earl 1990s. It experienced a nearly foom-like rate of adoption when internet availability altered the problem, from one that required a big company to finance the hardware and marketing, to a problem that could be solved by simply giving away a small amount of code. But this change in adoption was not accompanied by a change in the power of hypertext software (beyond changes due to network effects). So this seems like weak evidence against accelerating progress toward human-level AI.

What other industries should I look at?

I started writing morning pages a few months ago. That means writing three pages, on paper, before doing anything else [1].

I’ve only been doing this on weekends and holidays, because on weekdays I feel a need to do some stock market work close to when the market opens.

It typically takes me one hour to write three pages. At first, it felt like I needed 75 minutes but wanted to finish faster. After a few weeks, it felt like I could finish in about 50 minutes when I was in a hurry, but often preferred to take more than an hour.

That suggests I’m doing much less stream-of-consciousness writing than is typical for morning pages. It’s unclear whether that matters.

It feels like devoting an hour per day to morning pages ought to be costly. Yet I never observed it crowding out anything I valued (except maybe once or twice when I woke up before getting an optimal amount of sleep in order to get to a hike on time – that was due to scheduling problems, not due to morning pages reducing the available of time per day).
Continue Reading

Why do people knowingly follow bad investment strategies?

I won’t ask (in this post) about why people hold foolish beliefs about investment strategies. I’ll focus on people who intend to follow a decent strategy, and fail. I’ll illustrate this with a stereotype from a behavioral economist (Procrastination in Preparing for Retirement):[1]

For instance, one of the authors has kept an average of over $20,000 in his checking account over the last 10 years, despite earning an average of less than 1% interest on this account and having easy access to very liquid alternative investments earning much more.

A more mundane example is a person who holds most of their wealth in stock of a single company, for reasons of historical accident (they acquired it via employee stock options or inheritance), but admits to preferring a more diversified portfolio.

An example from my life is that, until this year, I often borrowed money from Schwab to buy stock, when I could have borrowed at lower rates in my Interactive Brokers account to do the same thing. (Partly due to habits that I developed while carelessly unaware of the difference in rates; partly due to a number of trivial inconveniences).

Behavioral economists are somewhat correct to attribute such mistakes to questionable time discounting. But I see more patterns than such a model can explain (e.g. people procrastinate more over some decisions (whether to make a “boring” trade) than others (whether to read news about investments)).[2]

Instead, I use CFAR-style models that focus on conflicting motives of different agents within our minds.

Continue Reading

Book review: Are We Smart Enough to Know How Smart Animals Are?, by Frans de Waal.

This book is primarily about discrediting false claims of human uniqueness, and showing how easy it is to screw up evaluations of a species’ cognitive abilities. It is best summarized by the cognitive ripple rule:

Every cognitive capacity that we discover is going to be older and more widespread than initially thought.

De Waal provides many anecdotes of carefully designed experiments detecting abilities that previously appeared to be absent. E.g. asian elephants failed mirror tests with small, distant mirrors. When experimenters dared to put large mirrors close enough for the elephants to touch, some of them passed the test.

Likewise, initial observations of behaviorist humans suggested they were rigidly fixated on explaining all behavior via operant conditioning. Yet one experimenter managed to trick a behaviorist into demonstrating more creativity, by harnessing the one motive that behaviorists prefer over their habit of advocating operant conditioning: their desire to accuse people of recklessly inferring complex cognition.

De Waal seems moderately biased toward overstating cognitive abilities of most species (with humans being one clear exception to that pattern).

At one point he gave me the impression that he was claiming elephants could predict where a thunderstorm would hit days in advance. I checked the reference, and what the elephants actually did was predict the arrival of the wet season, and respond with changes such as longer steps (but probably not with indications that they knew where thunderstorms would hit). After rereading de Waal’s wording, I decided it was ambiguous. But his claim that elephants “hear thunder and rainfall hundreds of miles away” exaggerates the original paper’s “detected … at distances greater than 100 km … perhaps as much as 300 km”.

But in the context of language, de Waal switches to downplaying reports of impressive abilities. I wonder how much of that is due to his desire to downplay claims that human minds are better, and how much of that is because his research isn’t well suited to studying language.

I agree with the book’s general claims. The book provides evidence that human brains embody only small, somewhat specialized improvements on the cognitive abilities of other species. But I found the book less convincing on that subject than some other books I’ve read recently. I suspect that’s mainly due to de Waal’s focus on anecdotes that emphasize what’s special about each species or individual. Whereas The Human Advantage rigorously quantifies important ways in which human brains are just a bigger primate brain (but primate brains are special!). Or The Secret of our Success (which doesn’t use particularly rigorous methods) provides a better perspective, by describing a model in which ape minds evolve to human minds via ordinary, gradual adaptations to mildly new environments.

In sum, this book is good at explaining the problems associated with research into animal cognition. It is merely ok at providing insights about how smart various species are.

Book review: Made-Up Minds: A Constructivist Approach to Artificial Intelligence, by Gary L. Drescher.

It’s odd to call a book boring when it uses the pun “ontology recapitulates phylogeny”[1]. to describe a surprising feature of its model. About 80% of the book is dull enough that I barely forced myself to read it, yet the occasional good idea persuaded me not to give up.

Drescher gives a detailed model of how Piaget-style learning in infants could enable them to learn complex concepts starting with minimal innate knowledge.
Continue Reading

One of most important assumptions in The Age of Ems is that non-em AGI will take a long time to develop.

1.

Scott Alexander at SlateStarCodex complains that Robin rejects survey data that uses validated techniques, and instead uses informal surveys whose results better fit Robin’s biases [1]. Robin clearly explains one reason why he does that: to get the outside view of experts.

Whose approach to avoiding bias is better?

  • Minimizing sampling error and carefully documenting one’s sampling technique are two of the most widely used criteria to distinguish science from wishful thinking.
  • Errors due to ignoring the outside view have been documented to be large, yet forecasters are reluctant to use the outside view.

So I rechecked advice from forecasting experts such as Philip Tetlock and Nate Silver, and the clear answer I got was … that was the wrong question.

Tetlock and Silver mostly focus on attitudes that are better captured by the advice to be a fox, not a hedgehog.

The strongest predictor of rising into the ranks of superforecasters is perpetual beta, the degree to which one is committed to belief updating and self-improvement.

Tetlock’s commandment number 3 says “Strike the right balance between inside and outside views”. Neither Tetlock or Silver offer hope that either more rigorous sampling of experts or dogmatically choosing the outside view over the inside view help us win a forecasting contest.

So instead of asking who is right, we should be glad to have two approaches to ponder, and should want more. (Robin only uses one approach for quantifying the time to non-em AGI, but is more fox-like when giving qualitative arguments against fast AGI progress).

2.

What Robin downplays is that there’s no consensus of the experts on whom he relies, not even about whether progress is steady, accelerating, or decelerating.

Robin uses the median expert estimate of progress in various AI subfields. This makes sense if AI progress depends on success in many subfields. It makes less sense if success in one subfield can make the other subfields obsolete. If “subfield” means a guess about what strategy best leads to intelligence, then I expect the median subfield to be rendered obsolete by a small number of good subfields [2]. If “subfield” refers to a subset of tasks that AI needs to solve (e.g. vision, or natural language processing), then it seems reasonable to look at the median (and I can imagine that slower subfields matter more). Robin appears to use both meanings of “subfield”, with fairly similar results for each, so it’s somewhat plausible that the median is informative.

3.

Scott also complains that Robin downplays the importance of research spending while citing only a paper dealing with government funding of agricultural research. But Robin also cites another paper (Ulku 2004), which covers total R&D expenditures in 30 countries (versus 16 countries in the paper that Scott cites) [3].

4.

Robin claims that AI progress will slow (relative to economic growth) due to slowing hardware progress and reduced dependence on innovation. Even if I accept Robin’s claims about these factors, I have trouble believing that AI progress will slow.

I expect higher em IQ will be one factor that speeds up AI progress. Garrett Jones suggests that a 40 IQ point increase in intelligence causes a 50% increase in a country’s productivity. I presume that AI researcher productivity is more sensitive to IQ than is, say, truck driver productivity. So it seems fairly plausible to imagine that increased em IQ will cause more than a factor of two increase in the rate of AI progress. (Robin downplays the effects of IQ in contexts where a factor of two wouldn’t much affect his analysis; he appears to ignore them in this context).

I expect that other advantages of ems will contribute additional speedups – maybe ems who work on AI will run relatively fast, maybe good training/testing data will be relatively cheap to create, or maybe knowledge from experimenting on ems will better guide AI research.

5.

Robin’s arguments against an intelligence explosion are weaker than they appear. I mostly agree with those arguments, but I want to discourage people from having strong confidence in them.

The most suspicious of those arguments is that gains in software algorithmic efficiency “remain surprisingly close to the rate at which hardware costs have fallen. This suggests that algorithmic gains have been enabled by hardware gains”. He cites only (Grace 2013) in support of this. That paper doesn’t comment on whether hardware changes enable software changes. The evidence seems equally consistent with that or with the hypothesis that both are independently caused by some underlying factor. I’d say there’s less than a 50% chance that Robin is correct about this claim.

Robin lists 14 other reasons for doubting there will be an intelligence explosion: two claims about AI history (no citations), eight claims about human intelligence (one citation), and four about what causes progress in research (with the two citations mentioned earlier). Most of those 14 claims are probably true, but it’s tricky to evaluate their relevance.

Conclusion

I’d say there’s maybe a 15% chance that Robin is basically right about the timing of non-em AI given his assumptions about ems. His book is still pretty valuable if an em-dominated world lasts for even one subjective decade before something stranger happens. And “something stranger happens” doesn’t necessarily mean his analysis becomes obsolete.

Footnotes

[1] – I can’t find any SlateStarCodex complaint about Bostrom doing something in Superintelligence that’s similar to what Scott accuses Robin of, when Bostrom’s survey of experts shows an expected time of decades for human-level AI to become superintelligent. Bostrom wants to focus on a much faster takeoff scenario, and disagrees with the experts, without identifying reasons for thinking his approach reduces biases.

[2] – One example is that genetic algorithms are looking fairly obsolete compared to neural nets, now that they’re being compared on bigger problems than when genetic algorithms were trendy.

Robin wants to avoid biases from recent AI fads by looking at subfields as they were defined 20 years ago. Some recent changes in AI are fads, but some are increased wisdom. I expect many subfields to be dead ends, given how immature AI was 20 years ago (and may still be today).

[3] – Scott quotes from one of three places that Robin mentions this subject (an example of redundancy that is quite rare in the book), and that’s the one place out of three where Robin neglects to cite (Ulku 2004). Age of Em is the kind of book where it’s easy to overlook something important like that if you don’t read it more carefully than you’d read a normal book.

I tried comparing (Ulku 2004) to the OECD paper that Scott cites, and failed to figure out whether they disagree. The OECD paper is probably consistent with Robin’s “less than proportionate increases” claim that Scott quotes. But Scott’s doubts are partly about Robin’s bolder prediction that AI progress will slow down, and academic papers don’t help much in evaluating that prediction.

If you’re tempted to evaluate how well the Ulku paper supports Robin’s views, beware that this quote is one of its easier to understand parts:

In addition, while our analysis lends support for endogenous growth theories in that it confirms a significant relationship between R&D stock and innovation, and between innovation and per capita GDP, it lacks the evidence for constant returns to innovation in terms of R&D stock. This implies that R&D models are not able to explain sustainable economic growth, i.e. they are not fully endogenous.

Book review: The Age of Em: Work, Love and Life when Robots Rule the Earth, by Robin Hanson.

This book analyzes a possible future era when software emulations of humans (ems) dominate the world economy. It is too conservative to tackle longer-term prospects for eras when more unusual intelligent beings may dominate the world.

Hanson repeatedly tackles questions that scare away mainstream academics, and gives relatively ordinary answers (guided as much as possible by relatively standard, but often obscure, parts of the academic literature).

Assumptions

Hanson’s scenario relies on a few moderately controversial assumptions. The assumptions which I find most uncertain are related to human-level intelligence being hard to understand (because it requires complex systems), enough so that ems will experience many subjective centuries before artificial intelligence is built from scratch. For similar reasons, ems are opaque enough that it will be quite a while before they can be re-engineered to be dramatically different.

Hanson is willing to allow that ems can be tweaked somewhat quickly to produce moderate enhancements (at most doubling IQ) before reaching diminishing returns. He gives somewhat plausible reasons for believing this will only have small effects on his analysis. But few skeptics will be convinced.

Some will focus on potential trillions of dollars worth of benefits that higher IQs might produce, but that wealth would not much change Hanson’s analysis.

Others will prefer an inside view analysis which focuses on the chance that higher IQs will better enable us to handle risks of superintelligent software. Hanson’s analysis implies we should treat that as an unlikely scenario, but doesn’t say what we should do about modest probabilities of huge risks.

Another way that Hanson’s assumptions could be partly wrong is if tweaking the intelligence of emulated Bonobos produces super-human entities. That seems to only require small changes to his assumptions about how tweakable human-like brains are. But such a scenario is likely harder to analyze than Hanson’s scenario, and it probably makes more sense to understand Hanson’s scenario first.

Wealth

Wages in this scenario are somewhat close to subsistence levels. Ems have some ability to restrain wage competition, but less than they want. Does that mean wages are 50% above subsistence levels, or 1%? Hanson hints at the former. The difference feels important to me. I’m concerned that sound-bite versions of book will obscure the difference.

Hanson claims that “wealth per em will fall greatly”. It would be possible to construct a measure by which ems are less wealthy than humans are today. But I expect it will be at least as plausible to use a measure under which ems are rich compared to humans of today, but have high living expenses. I don’t believe there’s any objective unit of value that will falsify one of those perspectives [1].

Style / Organization

The style is more like a reference book than a story or an attempt to persuade us of one big conclusion. Most chapters (except for a few at the start and end) can be read in any order. If the section on physics causes you to doubt whether the book matters, skip to chapter 12 (labor), and return to the physics section later.

The style is very concise. Hanson rarely repeats a point, so understanding him requires more careful attention than with most authors.

It’s odd that the future of democracy gets less than twice as much space as the future of swearing. I’d have preferred that Hanson cut out a few of his less important predictions, to make room for occasional restatements of important ideas.

Many little-known results that are mentioned in the book are relevant to the present, such as: how the pitch of our voice affects how people perceive us, how vacations affect productivity, and how bacteria can affect fluid viscosity.

I was often tempted to say that Hanson sounds overconfident, but he is clearly better than most authors at admitting appropriate degrees of uncertainty. If he devoted much more space to caveats, I’d probably get annoyed at the repetition. So it’s hard to say whether he could have done any better.

Conclusion

Even if we should expect a much less than 50% chance of Hanson’s scenario becoming real, it seems quite valuable to think about how comfortable we should be with it and how we could improve on it.

Footnote

[1] – The difference matters only in one paragraph, where Hanson discusses whether ems deserve charity more than do humans living today. Hanson sounds like he’s claiming ems deserve our charity because they’re poor. Most ems in this scenario are comfortable enough for this to seem wrong.

Hanson might also be hinting that our charity would be effective at increasing the number of happy ems, and that basic utilitarianism says that’s preferable to what we can do by donating to today’s poor. That argument deserves more respect and more detailed analysis.