kelvinism

All posts tagged kelvinism

No, this isn’t about cutlery.

I’m proposing to fork science in the sense that Bitcoin was forked, into an adversarial science and a crowdsourced science.

As with Bitcoin, I have no expectation that the two branches will be equal.

These ideas could apply to most fields of science, but some fields need change more than others. P-values and p-hacking controversy are signs that a field needs change. Fields that don’t care much about p-values don’t need as much change, e.g. physics and computer science. I’ll focus mainly on medicine and psychology, and leave aside the harder-to-improve social sciences.

What do we mean by the word Science?

The term “science” has a range of meanings.

One extreme focuses on “perform experiments in order to test hypotheses”, as in The Scientist In The Crib. I’ll call this the personal knowledge version of science.

A different extreme includes formal institutions such as peer review, RCTs, etc. I’ll call this the authoritative knowledge version of science.

Both of these meanings of the word science are floating around, with little effort to distinguish them [1]. I suspect that promotes confusion about what standards to apply to scientific claims. And I’m concerned that people will use the high status of authoritative science to encourage us to ignore knowledge that doesn’t fit within its paradigm.

Continue Reading

I got interested in trying ashwagandha due to The End of Alzheimer’s. That book also caused me to wonder whether I should optimize my thyroid hormone levels. And one of the many features of ashwagandha is that it improves thyroid levels, at least in hypothyroid people – I found conflicting reports about what it does to hyperthyroid people.

I had plenty of evidence that my thyroid levels were lower than optimal, e.g. TSH levels measured at 2.58 in 2012, 4.69 in 2013, and 4.09 this fall [1]. And since starting alternate day calorie restriction, I saw increasing hypothyroid symptoms: on calorie restriction days my feet felt much colder around bedtime, my pulse probably slowed a bit, my body burned fewer calories, and I got vague impressions of having less energy. Presumably my body was lowering my thyroid levels to keep my weight from dropping.

I researched the standard treatments for hypothyroidism, but was discouraged by the extent of disagreement among doctors about the wisdom of treating hypothyroidism when it’s as mild as mine was. It seems like mainstream medical opinion says the risks slightly outweigh the rewards, and a sizable minority of doctors, relying on more subjective evidence, say the rewards are large, and don’t say much about the risks. Plus, the evidence for optimal thyroid levels protecting against Alzheimer’s seems to come mainly from correlations that are seen only in women.

Also, the standard treatments for hypothyroidism require a prescription (probably for somewhat good reasons), which may have deterred me by more than a rational amount.

So I decided to procrastinate any attempt to optimize my thyroid hormones, and since I planned to try ashwagandha and DHEA for other reasons, I hoped to get some evidence from the small increases to thyroid hormones that I expected from those two supplements.

I decided to try ashwagandha first, due mainly to the large number of problems it may improve – anxiety, inflammation, stress, telomeres, cholesterol, etc.
Continue Reading

[Warning: long post, of uncertain value, with annoyingly uncertain conclusions.]

This post will focus on how hardware (cpu power) will affect AGI timelines. I will undoubtedly overlook some important considerations; this is just a model of some important effects that I understand how to analyze.

I’ll make some effort to approach this as if I were thinking about AGI timelines for the first time, and focusing on strategies that I use in other domains.

I’m something like 60% confident that the most important factor in the speed of AI takeoff will be the availability of computing power.

I’ll focus here on the time to human-level AGI, but I suspect this reasoning implies getting from there to superintelligence at speeds that Bostrom would classify as slow or moderate.
Continue Reading

In this post, I’ll describe features of the moral system that I use. I expect that it’s similar enough to Robin Hanson’s views I’ll use his name dealism to describe it, but I haven’t seen a well-organized description of dealism. (See a partial description here).

It’s also pretty similar to the system that Drescher described in Good and Real, combined with Anna Salamon’s description of causal models for Newcomb’s problem (which describes how to replace Drescher’s confused notion of “subjunctive relations” with a causal model). Good and Real eloquently describes why people should want to follow dealist-like moral system; my post will be easier to understand if you understand Good and Real.

The most similar mainstream system is contractarianism. Dealism applies to a broader set of agents, and depends less on the initial conditions. I haven’t read enough about contractarianism to decide whether dealism is a special type of contractarianism or whether it should be classified as something separate. Gauthier’s writings look possibly relevant, but I haven’t found time to read them.

Scott Aaronson’s eigenmorality also overlaps a good deal with dealism, and is maybe a bit easier to understand.

Under dealism, morality consists of rules / agreements / deals, especially those that can be universalized. We become more civilized as we coordinate better to produce more cooperative deals. I’m being somewhat ambiguous about what “deal” and “universalized” mean, but those ambiguities don’t seem important to the major disagreements over moral systems, and I want to focus in this post on high-level disagreements.
Continue Reading

[Another underwhelming book; I promise to get out of the habit of posting only book reviews Real Soon Now.]

Book review: Seeing like a State: How Certain Schemes to Improve the Human Condition Have Failed, by James C. Scott.

Scott begins with a history of the tension between the desire for legibility versus the desire for local control. E.g. central governments wanted to know how much they could tax peasants without causing famine or revolt. Yet even in the optimistic case where they got an honest tax collector to report how many bushels of grain John produced, they had problems due to John’s village having an idiosyncratic meaning of “bushel” that the tax collector couldn’t easily translate to something the central government knew. And it was hard to keep track of whether John had paid the tax, since the central government didn’t understand how the villagers distinguished that John from the John who lived a mile away.

So governments that wanted to grow imposed lots of standards on people. That sometimes helped peasants by making their taxes fairer and more predictable, but often trampled over local arrangements that had worked well (especially complex land use agreements).

I found that part of the book to be a fairly nice explanation of why an important set of conflicts was nearly inevitable. Scott gives a relatively balanced view of how increased legibility had both good and bad effects (more efficient taxation, diseases tracked better, Nazis found more Jews, etc.).

Then Scott becomes more repetitive and one-sided when describing high modernism, which carried the desire for legibility to a revolutionary, authoritarian extreme (especially between 1920 and 1960). I didn’t want 250 pages of evidence that Soviet style central planning was often destructive. Maybe that conclusion wasn’t obvious to enough people when Scott started writing the book, but it was painfully obvious by the time the book was published.

Scott’s complaints resemble the Hayekian side of the socialist calculation debate, except that Scott frames in terms that minimize associations with socialism and capitalism. E.g. he manages to include Taylorist factory management in his cluster of bad ideas.

It’s interesting to compare Fukuyama’s description of Tanzania with Scott’s description. They both agree that villagization (Scott’s focus) was a disaster. Scott leaves readers with the impression that villagization was the most important policy, whereas Fukuyama only devotes one paragraph to it, and gives the impression that the overall effects of Tanzania’s legibility-increasing moves were beneficial (mainly via a common language causing more cooperation). Neither author provides a balanced view (but then they were both drawing attention to neglected aspects of history, not trying to provide a complete picture).

My advice: read the SlateStarCodex review, don’t read the whole book.

Book review: The Measure of All Minds: Evaluating Natural and Artificial Intelligence, by José Hernández-Orallo.

Much of this book consists of surveys of the psychometric literature. But the best parts of the book involve original results that bring more rigor and generality to the field. The best parts of the book approach the quality that I saw in Judea Pearl’s Causality, and E.T. Jaynes’ Probability Theory, but Measure of All Minds achieves a smaller fraction of its author’s ambitions, and is sometimes poorly focused.

Hernández-Orallo has an impressive ambition: measure intelligence for any agent. The book mentions a wide variety of agents, such as normal humans, infants, deaf-blind humans, human teams, dogs, bacteria, Q-learning algorithms, etc.

The book is aimed at a narrow and fairly unusual target audience. Much of it reads like it’s directed at psychology researchers, but the more original parts of the book require thinking like a mathematician.

The survey part seems pretty comprehensive, but I wasn’t satisfied with his ability to distinguish the valuable parts (although he did a good job of ignoring the politicized rants that plague many discussions of this subject).

For nearly the first 200 pages of the book, I was mostly wondering whether the book would address anything important enough for me to want to read to the end. Then I reached an impressive part: a description of an objective IQ-like measure. Hernández-Orallo offers a test (called the C-test) which:

  • measures a well-defined concept: sequential inductive inference,
  • defines the correct responses using an objective rule (based on Kolmogorov complexity),
  • with essentially no arbitrary cultural bias (the main feature that looks like an arbitrary cultural bias is the choice of alphabet and its order)[1],
  • and gives results in objective units (based on Levin’s Kt).

Yet just when I got my hopes up for a major improvement in real-world IQ testing, he points out that what the C-test measures is too narrow to be called intelligence: there’s a 960 line Perl program that exhibits human-level performance on this kind of test, without resembling a breakthrough in AI.
Continue Reading

Book review: Superforecasting: The Art and Science of Prediction, by Philip E. Tetlock and Dan Gardner.

This book reports on the Good Judgment Project (GJP).

Much of the book recycles old ideas: 40% of the book is a rerun of Thinking Fast and Slow, 15% of the book repeats Wisdom of Crowds, and 15% of the book rehashes How to Measure Anything. Those three books were good enough that it’s very hard to improve on them. Superforecasting nearly matches their quality, but most people ought to read those three books instead. (Anyone who still wants more after reading them will get decent value out of reading the last 4 or 5 chapters of Superforecasting).

The book’s style is very readable, using an almost Gladwell-like style (a large contrast to Tetlock’s previous, more scholarly book), at a moderate cost in substance. It contains memorable phrases, such as “a fox with the bulging eyes of a dragonfly” (to describe looking at the world through many perspectives).

Continue Reading

Book review: Notes on a New Philosophy of Empirical Science (Draft Version), by Daniel Burfoot.

Standard views of science focus on comparing theories by finding examples where they make differing predictions, and rejecting the theory that made worse predictions.

Burfoot describes a better view of science, called the Compression Rate Method (CRM), which replaces the “make prediction” step with “make a compression program”, and compares theories by how much they compress a standard (large) database.

These views of science produce mostly equivalent results(!), but CRM provides a better perspective.

Machine Learning (ML) is potentially science, and this book focuses on how ML will be improved by viewing its problems through the lens of CRM. Burfoot complains about the toolkit mentality of traditional ML research, arguing that the CRM approach will turn ML into an empirical science.

This should generate a Kuhnian paradigm shift in ML, with more objective measures of the research quality than any branch of science has achieved so far.

Burfoot focuses on compression as encoding empirical knowledge of specific databases / domains. He rejects the standard goal of a general-purpose compression tool. Instead, he proposes creating compression algorithms that are specialized for each type of database, to reflect what we know about topics (such as images of cars) that are important to us.
Continue Reading

Book review: The Human Advantage: A New Understanding of How Our Brain Became Remarkable, by Suzana Herculano-Houzel.

I used to be uneasy about claims that the human brain was special because it is large for our body size: relative size just didn’t seem like it could be the best measure of whatever enabled intelligence.

At last, Herculano-Houzel has invented a replacement for that measure. Her impressive technique for measuring the number of neurons in a brain has revolutionized this area of science.

We can now see an important connection between the number of cortical neurons and cognitive ability. I’m glad that the book reports on research that compares the cognitive abilities of enough species to enable moderately objective tests of the relevant hypotheses (although the research still has much room for improvement).

We can also see that the primate brain is special, in a way that enables large primates to be smarter than similarly sized nonprimates. And that humans are not very special for a primate of our size, although energy constraints make it tricky for primates to reach our size.

I was able to read the book quite quickly. Much of it is arranged in an occasionally suspenseful story about how the research was done. It doesn’t have lots of information, but the information it does have seems very new (except for the last two chapters, where Herculano-Houzel gets farther from her area of expertise).

Added 2016-08-25:
Wikipedia has a List of animals by number of neurons which lists the long-finned pilot whale as having 37.2 billion cortical neurons, versus 21 billion for humans.

The paper reporting that result disagrees somewhat with Herculano-Houzel:

Our results underscore that correlations between cognitive performance and absolute neocortical neuron numbers across animal orders or classes are of limited value, and attempts to quantify the mental capacity of a dolphin for cross-species comparisons are bound to be controversial.

But I don’t see much of an argument against the correlation between intelligence and cortical neuron numbers. The lack of good evidence about long-finned pilot whale intelligence mainly implies we ought to be uncertain.