prediction markets

All posts tagged prediction markets

Book review: Superforecasting: The Art and Science of Prediction, by Philip E. Tetlock and Dan Gardner.

This book reports on the Good Judgment Project (GJP).

Much of the book recycles old ideas: 40% of the book is a rerun of Thinking Fast and Slow, 15% of the book repeats Wisdom of Crowds, and 15% of the book rehashes How to Measure Anything. Those three books were good enough that it’s very hard to improve on them. Superforecasting nearly matches their quality, but most people ought to read those three books instead. (Anyone who still wants more after reading them will get decent value out of reading the last 4 or 5 chapters of Superforecasting).

The book’s style is very readable, using an almost Gladwell-like style (a large contrast to Tetlock’s previous, more scholarly book), at a moderate cost in substance. It contains memorable phrases, such as “a fox with the bulging eyes of a dragonfly” (to describe looking at the world through many perspectives).

Continue Reading

The stock market reaction to the election was quite strange.

From the first debate through Tuesday, S&P 500 futures showed modest signs of believing that Trump was worse for the market than Clinton. This Wolfers and Zitzewitz study shows some of the relevant evidence.

On Tuesday evening, I followed the futures market and the prediction markets moderately closely, and it looked like there was a very clear correlation between those two markets, strongly suggesting the S&P 500 would be 6 to 8 percent lower under Trump than under Clinton. This correlation did not surprise me.

This morning, the S&P 500 prices said the market had been just kidding last night, and that Trump is neutral or slightly good for the market.

Part of this discrepancy is presumably due to the difference between regular trading hours and after hours trading. The clearest evidence for market dislike of Trump came from after hours trading, when the most sophisticated traders are off-duty. I’ve been vaguely aware that after hours markets are less efficiently priced. But this appears to involve at least a few hundred million dollars of potential profit, which somewhat stretches the limit of how inefficient the markets could plausibly be.

I see one report of Carl Icahn claiming

I thought it was absurd that the market, the S&P was down 100 points on Trump getting elected … but I couldn’t put more than about a billion dollars to work

I’m unclear what constrained him, but it sure looked like the market could have absorbed plenty more buying while I was watching (up to 10pm PST), so I’ll guess he was more constrained by something related to him being at a party.

But even if the best U.S. traders were too distracted to make the markets efficient, that leaves me puzzled about asian markets, which were down almost as much as the U.S. market during the middle of the asian day.

So it’s hard to avoid the conclusion that the market either made a big irrational move, or was reacting to news whose importance I can’t recognize.

I don’t have a strong opinion on which of the market reactions was correct. My intuition says that a market decline of anywhere from 1% to 5% would have been sensible, and I’ve made a few trades reflecting that opinion. I expect that market reactions to news tend to get more rational over time, so I’m now giving a fair amount of weight to the possibility that Trump won’t affect stocks much.

MIRI has produced a potentially important result (called Garrabrant induction) for dealing with uncertainty about logical facts.

The paper is somewhat hard for non-mathematicians to read. This video provides an easier overview, and more context.

It uses prediction markets! “It’s a financial solution to the computer science problem of metamathematics”.

It shows that we can evade disturbing conclusions such as Godel incompleteness and the paradox of the liar, by expecting to only be very confident about logically deducible facts (as opposed to being mathematically certain). That’s similar to the difference between treating beliefs about empirical facts as probabilities, as opposed to boolean values.

I’m somewhat skeptical that it will have an important effect on AI safety, but my intuition says it will produce enough benefits somewhere that it will become at least as famous as Pearl’s work on causality.

Automated market-making software agents have been used in many prediction markets to deal with problems of low liquidity.

The simplest versions provide a fixed amount of liquidity. This either causes excessive liquidity when trading starts, or too little later.

For instance, in the first year that I participated in the Good Judgment Project, the market maker provided enough liquidity that there was lots of money to be made pushing the market maker price from its initial setting in a somewhat obvious direction toward the market consensus. That meant much of the reward provided by the market maker went to low-value information.

The next year, the market maker provided less liquidity, so the prices moved more readily to a crude estimate of the traders’ beliefs. But then there wasn’t enough liquidity for traders to have an incentive to refine that estimate.

One suggested improvement is to have liquidity increase with increasing trading volume.

I present some sample Python code below (inspired by equation 18.44 in E.T. Jaynes’ Probability Theory) which uses the prices at which traders have traded against the market maker to generate probability-like estimates of how likely a price is to reflect the current consensus of traders.

This works more like human market makers, in that it provides the most liquidity near prices where there’s been the most trading. If the market settles near one price, liquidity rises. When the market is not trading near prices of prior trades (due to lack of trading or news that causes a significant price change), liquidity is low and prices can change more easily.

I assume that the possible prices a market maker can trade at are integers from 1 through 99 (percent).

When traders are pushing the price in one direction, this is taken as evidence that increases the weight assigned to the most recent price and all others farther in that direction. When traders reverse the direction, that is taken as evidence that increases the weight of the two most recent trade prices.

The resulting weights (p_px in the code) are fractions which should be multiplied by the maximum number of contracts the market maker is willing to offer when liquidity ought to be highest (one weight for each price at which the market maker might position itself (yes there will actually be two prices; maybe two weight ought to be averaged)).

There is still room for improvement in this approach, such as giving less weight to old trades after the market acts like it has responded to news. But implementers should test simple improvements before worrying about finding the optimal rules.

trades = [(1, 51), (1, 52), (1, 53), (-1, 52), (1, 53), (-1, 52), (1, 53), (-1, 52), (1, 53), (-1, 52),]
p_px = {}
num_agree = {}

probability_list = range(1, 100)
num_probabilities = len(probability_list)

for i in probability_list:
    p_px[i] = 1.0/num_probabilities
    num_agree[i] = 0

num_trades = 0
last_trade = 0
for (buy, price) in trades: # test on a set of made-up trades
    num_trades += 1
    for i in probability_list:
        if last_trade * buy < 0: # change of direction
            if buy < 0 and (i == price or i == price+1):
                num_agree[i] += 1
            if buy > 0 and (i == price or i == price-1):
                num_agree[i] += 1
            if buy < 0 and i <= price:
                num_agree[i] += 1
            if buy > 0 and i >= price:
                num_agree[i] += 1
        p_px[i] = (num_agree[i] + 1.0)/(num_trades + num_probabilities)
    last_trade = buy

for i in probability_list:
    print i, num_agree[i], '%.3f' % p_px[i]

I’ve made a change to the software which should fix the bug uncovered last weekend.
I’ve restored about half of the liquidity I was providing before last weekend. I believe I can continue to provide the current level of liquidity for at least a few more months unless prices change more than I currently anticipate. I may readjust the amount of liquidity provided in a month or two to increase the chances that I can continue to provide a moderate amount of liquidity until all contracts expire without adding more money to the account.
I’m not making new software public now. I anticipate doing so before the end of November.

To deter any suspicion that the comparisons I plan to make between Intrade’s predictions and polls are comparisons I selected to make Intrade look good, I’m announcing now that I intend to use as the primary poll aggregator. I intend to pay attention to predictions that are more long-term than I focused in 2004, so the comparison I’ll attach the most importance to will be based on the first snapshot I took of’s state by state projections, which was on July 24.

Also, as of last week, one of the Presidential Decision Markets that I’m subsidizing, DEM.PRES-OIL.FUTURES, has attracted enough trading (I suspect from one large trader) to make me reasonably confident that it’s showing the effects of trader opinion rather than the effects of my automated market maker (saying that oil futures will drop if the Democratic candidate wins, and rise if he loses).

This post is a response to a challenge on Overcoming Bias to spend $10 trillion sensibly.
Here’s my proposed allocation (spending to be spread out over 10-20 years):

  • $5 trillion on drug patent buyouts and prizes for new drugs put in the public domain, with the prizes mostly allocated in proportion to the quality adjusted life years attributable to the drug.
  • $1 trillion on establishing a few dozen separate clusters of seasteads and on facilitating migration of people from poor/oppressive countries by rewarding jurisdictions in proportion to the number of immigrants they accept from poorer / less free regions. (I’m guessing that most of those rewards will go to seasteads, many of which will be created by other people partly in hopes of getting some of these rewards).

    This would also have a side affect of significantly reducing the harm that humans might experience due to global warming or an ice age, since ocean climates have less extreme temperatures, seasteads will probably not depend on rainfall to grow food, and can move somewhat to locations with better temperatures.
  • $1 trillion on improving political systems, mostly through prizes that bear some resemblance to The Mo Ibrahim Prize for Achievement in African Leadership (but not limited to democratically elected leaders and not limited to Africa). If the top 100 or so politicians in about 100 countries are eligible, I could set the average reward at about $100 million per person. Of course, nowhere near all of them will qualify, so a fair amount will be left over for those not yet in office.
  • $0.5 trillion on subsidizing trading on prediction markets that are designed to enable futarchy. This level of subsidy is far enough from anything that has been tried that there’s no way to guess whether this is a wasteful level.
  • $1 trillion existential risks
    Some unknown fraction of this would go to persuading people not to work on AGI without providing arguments that they will produce a safe goal system for any AI they create. Once I’m satisfied that the risks associated with AI are under control, much of the remaining money will go toward establishing societies in the asteroid belt and then outside the solar system.
  • $0.5 trillion on communications / computing hardware for everyone who can’t currently afford that.
  • $1 trillion I’d save for ideas I think of later.

I’m not counting a bunch of other projects that would use up less than $100 billion since they’re small enough to fit in the rounding errors of the ones I’ve counted (the Methuselah Mouse prize, desalinization and other water purification technologies, developing nanotech, preparing for the risks of nanotech, uploading, cryonics, nature preserves, etc).

Book review: Infotopia: How Many Minds Produce Knowledge by Cass R. Sunstein.
There’s a lot of overlap between James Surowiecki’s The Wisdom of Crowds and Infotopia, but Infotopia is a good deal more balanced and careful to avoid exaggeration. This makes Infotopia less exciting but more likely to convince a thoughtful reader. It devotes a good deal of attention to conditions which make groups less wise than individuals as well as conditions where groups outperform the best individuals.
Infotopia is directed at people who know little about this subject. I found hardly any new insights in it, and few ideas that I disagreed with. Some of its comments will seem too obvious to be worth mentioning to anyone who uses the web much. It’s slightly better than Wisdom of Crowds, but if you’ve already read Wisdom of Crowds you’ll get little out of Infotopia.

Predictocracy (part 2)
Book review: Predictocracy: Market Mechanisms for Public and Private Decision Making by Michael Abramowicz (continued from prior post).
I’m puzzled by his claim that it’s easier to determine a good subsidy for a PM that predicts what subsidy we should use for a basic PM than it is to determine the a good subsidy for the basic PM. My intuition tells me that at least until traders become experienced with predicting effects of subsidies, the markets that are farther removed from familiar questions will be less predictable. Even with experience, for many of the book’s PMs it’s hard to see what measurable criteria could tell us whether one subsidy level is better than another. There will be some criteria that indicate severely mistaken subsidy levels (zero trading, or enough trading to produce bubbles). But if we try something more sophisticated, such as measuring how accurately PMs with various subsidy levels predict the results of court cases, I predict that we will find some range of subsidies above which increased subsidy produces tiny increases in correlations between PMs and actual trials. Even if we knew that the increased subsidy was producing a more just result, how would we evaluate the tradeoff between justice and the cost of the subsidy? And how would we tell whether the increased subsidy is producing a more just result, or whether the PMs were predicting the actual court cases more accurately by observing effects of factors irrelevant to justice (e.g. the weather on the day the verdict is decided)?
His proposal for self-resolving prediction markets (i.e. markets that predict markets recursively with no grounding in observed results) is bizarre. His arguments about why some of the obvious problems aren’t serious would be fascinating if they didn’t seem pointless due to his failure to address the probably fatal flaw of susceptibility to manipulation.
His description of why short-term PMs may be more resistant to bubbles than stock markets was discredited just as it was being printed. His example of deluded Green Party voters pushing their candidate’s price too high is a near-perfect match for what happened with Ron Paul contracts on Intrade. What Abramowicz missed is that traders betting against Paul needed to tie up a lot more money than traders betting for Paul. High volume futures markets have sophisticated margin rules which mostly eliminate this problem. I expect that low-volume PMs can do the same, but it isn’t easy and companies such as Intrade have only weak motivation to do this.
He suggests that PMs be used to minimize the harm resulting from legislative budget deadlocks by providing tentative funding to projects that PMs predict will receive funding. But if the existence of funding biases legislatures to continue that funding (which appears to be a strong bias, judging by how rare it is for a legislature to stop funding projects), then this proposal would fund many projects that wouldn’t otherwise be funded.
His proposals to use PMs to respond to disasters such as Katrina are poorly thought out. He claims “not much advanced planning of the particular subjects that the markets should cover would be needed”. This appears to underestimate the difficulty of writing unambiguous claims, the time required for traders to understand them, the risks that the agencies creating the PMs will bias the claim wording to the agencies’ advantage, etc. I’d have a lot more confidence in a few preplanned PM claims such as the expected travel times on key sections of roads used in evacuations.
I expect to have additional comments on Predictocracy later this month; they may be technical enough that I will only post the on the futarchy_discuss mailing list.