Building A Better Forecast
At the beginning of 2011, the United States government’s vast and sophisticated intelligence agencies thoroughly analyzed the situation in Egypt. “Our assessment,” said Secretary of State Hillary Clinton, “is that the Egyptian government is stable.”
That was Jan. 25. The same day, a massive demonstration blossomed in Tahrir Square. More followed. A week later, longtime Egyptian dictator Hosni Mubarak announced he would not seek re-election in September. On Feb. 11, Mubarak resigned.
Of course it’s easy to snigger at poor Hillary Clinton and the many smart people who told her Mubarak wasn’t going anywhere. We saw that movie. We know how it ended. But the truth is no one truly forecast the fall of the Mubarak regime. Not the Israelis. Not the Egyptian opposition. Not Hosni Mubarak.
And that’s true of most of the big stories of 2011. The most recent edition of The Economist’s annual look at the world in the coming year – “The World in 2012” – is on newsstands now but it’s far more interesting to read “The World in 2011,” which bears only a modest resemblance to the world in 2011.
So does this mean forecasting is futile? No. The truth is much more nuanced and interesting. And you can see it in a row of charts taped neatly to a book shelf in a tidy office at the Wharton School of the University of Pennsylvania.
The office belongs to Phil Tetlock, a psychologist famous for a mammoth experiment that began in the late 1980s. After recruiting some 280 experts – political scientists, economists, intelligence analysts, journalists – Tetlock asked them to make predictions. Who will win the next presidential election? Will the inflation rate rise or fall? Will war break out in the Persian Gulf? Will the Soviet Union collapse? Questions were carefully crafted so there would be no ambiguity about the right answer. Time frames were varied, from a few months to many years.
In all, Tetlock collected 28,000 predictions and generated a mountain of data. His brilliant analysis, published in the 2005 book Expert Political Judgement, won a basket full of academic awards. But one conclusion became famous: The average expert was about as accurate as random guessing. Or to put that more bluntly, the average expert had no real predictive insight at all
But Tetlock drew another conclusion that got far less attention. There were actually two identifiable groups of experts in his experiment. One did significantly worse than random guessing, which is a remarkable feat. The other did modestly better. They were still miles from perfect but they did have real predictive insight.
Tetlock was able to identify the factor that made the difference. It was the style of thinking. The experts who were a disaster tended to value one big analytical idea, simplicity, clarity, and certainty. Experts with insight used many analytical ideas, gathered information promiscuously, were comfortable with uncertainty and ambiguity, and much less sure of themselves.
The charts in Tetlock’s office represent the next stage of his research.
One reads: “Will a foreign or multinational force fire on, invade, or enter Iran before Sept. 1, 2012?” A series of coloured squiggles stretch across the paper. One squiggle consistently says the likelihood of an attack on Iran is less than 10 per cent. Others are more alarming. One says it’s more than 50 per cent.
It’s not a coincidence that the question is one American intelligence analysts are keenly interested in. It, and all the other questions on Tetlock’s charts, were dreamed up by the “Intelligence Advanced Research Projects Activity,” a branch of the U.S. intelligence community. At considerable expense, IAR-PA is funding an unprecedented “forecasting tournament,” in which five teams will use markedly different methods to see which best forecasts economic and political events.
Tetlock is heading one of the five competing teams. It’s a huge undertaking. His advisory group consists of a long list of leading scholars, including Nobel laureate Daniel Kahneman, and his roster of volunteer forecasters includes experts drawn from various fields along with thousands of educated laypeople who regularly go online and try to predict the future. It’s a considerable understatement to say that nothing like it has ever been attempted before.
To improve foresight, Tetlock is essentially trying to answer two questions.
One, if people engage in exercises intended to make them think along the lines of the experts who were more successful in Tetlock’s earlier experiment, will they improve their forecasting accuracy? Two, if these enhanced predictions and other ingredients are combined in various ways, can the famous “wisdom of crowds” – the welldocumented tendency for the average of many judgments to be more accurate than any one judgment – be boosted to produce superior forecasts?
The tournament only began recently and Tetlock thinks it may take four or five years for him to complete his work. But already the squiggles look promising.
That may surprise some. Tetlock’s work has so often been cited by people who dismiss all forecasting as tarted-up astrology – people like Nassim Taleb, author of The Black Swan – that Tetlock’s name has become associated with that extreme position. But it’s not what he believes.
Tetlock sums up his view with the ungainly but accurate phrase “skeptical meliorism.” The “skeptical” part is obvious. Look at what Hillary Clinton said on Jan. 25. Read “The World in 2011.” The litany of failed predictions is endless. And as Tetlock proved in his original experiment, it is precisely the experts who are most famous – the ones with the best-selling books and giant speaking fees – who are most likely to be wrong. There’s no question that our ability to peer into the future is dwarfed by our desire to do so and our ability to fool ourselves into thinking we have insight when we do not. This is dangerous terrain. Approach with caution.
But that doesn’t mean we should avoid forecasting altogether.
Science has made lots of things perfectly predictable. Tides. Eclipses. Tomorrow’s sunrise. Other things are inherently less predictable but thanks to the diligent work of researchers we can now forecast them with reasonable accuracy. If the weather report says it will rain tomorrow morning, you should bring an umbrella. You may not need it. But you probably will.
So we can improve foresight – that’s the “meliorism” part of “skeptical meliorism.” But how much can it be improved?
Weather forecasts five days out have only modest accuracy. Ten days out they have none. Can we push that to seven days? Eight? Nine? Insights from the natural sciences tell us these advances will get steadily more difficult. The same insights suggest there are some things that can never be predicted. The precise timing of earthquakes, for example. Or revolutions.
You would think corporations and governments would be clamouring to fund researchers like Tetlock. They’re not. They spend huge sums of money on forecasts but, bizarrely, they spend little or nothing to determine the accuracy of those forecasts and even less on improving them.
Which helps explain why embarrassing predictions are a staple of year-end roundups. And why IARPA’s unprecedented forecasting tournament is so important.