Nassim Taleb famously wrote two books — Fooled by Randomness and The Black Swan — about how incredibly complex prediction models can spectacularly fail if just a few underlying assumptions are incorrect. One of Taleb’s targets was a financial model referred to as “Value at Risk,” or VaR. This model attempted to quantify, using historical measures of volatility as a proxy for risk, the maximum amount of money a firm or portfolio could lose over a certain period of time. Many commentators and analysts now believe that a foolish over-reliance on risk-management models like VaR was partly responsible for the 2008 financial crisis.
One of Taleb’s main points is that humans are desperate to view the world as far more rational and predictable than it actually is. If you doubt that assertion, spend a few minutes talking to an insurance actuary. Or take the sub-prime mortgage crash, for example. Bond traders and investment banks and credit ratings agencies swore up and down that a security filled with sub-prime mortgages — that is, home loans that were made to individuals with less than stellar credit — were somehow AAA-rated because there was no possible way all of the loans would go bad at once. And why did Wall Street believe such an assumption was warranted? Because sub-prime home loans had never before all gone bad at the same time. In short: it wouldn’t happen because it hadn’t ever happened.
So what does this have to do with Nate Silver?
Silver stormed onto the scene in 2008 when, according to his acolytes, he correctly predicted how 49 of 50 states would vote in the presidential election (he missed Indiana). Do not remind his disciples that of the four close states — those with margins of 2.5% or less — Silver only forecast three of them correctly. And definitely do not remind them that the polls in swing states correctly forecast all but two states (Indiana and North Carolina).
Silver’s key insight was that if you used a simple simulation method known as Monte Carlo, you could take a poll’s topline numbers and its margin of error and come up with a probability forecast based on the poll. The effect of this method was to show that a 50-49 lead in a poll with 1,000 respondents wasn’t really a dead heat at all — in fact, the candidate with 50% would be expected to win two-thirds of the time if the poll’s sample accurately reflected the true voting population.
To a political world unfamiliar with mathematical methods that are normally taught in an introductory statistics course, Silver’s prophecy was nothing short of miraculous.
But was it? To find out, I spent a few hours re-building Nate Silver’s basic Monte Carlo poll simulation model from the ground up. It is a simplified version, lacking fancy pollster weights and economic assumptions and state-by-state covariance factors, but it contains the same foundation of state poll data that supports Nate Silver’s famous FiveThirtyEight model. That is, they are both built upon the same assumption that state polls, on average, are correct.
After running the simulation every day for several weeks, I noticed something odd: the winning probabilities it produced for Obama and Romney were nearly identical to those reported by FiveThirtyEight. Day after day, night after night. For example, based on the polls included in RealClearPolitics’ various state averages as of Tuesday night, the Sean Davis model suggested that Obama had a 73.0% chance of winning the Electoral College. In contrast, Silver’s FiveThirtyEight model as of Tuesday night forecast that Obama had a 77.4% chance of winning the Electoral College.
So what gives? If it’s possible to recreate Silver’s model using just Microsoft Excel, a cheap Monte Carlo plug-in, and poll results that are widely available, then what real predictive value does Silver’s model have?