Darn it if Hillary Clinton isn't going to go out like that dude careering off the side of the high-rise at the end of Die Hard. Today brought another parting shot: a 68-32 thumpin' in Puerto Rico.
Once again, the polls proved relatively useless at forecasting this one-sided result. The two pre-primary polls showed Hillary ahead by an average of 16 points, less than half the eventual victory margin.
One of the real stories of this primary has been how limited polling has been as a tool, and how Democratic electorates in state after state have defaulted to demographics. At critical moments in the primary season, momentum has shifted based on movement relative to the final polls, when that movement was simply an expression of the underlying demographic trends in that state.
- Final polling in New Hampshire was so cosmically wrong that a special committee of the American Association of Public Opinion Research was formed to look into the discrepancy. We now know that Obama has struggled in blue-collar, suburban New England -- losing MA and RI handily, and winning the Maine caucuses by about half his usual caucus victory margin. The demographics slightly favored Clinton, the polling favored Obama, and demographics won.
- South Carolina is the linchpin of the argument I'm about to make. That was the state that disrupted the narrative, transforming a likely loss into a Super Tuesday stalemate that created the situation we are in today. The RCP average showed Obama up by 11.3. He won by 28.9, thanks to the black vote. This sent shockwaves through the political system. But it was perfectly consistent with nearby states, with Obama winning Georgia by 35 percent and North Carolina by a greater than expected 14.7 percent. It should be noted that the gap between results and final polling was much higher in South Carolina (as well as Georgia and Alabama). Polling's low-balling the nearly unanimous black vote -- normally not a factor with the usual all-white field, created the conditions for Obama to beat expectations and to finally tie Hillary in national polling ahead of Super Tuesday.
- The first post-Super Tuesday stop was the Potomac primary. Here, Obama took Virginia by 28.8 percent when the final RCP average predicted 17.7. The outsized influence of Northern Virginia, the black vote in the south side of the state, and a small Appalachian rural influence conspired to make it so. This was the first confirmation of Obama's post-Super Tuesday momentum, leading to his 10-in-a-row wins.
- Virginia's influenced paled in comparison to Wisconsin. This was expected to be tight, with Obama narrowly ahead in the RCP average by 4.3. Instead, he blew the doors off with a 17.4 point win, fueled by proximity to Illinois, the state's progressive heritage, and lax voter registration requirements. This primary solidified his role as the favorite and was a key morale booster as the final big primary before "Super Tuesday 2" on March 4th.
- North Carolina crowned Obama the presumptive nominee. Here, Obama outperformed the final polls by nearly 7 points. By now, the map was filled out enough to know that with 25-point-plus wins directly to the north and south, the predicted 8 point margin was just too low.
- The final states that solidified the Appalachian + Hispanic narrative -- West Virginia, Kentucky, and Puerto Rico, saw greater than expected Clinton margins -- by 7, 7, and 20 points respectively. These primaries were immune to national polling showing Obama pulling away as the likely nominee. It's difficult to see how Clinton would have done better had these contests been held earlier in the process.
I've posted a spreadsheet with the calculations underlying this analysis. In a total of five primaries -- Puerto Rico, South Carolina, Georgia, Alabama, and Wisconsin -- the absolute gap between the final polls and the results was greater than New Hampshire, which is seen as one of polling's greatest historical failures. Of these, the momentum shifts in South Carolina and Wisconsin were crucial to determing the nominee.
In 25 of the 31 primaries analyzed, the trend between the polls and the final results benefited the winner. This is a bit tautological -- until you consider there was a pro-winner shift of 5.76 points across all contests. This is inside the margin of error of most polls -- but barely.
Overall, the polls showed an average error of 6.89 points compared to the results.
Results like this would seem to indicate primary results that were unpredictable and all over the map. Except they weren't. Now that we are approaching a perfect 50-state electoral breakdown of Clinton-Obama -- marred only by the presence of small-state caucuses -- we see that the results are totally internally coherent based on demographics. In fact, I'm prepared to argue that we would have gotten almost the same margin in each state in a national primary held February 5th. Because the candidates themselves polarized the electorate demographically, momentum and events (and polls) made little difference.
The polls' vulnerability was exposed in part because of the unique nature of the Democratic contest. Usually, we only pay attention to polls in close races in diverse states with lots of countervailing trends. And in fact, the polls did relatively well in large states like Ohio, Pennsylvania, Texas, Illinois, and New York -- the glaring exception being California. But in a sequential process where even the most far-gone states matter because of the delegate split -- getting it wrong in the Deep South or Appalachia really impacted the narrative of the race.
For whatever reason, the polls seemed to do worse in states with large monolithic communities favoring either Clinton or Obama. Perhaps this is the "Wilder effect" -- professing to be open to the white woman or the black man to an interviewer but in the end voting for "our guy" -- or gal.
Either way, I think this means that polling in the future will have to be supplemented with a heavy dose of Poblano-style Moneyball analysis.