Opinion

So, what happened with the polling for the 2020 election?

While the next president of the United States remains unknown as of Thursday evening, there is clearly one big loser: The pollsters, most of whom were touting the high likelihood of a Joe Biden blowout. So how did they get it so wrong? In 2016, I predicted that against all odds Donald Trump would win the presidential election. A month ago, I predicted that his reelection race would be won by a razor-thin margin on either side. Whether you are happy or not about the results, there was a very real chance of Trump winning a second term.

So, back to the main question: How were all the pollsters so wrong, again, even after the soul searching and methodological recalibrating that followed 2016?

The first answer is that public opinion researchers haven’t learned from their past mistakes.

First, pollsters almost surely underestimated what is called a “nonresponse bias,” which is a fancy term for saying that the voters who participated in the surveys had different opinions than those who did participate — i.e., either Trump voters weren’t contacted, refused to respond or chose not to answer truthfully. A Washington Post investigation from 2016 showed that large number of voters were reluctant to answer truthfully, specifically if they planned to vote from Trump. This shouldn’t be surprising giving the vitriol in our current political climate. To anyone who doesn’t think shy voters, or scared voters, or voters who don’t want to play it straight with pollsters for whatever reason, exist, you are living in a bubble.

Second, there is one polling prediction device that is regularly overlooked and misunderstood: approval ratings. A quick glance would show you that Trump has the lowest average approval rating in history at only 41%. The average for reelected presidents since 1980 is 54.5% — almost 14 percentage points higher! Since recording approval ratings began, every reelected president has had over a 50% approval rating when reelected or an upward trend over 30 days before the election.

This would clearly suggest that Trump didn’t have a chance of winning — but there was a clear asterisk.

As I discussed in 2019 and last month, we should care less about the raw numbers than about the trend; we can’t compare Trump’s approval ratings to past presidents, we have to compare the highs and lows over time. In that case, Trump has the smallest difference in spread ever recorded. In other words, Trump had what is arguably the strongest base in recent history. Furthermore, a basic statistical analysis shows that Trump trended toward higher and higher approval ratings since he took office — only gaining in his base.

Third, pollsters have not gotten any better at estimating the margin of error in their polling according to a piece out of the Harvard Data Science Review last week. Pollsters don’t ask every American for their vote decision, but instead they ask a smaller portion of the population and infer from that what the entire population is going to do. That means there is inevitably plus or minus error in their predictions.

Overall, not a single one of these three issues was enough to push the election to Trump, but combined, they threw off pollsters’ models. Again.

 Liberty Vittert is a professor of the practice of data science at the Olin Business School at the Washington University in St. Louis and the feature editor of the Harvard Data Science Review.