Time for Truth in Polling?

Pollsters ought to be transparent about their methods and their limitations.

Oct 24, 2024

This article is brought to you by American Purpose, the magazine and community founded by Francis Fukuyama in 2020, which is now proudly part of the Persuasion family.

By John J. DiIulio, Jr.

Writing in early July for Public Discourse, I summarized the case against trusting this year’s presidential election polls. I cited a major report by leading academic experts documenting that pre-election polling was bad in 2016 and worse in 2020. In fact, in 2020, the national polls, whether conducted by phone or online, and for all types of samples, were the worst in 40 years.

Among the reasons for polling’s plight is how easy it has become for people to screen out unwanted phone calls and texts. Pollsters must make thousands of contacts to land just one randomly selected respondent. For example, in late July, a New York Times (NYT)/Sienna presidential horserace poll was based on a telephone survey of 1,142 likely voters. It was conducted both in English and in Spanish. It took about 140,000 calls to 54,000 voters to get that sample.

Bravo, but making that many contacts is quite expensive and labor intensive. So, many pollsters now economize by using smaller-than-ideal sample sizes; under-sampling hard-to-contact people and subpopulations; experimenting with novel ways of concocting a sample that’s representative of its target population; and juggling multiple methods for weighting data to adjust for a sample’s under-representation or over-representation of given groups.

Since July, more formidable voices than mine have echoed the case against trusting the polls. For instance, writing in early September for the Brookings Institution, William Galston explained why “not just individual polls but polling averages have turned out to be misleading.” And writing in mid-October for The New York Times, Ezra Klein advised that “over the next few weeks until Election Day: Just ignore the polls.”

Fair enough, but here’s a different pitch: Just question the polls in ways that might help to reform and improve polling. Most pollsters are well-meaning. Polling, done right, can provide a worthwhile window on citizens’ opinions. But we’ll only have more polling done right when more people can easily spot when polling is done wrong and its results are being oversimplified or oversold.

The first thing to understand is that, even when done to perfection, a poll is like a compass, not a GPS system. For example, let’s say a well-conducted national poll shows Harris with 51% versus Trump with 49%. That’s a spread of Harris +2 percentage points, and so the race is tight, right?

Wrong. Every poll has a “margin of error” or MOE. The MOE is an estimate of the gap between the results from the sample and the results the pollster would have gotten by polling the entire population from which the sample was drawn. So, let’s say the MOE for our Harris 51% versus Trump 49% poll is 3.0%, notated as +/- 3.0. What, exactly, does “+/- 3.0” mean?

It means that if the votes were cast when the poll was taken—strictly speaking, a poll says nothing (nothing!) about any time before or after the poll was taken—then the vote probably would have been in the range of Harris 54% (add 3.0 to 51) to Trump 46% (subtract 3.0 from 49), and Harris 48% percent (subtract 3.0 from 51) to Trump 52% percent (add 3.0 to 49). So, the poll’s 3-point spread translates into a range of Harris +8 to Trump +4.

Was Biden a Sure Loser?

With MOEs always in view, let’s revisit the Biden versus Trump polls taken after their debate on June 27, 2024. These polls were widely interpreted to show that Biden was behind and could not beat Trump.

On July 11, 2024, two full weeks after Biden’s debate debacle, fourteen Biden versus Trump polls were summarized in a table featured on the Real Clear Politics (RCP) website. Eleven polls showed Trump ahead of Biden by a spread of 2 points to 6 points; two showed them in a tie; and one showed Biden ahead of Trump by a spread of 2 points.

But let’s now compute not just their spreads but their ranges. For instance, the latest of that lot was an NPR/Marist poll conducted July 9-10. It reported Trump 48% versus Biden 50%, for a spread of 2 points favoring Biden. With its MOE of 3.1, the poll showed a range of Biden up by 8.2 points to Trump up by 4.2 points.

The second latest poll, an Emerson poll conducted July 7-8, reported Trump 46% versus Biden 43%, for a spread of 3 points favoring Trump. With its MOE of 2.6, the poll showed a range of Trump up by 8.2 points to Biden up by 2.2 points.

And the third latest, an ABC News/Washington Post poll conducted July 5-9, reported Trump 46% versus Biden 46%. That’s zero spread, a “tie,” a “dead heat,” yes?

No. With its MOE of 2.0, the poll showed a range of Trump up by 4 points to Biden up by 4 points. The spread implies that it’s a neck-and-neck horserace; but the range unmasks two 4-point spreads, with each horse up to 4 points ahead of the other. “Ties” in polls indicate not a photo-finish but a double-exposure photo finish.

Likewise, the post-debate Biden vs. Trump polls in the key battleground states were a far cry from suggesting that Biden was sure to lose. For example, an Emerson post-debate poll of registered Pennsylvania voters, conducted June 30-July 2, reported Trump 48% versus Biden at 43%, with a MOE of 3.0, a spread of Trump up by 5 points, and a range of Trump up by 11 points to Biden up by 1 point.

But the very next Emerson post-debate poll of registered Pennsylvania voters, conducted July 7-8, had Trump at 46% versus Biden at 43%, with a MOE of 2.6, a spread of Trump up by 3 points, and a range of Trump up by 8.2 points to Biden up by 2.2 points. So, if you believe the two polls, between July 2 and July 8, Biden gained, not lost, ground in the Keystone State.

Harris-Trump Whack-a-MOE

Now let’s probe the Harris versus Trump polls, starting with one that was widely interpreted to indicate that Trump was not only gaining ground among Latino voters but way ahead of Harris with Latino voters.

A USA Today/Suffolk University poll of likely Latino voters conducted October 14 to October 18 had Trump at 49% versus Harris with 38%, an 11-point spread. USA Today duly reported that the poll’s MOE was +/- 9.0. The poll’s ranges stretched from Trump 58% versus Harris 29% to Harris 47% versus Trump 40%.

But in other news reports and commentary on that poll, the 11-point spread was reported without reference to its MOE and its ranges, not unless you count the tiny-font “MOE: +/-9%” at the bottom of much larger graphics featuring the 11-point spread.

What about the “polling averages”? On October 16, 2024, Real Clear Politics (RCP) featured a table that summarized the eleven latest Harris versus Trump national polls. The eleven polls were variously conducted between September 30 and October 14. They showed Harris up by an average spread of +1.7.

That “RCP Average”—Harris +1.7—was calculated by adding the spreads of the eleven polls and dividing by eleven. To wit, in two of the polls the spread was Harris +4; in another two the spread was Harris +3; in one poll each the spread was Harris +1 and Harris +5; in one the spread was Harris +2; in three polls Harris and Trump were tied; and in one poll Trump was +3.

Here’s the arithmetic: three zeroes for each (the ties), and then Harris with a total of +22, minus Trump with a total of +3, for a spread of Harris +19. Divide +19 by eleven and (eureka!) you get an average spread of Harris up by 1.72, rounded down as reported to +1.7.

Now, however, take a closer look. The eleven polls were variously conducted over two or more of 15 days. Nine of the polls were based on samples of likely voters (LV) ranging from 699 people to 8,647 people. The other two polls were based on samples of registered voters (RV), one with a sample of 1,033 people and the other with a sample of 4,025 people. Some of the eleven polls were conducted entirely online; others involved a live interview conducted by telephone; and still other polls’ samples were cobbled together from both online contacts and telephone interviews.

The MOEs of the eleven Harris versus Trump polls varied lots: three had a MOE of between +/- 2.0 and +/- 2.8; two had a MOE of between +/- 1.0 and +/-1.9; two had a MOE between +/- 3.1 and +/- 3.9; two others had a MOE of between +/- 4.0 and +/- 4.7; and two reported no MOE at all (not zero MOE, which is impossible, but just not reported).

The eleven polls’ respective ranges ranged all over the board. For example, the Marquette poll of 699 likely voters, with the smallest sample of the eleven, reported a 50%-50% zero spread “tie;” but with its MOE of +/- 4.7, it ranged from Harris 54.7% to Trump 45.3% (Harris up by as much as 9.4 points), to Trump 54.7% to Harris 45.3% (Trump up by as much as 9.4 points).

By contrast, the Morning Consult poll of 8,647 likely voters, with the biggest sample of the eleven, reported 50% for Harris versus 46% for Trump. With a MOE of +/- 1.0, it ranged from Harris 51% to Trump 45% (Harris up by 6 points), to Harris 49% to Trump 47% (Harris up by 2 points).

Report Ranges, Rethink Averages

“Poll aggregator” numbers like the “RCP Average” have their defenders; and a shift in polling averages over time can potentially spy the direction in which opinion may be trending.

Still, how, exactly, does averaging spreads from polls conducted at different times, with different sample populations (likely voters, registered voters, all adults), different sample sizes, different weighting protocols, different interview methods, different MOEs, and different question-wording make good sense?

Isn’t thinking that it does make good sense like thinking that a tornado blowing through a junkyard containing spare parts from all sorts of different makes and models of cars is likely to assemble and leave a well-running all-terrain vehicle (ATV) in its wake?

The New York Times also averages polls, but it adjusts how much weight each poll is given in the average by “a variety of factors,” including “recency and sample size,” whether it “represents likely voters,” whether “other polls have shifted” since it was conducted,” and whether “select pollsters” produced it.

But isn’t that tantamount to thinking that the twister can fashion a road-ready ATV, but only if it hits just those sections of the junkyard that contain bumpers that aren’t rusted, batteries that aren’t busted, and other parts that are functional, fixable, and fashioned by your favorite manufacturers?

By or before the 2028 presidential election season, pollsters, pundits, political consultants, news channels, and other media outlets might do well to consider these truth-in-polling practices:

Rethink averaging spreads and flacking polling averages.
Be explicit regarding sample weighting methods (“raking,” “matching,” “propensity weighting,” and others) and how even a slight change in weighting could spell a big shift in a poll’s reported results.
Commit to highlighting and explaining, not obscuring or omitting, MOEs.
Spotlight the poll’s ranges right alongside its spread, sample size, sample type (registered, likely, etc.), and dates conducted.
Publicize no poll’s results without a straightforward and easy-to-find source note describing its interview method (telephone, online, hybrid, or other) and expressing its sample size as a percentage of all individuals contacted.
Preface every statement regarding a presidential horserace poll’s results with words to the effect “If the election were held on the days that this poll was conducted…”

John J. DiIulio, Jr. taught American government for 35 years across three different universities and co-authored a leading American government textbook.

Follow Persuasion on X, LinkedIn, and YouTube to keep up with our latest articles, podcasts, and events, as well as updates from excellent writers across our network.

And, to receive pieces like this in your inbox and support our work, subscribe below:

Time for Truth in Polling?

Pollsters ought to be transparent about their methods and their limitations.

Was Biden a Sure Loser?

Harris-Trump Whack-a-MOE

Report Ranges, Rethink Averages

Discussion about this post