Explorations in Polling

Primary election season is here, and news reports are filled with sound bites from candidates, their supporters, and pundits all trying to get the edge by being the first with breaking news.  It’s also polling season, as every news organization seems to have their own poll, all designed to project the winners.  This provides a great opportunity to talk about some statistics concepts which often get buried in the high school curriculum: sampling, surveys, margin of error and confidence intervals.

Photobucket

One nice resource I have used in my classes before is the site pollingreport.com. This site collects polls from many sources: news agencies, university organizations and polling companies. Students can search from a long menu of topics and examine the careful wording of survey questions, time-progression data and information on sample size and margin of error.
Having students select their own survey, and interpret the results, can lead to interesting class discussions. One problem with polls is that the results are often taken as absolute, rather than an estimate of a population. An interval plot can help remedy this, and get students thinking about that pesky margin of error, which is often buried, italicized, or shown in a smaller font than the rest of a poll’s results. Here’s an example of an interval plot, using the results of a poll from pollingreport.com:

Quinnipiac University Poll. Feb. 14-20, 2012. N=1,124 Republican and Republican-leaning registered voters nationwide. Margin of error ± 2.9.
“If the Republican primary for president were being held today, and the candidates were Newt Gingrich, Mitt Romney, Rick Santorum, and Ron Paul, for whom would you vote?”

Photobucket

Some questions for discussion can then include:

  • How can these results be used?
  • What do you think would happen if we asked more people? Or if the election were held today?
  • What would it mean if intervals over-lapped each other?
  • How likely is it that nation-wide support for Rick Santorum is within the interval?

While confidence intervals don’t need to be defined formally, the concept of these intervals indicating plausible values for the population parameter can be discussed. The New York Times, in particular, does an excellent job of providing an accessible explanation for margin of error, such as this excerpt from a telephone poll summary:

In theory, in 19 cases out of 20, overall results based on such samples will differ by no more than three percentage points in either direction from what would have been obtained by seeking to interview all American adults. For smaller subgroups, the margin of sampling error is larger. Shifts in results between polls over time also have a larger sampling error.

Next, we can take a look at formulas for margin of error. One convenient formula found in some textbooks links margin of error directly to the sample size:

By going by to pollingreport.com, and pulling a sample of polls with different sample sizes, we can examine the accuracy of this short and snappy formula.  The scatterplot below uses sample size as the independent variable, and reported margin of error as the dependent variable.
Photobucket
The formula seems to be a nice guide, and some polls clearly use more sophisticated formulas which generate more conservative margins of error.

Classes who wish to explore polling further can check out the New York Time polling blog, FiveThirtyEight, which provides more detailed analyses of polls and their historical accuracy.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s