Monthly Archives: February 2012

Explorations in Polling

Primary election season is here, and news reports are filled with sound bites from candidates, their supporters, and pundits all trying to get the edge by being the first with breaking news.  It’s also polling season, as every news organization seems to have their own poll, all designed to project the winners.  This provides a great opportunity to talk about some statistics concepts which often get buried in the high school curriculum: sampling, surveys, margin of error and confidence intervals.

Photobucket

One nice resource I have used in my classes before is the site pollingreport.com. This site collects polls from many sources: news agencies, university organizations and polling companies. Students can search from a long menu of topics and examine the careful wording of survey questions, time-progression data and information on sample size and margin of error.
Having students select their own survey, and interpret the results, can lead to interesting class discussions. One problem with polls is that the results are often taken as absolute, rather than an estimate of a population. An interval plot can help remedy this, and get students thinking about that pesky margin of error, which is often buried, italicized, or shown in a smaller font than the rest of a poll’s results. Here’s an example of an interval plot, using the results of a poll from pollingreport.com:

Quinnipiac University Poll. Feb. 14-20, 2012. N=1,124 Republican and Republican-leaning registered voters nationwide. Margin of error ± 2.9.
“If the Republican primary for president were being held today, and the candidates were Newt Gingrich, Mitt Romney, Rick Santorum, and Ron Paul, for whom would you vote?”

Photobucket

Some questions for discussion can then include:

  • How can these results be used?
  • What do you think would happen if we asked more people? Or if the election were held today?
  • What would it mean if intervals over-lapped each other?
  • How likely is it that nation-wide support for Rick Santorum is within the interval?

While confidence intervals don’t need to be defined formally, the concept of these intervals indicating plausible values for the population parameter can be discussed. The New York Times, in particular, does an excellent job of providing an accessible explanation for margin of error, such as this excerpt from a telephone poll summary:

In theory, in 19 cases out of 20, overall results based on such samples will differ by no more than three percentage points in either direction from what would have been obtained by seeking to interview all American adults. For smaller subgroups, the margin of sampling error is larger. Shifts in results between polls over time also have a larger sampling error.

Next, we can take a look at formulas for margin of error. One convenient formula found in some textbooks links margin of error directly to the sample size:

By going by to pollingreport.com, and pulling a sample of polls with different sample sizes, we can examine the accuracy of this short and snappy formula.  The scatterplot below uses sample size as the independent variable, and reported margin of error as the dependent variable.
Photobucket
The formula seems to be a nice guide, and some polls clearly use more sophisticated formulas which generate more conservative margins of error.

Classes who wish to explore polling further can check out the New York Time polling blog, FiveThirtyEight, which provides more detailed analyses of polls and their historical accuracy.

Slope and the ADA

The middle school in the district where I work is quite old.  Dedicated in 1959, and once serving as the district high school, the building is a Frankenstein of aging classrooms, newer additions, and inconsistent heat. One feature of the building is the network of ramps used to shuttle students from wing to wing, and supplies in and out.
Photobucket
Photobucket
Photobucket

Recently, I worked with a team of 8th grade algebra I teachers to develop an activity which would utilize the many ramps, get kids moving and measuring, and reinforce slope as a measure of steepness. The teachers had great ideas for leading students through measurement activitites. My initial idea of having students choose points along the ramp, then measuring the rise and run between points, was discussed and improved. The teachers used blue painter’s tape to create guiding triangles along the bricks on the walls along two of the ramps. Another teacher noted that railings could be used to connect parallel lines to slopes, and triangles utilizing the railing were also provided.

Students measured the slope of two ramps using the provided triangles, then were led outside, where both a pedestrian ramp and a custodian’s ramp were measured.  The outside ramps were additional challenges, as no guiding tape marks were provided.  Wacthing student reactions and approaches to these ramps was intriguing.  Some students attempted to use the bricks on the building to trace their own triangles, while another group discovered that the level ground along the freight ramp could be used as the “run”.

After the activity, the class discussed and compared their results.  In one class, the unusual steepness of one ramp in our building was questioned, and related to the legal limits of handicapped ramps.  The class agreed that the ramp seems to be an original part of the building, and that an elevator had been installed alongside the ramp for our disabled friends.  Further discussion could include the requirements of the Americans with Disabilities Act, which contains the following requirements for ramps:

The least possible slope shall be used for any ramp. The maximum slope of a ramp in new construction shall be 1:12. The maximum rise for any run shall be 30 in (760 mm) . Curb ramps and ramps to be constructed on existing sites or in existing buildings or facilities may have slopes and rises as allowed in 4.1.6(3)(a)  if space limitations prohibit the use of a 1:12 slope or less.

As a follow-up, students found pictures of objects or places which they felt represented interesting slopes.  Geometer’s Sketchpad was then used to measure and compare the slopes in their pictures:
Photobucket
Photobucket
Photobucket

Who Takes 5 Hours to Mow a Lawn?

Some units and chapters in algebra lend themselves naturally to interesting openers. Interesting scenarios to discuss slopes, systems of equations or quadratic functions are abundant. Finding examples for topics like radicals and complex numbers or rational expressions can be a bit more of a challenge. Addition and subtraction of rational expressions mean that shared work problems can’t be far behind, like this nugget from algebra.com:

One good use for rational equations is the shared work problem. This solution would be of great help in scheduling employees. For example, If Bob can mow a lawn in 3 hours and Joe can do it in 5 hours, how long would it take them together?

A few thoughts come to mind:

  • I’m doubting that the personnel schedulers at WalMart or Jiffy Lube are using rational expressions to schedule their employees.
  • How many of our kids would guess 8 hours, or even 4 hours, as their initial guess?
  • Joe needs to stop lollygagging on the job.

I set out to make a video to encourage discussion of these problems. In a first attempt, my sister and nephew were recruited to each build a Lego tower separately, then together.
Photobucket
Working together had little effect on the overall time, as the partnership tripped over each other digging into the bucket for Legos, and had trouble coordinating the overall tower construction. This leads to a nice discussion of the assumed independence of the two volunteers in these problems, but made for a pretty bad video.

In the video below, teachers Christine and John were recruited to staple index cards to a stack of 50 “top secret” papers. A shared work ending was also produced. But in a version that was later eliminated, Christine passed papers to John, who then stapled. In order to maintain independence, a new ending was shot where they worked separately, yet simultaneously.

Christine’s final time was 4:50, while John’s final time was 4:29
To find the ideal shared time, we let x = the number of seconds required to complete the job together.

  • Christine’s rate is 1 / 290 of the job completed per second
  • John’s rate is 1 / 269 of the job completed per second

Since we want one job to be completed, this leads to the equation:

Solving for x yields an ideal solution of 2:19, so the partnership’s time of 2:10 is not too surprising.  The subjects admitted that they were a bit more competitive to do well working together than when they were separated.  Also, my quick appearance during the shared portion on the video is due to the team needing more index cards, and not any funny business!  What would happen if Christine showed up a minute late?  How long would it take them to complete 2000 cards?

Hopefully, we can encourage some discussion and debate, and move away from Joe and his 5-hour lawns.

Factoring – Sending Out the Bat Signal!

One of the joys of my job is having mathematically interesting chats with my colleagues about how they approach  specific problems with their classes.  These conversations often begin as one-on-one discussions, but usually evolve into calling multiple people into the fray to give their two cents.  This semester, a teacher in my department is tackling an Accelerated Algebra II class for the first time.  Having taught academically talented kids for many years, my advice to him was to constantly challenge his students, perhaps using problems like those from the American Mathematics Competitions as openers.  But while offering up academic challenges can keep a teacher’s mind sharp, there is the risk of having that “hmmmm…” moment…..that uncomfortable feeling where you’re not quite sure what the correct response to a student question is.

The discussion today came from a review of factoring, and a problem which seems innocent enough:

Factor x6– 64

Take a moment and think about how you would factor this….show all work for full credit…

Enjoy a few lines of free space as you consider your work….

And…time….pencils down….

The interesting aspect of x6– 64 is that it is both a difference of cubes and a difference of squares.  I used the neat algebraic interface on purplemath.com to do some screen captures and make the algebra look pretty here.  In this case, the calculator factors this expression as a difference of squares, (x3– 8 ) (x3 + 8),  which then become both a sum and difference of cubes and can both factor further:

Photobucket

But, the initial expression is also a difference of cubes, and can be factored as such.  It is verified below:

Photobucket

The plot thickens as the discussion then centers about the “remnants” we get when we factor a difference of cubes.  We can verify that the two “remnants” (underlined in red) from the first factorization are factors of the remnant of the second method (underlined in green):

Photobucket

So, what’s happening here?

The extra, messy, factor we get when we factor a sum or difference of cubes is up for discussion here.
According to Purplemath:

The quadratic part of each cube formula does not factor, so don’t attempt it.

But we don’t have a quadratic here (though we could perform a quick substitution and consider it is one), we have a 4th degree polynomial.  Even the algebra calculator on the site doesn’t care for this quirky 4th power expression:

Photobucket

So, I am looking to my math peeps for some thoughts:

  1. Is there an order to consider when a polynomial meets 2 special cases?  Should we look at sum of cubes or squares first?
  2. Does anyone have any insight on x4+4x2 + 16?

Good night, and good factoring…

NFL Replays and the Chi-Squared Distribution

OK, I’ll admit the blog has been sports-heavy lately.  Now that the Super Bowl is over, hopefully I can diversify some.  But for now, one last football example…

This week, the sports blog Deadspin featured an article titled: “Does The Success Of An NFL Replay Challenge Depend On Which TV Network Is Broadcasting The Game?”   From the title, I was immediately hooked, since this exactly the type of question we ask in AP Stats when discussing chi-squared distributions.  (Web note: while this particular article is fairly vanilla, linking to this site at school is not recommended, as Deadspin often contains not-safe-for-school content.)

The article nicely summarizes the two resolution types used in NFL broadcasts, and the overturn/confirmation rates for replay  challenges in both groups.  For us stat folks, the only omission here is the disaggregated data.  I contacted the author a few days ago with a request for the data, and have yet to receive a response.  But playing around with Excel some, and assuming the “p-value” later quoted in the article, we can narrow in on the possibilities.  The graph below summarizes a data set which fit the conditions and conclusions set forth in the article.

Photobucket

By the time Chi-Squared distributions are covered in AP Stats, students have been exposed to all 4 of the broad conceptual themes in detail.  We can explore each of them in this article:

  • Exploring data:  What are the explanatory and response variables?  What graphical display is appropriate for summarizing the data?
  • Sampling and Experimentation:  How was this data collected?  What sampling techniques were used?  What conclusions will we be able to reach using complete data from just one year?
  • Anticipating Patterns:  Could the difference between the replay  overturn rates have plausibly occurred by chance?  Can we conduct a simulation for both types of  replay systems?
  • Statistical Inference:  What hypothesis test is appropriate here?  Are conditions for a chi-squared test met?

The author’s conclusions present an opportunity to have a  class discussion on communicating results clearly.  First, consider this statement about the  chi-squared test:

“A chi-square analysis of the results suggested those  differences had an 87 percent chance of being related to the video format, and a 13 percent chance of being random. Science prefers results that clear a 95 percent cutoff.”

Having students dissect those sentences, and working in groups to re-write them would be a worthwhile exercise.  Do these results allow us to conclude that  broadcast resolution is a factor in replay challenge success?  Has the author communicated the concept of p-value correctly?  What would we need to do differently in order  to “prove” a cause-effect relationship here?

One final thought.  While  I can’t be sure if my raw data is correct, the data seem to suggest that broadcasts in 720p (Fox and ESPN) have more challenges overall than 1080i (CBS, NBC).  And it seems to be quite a difference.  Can anyone provide plausible  reasons for this, as I am struggling with it.

Thoughts on Coin Flipping

It’s Super Bowl weekend, otherwise known here as the weekend I lose 5 bucks to my friend Mattbo. Matt and I have a standing wager every year on the Super Bowl coin flip, and I seem to have an uncanny, almost scary, ability to lose money on the flip. I also lose money to Matt on Thanksgiving annually when my public school alma mater is routinely thrashed by Matt’s catholic school, but that’s a story for another time.

Coin flipping seems vanilla enough. It’s 50-50 probabilities make it seemingly uninteresting to study. But beneath the surface are lots of puzzling nuggets worth sharing with your students.

The NFC has won the opening coin toss in the Super Bowl for 14 consecutive years.  Go back and read that again slowly for maximum wow factor.  This is the sort of fascinating result which seems borderline impossible to many and brings on rumors of fixes and trends, but just how impressed should we be by this historical result?  Try simulating 45 coin flips (to represent the 45 Super Bowls) using a trusty graphing calculator.  What “runs” do we see?  Does having 14 in-a-row seem so implausible after simulation?  A number of sites (mosty gambling sites) have examined this piece of Super Bowl history, where some attach a probability of  1/16384 (2^14), or .00006 to this event.  But what exactly is this the probability OF?  In this case, it is the probability, starting with a given toss, that the NFC will win the next 14 in a row.  But it is also the probability that the AFC will win the next 14.  Or that the next 14 will be heads.  Or tails.  The blog The Book of Odds provides more information about the coin toss, specifically how it relates to winning the big game:

The odds a team that wins the coin toss will win the Super Bowl are 1 in 2.15(47%).

The odds a team that loses the coin toss will win the Super Bowl are 1 in 1.87 (53%).

The odds a team that calls the coin toss will win the Super Bowl are 1 in 1.79 (56%).

UPDATE:  In 2007, NPR ran a short piece during “All Things Considered” about the coin-flipping run, which was then in year 10.  Finally found it here.   It’s a quick 4-minutes and great to share with classes.

UPDATE #2:  The AFC just broke its dry spell.  Thanks to NBC for the nice stat line:

Photobucket

Exploring runs in coin tossing through simulation allows us to make sense of unusual phenomena.  On the TI-84, the randint feature allows for quick simulations (for example, the command RandInt (1,2,100) will produce a “random” string of 100 1’s and 2’s).  Deborah Nolan, a professor and author from UC Berkley, has developed an activity which challenges students to act randomly.  A class is split in half and given a blackboard for recording coin flipping results, and the professor leaves the room.  One group is charged with flipping a coin 100 times, and recording their results accurately.  The second group is given the task of fabricating a list of 100 coin flip results.  After both are finished, the professor returns and is able to quickly identify the falsifies list.  Too few runs give the fabricators away.

Does the manner in which a coin is tossed make the outcome more or less predictable?  Engineers at Harvard built a mechanical flipper to examine the relationship between a coin’s initial and final positions.  The assertion that much of the randomness in coin flipping  is the result of “sloppy humans” is tasty; we humans have trouble being random when needed.  Along the same lines of innovations in coin tossing, the 2009 and 2010 Liberty Bowl football games used something called the eCoin Toss to make the toss more accessible to the crowd.

Photobucket

Finally, if you are into old-school, bare-knuckles, coin flipping, you can mention these scientists, who each took coin flipping to the extreme:

Comte de Bufton:  (4,040 tosses, 2,048 heads)

Karl Pearson (24,000 tosses, 12,012 heads)

John Kerrich (10,000 tosses, 5,076 heads)