NFL Replays and the Chi-Squared Distribution

OK, I’ll admit the blog has been sports-heavy lately.  Now that the Super Bowl is over, hopefully I can diversify some.  But for now, one last football example…

This week, the sports blog Deadspin featured an article titled: “Does The Success Of An NFL Replay Challenge Depend On Which TV Network Is Broadcasting The Game?”   From the title, I was immediately hooked, since this exactly the type of question we ask in AP Stats when discussing chi-squared distributions.  (Web note: while this particular article is fairly vanilla, linking to this site at school is not recommended, as Deadspin often contains not-safe-for-school content.)

The article nicely summarizes the two resolution types used in NFL broadcasts, and the overturn/confirmation rates for replay  challenges in both groups.  For us stat folks, the only omission here is the disaggregated data.  I contacted the author a few days ago with a request for the data, and have yet to receive a response.  But playing around with Excel some, and assuming the “p-value” later quoted in the article, we can narrow in on the possibilities.  The graph below summarizes a data set which fit the conditions and conclusions set forth in the article.


By the time Chi-Squared distributions are covered in AP Stats, students have been exposed to all 4 of the broad conceptual themes in detail.  We can explore each of them in this article:

  • Exploring data:  What are the explanatory and response variables?  What graphical display is appropriate for summarizing the data?
  • Sampling and Experimentation:  How was this data collected?  What sampling techniques were used?  What conclusions will we be able to reach using complete data from just one year?
  • Anticipating Patterns:  Could the difference between the replay  overturn rates have plausibly occurred by chance?  Can we conduct a simulation for both types of  replay systems?
  • Statistical Inference:  What hypothesis test is appropriate here?  Are conditions for a chi-squared test met?

The author’s conclusions present an opportunity to have a  class discussion on communicating results clearly.  First, consider this statement about the  chi-squared test:

“A chi-square analysis of the results suggested those  differences had an 87 percent chance of being related to the video format, and a 13 percent chance of being random. Science prefers results that clear a 95 percent cutoff.”

Having students dissect those sentences, and working in groups to re-write them would be a worthwhile exercise.  Do these results allow us to conclude that  broadcast resolution is a factor in replay challenge success?  Has the author communicated the concept of p-value correctly?  What would we need to do differently in order  to “prove” a cause-effect relationship here?

One final thought.  While  I can’t be sure if my raw data is correct, the data seem to suggest that broadcasts in 720p (Fox and ESPN) have more challenges overall than 1080i (CBS, NBC).  And it seems to be quite a difference.  Can anyone provide plausible  reasons for this, as I am struggling with it.


May The Best Team Win?

Driving home today, there was an interesting discussion on sports-talk radio about championship teams in various sports. The genesis of the discussion was the lingering anger/disappointment/jealousy we Phillies fans harbor over the Saint Louis Cardinals winning the World Series this year (the stereotype is true….we are generally angry people). Despite having the best regular-season record, and the best record in team history, the Phillies were out in the first round.

Part of the discussion centered around the wild-card in baseball, and how the introduction of the wild-card (and more next year), makes it far more difficult for the “best” team to win. This stands in contrast to the NBA, where the best team is not often upset early, and the NFL, where the byes give a large advantage to top teams.

So, what does the data suggest? Coming home, I looked up the champions for the past 25 years in all 4 major (yes, hockey counts….so shut it!) sports. I also did a quick check and found the team’s regular season ranks, according to wins (or points, in hockey). Here’s what we get:


Some interesting trends here. The host on my local sports-radio channel was making a compelling argument this it is easier to win if you are a top team in the NBA, and the numbers bear that out.  Also, note how poorly the team with the best regular-season record in major league baseball fares.

Math-wise, what can we do with this data? The chart has some nice talking points for conditional probability:

  • What is the probability you win the NBA title, give that you are the top seed?
  • What is the probability you were the top team, given that you won the World Series?
  • What is the probability you were the #2-4 seed, if you won the Stanley Cup?

What else can you do with this?