Category Archives: Statistics


Class Opener – Day 60 – Herding the Cats

After a wonderful Thanksgiving break (made more wonderful by the Eagles win over the Cowboys!), it’s the 3-week sprint to the holidays, followed by 2 full weeks before final exams.  There’s a lot of stopping and starting going on, which doesn’t help continuity when thinking about class content.  In my AP Stats class, we are deep into our unit on experimental design, which is filled with ideas, terms and arguments must different than a traditional math class. Groups are working through their “Old Wives’ Tales” project, and after grading some student responses this weekend I need an opener which brings the whole class back into the Stats circle. My friend Glenn Waddell has some awesome resources for statistics on his website, which provided inspiration for today’s opener – a short video from ABC News featuring the placebo effect.
More ABC US news | ABC World News

For today’s opener, I asked students to design an experiment which could prove (or disprove) the efficacy of the WYFFT “energy drink”.  This gave groups much to talk about, and a thorough discussion of elements of experimental design, including:

  • Treatments: WYFFT is not a “real” drink, it’s just soda. Students conjectured that the labeling and associated signage were the actual treatment. We can compare this vs a plain bottle, or against no drink at all.
  • Matched-Pairs: could subjects plausibly participate in both treatments? Is this reasionable?
  • Blocking: could the implied reaction be different in men than in women? Perhaps we should have two different experiments?
  • Response: what exactly are we measuring? What would be a suitable activity to measure a change in energy?
  • Randomization: how will subects be selected for the treatments?

And we are off and running after a long turkey-induced rest!

Class Opener – Day 54 – Matched-Pairs in AP Stats

The unit on experimental design is one of my favorites in the AP Stats year, but the structure of a matched pairs experiment – where every subject participates in both treatments – often confuses students. For the past few years, I have been introducing students to matched pairs design through a sport which is sweeping America…


bocce2In hallway bocce, students place two poker chips 5 meters apart in the hallway. Then, standing behind one of the chips, they roll a golf ball towards the opposite chip, trying to get as close as possible. With our carpeted hallways, the golf balls really take off, so some practice is needed to get the right touch. During this practice session, the students don’t know where this is all heading in terms of experimental design.

Next, the students are given a direction sheet for recording results. Each “stat-lete” is asked to play bocce 4 times, twice with their right hand, twice with their left, alternating hands. A coin is used to determine which hand to start with. Partners then measure their attempts and record results.  Note that today was “fashion disaster” day as part of our school’s spirit week.


Back in class, we then think about what could be conjectured before this experiment.  Sure, we could compare the attempts by right hands and by left hands, but what does this tell us?  We then settled on looking at players’ dominant versus their non-dominant hands, and made a dotplot of the results (note – my pre-made scale really was not sufficient here…those golf balls really fly!)


But this only allows us to compare hands in general. What we’d like to be able to do is determine if players are better with their dominant, rather than their non-dominant, hands. Subtracting these results, since all players participated in both treatments, allows for this comparison.

bocce3In the end, those reasults seem quite inconclusive, but that’s okay! Not all experiments prove conjectures, and we learn about the process.

Class Opener – Day 46 – Correlation Does NOT Mean Causation!

Today’s class opener comes from my Advanced Placement Statistics class, but provides an important lesson for stats students of all ages.  A timeplot featuring two interesting data sets, and their changes over time is featured as students enter:


That’s quite a high r value we have for two variables, autism diagnoses and organic food sales, which would not seem so closely related. In conversation with the class we discussed the importance of clear communication, and how this article could easily be summarized and misinterpreted by our local newspaper:


Uh oh….we have a problem.  And not an uncommon problem, as scientific studies which find correlations between variables are often misinterpreted as cause-effect studies.  The fun site Spurious Correlations by Tyler Vigen provides some wild examples of variables with strong (sometimes eerliy strong) correlations to help frame discussions.  Some fun examples –

  • Divorce rate in Maine correlates with Per capita consumption of margarine (US)
  • Worldwide non-commercial space launches correlates with Sociology doctorates awarded (US)
  • Per capita consumption of chicken (US) correlates with Total US crude oil imports

Later, my students will be asked to read and respond to a “newspaper article” about a California school which analyzed their student data and found that student achievement correlates strongly to student height.  The school’s reaction to this correlation seems dubious at best, and with good reason….it’s a fictitious article I wrote symobolize the danger of seeking cause/effect from casual relationships.

Class Opener – Day 44 – Statistics Clue Boxes

A problem I gave as review for our statistics test today became not only a source of conversation regarding vocabulary, but provided me some insight into the problem solving approaches of my students.

Here’s the problem. A list of numbers is given, listed in order, with some numbers removed:


The list has the following characteristics:

  • A mean of 76
  • A range of 32
  • An inter-quartile range of 21

Many students quickly understood the last blank must be 92, due to the range, but then became stuck.  As we’ve never explicity seen a problem like this before, the reactions from students was fascinating.  Some pockets of students had no fear in drawing circles and arrows to break down the data set. Others preferred to talk ideas out, but without putting pen to paper this doesn’t lead to solutions right away. I was thrilled to see a few students step up and take the lead, and explain their ideas to others, which then led to breakthroughs.  Identifying the positions of median and quartiles here lets us fill in one of the missing numbers:


But a subset of my class was content to watch from afar, waiting for hints which they assumed would come. Or worse, tuning out until I presented an explanation to the class….which never came.

And that last blank caused more trouble than I would have expected, as some students had trouble making the connection between the mean of a data set and the sum of its elements.  To help with this, I asked struggling students to provide me with any 4 numbers which had a mean of 10 (making them different numbers).  I asked students what I should be looking for to check accuracy besides computing the mean….and then, the light bulb!  All lists need to add up to 40!  So without explictly doing the empty blank problem in front of us, I sent students back to the board to think about this fact.  And the results were satisfying, as many of my fringe students could now complete the task and explain their procedure to their peers.

Students need to understand math ideas in many forms, and the concept of mean here demonstrates this need.  If you ask a student how to compute a mean, they most likely have little difficulty, and have had much practice:

Mean = sum of “scores” / count of “scores”

But in the missing numbers puzzle, the concept “felt” different and thus “new” to many students.  For me, this is where many students struggle in math classrooms.  Are we showing students how ideas and problems connect to big ideas?  Or does each combination of an existing problem become treated like a new experience?  It’s hard to break the pattern of students wanting specific rules for each type of math problem, when this is often the math conditioning they receive. But it’s worth the hard-fought battle.

And if you had fun with the challenge at the start of this post, try the similar problem I give later as an assessment:


Class Opener – Day 43 – Statistics as Art

Big Stats test tomorrow – students are getting antsy, lots of movement happening with review and reflection.  Today was a good day to step back, think about the role of numbers in society and appreciate some intriguing artwork.

Chris Jordan is a photographic artist whose works “Running the Numbers – an American Self-Portrait” cause you think of the largeness of our world, and the amount of waste we create. His website contains a number of fascinating pieces which zoom to reveal a statistic about our society’s wastefullness.  It’s an awesome experience, and we started class today by discussing a number of the pieces and the large numbers they represent.  There were a number of “whoa” moments as the composition of each picture was revealed, and I read the helpful statistic attached to each work.  Based on the size of each piece, there are some great estimation discussions to be had here as well.  It’s statistics – it’s art – world are colliding in a cool way!

Chris’s TED Talk “Turning Powerful Statistics Into Art” can also be shared with classes to learn more about the message of these pieces.

Class Opener – Day 42 – A Sampler of Sampling Methods

2014-11-05_0001After a day off for election day, it’s back to the world of random sampling, margin of error and plausible intervals.  These tend to be tricky ideas for students, as we move from the “absolute” world of algebra and into the slightly more wishy-washy world of sampling and plausability.  My board scribblings were intended to remind students that we draw samples to represent populations, and that random sampling is king!

But random sampling is messy business, and there are other sampling techniques I want students to consider, and think about their effectiveness.  Rather than lecture each type (caution – excessive vocabulary lectures may cause drowsiness), I gave students a list of words I expected them to research and find suitable resources.

  • Cluster Sampling
  • Stratified Sampling
  • Systematic Sampling
  • Convenience Sampling

After a few rounds of walking around the room to discourage random copying of definitions which they didn’t understand anyway, many groups began to ask the “right” questions, relating the ideas to hypothetical surveys we could do of high school students.  Towards the end of our time, each group was assigned one term to “explain” on a poster through a visual representation. And now, we have a great crowd-sourced wall of survey vocabulary to refer to during discussions!


Class Opener – Day 39 – It’s a Heat (Map) Wave!

Finishing up discussions with scatterplots – today’s visual when students entered presented a new idea in scatterplots (from the awesome site) – a scatterplot representing the score of every NFL game ever played!


What’s the story here? So many great features of this plot to discuss including:

  • It’s apparent symmetry
  • The vertical and horizontal avoidance lines
  • The colors – many students have never seen a heat map before
  • The clustering in the center of the graph

This was a quick warm up as I wanted to get to the main event – scatterplot stations!  Students worked in teams to complete activities (in 15-minute intervals) designed to strengthen their understanding of many ideas surrounding scatterplots.

Station 1 – using graphing calculators to assess data sets, and writing clear summaries of the trends.

Station 2 – estimating best-fit lines given a scatterplot, and using their algebra skills to make good estimates.

Station 3 – netbooks! Play with the Rossman-Chance “Guess the Correlation Applet” and develop and understanding of “least squares” with this Geogebra applet.

Fun day today…..moving on to sampling tomorrow!