Category Archives: Statistics

Class Opener – Day 54 – Matched-Pairs in AP Stats

The unit on experimental design is one of my favorites in the AP Stats year, but the structure of a matched pairs experiment – where every subject participates in both treatments – often confuses students. For the past few years, I have been introducing students to matched pairs design through a sport which is sweeping America…

HALLWAY BOCCE!

bocce2In hallway bocce, students place two poker chips 5 meters apart in the hallway. Then, standing behind one of the chips, they roll a golf ball towards the opposite chip, trying to get as close as possible. With our carpeted hallways, the golf balls really take off, so some practice is needed to get the right touch. During this practice session, the students don’t know where this is all heading in terms of experimental design.

Next, the students are given a direction sheet for recording results. Each “stat-lete” is asked to play bocce 4 times, twice with their right hand, twice with their left, alternating hands. A coin is used to determine which hand to start with. Partners then measure their attempts and record results.  Note that today was “fashion disaster” day as part of our school’s spirit week.

bocce1


Back in class, we then think about what could be conjectured before this experiment.  Sure, we could compare the attempts by right hands and by left hands, but what does this tell us?  We then settled on looking at players’ dominant versus their non-dominant hands, and made a dotplot of the results (note – my pre-made scale really was not sufficient here…those golf balls really fly!)

bocce4

But this only allows us to compare hands in general. What we’d like to be able to do is determine if players are better with their dominant, rather than their non-dominant, hands. Subtracting these results, since all players participated in both treatments, allows for this comparison.

bocce3In the end, those reasults seem quite inconclusive, but that’s okay! Not all experiments prove conjectures, and we learn about the process.

Advertisements

Class Opener – Day 46 – Correlation Does NOT Mean Causation!

Today’s class opener comes from my Advanced Placement Statistics class, but provides an important lesson for stats students of all ages.  A timeplot featuring two interesting data sets, and their changes over time is featured as students enter:

correlation-does-not-imply-causation

That’s quite a high r value we have for two variables, autism diagnoses and organic food sales, which would not seem so closely related. In conversation with the class we discussed the importance of clear communication, and how this article could easily be summarized and misinterpreted by our local newspaper:

ORGANIC FOOD CAUSES AUTISM, RESEARCH SHOWS

Uh oh….we have a problem.  And not an uncommon problem, as scientific studies which find correlations between variables are often misinterpreted as cause-effect studies.  The fun site Spurious Correlations by Tyler Vigen provides some wild examples of variables with strong (sometimes eerliy strong) correlations to help frame discussions.  Some fun examples –

  • Divorce rate in Maine correlates with Per capita consumption of margarine (US)
  • Worldwide non-commercial space launches correlates with Sociology doctorates awarded (US)
  • Per capita consumption of chicken (US) correlates with Total US crude oil imports

Later, my students will be asked to read and respond to a “newspaper article” about a California school which analyzed their student data and found that student achievement correlates strongly to student height.  The school’s reaction to this correlation seems dubious at best, and with good reason….it’s a fictitious article I wrote symobolize the danger of seeking cause/effect from casual relationships.

Class Opener – Day 44 – Statistics Clue Boxes

A problem I gave as review for our statistics test today became not only a source of conversation regarding vocabulary, but provided me some insight into the problem solving approaches of my students.

Here’s the problem. A list of numbers is given, listed in order, with some numbers removed:

pic2

The list has the following characteristics:

  • A mean of 76
  • A range of 32
  • An inter-quartile range of 21

Many students quickly understood the last blank must be 92, due to the range, but then became stuck.  As we’ve never explicity seen a problem like this before, the reactions from students was fascinating.  Some pockets of students had no fear in drawing circles and arrows to break down the data set. Others preferred to talk ideas out, but without putting pen to paper this doesn’t lead to solutions right away. I was thrilled to see a few students step up and take the lead, and explain their ideas to others, which then led to breakthroughs.  Identifying the positions of median and quartiles here lets us fill in one of the missing numbers:

pic1

But a subset of my class was content to watch from afar, waiting for hints which they assumed would come. Or worse, tuning out until I presented an explanation to the class….which never came.

And that last blank caused more trouble than I would have expected, as some students had trouble making the connection between the mean of a data set and the sum of its elements.  To help with this, I asked struggling students to provide me with any 4 numbers which had a mean of 10 (making them different numbers).  I asked students what I should be looking for to check accuracy besides computing the mean….and then, the light bulb!  All lists need to add up to 40!  So without explictly doing the empty blank problem in front of us, I sent students back to the board to think about this fact.  And the results were satisfying, as many of my fringe students could now complete the task and explain their procedure to their peers.

Students need to understand math ideas in many forms, and the concept of mean here demonstrates this need.  If you ask a student how to compute a mean, they most likely have little difficulty, and have had much practice:

Mean = sum of “scores” / count of “scores”

But in the missing numbers puzzle, the concept “felt” different and thus “new” to many students.  For me, this is where many students struggle in math classrooms.  Are we showing students how ideas and problems connect to big ideas?  Or does each combination of an existing problem become treated like a new experience?  It’s hard to break the pattern of students wanting specific rules for each type of math problem, when this is often the math conditioning they receive. But it’s worth the hard-fought battle.

And if you had fun with the challenge at the start of this post, try the similar problem I give later as an assessment:

boxes

Class Opener – Day 43 – Statistics as Art

Big Stats test tomorrow – students are getting antsy, lots of movement happening with review and reflection.  Today was a good day to step back, think about the role of numbers in society and appreciate some intriguing artwork.

Chris Jordan is a photographic artist whose works “Running the Numbers – an American Self-Portrait” cause you think of the largeness of our world, and the amount of waste we create. His website contains a number of fascinating pieces which zoom to reveal a statistic about our society’s wastefullness.  It’s an awesome experience, and we started class today by discussing a number of the pieces and the large numbers they represent.  There were a number of “whoa” moments as the composition of each picture was revealed, and I read the helpful statistic attached to each work.  Based on the size of each piece, there are some great estimation discussions to be had here as well.  It’s statistics – it’s art – world are colliding in a cool way!

Chris’s TED Talk “Turning Powerful Statistics Into Art” can also be shared with classes to learn more about the message of these pieces.

Class Opener – Day 42 – A Sampler of Sampling Methods

2014-11-05_0001After a day off for election day, it’s back to the world of random sampling, margin of error and plausible intervals.  These tend to be tricky ideas for students, as we move from the “absolute” world of algebra and into the slightly more wishy-washy world of sampling and plausability.  My board scribblings were intended to remind students that we draw samples to represent populations, and that random sampling is king!

But random sampling is messy business, and there are other sampling techniques I want students to consider, and think about their effectiveness.  Rather than lecture each type (caution – excessive vocabulary lectures may cause drowsiness), I gave students a list of words I expected them to research and find suitable resources.

  • Cluster Sampling
  • Stratified Sampling
  • Systematic Sampling
  • Convenience Sampling

After a few rounds of walking around the room to discourage random copying of definitions which they didn’t understand anyway, many groups began to ask the “right” questions, relating the ideas to hypothetical surveys we could do of high school students.  Towards the end of our time, each group was assigned one term to “explain” on a poster through a visual representation. And now, we have a great crowd-sourced wall of survey vocabulary to refer to during discussions!

posters

Class Opener – Day 39 – It’s a Heat (Map) Wave!

Finishing up discussions with scatterplots – today’s visual when students entered presented a new idea in scatterplots (from the awesome Plot.ly site) – a scatterplot representing the score of every NFL game ever played!

superbowl

What’s the story here? So many great features of this plot to discuss including:

  • It’s apparent symmetry
  • The vertical and horizontal avoidance lines
  • The colors – many students have never seen a heat map before
  • The clustering in the center of the graph

This was a quick warm up as I wanted to get to the main event – scatterplot stations!  Students worked in teams to complete activities (in 15-minute intervals) designed to strengthen their understanding of many ideas surrounding scatterplots.

Station 1 – using graphing calculators to assess data sets, and writing clear summaries of the trends.

Station 2 – estimating best-fit lines given a scatterplot, and using their algebra skills to make good estimates.

Station 3 – netbooks! Play with the Rossman-Chance “Guess the Correlation Applet” and develop and understanding of “least squares” with this Geogebra applet.

Fun day today…..moving on to sampling tomorrow!

Class Opener – Day 38 – Are Any of My Students Compatible?

Today’s opener was inspired by a movie correlations activity I have used in AP Statistics, and Cathy Yenca’s awesome activity which brings this idea down to the Algebra level.

For my freshman class, I wanted to students to “discover” the role of the correlation coefficient r – how it acts as a measure of the strength of the relationship between two quantitative variables.  To begin, 10 potential vacation / off-day activities were listed on the board:

  • Ski
  • Go to Beach
  • Amusement Park
  • Baseball Game
  • Broadway Show
  • Camping
  • Washington DC Tour
  • Shopping Day
  • Big Concert
  • Cruise

Students were each asked to rank these activities from 1 to 10 (10 being most desirable) and using each number only once. The class then moved into partnerships with my suggestion that they work with someone they maybe did not know so well in class, and compared results.  With an odd number of students, I worked with a student to share interests.  Results for each activity were plotted as ordered pairs, with each partner contributing their number score.  Students plotted their points on graph paper, while my student partner and I used Desmos – and quickly discovered that we have little in common.

Colin

galleryFrom there, students learned how to use graphing calculators to analyze the data – making the scatterplot and finding the best-fit line.  The partnerships also wrote this mysterious new statistic – r – on the bottom of the graph and shared their graph in the board.  Through a gallery walk, the class examined the graphs and tried to conjecture the meaning of r.

This worked better than planned, as the class quickly made some key observations:

  • Pairs with stronger relationships have “higher” r values.
  • There are no r-values greater than 1.
  • r can be negative if people answer opposite each other.

Definitely will add this activity to my arsenal every year!


If you are interested in the activity for AP Stats, you can check out the Google Form we use, then some instructions for processing the data in this video: