Category Archives: Statistics

What’s Going On in This Graph

Today the New York Times Learning Network dropped the first “What’s Going On in This Graph?” (WGOITG) of the new school year. This feature started last year as a monthly piece, but now expands to a weekly release. In WGOITG, an infographic from a previous NYT article is shown with the title, and perhaps some other salient details, stripped away – like this week’s graph…


Challenge your students to list some things they notice and wonder about the graph, and visit the NYT August post to discover how teachers use WGOITG in their classrooms. Here are some ideas I have used before with my 9th graders:

  • Have groups work in pairs to write a title and lede (brief introduction) to accompany the graph.
  • Ask tables to develop a short list of bullet points facts which are supported by the graph, and share out on note cards.
  • Have students consider how color, sizing, scaling are used in effective ways to support the story (note how the size of the arrows play a role in the graph shown here). This is a wonderful opportunity to think of statistics beyond traditional graphs and measures.

Invite your students to join in the moderated conversation, which drops on Thursday. Have your own favorite way to use WGOITG? Share it in the comments!



Seeing Stars with Random Sampling

Adapted from Introduction to Statistical Investigations, AP Version, by Tintle, Chance, Cobb, Rossman, Roy, Swanson and VanderStoep

Before the Thanksgiving break, I started the sampling chapter in AP Statistics.  This is a unit filled with new vocabulary and many, many class activities.  To get students thinking about random sampling, I have used the “famous” Random Rectangles activity (Google it…you’ll find it) and it’s cousin – Jelly Blubbers. These activities are effective in causing students to think about the importance of choosing a random sample from a population, and considering communication of procedures. But a new activity I first heard about at a summer session on simulation-based inference, and later explained by Ruth Carver at a recent PASTA meeting, has added some welcome wrinkles to this unit.  The unit uses the one-variable sampling applet from the Rossman-Chance applet collection, and is ideal for 1-1 classrooms, or even students working in tech teams.  Also, Beth Chance is wonderful…and you should all know that!

starsIn my classroom notes, students first encounter the “sky”, which has been broken into 100 squares. To start, teams work to define procedures for selecting a random sample of 10 squares, using both the “hat” (non-technology) method, and a method using technology (usually a graphing calculator). Before we draw the samples however, I want students to think about the population – specifically, will a random sample do a “good job” with providing estimates? Groups were asked to discuss what they notice about the sky.  My classes immediately sensed something worth noting:

There are some squares where there are many stars (we end up calling these “dense” squares) and some where there are not so many.

Before we even drew our first sample, we are talking about the need to consider both dense and non-dense areas in our sample, and the possibility that our sample will overestimate or underestimate the population, even in random sampling.  There’s a lot of stats goodness in all of this, and the conversation felt natural and accessible to the students.

Studestars1nts then used their technology-based procedure to actually draw a random sample of 10 squares, marking off the squares.  But counting the actual stars is not reasonable, given their quantity – so it’s Beth Chance to the rescue!  Make sure you click the “stars” population to get started.  Beth has provided the number of stars in each square, and information regarding density, row and column to think about later.

But before we start clicking blindly, let’s describe that population.   The class quickly agrees that we have a skewed-right distribution, and take note of the population mean – we’ll need it to discuss bias later.

Click “show sampling options” on the top of the screen and we can now simulate random samples.  First, students each drew a sample of size 10 – the bottom of the screen shows the sample, summary statistics, and a visual of the 10 squares chosen from the population.


Groups were asked to look at their sample means, share them with neighbors, and think about how close these samples generally come to hitting their target.  Find a neighbor where few “dense” area were selected , or where many “dense” squares made the cut, how much confidence do we have in using this procedure to estimate the population mean?

Eventually I unleashed the sampling power of the applet and let students draw more and more samples.  And while a formal discussion of sampling distributions is a few chapters away, we can make observations about the distributions of these sample means.


And I knew the discussion was heading in the right direction when a student observed:

Hey, the population is definitely skewed, but the means are approximately normal.  That’s odd…

Yep, it sure is…and more seeds have been planted for later sampling distribution discussions. But what about those dense and non-dense areas the students noticed earlier?  Sure, our random samples seem to provide an unbiased estimator of the population mean, but can we do better?  This is where Beth’s applet is so wonderful, and where this activity separates itself from Random Rectangles.  On the top of the applet, we can stratify our sample by density, ensuring that an appropriate ratio of dense / non-dense areas (here, 20%) is maintained in the sample.  The applet then uses color to make this distinction clear: here, green dots represent dense-area squares.


Finally, note the reduced variability in the distribution from stratified samples, as opposed to random samples. The payoff is here!

Later, we will look at samples stratified by row and/or column.  And cluster samples by row or column will also make an appearance.  There’s so much to talk about with this one activity, and I appreciate Ruth and Beth for sharing!

Compute Expected Value, Pass GO, Collect $200

Photo Oct 11, 7 08 49 AM.jpgExpected Value – such a great time to talk about games, probability, and decision making!  Today’s lesson started with a Monopoly board in the center of the room. I had populated the “high end” and brown properties with houses and hotels.  Here’s the challenge:

When I play Monopoly, my strategy is often to buy and build on the cheaper properties.  This leaves me somewhat scared when I head towards the “high rent” area if my opponents built there.  It is now my turn to roll the dice.  Taking a look at the board, and assuming that my opponents own all of the houses and hotels you see, what would be the WORST square for me to be on right now?  What would be the BEST square?

For this question, we assumed that my current location is between the B&O and the Short Line Railroads.  The conversation quickly went into overdrive – students debating their ideas, talking about strategy, and also helping explain the scenario to students not as familiar with the game (thankfully, it seems our tech-savvy kids still play Monopoly!).  Many students noted not only the awfulness of landing on Park Place or Boardwalk, but also how some common sums with two dice would make landing on undesirable squares more likely.


After our initial debates, I led students through an analysis, which eventually led to the introduction of Expected Value as a useful statistic to summarize the game.  Students could start on any square they wanted, and I challenged groups to each select a different square to analyze.  Here are the steps we followed.


First, we listed all the possible sums with 2 dice, from 2 to 12.

Next, we listed the Monopoly Board space each die roll would causes us to land on (abbreviated to make it easier).

Next, we looked at the dollar “value” of each space.  For example, landing on Boardwalk with a hotel has a value of -$2,000.  For convenience, we made squares like Chance worth $0.  Luxury Tax is worth -$100.  We agreed to make Railroads worth -$100 as an average.  Landing on Go was our only profitable outcome, worth +$200. Finally, “Go to Jail” was deemed worth $0, mostly out of convenience.

Finally, we listed the probability of each roll from 2 to 12.

Now for the tricky computations.  I moved away from Monopoly for a moment to introduce a basic example to support the computation of expected value.

I roll a die – if it comes out “6” you get 10 Jolly Ranchers, otherwise, you get 1.  What’s the average number of candies I give out each roll?

This was sufficient to develop need for multiplying in our Monopoly table – multiply each value by its probability, find the sum of these and we’ll have something called Expected Value.  For each initial square, students verified their solutions and we shared them on a class Monopoly board.


The meaning of these numbers then held importance in the context of the problem – “I may land on Park Place, I may roll and hit nothing, but on average I will lose $588 from this position”.

HOMEWORK CHALLENGE: since this went so well as a lesson today, I held to the theme in providing an additional assignment:

Imagine my opponent starts on Free Parking.  I own all 3 yellow properties, but can only afford to purchase 8 houses total.  How should I arrange the houses in order to inflict the highest potential damage to my opponent?


I’m looking forward to interesting work when we get back to school!

Note: I discussed my ideas about this topic in a previous post.  Enjoy!