The Common Core places increased imporance on statistics in middle school, beyond the tasks of creating simple data displays often encountered in middle-school texts. The new standards require that students be able to describe distributions, compare samples to populations, and design simulations:

- CCSS.Math.Content.6.SP.B.5c Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data were gathered
- CCSS.Math.Content.7.SP.A.1 Understand that statistics can be used to gain information about a population by examining a sample of the population; generalizations about a population from a sample are valid only if the sample is representative of that population. Understand that random sampling tends to produce representative samples and support valid inferences.

It’s all great table-setting for AP Statistics down the road, and working with authentic, interesting data. In this activity, students use an online resource to perform a simulation in order to find the proportion of the earth’s surface covered with land (as opposed to water). This is not a new activity, as a number of teachers suggest using an inflatable globe and some classroom tossing to reach estimates. But I think the method below uses some web tools in a novel way, and encourages some authentic geography discussions.

**GATHERING SOME INITIAL IDEAS**

Before diving into any simulation, I like to gather ideas from my students, to see if they have any initial estimates or backround about the problem. Using sites like Padlet or TodaysMeet are great for encouraging and archiving discussions and participation, or you can go old-school and just record initial estimates on the board. Asking an initial questions like: “Do you know the percentage of Earth which is covered by water” to start the discussion. The start a discussion of sample size: “Would it be better to sample 20 points on earth, or 40 points, or 100 points? What factors would be part of your decision?”

**COLLECTING DATA **

In this simulation, students will sample a random point on the earth’s surface, record whether the point is covered by land or water, and repeat for a given sample size. For this, we will use the site Random.org, which generates random events, mostly things like numbers and dice, and their Random Coordinate Generator, which chooses a random location on earth and displays it using Google Earth.

The site works quickly, and the water/land data can often be determined without issue. It is also easy to zoom in and out to take care of those “close calls”. Quite a more accurate measure than the thumb-check data from the inflatable globes. Very few issues arose in my trials with this method, but the biggest snag is Antarctica, which is land, but often appears light blue on Google Earth. Also, a few rare occasions produced a data point above the North Pole on the map, which I discarded. For your class, have each student generate a sample of size 10, and look at the proportion of land hits. Below, I did 10 trials for 4 different sample sizes:

The next steps depend on the sophisitcation and grade-level of your class. But in general, we want to know which sample size provides the best estimates. How do you know? Have students write explanations which defend a particular sample size.

Multiple sources (Circle graph from ChartsBin,NOAA Information) verify that about 29% of the Earth’s surface is land. Do our trials verify this? How often were our trials within 10% of this 29% mark?

As more data is collected, free site like StatKey can be used to generate appropriate graphs and statistics.

**FOR AP STATISITCS**

I see this as an improvement of an existing AP Stats exploration, and opening activity for Confidence Intervals for proportions, and extension into the behaviors of CI’s. For those playing along at home, here are the calculations for 2 standard deviations (Margin of Error) for my given samples:

And the corresponding intervals, showing how often my sample proportions were captured within each interval: