Bob, they printed teacher salaries in the newspaper this week, and it said that the median salary for your school district is $65,000. That’s great. How long until you make $130,000?
Yep, that’s my mother. A good cook, but a bit misguided statistically. A recent CNN interview of Diane Ravitch reminded me of the need for statistical literacy, and the danger of inappropriate statistical conclusions.
It has been my intent to keep this blog non-political. Of course, I have my opinions on educational issues at both the national and local levels, but my mission here has been to provide ideas and resources for my math teacher friends. So now I will try to tip-toe through a recent CNN interview between Diane Ravitch, the former Assistant Secretary of Education, and CNN reporter Randi Kaye. Diane expressed her concerns over statistical misuse in the interview through a recent blog post:
Randi Kaye asked me about NAEP scale scores, which was technically a very dumb question, and I was stunned. She thinks that a scale score of 250 on a 500 point scale is a failing grade, but a scale score is not a grade at all. It’s a trend line. She asserted that the scale scores are a failing grade for the nation.
Fortunately, CNN chose to cut this portion from the aired interview. But the statistical arguments are still worth exploring and learning from. Consider how many parents would feel if their child were to score 600 on the math portion of the SAT. What does this number represent, and how should it be applied? On Diane Ravitch’s blog, this SAT example is used to relate to the misuse of the NAEP statistic:
That is like saying that someone who scores a 600 on the SAT is a C student, because it is only 75% of 800. But that’s wrong. The scale is a technical measure. It is not a grade, period.
And while the SAT analogy helps in explaining the misuse of scale scores, I feel it only tells part of the stats abuse story. The use of a group average to reach a conclusion about all members of the population is simply inappropriate. Given that Finland is often cited as having the top mean NAEP scores, does this imply that all of their schools are passing? Of course not, and it would be foolish for their schools to not work to improve. If, instead of country mean scores, we plotted individual student scores, what would we see? I suspect that we would see each country represented with sprinklings of students at the top, many students represented in the middle, and sprinklings towards the ends. Perhaps we would find that the differences often cited as evidence often become indistinguishable.
Let’s move away from the debate and politics of education and towards a classroom lesson. Consider major league baseball home runs. The dot plot below shows the mean number of home runs by players on National League team rosters at the end of last year, made using Tinkerplots, the middle-school cousin of Fathom.
The dot at the far right represents the Milwaukee Brewers, with the LA Dodgers just behind at 11.2 What conclusions can be made about the Brewers, based on this data point? How many of the conclusions below are appropriate, based on the means?
- The Brewers are the best home-run hitting team in the national league.
- The Brewers have all of the best home run hitters.
- The Brewers score the most runs.
- The Brewers are the best team in baseball.
What do we know about the individual home run hitters on the Brewers, or any other team? Not much. If we plotted all of their individual home runs, what should we expect to see? Will half be above 12, and half below 12….or can something else occur? The plot below shows all hitters’ home runs from last year, with the Dodgers and Braves pulled out to demonstrate individual team distributions.
The Dodgers’ team average of 11.2 gives us a nice summary of the team performance, but how well does it describe individual performance? Only 3 of the Dodger players hit near 11 home runs, while the others are distributed away from that number. We also gain an appreciation for the right-skewness of the distribution, which the means did not reveal. This provides a vital lesson that while the mean provides one overall summary of a distribution, it is not the only summary we should consider, and the mean tells us little about individuals.
Finally, we may also become impressed by high means in our first plot, as the scale of that plot shows clear “leaders”. But what happens when we use similar scales for both means and individuals?
Are we as impressed with our means leaders as we once were? And while there are certainly teams that are ahead, but how much are they ahead? What are the similarities and differences between the distributions of means and individuals?
I hope this demonstrates the rich discussions you can have with your students regarding the application of statistics. Get your students writing about statistics, sharing conclusions, and discussing ideas. The statistical future of our CNN anchors, government officials, and mothers depend upon it!