Drinking the Statistical Power Kool-Aid

For my colleagues who teach AP Stats, there are few phrases more terrifying:

Today I am teaching Power.

Power: a deep statistical concept, but one which often gets moved towards the back of the AP Stats junk drawer.  The only mention of power in the AP Stats course description comes under Tests of Significance:

Logic of significance testing, null and alternative hypotheses; p-values; one- and two-sided tests; concepts of Type I and Type II errors; concept of power

So, students need to understand the concept of power, but not actually compute it (which is itself not an easy task).  Floyd Bullard’s article “On Power” from the AP Central website provides solid starting points for teachers struggling with this concept; specifically, I appreciate his many ways of considering power:

  • Power is the probability of rejecting the null hypothesis when in fact it is false.
  • Power is the probability of making a correct decision (to reject the null hypothesis) when the null hypothesis is false.
  • Power is the probability that a test of significance will pick up on an effect that is present.
  • Power is the probability that a test of significance will detect a deviation from the null hypothesis, should such a deviation exist.
  • Power is the probability of avoiding a Type II error.

This year, I tried an activity which used the third bullet above, picking up on effects, as a basis for making decisions.

HEY KOOL-AID MAN!

Photo Mar 06, 12 11 56 PMArriving at school early, I got to work making 3 batches of Kool Aid.  During class, all students would receive samples of the 3 juices to try.  Students were not told about the task beforehand, or where this was headed. Up to now, we had discussed type I and type II error, so this served as a transition to the next idea.

THE BASELINE SAMPLE:

All students received cups and as they worked on a practice problem I circulated, serving tasty Kool Aid – don’t forget to tip your server!  I told students to savor the juice, but to pay attention: I promised them that this first batch was made using strict Kool Aid instructions.  Think about the taste of the juice.

SAMPLE A:

Next, students received a drink from “Sample A”.  Their job – to assess if this new sample was made using LESS drink mix than the baseline batch.  Also, I varied the amounts of juice students received: while some students were poured full cups, some received just a few dribbles.  To collect responses, all students approached the board to contribute a point to a Sample A scatterplot, using the following criteria:

Photo Mar 06, 8 50 05 AM - CopySample size: how much juice you were given

Evidence: how much evidence do you feel you have to support our alternate hypothesis – that Sample A was made with LESS mix than the baseline?

As you can see, the responses were all over the place – a mixture of “we’re not quite sure” to “these are strange directions” to “I just don’t trust Lochel – something’s up”.  But the table has been set for the next sample.

Sample A: it was made with just a smidge less mix than the baseline.  So I wasn’t totally surprised to see dots all over.

SAMPLE B:

Photo Mar 06, 8 50 05 AMI poured drinks again from this new sample, and again varied the sample sizes.  I asked all students to think about their evidence in favor of the alternate, and wait until everyone tasted their juice before submitting a dot.

And check out those results!  Except for a few kids (who admitted they stink at telling apart tastes), we have universal support in favor of the alternate hypothesis.

Sample B: this was made with 1/2 the suggested amount of drink mix.  Much weaker!

FOLLOW-UP DISCUSSION:

This activity made the discussion of power much more natural.  In particular, what could occur during a study which would make it more likely to reject the null hypothesis, if it deserves rejecting?

Larger sample size: smaller samples make it tough to detect differences

Effect size: how far away from the null is the “truth”.  If the “truth” as just a bit less than the null, it could be difficult to detect this effect.

In terms of AP Stats “concepts of power”, this covers much of what we need.  Next, I used an applet to walk students through examples and show power as a probability.  And like most years, this was met with googly eyes by many, but the foundation of conditions which would be ripe for rejecting the null was built, and I was happy with this day!

Suggested reading: Statistics Done Wrong by Alex Reinhart contains compelling, clear examples for teachers who look to lead discussions regarding P-value and Power.  I recommend it highly!

4 Activity Builder Formative Assessment Ideas

The creative team at Desmos continue to develop engaging lessons using their Activity Builder interface, found at teacher.desmos.com. While teachers I encounter have their own favorite activity, many desire to dive in and create their own. But building your own activity, testing it, and hoping it works with your class can be an intimidating task (pro-tip: making your own activity is really hard!).  But there are a few simple ways teachers can use Activity Builder as a mechanism for formative assessment.  Here. I share 4 quick and easy ideas – you can check them out and observe their structure at this link.

SELF-CHECKING GRAPH MATCHING

blog2

I used this often with my Pre-Calculus class in the fall, and the concept works equally well with younger students.  Simply start a new Activity Builder screen, and enter the equation you’d like students to provide.  Place the equation in a folder, which you can hide so students won’t see it when they encounter the screen.  Finally, by making the graph with dashed lines, students can easily see if their submission matches the requested graph, and can adjust accordingly.

GALLERY WALK

blog4

Here’s a neat Activity Builder hack you may not know about.  If you have an existing Desmos graph, copy the URL from your graph to the clipboard.  Then, in an Activity Builder screen click the “Graph” button and paste the URL into the first expression line – and PRESTO, the graph is imported into an Activity Builder screen.  I often collect student work by simply having them submit a Desmos URL.  Consider taking samples of student works and create a virtual gallery walk.  Let students view each other’s ideas, comment and make suggestions. Thanks to my colleague DJ for providing neat student graphs!

SELF-ASSESSMENT SLIDERS…AND OVERLAY

blog5

Have students assess their own learning with a moveable point. Provide an “I can…” prompt and let students consider where they fall in the learning progression.  Hold a class-wide discussion of unit skills by anonymizing student names and using the overlay feature to take the class pulse on skills.

MY FAVORITE DISTRACTOR

blog3

Activity Builder allows teachers to build their own multiple-choice questions, with the option of having students provide an explanation for the choice they make.  In “My Favorite Distractor”, students select an answer they KNOW is wrong, and explain how they know.  This may not work for many multiple-choice type questions, but consider using this idea in situations where the distractors have clear, interesting rationales for elimination.

Have your own quick formative assessment ideas?  Share it here!

 

Last Week I Refused to Teach Factoring

The students in my Freshman Honors class have certain expectations for how a math class works – a teacher lectures, there’s lots of drill practice, and then a test. Breaking this mold, and causing them to think of themselves as reflective learners, is one of my many missions. So this past week, when confronted with factoring, I simply refused to lecture.

My 9th graders have seen factoring before, but it was back in 7th grade, and it was only a surface treatment. So after a brief opener where we discussed what a “factor” means (both numerically and algebraically), I dropped the bomb –

  • I’ve posted your learning targets online
  • I’ve posted videos, resources and practice problems if you need them
  • I’ve set up online practice if you need it
  • You have a timed quiz on Friday (we started on Tuesday)

And….scene!

Panic….apprehension….incredulous looks….

So, you’re not going to teach us?

Nope.  Now get to work.

Here are some details of what I posted:

LEARNING TARGETS

  • F1: I can identify and factor expressions which involve greatest common factors.
  • F2: I can efficiently factor trinomials of the form ax2+bx+c, where a = 1.
  • F3: I can factor trinomials of the form ax2+bx+c, where a does not equal 1 (or zero).
  • F4: I can identify and factor perfect square trinomials.
  • F5: I can identify and factor “difference of squares” expressions.
  • F6: I can factor expressions which may represent a combination of F1 to F5.
  • F7: I can factor expressions “by parts” (or “by grouping”) when necessary.
  • F8: I can factor expressions which are the sum (or difference) of two cubes.

RESOURCES

Each learning target featured a video – some from Khan Academy, and some from other sources I searched for – but I attempted to provide a variety of methods. Some featured grouping as a primary means, others demonstrated the box method or the diamond.  This was the most important aspect of this learning experience: I wanted students to experience a variety of approaches, evaluate them, and make a personal decision about what worked best for them.  The students did not disappoint.

I also posted other online resources, such as worked examples and flowcharts.  One of my favorite resources – Finding Factors from nrich, was also included. Finally, I created an assignment on DeltaMath for each learning target, and a final jumbled assignment. The end of each day featured an exit ticket quiz and recap, to assess progress and provide “next steps” during the week.

SO WHAT HAPPENED?

Some students latched onto factoring by grouping for every quadratic, and explained their reasoning to their peers.  Many of these same students later in the week found more confidence in their number sense and chose to group only for “tricky” problems. One student was particularly insistent that the box method was the best was to go for all problems. Others found the diamond method helpful – which led to deep conversations about number sense and how to make searches more efficient. And in one fascinating conversation, a student discovered a “trick” he had found online. The group debated the merits of the method, tried some practice…but as nobody in the group could figure out why the method worked, they quickly dismissed it.  Good boys!!!

In the end, the quiz scores were great.  But beyond the scores, I feel confident that the students have made choices about their learning, assessed and revised their thinking, and can move forward using their new tools.

WHAT DID THE STUDENTS THINK?

Today I asked students to reflect upon their learning experience, and provide me feedback.

What was your overall feeling about last week’s learning method?  (1 = “Please never do that again”, 5 = “I loved it – do it more”.)

chart

Describe something you LIKED about last week’s classes, and why you liked it.

  • I liked being able to choose what i wanted to do. I could focus on my weaknesses and do less problems on what i was good at. I also appreciated the practice problems.
  • I liked that if you knew a topic you could move on and didn’t have to wait for someone else or the next day of class.
  • I liked that I could learn and do problems at my own speed.

Describe something you DIDN’T LIKE about last week’s classes, and why you didn’t like it.

  • I did not like that you did not explain how to factor
  • I didn’t have as much instruction from the master of factoring. {note – I suppose this is me?}
  • the teacher wasn’t involved

This last comment intrigues me…and I’m not sure if I should be bothered by it…I don’t think I should be.  In many respects, I feel I worked harder during the classes, as students were all over the place.  But I also realize students don’t see all of this going on around them.  I’ve become intrigued by how I can be less of a teacher and more of a facilitator in my classes, and this was a solid step forward I feel.

Now, off to plan to not lecture tomorrow….

How I Stumbled Into Math Modeling Without Even Realizing It.

We started a unit on counting principles this week in my 9th grade honors class – permutations, combinations – eventually leading to the binomial theorem.  Since my  classes had used Desmos Activity Builder a few times and were familiar with the need to enter a 5-character code to start an activity, I planned to ask the following question as a class opener:

How many different 5-character DesmosActivity Builder codes exist?

codes

This problem would have likely met my intended goal of having kids think about the fundamental counting principle in a real-world context.  It also would have taken about 10 minutes of class time, and have been forgotten about by the next day.  It felt like I was missing an opportunity to develop a deeper discussion.  A slight tweak to the question added just the right layer:

Activity codes for Desmos Activity Builder currently have 5 characters, as shown here.  When will Activity Codes need to expand to 6 characters?

And now we have a problem which requires a bit more than a quick calculation.  To start, I asked students to work in their teams to make a list of information they would need to help solve this problem.  This was not easy or comfortable for them – but a preliminary list of questions emerged from group discussions:

  • How many 5-character codes are there?
  • Are codes used less on weeekends and summers?
  • Can letters repeat in codes?
  • How many codes a day are used?

This was a good start to set kids in motion to think about how to solve the problem.  I’m hoping they will think about new questions or revise their questions as we go along…the class did not disappoint!

HOW MANY CODES ARE THERE?

As kids worked, clarifying questions came up – some of which I just didn’t know the answer to, and hadn’t really thought about:

Mr. L, are there any zeroes in codes? Kids might confuse them with the letter O.

Mr. L, I don’t see any L’s in the codes?

Excellent observations, and restrictions we need to think about in our calculation. A tweet to the Desmos crew lent some clarity, and added more restrictions!

Thank for the intel, Eli!

HOW MANY CODES PER DAY ARE USED?

This was tricky for my class. To help, I reminded students that when we started the semester, codes were 4 characters.  When did the Desmos 5-character era begin?  A quick scroll through my history (shown here) provides some info. After further interrogation from my class, I shared that Activity Builder started around July of last year with 4-character codes.  Add this to our bucket of helpful info.

codes2

SHARING IS CARING

Writing a draft solution was the next task for students.  But instead of turning it in to me immediately, I formed class teams of 3 where students shared their drafts and ideas.  I used this opportunity to build teams of students who I observe don’t often interact or chat.  From here, I gave students another day to think about their explanation – keeping in mind that there are no right answers to this question, only answers we can defend. But it still feels like we are missing a key piece in this problem……

DID WE MISS ANYTHING?

The next morning as students were mingling before the bell, I looked across the room at the laptop of Jacob – one of my more insightful, but also introverted, students:

trends

It’s the mother lode!

The google trends graph for student.desmos.  Yes! Yes! Yes!  Stop everything kids, we need to talk!  Jacob – tell us all about this graph. How does this new info factor into our estimates?  What should we do with it?  Is this going to continue?  And with this, I gave the class an extra day to think about their responses, share, and dig deeper.  And while many students simply estimated a growth rate by doubling or tripling their computed rate (this is fine with me), I am getting some responses which far exceed my expectations – like Jacob, who developed a growth function and evaluated integrals (did I mention this is a 9th grade class????)

jacob.JPG

Yep, this was definitely better than my originally intended problem!

 

 

Seeing Stars with Random Sampling

Adapted from Introduction to Statistical Investigations, AP Version, by Tintle, Chance, Cobb, Rossman, Roy, Swanson and VanderStoep

Before the Thanksgiving break, I started the sampling chapter in AP Statistics.  This is a unit filled with new vocabulary and many, many class activities.  To get students thinking about random sampling, I have used the “famous” Random Rectangles activity (Google it…you’ll find it) and it’s cousin – Jelly Blubbers. These activities are effective in causing students to think about the importance of choosing a random sample from a population, and considering communication of procedures. But a new activity I first heard about at a summer session on simulation-based inference, and later explained by Ruth Carver at a recent PASTA meeting, has added some welcome wrinkles to this unit.  The unit uses the one-variable sampling applet from the Rossman-Chance applet collection, and is ideal for 1-1 classrooms, or even students working in tech teams.  Also, Beth Chance is wonderful…and you should all know that!

starsIn my classroom notes, students first encounter the “sky”, which has been broken into 100 squares. To start, teams work to define procedures for selecting a random sample of 10 squares, using both the “hat” (non-technology) method, and a method using technology (usually a graphing calculator). Before we draw the samples however, I want students to think about the population – specifically, will a random sample do a “good job” with providing estimates? Groups were asked to discuss what they notice about the sky.  My classes immediately sensed something worth noting:

There are some squares where there are many stars (we end up calling these “dense” squares) and some where there are not so many.

Before we even drew our first sample, we are talking about the need to consider both dense and non-dense areas in our sample, and the possibility that our sample will overestimate or underestimate the population, even in random sampling.  There’s a lot of stats goodness in all of this, and the conversation felt natural and accessible to the students.

Studestars1nts then used their technology-based procedure to actually draw a random sample of 10 squares, marking off the squares.  But counting the actual stars is not reasonable, given their quantity – so it’s Beth Chance to the rescue!  Make sure you click the “stars” population to get started.  Beth has provided the number of stars in each square, and information regarding density, row and column to think about later.

But before we start clicking blindly, let’s describe that population.   The class quickly agrees that we have a skewed-right distribution, and take note of the population mean – we’ll need it to discuss bias later.

Click “show sampling options” on the top of the screen and we can now simulate random samples.  First, students each drew a sample of size 10 – the bottom of the screen shows the sample, summary statistics, and a visual of the 10 squares chosen from the population.

stars2.JPG

Groups were asked to look at their sample means, share them with neighbors, and think about how close these samples generally come to hitting their target.  Find a neighbor where few “dense” area were selected , or where many “dense” squares made the cut, how much confidence do we have in using this procedure to estimate the population mean?

Eventually I unleashed the sampling power of the applet and let students draw more and more samples.  And while a formal discussion of sampling distributions is a few chapters away, we can make observations about the distributions of these sample means.

stars3

And I knew the discussion was heading in the right direction when a student observed:

Hey, the population is definitely skewed, but the means are approximately normal.  That’s odd…

Yep, it sure is…and more seeds have been planted for later sampling distribution discussions. But what about those dense and non-dense areas the students noticed earlier?  Sure, our random samples seem to provide an unbiased estimator of the population mean, but can we do better?  This is where Beth’s applet is so wonderful, and where this activity separates itself from Random Rectangles.  On the top of the applet, we can stratify our sample by density, ensuring that an appropriate ratio of dense / non-dense areas (here, 20%) is maintained in the sample.  The applet then uses color to make this distinction clear: here, green dots represent dense-area squares.

stars4

Finally, note the reduced variability in the distribution from stratified samples, as opposed to random samples. The payoff is here!

Later, we will look at samples stratified by row and/or column.  And cluster samples by row or column will also make an appearance.  There’s so much to talk about with this one activity, and I appreciate Ruth and Beth for sharing!

Compute Expected Value, Pass GO, Collect $200

Photo Oct 11, 7 08 49 AM.jpgExpected Value – such a great time to talk about games, probability, and decision making!  Today’s lesson started with a Monopoly board in the center of the room. I had populated the “high end” and brown properties with houses and hotels.  Here’s the challenge:

When I play Monopoly, my strategy is often to buy and build on the cheaper properties.  This leaves me somewhat scared when I head towards the “high rent” area if my opponents built there.  It is now my turn to roll the dice.  Taking a look at the board, and assuming that my opponents own all of the houses and hotels you see, what would be the WORST square for me to be on right now?  What would be the BEST square?

For this question, we assumed that my current location is between the B&O and the Short Line Railroads.  The conversation quickly went into overdrive – students debating their ideas, talking about strategy, and also helping explain the scenario to students not as familiar with the game (thankfully, it seems our tech-savvy kids still play Monopoly!).  Many students noted not only the awfulness of landing on Park Place or Boardwalk, but also how some common sums with two dice would make landing on undesirable squares more likely.

ANALYZING THE GAME

After our initial debates, I led students through an analysis, which eventually led to the introduction of Expected Value as a useful statistic to summarize the game.  Students could start on any square they wanted, and I challenged groups to each select a different square to analyze.  Here are the steps we followed.

photo-oct-11-8-14-56-am

First, we listed all the possible sums with 2 dice, from 2 to 12.

Next, we listed the Monopoly Board space each die roll would causes us to land on (abbreviated to make it easier).

Next, we looked at the dollar “value” of each space.  For example, landing on Boardwalk with a hotel has a value of -$2,000.  For convenience, we made squares like Chance worth $0.  Luxury Tax is worth -$100.  We agreed to make Railroads worth -$100 as an average.  Landing on Go was our only profitable outcome, worth +$200. Finally, “Go to Jail” was deemed worth $0, mostly out of convenience.

Finally, we listed the probability of each roll from 2 to 12.

Now for the tricky computations.  I moved away from Monopoly for a moment to introduce a basic example to support the computation of expected value.

I roll a die – if it comes out “6” you get 10 Jolly Ranchers, otherwise, you get 1.  What’s the average number of candies I give out each roll?

This was sufficient to develop need for multiplying in our Monopoly table – multiply each value by its probability, find the sum of these and we’ll have something called Expected Value.  For each initial square, students verified their solutions and we shared them on a class Monopoly board.

photo-oct-11-8-21-37-am

The meaning of these numbers then held importance in the context of the problem – “I may land on Park Place, I may roll and hit nothing, but on average I will lose $588 from this position”.

HOMEWORK CHALLENGE: since this went so well as a lesson today, I held to the theme in providing an additional assignment:

Imagine my opponent starts on Free Parking.  I own all 3 yellow properties, but can only afford to purchase 8 houses total.  How should I arrange the houses in order to inflict the highest potential damage to my opponent?

monopoly-back-row

I’m looking forward to interesting work when we get back to school!

Note: I discussed my ideas about this topic in a previous post.  Enjoy!

Pulling In To the Station

My school isn’t 1-1 with technology yet, though there are rumblings we will get there next year….or the year after….or 2031…anyway, it’s time to get techy!  My new classroom features 4 computer stations in the back – nice to have, but not super-helpful with classes of about 24 each. Station-model classroom structure has been super-helpful in my pre-calculus class in the first month. Besides the chance for all students to participate in rich technology-based activities, I’ve had the opportunity to carve out valuable small-group time with students.  Here’s an example:

In our first pre-calc unit, we review functions and their shirts, folding in new ideas like the step function, piecewise and even/odd functions.  My objective for the class was for students to consider functions in varied forms.  As students entered class, playing cards were drawn to establish their groupings, so there were 3 groups of 7 or 8.  With 15 minutes on the classroom clock, students started on their first station:

  1. Group 1 gathered in a small group with me in a circle of desks, where we worked through proving functions even or odd, and sketching their graphs.
  2. marbleslideGroup 2 worked at the computer stations on a Desmos Marbleslides featuring quadratic functions, with many students pairing up to work together. If you have never tried a Marbleslides, run and play now – we’ll wait for you to come back…
  3. Group 3 worked out in the courtyard (hey, my new classroom leads outside – which is nice) on a group task involving a piecewise function.

After groups had rotated through all 3 activities, we had time to recap / share and assess our learning over the hour.  Here’s why I need to do this more:

  • The small group station let me touch base with every student, assess strengths, find out what we need to work on, and provide feedback to everyone.
  • Marbleslides is sneaky awesome! When students begin to obsess over function shifts and how to restrict domains and don’t want to peel away from their computer, you know something is going right.
  • Class went fast! It felt like the mixed practice from Let It Stick was now becoming part of my classroom culture.
  • My pre-calc is mostly 11th and 12th graders, who have had a pretty traditional classroom experience in their math lives.  I can sense they appreciate that something difference is happening.
  • All students are responsible for their learning.  Even the least-active task, the piecewise function, was used the next class for sharing out and a jumping-off point.