The Normal Approximation to the Binomial Distribution

Math Medic
Dec 2, 2022
3 min read

Updated: Feb 16

Our students love Normal distribution calculations. Well...not at first. But after some practice, they get really good at them. And because Normal distribution calculations show up over and over again in AP Stats, this process eventually becomes automatic for students. Which is good because we know that most of our significance tests are built upon these calculations, requiring a Normal distribution (or a t-distribution) to arrive at the holy grail of the P-value.

So generally speaking, students jump at the opportunity to use a Normal distribution. Heck, they seem to try and make every random variable into a Normal distribution even when they are never told this information. So it seems quite reasonable that when we are studying binomial distributions, students see something like this:

And they immediately announce that "It's a Normal distribution!!!"

Well...no not exactly. It's a binomial distribution. But maybe we could use a Normal distribution to approximate the binomial distribution if that was somehow useful.

So how do we know when it's safe to use the Normal distribution as an approximation for the binomial distribution?

An Example to Illustrate

In this Math Medic lesson, students are told that Mr. Wilcox will be grabbing a handful of 100 Skittles from a large bin in which 20% of them are green (thank goodness they fixed the flavor).

They are asked to do a binomial calculation (with the help of the calculator).

Next the students are presented with the graph of the binomial distribution. Hopefully they realize that the calculation of P(X<=11) is really just adding up the areas of the rectangles at X = 11 and below.

When seeing the graph, students immediately want to go to what they know: Normal distributions.

When they compare the answer they got using the Normal distribution (0.0122) to the actual answer from the binomial distribution (0.0126), they are quite satisfied that the Normal distribution calculation is "good enough".

So how do we know when the Normal approximation to the binomial is "good enough"? Well, it depends on how closely the Normal distribution matches the binomial distribution.

Make it Visual

To explain this to students, we use the Normal Approximation to the Binomial applet. The binomial distribution shows as yellow bars and the Normal distribution is shown in red. We move the sliders for n and p to try to find the optimal conditions where the binomial distribution looks most "Normal". Watch this video for an explanation:

In the video, we see that the Normal approximation for the binomial distribution works best when:

The sample size (n) is larger
The probability of success (p) is closer to 0.5.

We combine these two ideas to come up with the Large Counts condition, which says that the expected number of successes and expected number of failures are both at least 10.

You Will Need This Later!

Later in the course, the binomial distribution will be transformed into the sampling distribution of a sample proportion and the Large Counts condition will stay as the gate for being able to use a Normal distribution. When we get to formal inference (confidence intervals and significance tests), the Large Counts condition is one that we will check before doing any calculations. The reason, of course, is that the calculations require that we have an approximately Normal distribution.