The Normal Distribution
Continuous Random Variables
Recall that in AS Maths you learned about discrete random variables and their probability distributions. For example, we saw that the number of heads resulting from ten coin tosses was a binomially distributed variable where each toss had a probability of success of . That is to say, if we let X be the number of heads in ten coin tosses then . X is a discrete random variable because it can only take on certain values and nothing in between. On the other hand, a continuous random variable can take on any value. Examples include weight and height – although these are often rounded when measured. Unlike a discrete random variable, the probability of a continuous random variable being an exact value is 0. We saw that the probability of tossing 4 heads out of 10, i.e. , was 0.2051. However, the probability that a person has a weight of exactly 70kg is impossible – instead, for continuous variables, we specify a range to capture the exact weight and make the probability non-zero. For any given population, it is possible to collect the weights of everyone, record the frequency of weights occurring in all given intervals, and plot them in a histogram against relative frequency density. The total area of the bars of such a histogram will equate to 1. Narrowing the interval widths will smooth out the histogram until a probability density function is obtained where the interval widths are infinitely small. The area beneath a probability density function is equal to 1.
The Normal Distribution
Weight and height are examples of random variables that are normally distributed. This means that their probability density functions are bell-shaped – they have a smooth peak, they tail off in both directions and they are symmetrical. It can be shown that the probability density function is given by where is the mean and is the variance. Recall that the variance is a measure of spread away from the mean, and if is the variance, then is the standard deviation. If a random variable is normally distributed with mean and variance , then we write . For a normally distributed variable, around 68% of the data lies within 1 standard deviation of the mean, 95% within 2 standard deviations and almost 100% of the data within 3 standard deviations:
Facts about the probability density function for a normally distributed variable:
- For a normal distribution mean = median = mode and they all occur where the probability density function has its peak. The mode (or infinitely small interval that contains it) occurs with highest probability, i.e at the peak. Since the probability density function is symmetrical, the mean will be in the middle, as will the median, and so they all occur in the same place.
- The standard deviation determines how wide the probability density function is. A narrow probability density function has a small standard deviation – values remain close to the mean, whereas a wide probability density function has a large standard deviation and the spread of the data is vast.
- Recall that the points of inflection of a curve are the points where the curve changes from convex to concave or vice versa. The probability density function for a normal distribution has inflection points when as can be seen from the first figure.
We mentioned earlier that for , the probability density function cannot be used to calculate a probability for a single value of but it can be used to find probabilities of certain ranges, i.e. the area beneath the curve between two X-values. There is a function on your calculator that calculates these probabilities using the normal cumulative distribution function – see Example 1. Conversely, given a probability, there is also a function on your calculator to find the X-values that correspond to those probabilities using the inverse normal distribution – see Example 2.
The Standard Normal Distribution
The standard normal distribution is the normal distribution that has mean 0 and variance 1. We often call a random variable with a standard normal distribution and so . Probabilities for the standard normal distribution only are included in the Edexcel Formula Booklet (where is written as ) but they can also be calculated on your calculator as in Examples 1 and 2.
Any normal distribution can be converted to a standard normal distribution using the coding formula . This is useful for finding missing statistical values such as , or both where the table in the Edexcel Formula Booklet must be used – see Example 3. Recall that when we learned about coding in AS Maths, the mean of the coded variable would be and the standard deviation would be .
Approximating Binomial Distributions
We met the discrete binomial distribution in AS Maths and we saw above that histograms plotted against relative frequency density for discrete variables become probability density functions when the interval widths become infinitely small. In a similar way, the probabilities of a binomially distributed random variable begin to resemble a normal probability density function for large and – must be large enough to smooth out the probabilities and must be approximately a half so that the curve is symmetrical.
It can be difficult to calculate binomial probabilities when is large – for instance, the Edexcel Formula Booklet doesn’t give binomial probabilities for . In these cases, it is desirable to use a normal approximation where we can set and . Note that since a binomial distribution is discrete and a normal distribution is continuous, a continuity correction is required – see Example 4.
Hypothesis Testing
In AS Maths we learned how to apply hypothesis testing for binomial probabilities. These concepts are extended in A2 Maths where we conduct hypothesis tests for correlation between two variables and we also learn how to conduct hypothesis tests on a sample from a normal distribution.
If a random variable X is normally distributed such that , then the mean of any sample taken from the population is also normally distributed with the same mean but with a scaled standard deviation, that is , where given the observed values in the sample and is the size of the sample. This is a result that follows from finding the variance of a transformed variable:
See more on this. It is possible to perform hypothesis tests on the mean of a sample to make inferences on the population mean or find critical regions for a sample mean – see Example 5.