An average or measure of central tendency is a single value that approximates the central value of the numbers in a univariable dataset (list of single numbers as opposed to a list of pairs, for example). The three most common measures of central tendency (informally said as averages) are given by the mean, median and mode.
The best way to illustrate the differences between the mean, median and mode for non-grouped data is to consider a given set of numbers. For example, consider the following dataset:
The mean is found by adding all of the data values together and dividing by how many there are. For the set of numbers given above the mean is given by:
The mean of the set is 3.6.
The mean is often denoted i.e the sum of the data points (x) divided by how many there are (n).
The median is the middle data entry once all of the data values have been put into ascending order:
1, 3, 4, 5, 5
The middle data point is 4 and so the median of the set is 4.
See the example below for when there is an even number of data entries.
The mode is the most common number in the given set of numbers. For the set above, the most common number is 5, hence, the mode is 5.
These averages may also be found in a similar way if the data is given in a frequency table. See the example directly below.
Finding an estimate for the mean and the median and modal classes is demonstrated for the following grouped dataset. This dataset shows the number of people in a given age range that work at a particular supermarket:
An estimate for the mean can be found by approximating the combined age of all employees divided by how many there are. Since we don’t know the exact age, we estimate by using midpoints (see note below). The combined age is: Hence the mean is estimated as .
The total number of people in the dataset is 60, half of which is 30, and they are already listed in order. There are 25 people in the first three age intervals and 40 in the first four intervals. It follows that the median class is 46-55 years.
Interpolation may be used to find a more definitive approximation for the median.
For grouped data, rather than the mode which is a single number, we are looking for the most common interval or class. This is known as the modal class. In this example, the modal class is also 46-55 years – the largest number of people fall within this age bracket.
NOTE: the midpoint of the interval 16-25 is 21, for example, since it includes all ages right up until the day before someone’s 26th birthday.
Each of the measures of central tendency have their own advantages/disadvantages. The mean uses all of the data but can be skewed by extreme values. The median avoids this skewing but doesn’t necessarily give a true measure of the data as only one datapoint is used. The mode can be useful if many data points take the same value but not particularly useful when frequencies are low or nearly all the same.
In addition to measures of central tendency that indicate the central value of a dataset, there are also measures of variation that indicate how spread out a dataset is. Standard deviation is the measure of spread that pairs naturally with the mean. Interquartile Range, on the other hand, goes with the median. There is no measure of spread that naturally pairs with the mode.
Click here to practice exam questions from Past Applied Maths Papers.
Have you practised your Pure Maths lately? If not, visit our Practice Papers page and take StudyWell’s own Pure Maths tests.