Maths' beginners often find it hard to distinguish between mean, median and mode. There are some simple ways of remembering the differences, but first, let's understand what they are. All three terms and ways of measure the "central tendency" of a dataset. They each give you a single number as a representative of the whole dataset. Why do we want to do this?
Why do we need a measure of central tendency?
Well often want to have a single figure to help us make sense of large numbers of data. For example, in the health world it is often helpful to use a representative figure for the weight of children at different ages. By using this figure, it is easy to compare a single child's weight against the general weight of children of the same age. This gives you an indication of whether an individual child is putting on weight too quickly or too slowly, each of which might be a cause for concern.
This analysis often needs to be refined further. For instance, the weight of boys and girls are different throughout their development. In this case, it is more useful to divide the group into boys and girls and get a representative figure for each. Similarly, you might want to focus your attention not just on those children who are different from the representative figure - but on those who are very different, because these are the children most at risk. In this case, you could break the dataset further down to get a measure for boys and girls in the lowest and highest weight groups.
Mean, median and mode are simply different ways of measuring this central tendency. Each has advantages and disadvantages depending on the dataset.
The mean
The word "mean" is often used interchangeably with the word "average" but this is misleading, as there are different types of mean. Here we use "average" as the umbrella word covering all ways of representing datasets with a single figure. Mean, median and mode and therefore all different forms of "average".
"Mean" is calculated by totalling the sum of all numbers in a dataset and dividing it by the observations (the amount of numbers in a dataset). For example, lets say we have a dataset of 15, 19, 23, 17, 23, 41 and 29. The total of all of these numbers if 167 and the observations are 7. 167 divided by 7 equals 23.9, which is the mean for this dataset.
There are more complex ways of calculating the mean when the data has different properties. Data is often grouped into class intervals and frequencies. For example, the following data measures the time it takes school children to get to school:
- 0 to 10 minutes - 5 children
- 10 - 20 minutes - 17 children
- 20 - 30 minutes - 15 children
- 30 - 40 minutes - 6 children
To calculate the mean in the above instance, you need to work out the mid-point of each interval and then multiply it by the frequency. The mid-points are:
- 5 * 5 = 25
- 15 * 17 = 255
- 25 * 15 = 375
- 35 * 6 = 210
You then add the results together (25 + 255 + 375 + 210 = 865). 865 is the total number of minutes taken by all the children. The total of the frequencies says how many children there are (5 + 17 + 15 + 6). Therefore, the mean is 865 (total time taken) divided by 43 (total children), which equals 20.
We may want to attribute more significance to particular data points inside a group. This is called a "weighted mean". In a weighted mean, instead of adding up all the observations and dividing by the number of observations, you first multiply the observations by its weight.
There are more complicated ways of calculating the mean. These have not been discussed as they aren't relevant to a beginner's guide.
Problems with the mean
The mean may not be the best way of measuring the central tendency. Consider the following set of numbers: 12, 15, 22, 31, 27, 16, 29, 22, 462. The 9 numbers add up to 636, which equals 71 when divided by 9. However, 71 is not a good measure of central tendency because the final number of 462 is an outlier: it skews the analysis because it is so much higher than all the other numbers. In this situation, it's better to use another measure of central tendency called the "median".
The median
The "median" is simply the middle point in a dataset. Using the dataset above we have: 12, 15, 22, 31, 27, 16, 29, 22, 462. To find the median, we first need to organise the dataset into ascending order: 12, 15, 16, 22, 27, 29, 31, 462. There are 9 numbers in the dataset and to find the middle once we ignore the top 4 and the bottom 4. This leaves us with the number 22. For this dataset, 22 in a much better measure of central tendency than the mean.
There is a complication with the median. To show this, let's change some numbers in our dataset: 12, 15, 23, 22, 31, 27, 16, 29, 24, 462. Converting this into ascending order we have the following: 12, 15, 16, 22, 23, 24, 27, 29, 31, 462. Now there are 10 numbers in the dataset, which is an even number. The middle point is now between the fifth number and the sixth number: 23 and 24. We resolve this by adding the middle numbers together and dividing them by 2. In this instance, 23 and 22 divided by 2 equals 23.5. This is the median.
The mode
The "mode" is the most commonly occurring data point within a group of data. Let's take the same set of numbers that we looked at above: 12, 15, 22, 31, 27, 16, 29, 22, 462. Only one number occurs more than once: 22. This is therefore the most commonly occurring, which makes it the mode. In this case, 22 is a better measure of central tendency as it represents the group of numbers much more effectively. It is not distorted by the outlier of 462.
Let's add another number to the dataset: 12, 15, 22, 31, 27, 16, 29, 31, 22, 462. Now we have two numbers occurring twice: 22 and 31. We therefore have two modes. Now we'll change the dataset again: 12, 15, 22, 31, 16, 29, 32, 462. Now all the numbers in the dataset appear only once. They are all equals common/uncommon. In this case, there is no mode. Obviously, when there is no mode, it can't be used to measure central tendency.
Find out more
If you want to know more about the subject, check out our A-level science tutors at TeachTutti.
This post was updated on 01 Aug, 2023.