Find averages from grouped data
Let’s say you want to find out how many TVs each person in your class has in their house. This is something you can count: they might have 0 TVs, 1 TV, 2 TVs etc.
If you want to find out how many hours they spend watching TV each week, though, that’s not something you can really count in discrete (separate) chunks. What you tend to do instead is give people the option of belonging to a few different groups, e.g.:
Watch 0 hours of TV per week Watch up to 5 hours of TV per week Watch more than 5 hours, and up to 10 hours of TV per week Watch more than 10 hours, and up to up to 15 hours of TV per week Etc.
You can’t calculate the mean, median and mode directly. This is because, say someone ticket the third group, you’ll never know if they watch 11 hours of TV, or 15 hours of TV; you’ll never know exactly how much TV they watch.
There are still some methods for estimating the mean, median and mode, though, and they’re very similar to methods you already know.
The table below represents the amount of TV watched per week by 20 people
|Hours of TV watched (hours)||Number of People(f)|
|0 ≤ X < 5||8|
|5 ≤ X < 10||6|
|10 ≤ X < 15||3|
|15 ≤ X < 20||2|
|20 ≤ X < 25||1|
As the data is grouped, we can't find the mean like we did before (i.e. adding up all the numbers then dividing by how many numbers there are) because we don't know the exact values.
In order to get around this problem, we're going to assume the hours of TV watched is equal to the midpoint of that group. So for people who watch between 0 and up to 5 hours of TV a week, we're going to assume they all watch 2.5 hours.
We'll need to multiply the midpoint by the number of people (i.e. the Frequency). Which will produce the table below
|Hours of TV watched (hours)||Frequency(f)||Midpoint(x)||Frequency x Midpoint(fx)|
|0 ≤ X < 5||8||2.5||20|
|5 ≤ X < 10||6||7.5||45|
|10 ≤ X < 15||3||12.5||37.5|
|15 ≤ X < 20||2||17.5||35|
|20 ≤ X < 25||1||22.5||22.5|
To get the estimate of the mean, we need to:
Divide the sum of (Frequency x Midpoint) by the sum of Frequency = ∑fx / ∑f
So the esimate of the mean is 160/20 = 8 hours
The mode, or in the case of grouped data - The modal class is just the group that has the highest frequency which is 0 ≤ X < 5.
The median is the middle value, but since we cannot find the position of the middle value when the data is grouped - we will have to settle on the median group. We know the frequency is 20 so we are looking for the (20+1)/2 = 10.5th value. The median is in the 5 ≤ X < 10 group.
Follow the links below to see how this topic has appeared in past exam papers
Requires a knowledge of…
After this move on to…