Let’s say I’m being paid by a company to find out what type of chocolate everyone in the UK prefers. Well, I can’t ask everyone, so I’ll ask a much smaller group of people. That group is called a sample of the population.

Thing is, people are all different. Some are young, some are old, some are men, some women, some are rich, most are not, some have lived here all their lives, some moved here from other countries with different cultures. If I ask only one type of person, 12 year old white boys for example, and then try to pretend that the chocolate they prefer is the same for everyone, my sample would be biased. This means it doesn’t represent reality.

To make sure my sample isn’t biased there are two things I need to do. First, I need to choose a sample size that’s big enough. If we’re talking about the UK population, which is about 60-70 million people living all kinds of different lives, then ten people for example will hardly be enough. For a serious professional, a sample of even a hundred people might not be enough, you may need to ask as many as a thousand people.

The second thing I need is a ‘sampling strategy’, i.e. how am I going to choose the people to ask?

Stratified sampling

Stratified sampling is considered the most representative. It involves splitting the population into categories you choose (e.g. age brackets – 0-10, 11-20, 21-30 etc.), finding the proportion of people in the whole population who fit into each category, and then ensuring the same proportion of people are selected for your sample.

e.g. if 10% of 60 million people are between 11-20 years old, make sure 10% of your 1000 person sample (i.e. 100 people) are also 11-20 years old.

Example:

The owner of a health club wants to out how often members use the club. He collects data about the age of the 1000 members.

 Under 25 25- 40 41-50 Over 50 Population size 340 380 200 80

The owner decides to take a stratified sample of size 50.

To calculate how many members of each age group he should choose, we need to first divide the size of the sample (50) by the number of members in the health club = 50/1000 then multiply this by the number in each age group. By doing this we get the table below

 Under 25 25- 40 41-50 Over 50 Population size 340 380 200 80 Sample Size 17 19 10 4

But 4 out of 80 (for Over 50s) might be less accurate than the 19 out of 380 (for 25-40s), so he might choose 5 Over 50s and 18 25-40s. This principle may be useful if a table such as the above contained fractions (such as when an bage group does not contain a number which is a multiple of 20).

Nothing in this section yet. Why not help us get started?

## Follow the links below to see how this topic has appeared in past exam papers

Edexcel June 2010 (H) - Page 19, Question 24

Edexcel November 2012 (H) - Page 24, Question 25

## Related Topics

Requires a knowledge of…

## Related Questions

• 1
Vote
4

### What is a coordinate grid?

By Ryan McGuire on the 10th of January, 2013

• 1
Vote
3

### What is the general form for a quadratic equation?

By Filsan on the 10th of January, 2013

• 1
Vote
2

### What does the graph of Y = x Linear look like?

By Filsan on the 10th of January, 2013

• 1
Vote
3

### What is the equation of a straight line graph?

By Filsan on the 10th of January, 2013

• 0
4

### How do I find 15% of a value?

By Verity Painter on the 8th of January, 2013

• 0
1

### What does write the terms of a sequence mean?

By Filsan on the 6th of December, 2012

• 0
1

### what is a data collection sheet?

By Lee Mansfield on the 11th of June, 2012

• 2
4