Math and Arithmetic
Statistics
Probability

# How can you perform a sample selected in such a way that each member of the population has an equal probability of being included?

The short answer is "random sample," but that, unfortunately, is

neither specific nor complete. It is not specific because there are

forms of random sampling where selection probabilities are not

constant. It is not complete because there are many different ways

to conduct random sampling with equal selection probabilities.

"Simple random sampling" occurs when you can perform a process

that, for all practical purposes, behaves like writing down the

identifier of each population member on a piece of paper, putting

all the pieces into a box, mixing them thoroughly, and pulling out

a few of them one by one (without replacing them in the box).

Nowadays we use a computer to do this job, because it's faster and

more reliable (it is notoriously difficult to mix pieces of paper

perfectly randomly). The computer needs a complete list of all the

population members: this is called a <i>sampling

frame</i>.

Here is an example of random sampling that is not simple but

still selects every population member with equal probability.

Suppose you want to sample half the students in a classroom of 30.

Ask them to line up. Flip a fair coin: if it's heads, pick the

first, third, ..., 29th in line. If tails, pick the second, fourth,

..., 30th. Any individual student has a 50% chance of being part of

the sample, so each student has an equal probability of being

included. However, if you lined up the students boy-girl-boy-girl,

etc., the samples themselves wouldn't look very random: they will

either be mostly boys or mostly girls. It's still random though,

because it's determined by the flip of a coin.

The example highlights a subtle but important property of a

random sample: in many cases, you want the selection of population

members to be <b>independent</b>. This means the

probability of selecting one member is not affected by which other

members are selected. In simple random sampling, independence

holds; in the second example (a form of <i>gridded

sampling</i>), there is complete dependence: no student can

be chosen along with either of their neighbors in line, for

instance.

Simple random sampling is ideal for many purposes but often

cannot be carried out in practice because it is not feasible (you

might not be able to construct a sampling frame) or costs too much.

Often, more complicated procedures, such as <i>hierarchical

sampling</i>, are carried out to overcome these limitations.

(An example of hierarchical sampling is when an epidemiologist

selects a city at random, then selects households at random within

the city, then selects children at random within each household to

study. Doing it this way can require much less travel than

selecting children at random from all over the state.) These

procedures might or might not select population members with equal

probability. Usually the selection is not independent, either. When

the probabilities are unequal, they can be figured out and used as

<i>weights</i> in statistical analysis of the data.

Results can also be adjusted for lack of independence.

A good, readable, non-technical introduction to sampling and

simple random samples is the textbook <i>Statistics</i>

by Freedman, Pisani, and Purves. Any edition is fine. Steven

Thompson's book <i>Sampling</i> discusses dozens of

different sampling procedures and explains the theory behind each

one.

Copyright © 2020 Multiply Media, LLC. All Rights Reserved. The material on this site can not be reproduced, distributed, transmitted, cached or otherwise used, except with prior written permission of Multiply.