How do we go about selecting elements (be they individuals, organizations, etc.) for a study, once we have decided on a population? In short, how do we go about sampling?
You know by now (if only because of the title of this section) that the two broad types of sampling are non-random and random. Statistics (specifically, inferential statistics) is based on random sampling, therefore in what follows I disproportionately focus on that. This is not because non-random sampling is not used or isn’t useful — not at all! Non-random sampling comprises several very much valid and valuable sampling techniques, typically used in qualitative studies. However, these are situated outside the scope of this book. As such I will do an only passing overview of non-random sampling (so that you are able to spot it and differentiate it from random sampling). [1]
With that in mind, I start my lopsided mini-presentation on the topic; non-random sampling first and random-sampling in the next section.
Professors in social science classes sometimes ask students to interview or administer surveys as part of class assignments. You might have had to do that, or you can just imagine such an assignment — so how did/would you select your subjects? Most likely you would go with what’s most convenient — fellow students in your class, students that happen to be in, say, the cafeteria when you had time to do the assignment, your closest relatives or friends if you were instructed to chose non-fellow students. Any of these ways of sampling are generally classified as non-random (a.k.a non-probability) sampling.
Non-random sampling techniques typically include convenience sampling (selecting whichever elements are closets/most convenient to you), purposive sampling (sampling with a purpose: selecting only the most useful (e.g., most knowledgeable/ rich in information) cases as judged by the researcher, also called judgment, selective, or subjective sampling), snowball sampling (where selected few initial participants contact/invite/recruit others in their respective circles to become participants in the research), and quota sampling (sampling on a specific desired characteristic, e.g., specifically selecting a certain number of men and a certain number of women for a study).
As well, any time the subjects of a study are self-selected (i.e., the study is based on people volunteering to participate), it is also considered non-random sampling.
The one defining feature common to all non-random sampling methods is related to the probability of elements to be selected/included in the study. If the probability of the elements of the population to be included in the study is unequal — i.e., if some elements have higher probability to be in the study than others — the sampling is called non-random. Non-random samples are in this sense biased — they focus, and select information, on some elements more than others.
The information about these specific elements might be very useful but it reflects only the elements from which it was collected. In other words, such information (and studies based on it) is said to have limited generalizability. To the extent that there is a claim to generalizability, the generalizability is assumed (perhaps by assuming the population is so uniform that any sub-group would reflect it).
A word of caution, however: The limited generalizability of non-random sampling techniques should never be taken as somehow detracting from, or invalidating, research who legitimately uses them. To take a prime example, ethnographies usually rely on non-random sampling methods, yet they typically provide a wealth of information and levels of detail that could never be achieved through a quantitative survey research alone. Thus, non-random sampling techniques should never be considered as inferior to random ones — just different, and serving different purposes.
The purpose of random sampling, then, is to find a way for a sample to truthfully reflect — i.e., to stand in for — the population from which it is taken. This truthful reflection – i.e., generalizability — is no longer assumed (as it is in non-random sampling), but rather it is verifiably proven through mathematical means based on probability theory.