Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore -Earl_Babbie-_The_Practice_of_Social_Research(BookFi)

-Earl_Babbie-_The_Practice_of_Social_Research(BookFi)

Published by dinakan, 2021-08-12 20:20:06

Description: e-Book ini adalah untuk tujuan pembacaan sahaja dan tidak berasaskan sebarang keuntungan.

Search

Read the Text Version

Typologies 175 Typologies TABLE 6-4 We conclude this chapter with a short discussion of APolitical Typology of Newspapers typology construction and analysis. Recall that in- dexes and scales are constructed to provide ordinal Foreign Policy measures of given variables. We attempt to assign index or scale scores to cases in such a way as to in- Domestic Policy Conservative Conservative Liberal dicate a rising degree of prejudice, religiosity, con- Liberal servatism, and so forth. In such cases, we're dealing A B with single dimensions. C D Often, however, the researcher wishes to sum- size on political policies. With a single dimension, marize the intersection of two or more variables, you could easily determine the percentages of rural thereby creating a set of categories or types-a and urban newspapers that were scored conserva- nominal variable-called a typology. You may, for tive and liberal on your index or scale. example, ,vish to examine the political orientations of nevvspapers separately in terms of domestic is- With a typology, however, you would have to sues and foreign policy. The fourfold presentation present the distribution of the urban newspapers in in Table 6-4 describes such a typology. your sample among types A, B, C, and D. Then you would repeat the procedure for the rural ones in Newspapers in cell A of the table are conserva- the sample and compare the two distributions. Let's tive on both foreign policy and domestic policy; suppose that 80 percent of the rural newspapers those in cell D are liberal on both. Those in cells B are scored as type A (conservative on both dimen- and C are conservative on one and liberal on the sions), compared with 30 percent of the urban other. ones. Moreover, suppose that only 5 percent of the rural newspapers are scored as type B (conservative Frequently, you arrive at a typology in the only on domestic issues), compared with 40 per- course of an attempt to construct an index or scale. cent of the urban ones. It would be incorrect to The items that you felt represented a single variable conclude from an examination of type B that urban appear to represent two. We might have been at- newspapers are more conservative on domestic is- tempting to construct a single index of political ori- sues than rural ones are, because 85 percent of the entations for newspapers but discovered-empiri- rural newspapers, compared with 70 percent of the cally-that foreign and domestic politics had to be urban ones, have this characteristic. The relative kept separate. sparsity of rural newspapers in type B is due to their concentration in type A. It should be apparent In any event, you should be warned against a that an interpretation of such data would be very difficulty inherent in typological analysis. When- difficult for anything other than description. ever the typology is used as the independent vari- able, there will probably be no problem. In the In reality, you'd probably examine two such di- preceding example, you might compute the per- mensions separately, especially if the dependent centages of newspapers in each cell that normally variable has more categories of responses than the endorse Democratic candidates; you could then given example does. easily examine the effects of both foreign and do- mestic policies on political endorsements. Don't think that typologies should always be avoided in social research; often they provide the It's extremely difficult, however, to analyze a typology as a dependent variable. If you want to typology The classification (typically nominal) of discover why newspapers fall into the different cells observations in terms of their attributes on two or of typology, you're in trouble. That becomes appar- more variables. The classification of newspapers as ent when we consider the ways you might con- liberal-urban, liberal-rural, conservative-urban, or struct and read your tables. Assume, for example, conservative-rural would be an example. that you want to examine the effects of community

176 \" Chapter 6: Indexes, Scales, and Typologies most appropriate device for understanding the o Index scoring involves deciding the desirable data. To examine the pro-life orientation in depth, range of scores and determining whether items for example, you might create a typology involving will have equal or different weights. both abortion and capital punishment. Libertarian- ism could be seen in terms of both economic and o There are various techniques that allow items social permissiveness. You've now been warned, to be used in an index in spite of missing data. however, against the special difficulties involved in using typologies as dependent variables. o Item analysis is a type of internal validation, based on the relationship between individual MAIN POINTS items in the composite measure and the mea- sure itself. External validation refers to the rela- Introduction tionships between the composite measure and o Single indicators of variables seldom capture all other indicators of the variable-indicators not included in the measure. the dimensions of a concept, have sufficiently clear validity to warrant their use, or permit the Scale Construction desired range of variation to allow ordinal o Four types of scaling techniques are repre- rankings. Composite measures, such as scales and indexes, solve these problems by including sented by the Bogardus social distance scale, a several indicators of a variable in one summary device for measuring the varying degrees to measure. which a person would be willing to associate with a given class of people; Thurstone scaling, Indexes versus Scales a technique that uses judges to determine the o Although both indexes and scales are intended intensities of different indicators; Likert scaling, a measurement technique based on the use of as ordinal measures of variables, scales typically standardized response categories; and Guttman satisfy this intention better than indexes do. scaling, a method of discovering and using the o Whereas indexes are based on the simple cu- empirical intensity structure among several in- mulation of indicators of a variable, scales dicators of a given variable. Guttman scaling is take advantage of any logical or empirical in- probably the most popular scaling technique in tensity structures that exist among a variable's social research today. indicators. o The semantic differential is a question format that asks respondents to make ratings that lie Index Construction between two extremes, such as \"very positive\" o The principal steps in constructing an index in- and \"very negative.\" clude selecting possible items, examining their Typologies empirical relationships, scoring the index, and o A typology is a nominal composite measure of- validating it. o Criteria of item selection include face validity, ten used in social research. Typologies may be unidimensionality, the degree of specificity used effectively as independent variables, but with which a dimension is to be measured, and interpretation is difficult when they are used as the amount of variance provided by the items. dependent variables. o If different items are indeed indicators of the same variable, then they should be related em- KEY TERMS pirically to one another. In constructing an in- dex, the researcher needs to examine bivariate The following terms are defined in context in the and multivariate relationships among the items. chapter and at the bottom of the page where the term is introduced, as well as in the comprehensive glossary at the back of the book.

Online Study Resources 177 Bogardus social distance Likert scale development indexes to compare the status of scale scale different states in India. external validation semantic differential Lazarsfeld, PauL Ann Pasanella, and Morris Rosen- Guttman scale Thurstone scale berg, eds. 1972. Continlliries in the Langllage ofSo- index typology cial Research. New York: Free Press. See espe- item analysis cially Section I. An excellent collection of conceptual discussions and concrete illustra- REVIEW QUESTIONS AND EXERCISES tions. The construction of composite measures is presented within the more general area of con- I. In your own words, describe the difference be- ceptualization and measurement. tween an index and a scale. McIver, John E, and Edward G. Carmines. 198 L Ullidimensiollal Scaling . Newbury Park, CA: Sage. 2. Suppose you wanted to create an index for rat- Here's an excellent way to pursue Thurstone, ing the quality of colleges and universities. Likert, and Guttman scaling in further depth. Name three data items that might be included in Miller, Delbert. 1991. Halldbook of Research Design such an index. and Social MeaSllremem. Newbury Park, CA: Sage. An excellent compilation of frequently used and 3. Make up three questionnaire items that mea- semistandardized scales. The many illustrations sure attitudes toward nuclear power and that reported in Part 4 of the Miller book may be di- would probably form a Guttman scale. rectly adaptable to studies or at least suggestive of modified measures. Studying the several il- 4. Construct a typology of pro-life attitudes as dis- lustrations, moreover, may also give you a bet- cussed in the chapter. ter understanding of the logic of composite measures in general. 5. Economists often use indexes to measure eco- SPSS EXERCISES nomic variables, such as the cost of living. Go to See the booklet that accompanies your text for exer- the Bureau of Labor Statistics (http://www.bls cises using SPSS (Statistical Package for the Social Sci- ences). There are exercises offered for each chapter, .gov) and find the Consumer Price Index sur- and you'll also find a detailed primer on using SPSS. vey. What are some of the dimensions of livina Online Study Resources costs included in this measure? <> Sociology ~ Now'M: Research Methods ADDITIONAL READINGS L Before you do your final review of the chapter, take the SociologyNoll': Research J'v[ethods diagnos- Anderson, Andy K, Alexander Basilevsky, and tic quiz to help identify the areas on which you Derek E J. Hum. 1983. \"Measurement: Theory should concentrate. You'll find information on and Techniques.\" Pp. 231-87 in Halldbook of this online tool. as well as instructions on how Sllrvey Research, edited by Peter H. Rossi. James to access all of its great resources, in the front of D. Wright. and Andy B. Anderson. New York: the book. Academic Press. The logic of measurement is analyzed in the context of composite measures. 2. As you review, take advantage of the Sodology Now: Research J'vler/zods customized study plan, Bobo, Lawrence, and Frederick C Licari. 1989. based on your quiz results. Use this study plan \"Education and Political Tolerance: Testing the with its interactive exercises and other re- Effects of Cognitive Sophistication and Target sources to master the material. Group Effect.\" Pllblic Opillion Qllarterly 53: 285-308. The authors use a variety of tech- niques for determining how best to measure tolerance toward different groups in society. Indrayan, A., M. J. Wysocki. A. Chawla, R. Kumar, and N. Singh. 1999. \"Three-Decade Trend in Human Development Index in India and Its Major States.\" Social Indicators Research 46 (I): 91-120. The authors use several human

178 Chapter 6: Indexes, Scales, and Typologies 3, When you're finished with your review, take publication. provide opportunities to learn about in- the posttest to confirm that you're ready to dexes, scales. and typologies . move on to the next chapter. Bureau of Labor Statistics, Measurement Issues WEBSITE FOR THE PRACTICE in the Consumer Price Index OF SOCIAL RESEARCH 11 TH EDITION http://www.bls . gov/cpilcpigm697.htm The federal government's Consumer Price Index (CPI) Go to your book's website at http://sociology is one of those composite measures that affects many ,wadsworth,com/babbie_practice lIe for tools to people's lives-determining cost-of-living increases, aid you in studying for your exams, You'll find Titto, for example This site discusses some aspects of the rial Quizzes with feedback, Imel7let Exercises, Flashcards, measure, and Chapter Tlttorials, as well as E\\1ended ProjeCTs, Info, Trac College Editioll search terms, Social Research in Cyber- Arizona State University, Reliability and Validity space, GSS Data, Web Links, and primers for using vari- http://seamonkey,ed.asu. edu/-alex/teaching/ ous data-analysis software such as SPSS and 't\\'Vivo. assessment/reliability.html Here you'll find an extensive discussion of these two WEB LINKS FOR THIS CHAPTER aspects of measurement quality. Please realize that the Internet is an evolving Thomas O'Connor, \"Scales and Indexes\" entity, subject to change. Nevertheless, these http://faculty..ncwc.edu/toconnor/308/308Iect05.htm few websites should be fairly stable. Also, This web page has an excellent discussion of scales and check your book's website for even more H't?b Links, indexes in general. provides illustrative examples, and These websites, current at the time of this book's also gives hot links useful for pursuing the topic

The Logic of Sampling Introduction Populations and Sampling Frames Review of Populations A Brief History of Sampling and Sampling Frames President AU Landon President Thomas E. Dewey Types of Sampling Designs Two Types of Sampling Methods Simple Random Sampling Nonprobabili[y Sampling Systematic Sampling Reliance on Available Subjects Stratified Sampling Purposive or Judgmental Implicit Stratification Sampling in Systematic Sampling Snowball Sampling Illustration: Sampling University Quota Sampling Students Selecting Informants Multistage Cluster Sampling TofhPe rTohbeaobriv'litayndSaLmo~uliicng Multistage Designs Conscious and Unconscious and Sampling Error Sampling Bias Stratification in Multistage Representativeness Cluster Sampling and Probability of Selection Probability Proportionate to Size Random Selection (PPS) Sampling Probability Theory, Sampling Disproportionate Sampling Distributions. and Estimates and Weighting of Sampling Error Probability Sampling in Review Sociology:~ Now''': Research Methods Use this online tool to help you make the grade on your next exam. After reading this chapter, go to the \"Online Study Resources\" at the end of the chapter for instructions on how to benefit from SociologyNow: Research Methods .

180 Chapter 7: TIle Logic of Sampling Introduction TABLE 7-1 One of the most visible uses of survey sampling lies Election Eve Polls Reporting Percent of Population Voting in the political polling that is subsequently tested by for U.5. Presidential Candidates,2004 election results. Whereas some people doubt the accuracy of sample surveys, others complain that Poll Date Begun Bush Kerry political polls take all the suspense out of cam- paigns by foretelling the resulL In recent presiden- Fox/OpinDynamics Oct 28 50 50 tial elections, however. the polls have not removed Oct 28 53 47 the suspense. TIPP Oct 28 52 48 Going into the 2004 presidential elections, poll- CBS/NYT sters generally agreed that the election was \"too close to call,\" a repeat of their experience four years ARG Oct28 50 50 earlier. The Roper Center has compiled a list of polls conducted throughout the campaign; Table ABC Oct 28 51 49 7-1 reports those conducted during the few days preceding the election. Despite some variations, the Fox/OpinDynamics Oct 29 49 51 overall picture they present is amazingly consistent and was played out in the election results. Gallup/CNN/USA Oct 29 49 51 Now, how many interviews do you suppose it NBCIWSJ Oct 29 51 49 took each of these pollsters to come within a Oct 29 51 49 couple of percentage points in estimating the be- TIPP havior of more than 115 million voters? Often fewer than 2,000! In this chapter. we're going to Harris Oct 29 52 48 find out how social researchers can pull off such Democracy Corps Oct 29 49 51 wizardry. Harris Oct 29 51 49 For another powerful illustration of the po- CBS O(t29 51 49 tency of sampling, look at this graphic portrayal of President George W. Bush's approval ratings prior Fox/OpinDynamics Oct30 49 52 to and following the September 1L 200 L terrorist Oct30 51 49 attack on the US. (see Figure 7-1). The data re- TiPP ported by several different polling agencies descTibe the same pattern. Marist Oct31 50 50 Political polling, like other forms of social re- GWU Battleground 2004 Oct31 52 48 search, rests on observations. But neither pollsters nor other social researchers can observe everything Actual vote Nov2 52 48 that might be relevant to their interests. A critical part of social research, then, is deciding what to ob- Source: Poll data adapted from the Roper Center, Election 2004 (http://www serve and what not. If you want to study voters, for Jopercentecuconn.edu/elecC2004/pres_triaUeats.html). example, which voters should you study? Accessed November 16, 2004. I've apportioned the undecided and other votes The process of selecting observations is called according to the percentages saying they were voting for Bush or Kerry. sampling. Although sampling can mean any proce- dure for selecting units of observation-for ex- larger popUlation is probability sampling, which in- ample, interviewing every tenth passerby on a busy volves the important idea of random selection. street-the key to generalizing from a sample to a Much of this chapter is devoted to the logic and skills of probability sampling. This topic is more rig- orous and precise than some of the other topics in this book. Whereas social research as a whole is both art and science, sampling leans toward science. Although this subject is somewhat technical. the basic logic of sampling is not difficult to understand. In fact. the logical neatness of this topic can make it easier to comprehend than, say, conceptualization. Although probability sampling is central to social research today, we'll take some time to examine a variety of nonprobability methods as

ABrief History of Sampling 181 1 0 0 r - - - - - · - - - - . - - - - - - - - - - - - - - - - . - - - - - - - - - - - - - - - - _ _ _ _ _ _ _ _._ _ _ _ _ _._ _ _ __ _ Before + After September 11th attack September 11th attack 90 - . - - - - 801------ <> 501----oID---·lm---~~~-~--------------.----.- _ _ _ _ _ _ _ _ _ _ _ _ __ +0 o 00 o0 Key: + ABC/Post III CBS • Harris t::. Ipsos-Reid <> Pew + Fox t::. NBCIWSJ <> AmResGp JJ. Bloomberg G Gallup o IBO/CSM III Newsweek JJ. CNNfTime o Zogby FIGURE 7-1 Bush Approval: Raw Poll Data. This graph demonstrates how independent polls produce the same picture of reality. This also shows the impact of anational crisis on the president's popularity: in this ease, the September 11 terrorist attack and President George W. Bush's popularity. Source: Copyright © 2001,2002 by drlimerick.com (http://wwwpolikatz.homestead.com/files/MyHTML2gif).Allrights reserved. well. These methods have their own logic and can opportunities social researchers have to discover provide useful samples for social inquiry. the accuracy of their estimates. On election day, they find out how well or how poorly they did. Before we discuss the two major types of sam- pling, I'll introduce you to some basic ideas by way President AIrLandon of a brief history of sampling. As you'll see, the pollsters who correctly predicted the election cliff- President Alf Landon? Who's he? Did you sleep hanger of 2000 did so in part because researchers through an entire presidency in your US. history had learned to avoid some pitfalls that earlier poll- class? No-but Alf Landon \\vould have been presi- sters had fallen into. dent if a famous poll conducted by the Litermy Di- gest had proved to be accurate. The Literary Digest A Brief History of Sampling was a popular newsmagazine published between 1890 and 1938. In 1920, Digest editors mailed post- Sampling in social research has developed hand in cards to people in six states, asking them whom hand with political polling. This is the case, no they were planning to vote for in the presidential doubt. because political polling is one of the few campaign between Warren Harding and James

182 Chapter 7: The Logic of Sampling Cox. Names were selected for the poll from tele- Actually, there was a better explanation-what phone directories and automobile registration lists. is technically called the sampling frame used by the Based on the postcards sent back, the Digest cor- Digest. In this case the sampling frame consisted of rectly predicted that Harding would be elected. In telephone subscribers and automobile owners. In the elections that followed, the Literary Digest ex- the context of 1936, this design selected a dispro- panded the size of its poll and made correct predic- portionately wealthy sample of the voting popula- tions in 1924, 1928, and 1932. tion, especially coming on the tail end of the worst economic depression in the nation's history. The In 1936, the Digest conducted its most ambitious sample effectively excluded poor people, and the poll: Ten million ballots were sent to people listed in poor voted predominantly for Roosevelt's New Deal telephone directories and on lists of automobile recovery program. The Digests poll mayor may not owners. Over two million people responded, giving have correctly represented the voting intentions of the Republican contender, Alf Landon, a stunning telephone subscribers and automobile ovvners. Un- 57 to 43 percent landslide over the incumbent, fortunately for the editors, it decidedly did not rep- President Franklin Roosevelt. The editors modestly resent the voting intentions of the population as a cautioned, whole. We make no claim to infallibility. We did not President Thomas E Dewey coin the phrase \"uncanny accuracy\" which has been so freely applied to our Polls. We know The 1936 election also saw the emergence of a only too well the limitations of every straw young pollster whose name would become synony- vote, however enormous the sample gathered, mous with public opinion, In contrast to the Liter- however scientific the method. It would be a al)' Digest, George Gallup correctly predicted that miracle if every State of the forty-eight behaved Roosevelt would beat Landon. Gallup's success in on Election Day exactly as forecast by the PolL 1936 hinged on his use of something called quota sampling, which we'll look at more closely later in (Litemry Digest 1936a. 6) the chapter. For now, it's enough to know that quota sampling is based on a knowledge of the Two weeks later, the Digest editors knew the characteristics of the population being sampled: limitations of straw polls even better: The voters what proportion are men, what proportion are gave Roosevelt a second term in office by the women, what proportions are of various incomes, largest landslide in history, with 61 percent of the ages, and so on, Quota sampling selects people to vote. Landon won only 8 electoral votes to Roo- match a set of these characteristics: the right num- sevelt's 523. ber of poor, white, rural men; the right number of rich, African American, urban women; and so on. The editors were puzzled by their unfortunate The quotas are based on those variables most rele- turn of luck. A part of the problem surely lay in the vant to the study. In the case of Gallup's poll. the 22 percent return rate garnered by the polL The sample selection was based on levels of income; the editors asked, selection procedure ensured the right proportion of respondents at each income leveL Why did only one in five voters in Chicago to whom the Digest sent ballots take the trouble to Gallup and his American Institute of Public reply? And why was there a preponderance of Opinion used quota sampling to good effect in Republicans in the one-fifth that did reply? .... 1936, 1940, and 1944-correctly picking the presi- We were getting better cooperation in what we dential winner each of those years. Then, in 1948, have always regarded as a public service from Gallup and most political pollsters suffered the em- Republicans than we v'lere getting from Demo- barrassment of picking Governor Thomas Dewey of crats. Do Republicans live nearer to mailboxes? New York over the incumbent, President Harry Do Democrats generally disapprove of straw polls? (Literal), Dig,'st 19361> 7)

Nonprobability Sampling 183 Truman. The pollsters' embarrassing miscue con- first take a look at techniques for nonprobability tinued right up to election night. A famous photo- sampling and how they're used in social research. graph shows a jubilant Truman-whose followers' battle cry was \"Give 'em helL Harry!\"-holding Nonprobability Sampling aloft a newspaper with the banner headline \"Dewey Defeats Truman/' Social research is often conducted in situations that do not permit the kinds of probability samples used Several factors accounted for the pollsters' fail- in large-scale social surveys. Suppose you wanted ure in 1948. First, most pollsters stopped polling in to study homelessness: There is no list of all home- early October despite a steady trend toward Tru- less individuals, nor are you likely to create such a man during the campaign. In addition, many vot- list. Moreover, as you'll see, there are times when ers were undecided throughout the campaign, and probability sampling wouldn't be appropriate even these went disproportionately for Truman when if it were possible. Many such situations call for they stepped into the voting booth. nonprobability sampling. More important, Gallup's failure rested on In this section, we'll examine four types of the unrepresentativeness of his samples. Quota nonprobability sampling: reliance on available sub- sampling-which had been effective in earlier jects, purposive or judgmental sampling, snowball years-was Gallup's undoing in 1948. This tech- sampling, and quota sampling. We'll conclude v'lith nique requires that the researcher know something a brief discussion of techniques for obtaining infor- about the total population (of voters in this in- mation about social groups through the use of stance). For national political polls, such informa- informants. tion came primarily from census data. By 1948, however, World War II had produced a massive Reliance on Available Subjects movement from the country to cities, radically changing the character of the US. population from Relying on available subjects, such as stopping what the 1940 census showed, and Gallup relied people at a street corner or some other location, is on 1940 census data. City dwellers, moreover, an extremely risky sampling method; even so, it's tended to vote Democratic; hence, the overrepre- used all too frequently. Clearly, this method does sentation of rural voters in his poll had the effect of not permit any control over the representativeness underestimating the number of Democratic votes. of a sample. It's justified only if the researcher wants to study the characteristics of people passing Two Types of Sampling Methods the sampling point at specified times or if less risky sampling methods are not feasible. Even when this By 1948, some academic researchers had already method is justified on grounds of feasibility, re- been experimenting with a form of sampling based searchers must exercise great caution in generaliz- on probability theory, This technique involves the ing from their data. Also, they should alert readers selection of a \"random sample\" from a list contain- to the risks associated with this method., ing the names of everyone in the population being sampled. By and large, the probability sampling University researchers frequently conduct sur- methods used in 1948 were far more accurate than veys among the students enrolled in large lecture quota sampling techniques. nonprobability sampling Any technique in Today, probability sampling remains the pri- which samples are selected in some way not sug- mary method of selecting large, representative gested by probability theory. Examples include re- samples for social research, including national po- liance on available subjects as well as purposive litical polls, At the same time, probability sampling (judgmental), quota, and snowball sampling. can be impossible or inappropriate in many re- search situations. Accordingly, before turning to the logic and techniques of probability sampling, we'll

184 Chapter 7: The Logic of Sampling classes, The ease and frugality of such a method is called purposive or judgmental sampling. In explains its popularity, but it seldom produces data the initial design of a questionnaire, for example, of any general value. It may be useful for pretest- you might wish to select the widest variety of re- ing a questionnaire, but such a sampling method spondents to test the broad applicability of ques- should not be used for a study purportedly tions. Although the study findings would not rep- describing students as a whole. resent any meaningful population, the test run might effectively uncover any peculiar defects in Consider this report on the sampling design your questionnaire. This situation would be consid- in an examination of knowledge and opinions ered a pretest, however, rather than a final study. about nutrition and cancer among medical students and family physicians: In some instances, you may wish to study a small subset of a larger population in which many The fourth-year medical students of the Uni- members of the subset are easily identified, but the versity of Minnesota Medical School in Min- enumeration of them all would be nearly impos- neapolis comprised the student population in sible. For example, you might want to study the this study. The physician population consisted leadership of a student protest movement; many of of all physicians attending a \"Family Practice the leaders are easily visible, but it would not be Review and Update\" course sponsored by the feasible to define and sample all the leaders. In University of Minnesota Department of Contin- studying all or a sample of the most visible leaders, uing Medical Education. you may collect data sufficient for your purposes. (Cooper-Stephenson and Tlzeologides 1981. 472) Or let's say you want to compare left-wing and right-wing students. Because you may not be able After all is said and done, what will the results to enumerate and sample from all such students, of this study represent? They do not provide a you might decide to sample the memberships of meaningful comparison of medical students and left- and right-leaning groups, such as the Green family physicians in the United States or even in Party and the Young Americans for Freedom. Al- Minnesota. Who were the physicians who at- though such a sample design would not provide a tended the course? We can guess that they were good description of either left-wing or right-wing probably more concerned about their continuing students as a whole, it might suffice for general education than other physicians were, but we can't comparative purposes. say for sure. Although such studies can be the source of useful insights, we must take care not to Field researchers are often particularly inter- overgeneralize from them. ested in studying deviant cases-cases that don't fit into fairly regular patterns of attitudes and behav- Purposive orJudgmental Sampling iors-in order to improve their understanding ofthe more regular pattern. For example, you might gain Sometimes it's appropriate to select a sample on the important insights into the nature of school spirit, as basis of knowledge of a population, its elements, exhibited at a pep rally, by interviewing people who and the purpose of the study. This type of sampling did not appear to be caught up in the emotions of the crowd or by interviewing students who did not purposive (judgmental) sampling A type of attend the rally at all. Selecting deviant cases for nonprobability sampling in whid1 the units to be ob- study is another example of purposive study. served are selected on the basis of the researmer's judgment about which ones will be the most useful Snowball Sampling or representative. snowball sampling A nonprobability sampling Another nonprobability sampling technique, which method often employed in field researm whereby some consider to be a form of accidental sampling, each person interviewed may be asked to suggest is called snowball sampling. This procedure is additional people for interviewing. appropriate when the members of a special

Nonprobability Sampling 185 population are difficult to locate, such as homeless proportion of the national population is urban, individuals, migrant workers, or undocumented eastern, male, under 25, white, working class, and immigrants. In snowball sampling, the researcher the like, and all the possible combinations of these collects data on the few members of the target pop- attributes. ulation he or she can locate, then asks those indi- viduals to provide the information needed to locate Once you've created such a matrix and as- other members of that population whom they hap- signed a relative proportion to each cell in the ma- pen to know. \"Snowball\" refers to the process of ac- trix, you proceed to collect data from people having cumulation as each located subject suggests other all the characteristics of a given cell. You then as- subjects. Because this procedure also results in sign to all the people in a given cell a weight appro- samples with questionable representativeness, it's priate to their portion of the total population. When used primarily for exploratory purposes. all the sample elements are so weighted, the overall data should provide a reasonable representation of Suppose you wish to learn a community orga- the total population. nization's pattern of recruitment over time. You might begin by interviewing fairly recent recruits, Although quota sampling resembles probability asking them who introduced them to the group. sampling, it has several inherent problems. First, You might then interview the people named, ask- the quota frame (the proportions that different cells ing them who introduced them to the group. You represent) must be accurate, and it's often difficult might then interview those people named, asking, to get up-to-date information for this purpose. The in part, who introduced them. Or, in studying a Gallup failure to predict Truman as the presidential loosely structured political group, you might ask victor in 1948 was due partly to this problem. Sec- one of the participants who he or she believes to be ond, the selection of sample elements within a the most influential members of the group. You given cell may be biased even though its proportion might interview those people and, in the course of of the population is accurately estimated. Instructed the interviews, ask who they believe to be the most to interview five people who meet a given, com- influential. In each of these examples, your sample plex set of characteristics, an interviewer may still would \"snowball\" as each of your interviewees sug- avoid people living at the top of seven-story walk- gested other people to interview. ups, having particularly run-down homes, or own- ing vicious dogs. Quota Sampling In recent years, attempts have been made to Quota sampling is the method that helped George combine probability- and quota-sampling methods, Gallup avoid disaster in 1936-and set up the dis- but the effectiveness of this effort remains to be aster of 1948. Like probability sampling, quota seen. At present, you would be advised to treat sampling addresses the issue of representativeness, quota sampling warily if your purpose is statistical although the two methods approach the issue quite description. differently. At the same time, the logic of quota sampling Quota sampling begins with a matrix, or can sometimes be applied usefully to a field re- table, describing the characteristics of the target search project. In the study of a formal group, for population. Depending on your research purposes, example, you might wish to interview both leaders you may need to know what proportion of the and nonleaders. In studying a student political or- population is male and what proportion female as ganization, you might want to interview radical, well as what proportions of each gender fall into various age categories, educational levels, ethnic quota sampling A type of nonprobability sampling groups, and so forth. In establishing a national in which units are selected into a sample on the basis quota sample, you might need to know what of prespecified characteristics, so that the total sample will have the same distribution of characteristics assumed to exist in the population being studied.

186 Chapter 7: The Logic of Sampling moderate, and conservative members of that group. misleading. Interviewing only physicians will not You may be able to achieve sufficient representa- give you a well-rounded view of how a community tiveness in such cases by using quota sampling to medical clinic is working, for example. Along the ensure that you interview both men and women, same lines, an anthropologist who interviews only both younger and older people, and so forth. men in a society where women are sheltered from outsiders vvill get a biased view. Similarly, although Selecting Informants informants fluent in English are convenient for English-speaking researchers from the United When field research involves the researcher's at- States, they do not typify the members of many tempt to understand some social setting-a juve- societies and even many subgroups within English- nile gang or local neighborhood, for example- speaking countries. much of that understanding will come from a collaboration vvith some members of the group be- Simply because they're the ones willing to ing studied. Whereas social researchers speak of re- work with outside investigators, informants 'will al- spondents as people who provide information about most always be somewhat \"marginal\" or atypical themselves, allowing the researcher to construct a \\vithin their group. Sometimes this is obvious. composite picture of the group those respondents Other times, however, you'll learn about their mar- represent, an informant is a member of the group ginality only in the course of your research. who can talk directly about the group per se. In Jeffrey Johnson's study, the county agent Especially important to anthropologists, in- identified one fisherman who seemed squarely in formants are important to other social researchers the mainstream of the community. Moreover, he as well. If you wanted to learn about informal so- was cooperative and helpful to Johnson's research. cial networks in a local public housing project, for The more Johnson worked with the fisherman, example, you would do well to locate individuals however, the more he found the man to be a mar- who could understand vvhat you were looking for ginal member of the fishing community. and help you find it. First, he was a Yankee in a southern town. Sec- When Jeffrey Johnson (1990) set out to study a ond, he had a pension from the Navy [so he salmon-fishing community in North Carolina, he was not seen as a \"serious fisherman\" by others used several criteria to evaluate potential infor- in the community] .... Third, he was a major mants. Did their positions allow them to interact Republican activist in a mostly Democratic vil- regularly vvith other members of the camp, for ex- lage. Finally, he kept his boat in an isolated an- ample, or were they isolated? (He found that the chorage, far from the community harbor. carpenter had a vvider range of interactions than the boat captain did.) Was their information about (1990: 56) the camp pretty much limited to their specific jobs, or did it cover many aspects of the operation? Informants' marginality may not only bias the view These and other criteria helped determine how you get, but their marginal status may also limit useful the potential informants might be. their access (and hence yours) to the different sec- tors of the community you wish to study. Usually, you'll want to select informants some- what typical of the groups you're studying. Other- These conmlents should give you some sense of \\vise, their observations and opinions may be the concerns involved in nonprobability sampling, typically used in qualitative research projects. I informant Someone who is well versed in the so- conclude \\vith the following injunction: cial phenomenon that you wish to study and who is willing to tell you what he or she knows about it. Your overall goal is to collect the richest possible Not to be confused with a respondent. data. Rich data mean, ideally, a \\vide and di- verse range of information collected over a rel- atively prolonged period of time. Again, ideally, you achieve this through direct, face-to-face

The Theory and Logic of Probability Sampling 187 contact with, and prolonged immersion in, 50 44 some social location or circumstance. 44 (Lofland and Lofland 1995 16) 40 In other words, nonprobability sampling does Ql have its uses, particularly in qualitative research projects. But researchers must take care to ac- Ci knowledge the limitations of nonprobability sam- 0 pling, especially regarding accurate and precise rep- Ql resentations of populations. This point will become 0- 30 clearer as we discuss the logic and techniques of probability sampling. '0 As you can see, choosing and using informants CD 20 can be a tricky business. To see some practical im- .0 plications of doing so, you can visit the website of Canada's Community Adaptation and Sustainable E Livelihoods (CASL) Program: http://iisd.ca/casl! :::J CASLGuide/KeyInformEx.htm. Z The Theory and Logic 10 of Probability Sampling a L....It_-.:5.._ _ _' - However appropriate to some research purposes, nonprobability sampling methods cannot guarantee White African White African that the sample we observed is representative of the women American men American whole population. When researchers want precise, statistical descriptions of large populations-for ex- women men ample, the percentage of the population who are unemployed, plan to vote for Candidate X, or feel a FIGURE 7·2 rape victim should have the right to an abortion- they turn to probability sampling. All large-scale APopulation of 100 Folks. Typically, sampling aims to reflect surveys use probability-sampling methods. the characteristics and dynamiCS of large populations. For the purpose of some simple illustrations, let's assume our total Although the application of probability sam- population only has 100 members. pling involves some sophisticated use of statistics, the basic logic of probability sampling is not varying in many ways. Figure 7-2 offers a simpli- difficult to understand. If all members of a popula- fied illustration of a heterogeneous population: The tion were identical in all respects-all demographic 100 members of this small population differ by gen- characteristics, attitudes, experiences, behaviors, der and race. We'll use this hypothetical micropop- and so on-there would be no need for careful ulation to illustrate various aspects of probability sampling procedures. In this extreme case of per- sampling. fect homogeneity, in fact, any single case would suffice as a sample to study characteristics of the The fundamental idea behind probability sam- whole population. pling is this: To provide useful descriptions of the total population, a sample of individuals from a In fact, of course, the human beings who com- population must contain essentially the same pose any real population are quite heterogeneous, variations that exist in the population. This isn't as simple as it might seem, however. Let's take a minute to look at some of the ways researchers might go astray. Then, we'll see how probability sampling provides an efficient method for selecting a sample that should adequately reflect variations that exist in the population. probability sampling The general term for samples selected in accord with probability theory, typically involving some random-selection mecha- nism. Specific types of probability sampling include EPSEM, PPS, simple random sampling, and system- atic sampling..

188 Chapter 7: The Logic of Sampling t FIGURE 7-3 ASample of Convenience: Easy, but Not Representative. Simply selecting and observing those people who are most readily at hand is the simplest method, perhaps, but it's unlikely to provide asample that accurately reflects the total population. Conscious and Unconscious Beyond the risks inherent in simply studying Sampling Bias people who are convenient, other problems can arise. To begin with, the researcher's personallean- At first glance, it may look as though sampling is ings may affect the sample to the point where it pretty straightforward. To select a sample of 100 does not truly represent the student population. university students, you might simply interview Suppose you're a little intimidated by students who the first 100 students you find walking around look particularly \"cooL\" feeling they might ridicule campus. This kind of sampling method is often your research effort. You might consciously or un- used by untrained researchers, but it runs a high consciously avoid interviewing such people. Or, risk of introducing biases into the samples. you might feel that the attitudes of \"supeHtraight- looking\" students would be irrelevant to your re- In connection with sampling, bias simply search purposes and so avoid interviewing them. means that those selected are not typical or repre- sentative of the larger populations they have been Even if you sought to interview a \"balanced\" chosen from. This kind of bias does not have to be group of students, you wouldn't know the exact intentional. In fact. it is virtually inevitable when proportions of different types of students making you pick people by the seat of your pants. up such a balance, and you wouldn't always be able to identify the different types just by watching Figure 7-3 illustrates what can happen when re- them walk by. searchers simply select people who are convenient for study. Although women are only 50 percent of Even if you made a conscientious effort to our micropopulation, those closest to the researcher interview, say, every tenth student entering the (in the lower right corner) happen to be 70 percent university library, you could not be sure of a repre- women, and although the population is 12 percent sentative sample, because different types of stu- black, none was selected into the sample. dents visit the library with different frequencies.

The Theory and Logic of Probability Sampling 189 Your sample would overrepresent students who 50 percent women, then a sample must contain visit the library more often than others do. \"close to\" 50 percent women to be representative. Later, we'll discuss \"how close\" in detail. Similarly, the \"public opinion\" call-in polls-in which radio stations or newspapers ask people to Note that samples need not be representative call specified telephone numbers to register their in all respects; representativeness is limited to those opinions-cannot be trusted to represent general characteristics that are relevant to the substantive populations. At the very least. not everyone in the interests of the study. However, you may not know population will even be aware of the poll. This in advance which characteristics are relevant. problem also invalidates polls by magazines and newspapers that publish coupons for readers to A basic principle of probability sampling is that complete and mail in. Even among those who are a sample will be representative of the population aware of such polls, not all will express an opinion, from which it is selected if all members of the pop- especially if doing so will cost them a stamp, an en- ulation have an equal chance of being selected in velope, or a telephone charge. Similar considera- the sample. (We'll see shortly that the size of the tions apply to polls taken over the Internet. sample selected also affects the degree of represen- tativeness.) Samples that have this quality are often Ironically, the failure of such polls to represent labeled EPSEM samples (EPSEM stands for \"equal all opinions equally was inadvertently acknowl- probability of selection method\"). Later, we'll dis- edged by Phillip Perinelli (1986), a staff manager of cuss variations of this principle, which forms the AT&T Communications' DIAL-IT 900 Service, basis of probability sampling. which offers a call-in poll facility to organizations. Perinelli attempted to counter cTiticisms by saying, Moving beyond this basic principle, we must \"The 50-cent charge assures that only interested realize that samples-even carefully selected parties respond and helps assure also that no indi- EPSEM samples-seldom if ever perfectly repre- vidual 'stuffs' the ballot box.\" We cannot determine sent the populations from which they are drawn. general public opinion while considering \"only in- Nevertheless, probability sampling offers two spe- terested parties:' This excludes those who don't care cial advantages. 50-cents' worth, as well as those who recognize that such polls are not valid. Both types of people may First, probability samples, although never per- have opinions and may even vote on election day. fectly representative, are typically more representa- Perinelli's assertion that the 50-cent charge will pre- tive than other types of samples, because the biases vent ballot stuffing actually means that only those previously discussed are avoided. In practice, a who can afford it will engage in ballot stuffing. probability sample is more likely than a nonproba- bility sample to be representative of the population The possibilities for inadvertent sampling bias from which it is drawn. are endless and not always obvious. Fortunately many techniques can help us avoid bias. Second, and more important, probability theory permits us to estimate the accuracy or Representativeness and Probability representativeness That quality of a sample of ofSelection having the same distribution of characteristics as the population from which it was selected. By impli- Although the term representativeness has no cation, descriptions and explanations derived from precise, scientific meaning, it carries a common- an analysis of the sample may be assumed to sense meaning that makes it useful here. For represent similar ones in the population. Represen- our purpose, a sample is representative of the tativeness is enhanced by probability sampling population from which it is selected if the aggregate and provides for generalizability and the use of characteristics of the sample closely approximate inferential statistics. those same aggregate characteristics in the popula- tion. If, for example, the population contains EPSEM (equal probability of selection method) A sample design in which each member of a popula- tion has the same chance of being selected into the sample.

190 Chapter 7: The logic of Sampling representativeness of the sample. Conceivably, study elements. Whereas the vague term Americans an uninformed researcher might. through wholly might be the target for a study, the delineation of haphazard means, select a sample that nearly the population would include the definition of the perfectly represents the larger population. The element Americans (for example, citizenship, odds are against doing so, however, and we residence) and the time referent for the study would be unable to estimate the likelihood that (Americans as of when?). Translating the abstract he or she has achieved representativeness. The \"adult New Yorkers\" into a workable population probability sampler, on the other hand, can would require a specification of the age defining provide an accurate estimate of success or failure. adult and the boundaries of New York. Specifying We'll shortly see exactly how this estimate can be the term college student would include a considera- achieved. tion of full- and part-time students, degree candi- dates and nondegree candidates, undergraduate I've said that probability sampling ensures that and graduate students, and so forth. samples are representative of the population we wish to study. As we'll see in a moment, probability A study population is that aggregation sampling rests on the use of a random-selection of elements from which the sample is actually procedure. To develop this idea, though, we need selected. As a practical matter, researchers are to give more-precise meaning to two important seldom in a position to guarantee that every ele- terms: element and population.* ment meeting the theoretical definitions laid down actually has a chance of being selected in the An element is that unit about which informa- sample. Even where lists of elements exist for sam- tion is collected and that provides the basis of pling purposes, the lists are usually somewhat in- analysis. Typically, in survey research, elements are complete. Some students are always inadvertently people or certain types of people. However, other omitted from student rosters. Some telephone kinds of units can constitute the elements for social subscribers request that their names and numbers research: Families, social clubs, or corporations be unlisted. might be the elements of a study. In a given study, elements are often the same as units of analysis, Often, researchers decide to limit their study though the former are used in sample selection and populations more severely than indicated in the the latter in data analysis. preceding examples. National polling firms may limit their national samples to the 48 adjacent Up to now we've used the term popularion to states, omitting Alaska and Hawaii for practical mean the group or collection that we're interested reasons. A researcher wishing to sample psychol- in generalizing about. More formally, a popula- ogy professors may limit the study population to tion is the theoretically specified aggregation of those in psychology departments, omitting those in other departments. Whenever the *1 would like to acknowledge a debt to Leslie Kish and population under examination is altered in such his excellent textbook Slim?), Sampli11g. Although I've fashions, you must make the revisions clear to modified some of the conventions used by Kish, his your readers. presentation is easily the most important source of this discussion. element That unit of which a population is com- Random Selection prised and which is selected in a sample. Distin- guished from Ilnits ofa11alysis, which are used in data With these definitions in hand, we can define the analysis. ultimate purpose of sampling: to select a set of elements from a population in such a way that population The theoretically specified aggregation descriptions of those elements accurately portray of the elements in a study. the total population from which the elements are selected. Probability sampling enhances the likeli- study population That aggregation of elements hood of accomplishing this aim and also provides from which a sample is actually selected.

TIle Theory and logic of Probability Sampling 191 methods for estimating the degree of probable samples and to analyze the results of their sampling success. statistically. More formally, probability theory pro- vides the basis for estimating the parameters of a Random selection is the key to this process. In population. A parru:neter is the summary descrip- random selection, each element has an equal tion of a given variable in a population. The mean chance of selection independent of any other event income of all families in a city is a parameter; so is in the selection process. Flipping a coin is the most the age distribution of the city's population. When frequently cited example: Provided that the coin is researchers generalize from a sample, they're using perfect (that is, not biased in terms of coming up sample observations to estimate population param- heads or tails), the \"selection\" of a head or a tail is eters. Probability theory enables them to both independent of previous selections of heads or tails. make these estimates and arrive at a judgment of No matter how many heads tum up in a row, the how likely the estimates ,viII accurately represent chance that the next flip will produce \"heads\" is the actual parameters in the population. For ex- exactly 50-50. Rolling a perfect set of dice is an- ample, probability theory allows pollsters to infer other example. from a sample of 2,000 voters how a population of 100 million voters is likely to vote-and to spec- Such images of random selection, although ify exactly the probable margin of error of the useful. seldom apply directly to sampling methods estimates. in social research. More typically, social researchers use tables of random numbers or computer pro- Probability theory accomplishes these seem- grams that provide a random selection of sampling ingly magical feats by way of the concept of sam- units. A sampling unit is that element or set of pling distributions. A single sample selected from a elements considered for selection in some stage of population will give an estimate of the population sampling. In Chapter 9, on survey research, we'll parameter. Other samples would give the same or see how computers are used to select random tele- slightly different estimates. Probability theory tells phone numbers for interviewing, a technique us about the distribution of estimates that would be called random-digit dialing. produced by a large number of such samples. To see how this works, we'll look at two examples of The reasons for using random selection meth- sampling distributions, beginning with a simple ex- ods are twofold. First, this procedure serves as a ample in which our popUlation consists of just check on conscious or unconscious bias on the part ten cases. of the researcher. The researcher who selects cases on an intuitive basis might very well select cases The Sampling Distribution of Ten Cases that would support his or her research expectations or hypotheses. Random selection erases this dan- Suppose there are ten people in a group, and each ger. More important, random selection offers access has a certain amount of money in his or her to the body of probability theory, which provides pocket. To simplify, let's assume that one person the basis for estimating the characteristics of the has no money, another has one dollar, another has population as well as estimating the accuracy of two dollars, and so forth up to the person with samples. Let's now examine probability theory in greater detail. Probability Theory, Sampling random selection A sampling method in which Distribution~ and Estimates each element has an equal chance of selection inde- ofSampling Error pendent of any other event in the selection process\" Probabilily theOl)' is a branch of mathematics that sampling unit That element or set of elements provides the tools researchers need to devise sam- considered for selection in some stage of sampling. pling techniques that produce representative parameter The summary description of a given variable in a population.

192 \"' Chapter 7: The logic of Sampling FIGURE 7-4 APopulation of 10 People with $0-$9. Let's simplify matters even more now by imagining apopulation of only 10 people with differing amounts of money in their pockets-ranging from $0 to $9. nine dollars. Figure 7-4 presents the population of population. The distribution of the dots on the ten people.* graph is called the sampling distribution. Obviously, it wouldn't be a very good idea to select a sample of Our task is to determine the average amount of only one, because the chances are great that we'll money one person has: specifically, the mean num- miss the true mean of $4.50 by quite a bit. ber of dollars. If you simply add up the money shown in Figure 7-4, you'll find that the total is Now suppose we take a sample of two. As $45, so the mean is $4.50. Our purpose in the rest shown in Figure 7-6, increasing the sample size im- of this exercise is to estimate that mean without ac- proves our estimations. There are now 45 possible tually observing all ten individuals. We'll do that by samples: [$0 $1], [$0 $2], ... [$7 $8], [$8 $9]. selecting random samples from the population and Moreover, some of those samples produce the same using the means of those samples to estimate the means. For example, [$0 $6], [$1 $5], and [$2 $4] mean of the whole population. all produce means of $3. In Figure 7-6, the three dots shown above the $3 mean represent those To start, suppose we were to select-at random-a sample of only one person from the three samples. ten. Our ten possible samples thus consist of the Moreover, the 45 samples are not evenly dis- ten cases shown in Figure 7-4. tributed, as they were when the sample size was The ten dots shown on the graph in Figure 7-5 only one. Rather, they're somewhat clustered represent these ten samples. Because we're around the true value of $4.50. Only tVI'O possible taking samples of only one, they also represent samples deviate by as much as $4 from the true the \"means\" we would get as estimates of the value ([$0 $1] and [$8 $9]), whereas five of the samples would give the true estimate of $4.50; an- * I want to thank Hanan Selvin for suggesting this method other eight samples miss the mark by only 50 cents of introducing probability sampling. (plus or minus).

The Theory and logic of Probability Sampling 193 10 10 9 ~-------~. 9 True mean = $4.50 fIJ 8 fIJ Q) 7 --\"---- Q) 8 Q. 6 True mean = $4.50 Q. 7 5 -Eo E;:n 6 4 5 - - - - - - . _ . _ - - & - - _ . ---~- (\\\\..- (\\\\'<1\" fIJ II 3 4 &-&-&-&-& Oro:::. II Oro 2 •3 &-&--&-&-&-&--&-. a:; 0 Cia 1 2 --&-&--&-&-&-&-&-&-&-&-&-&-&-- ~c .E0 1~- o IIIIIII _ _r--~-&--&-&-&-&--&-&-&-&-&- :::l :::l o IIIIIIIII Z Z $0 $1 $2 $3 $4 $5 $6 $7 $8' $9 - IJ $0 $1 $2 $3 $4 $5 $6 $7 $8 $9 Estimate of mean Estimate of mean (Sample size = 1) (Sample size = 2) FIGURE 7-5 FIGURE 7-6 The Sampling Distribution of Samples of 1. In this simple ex- The Sampling Distribution of Samples of 2. By merely increas- ample, the mean amount of money these people have is ing our sample size to 2, notice that the possible samples pro- $4.50 ($4511 0). If we picked 10 different samples of 1 person vide somewhat better estimates of the mean. We couldn't get each, our \"estimates\" of the mean would range all across the either $0 or $9, and the estimates are beginning to cluster board. around the true value of the mean: $4.50. Now suppose we select even larger samples. percentage of students who approve or disapprove What do you think that VI/ill do to our estimates of of a student conduct code proposed by the admin- the mean? Figure 7-7 presents the sampling distri- istration. The study population will be the aggrega- butions of samples of 3, 4, 5, and 6. tion of, say, 20,000 students contained in a student roster: the sampling frame. The elements will be The progression of sampling distributions is .the individual students at Su. We'll select a random clear. Every increase in sample size improves the sample of, say, 100 students for the purposes of distribution of estimates of the mean. The limiting estimating the entire student body. The variable case in this procedure, of course, is to select a under consideration will be attitudes toward the code, sample of ten. There would be only one possible a binomial variable: approve and disapprove. (The sample (everyone) and it would give us the true logic of probability sampling applies to the exami- mean of $4.50. As we'll see shortly, this principle nation of other types of variables, such as mean applies to actual sampling of meaningful popula- income, but the computations are somewhat more tions. The larger the sample selected, the more ac- complicated. Consequently, this introduction curate it is as an estimation of the population from focuses on binomials.) which it was drawn. The horizontal axis of Figure 7-8 presents all Sampling Distribution and Estimates possible values of this parameter in the popula- ofSampling Error tion-from 0 percent to 100 percent approval. The midpoint of the axis-50 percent-represents half Let's turn now to a more realistic sampling situa- the students approving of the code and the other tion involving a much larger population and see half disapproving. how the notion of sampling distribution applies. Assume that we wish to study the student popula- To choose our sample, we give each student on tion of State University (SU) to determine the the student roster a number and select 100 random numbers from a table of random numbers. Then

a. Samples of 3 True mean = 54.50 b. Samples of 4 True mean = 54.50 20 20 0000000000000 18 .0.0.00000000 18 ••••• '000000- •• 00 0' 16 00000000000000000 0N' 16 C\\l cIoI 14 co 14 ~ 12 (5 (/) t:. 12 Q) (/) Ci 10 Q) E r(e/s) 8 Ci 10 E 000000.000 000000.00000000000. $9 -res 0.00.0000 000 2 •• 00.00000 •• 0 •••••••••• 8(/) 00.0.0000 •• 0000. .0000000000000000000.000000000 .0.. o • 0 • 0- 0 • 0- 0- 0-·0 0- • 0- 0 • '0 -0 o' .Qc) 6 0000.00000000000000000 $9 $0 $1 $2 $3 $4 $S $6 $7 $8 E Estimate of mean $1 $2 $3 $4 $S $6 $7 $8 (Sample size = 4) z:::I 4 Estimate of mean (Sample size = 3) 2 0 $0 c. Samples of 5 d. Samples of 6 True mean = 54.50 True mean = $4.50 II 20 -------- --- ---------V..---~---------- 20 ------------ 18 18 §[ 16 f---------------- -.ooo• • • • ~-----~----- §' 16 ~~----~~ ----o-t)1)OO----- \"--------------- C\\l C\\l cIoI 14 cIoI 14 ~-. 12 (5 t:. 12 (/) (/) Q) QQ). 10 000'0000-0000--- Ci 10 E E 0000000000.0. r(e/s) 8 res .0 •• 0.00000.00 8 --\"--- - • • - · -----~.oo.ooooooo U) ~~--- ~~- \"--~- -~-~ '.0.. •• 00000000000000 '0 •••.• 00 •••••• 0000.0000•••••• 0--- 6 0000000000.0 •• 00.0 ................0.- ••••00000••00090.0- - Q) 00.000000.0 ••0000000 •• 0 •• 000••• 0 ••• 0 • .c E :::I 4 00 • • 00 • • 00. . . . . . . . -- --- Z 0 ••• 0 •••• 0 ••0 •••• 0 • 2 --- -- _.oo •••o.o••• o••• o-.o•• o---~~---- 2 . 0• • • • • • • 01110 0 . 0 0 . 0 .. oo--~- 00.000000000000000000.000 D. 0.00 00 ••••••••• 00. 0 •••••• 00. 0 OL--L_ _~-L_ _L - - L_ _L-~--~~ __ __OL-~~ ~~ L--L~--~~ $0 $1 $2 $3 $4 $S $6 $7 $8 $9 $0 $1 $2 $3 $4 $S $6 $7 $8 $9 Estimate of mean Estimate of mean (Sample size = 5) (Sample size = 6) FIGURE 7-7 The Sampling Distributions of Samples of 3,4, 5, and 6. As we increase the sample size, the possible samples cluster ever. m~re tightly around the true value of the mean. The chance of extremely inaccurate estimates is reduced at the two ends of the dIstrIbu- tion, and the percentage of the samples near the true value keeps inaeasing.

The TheOlY and Logic of Probability Sampling 195 o so 100 lIi'IVSample 1 (48%) ~Sample 2 (51%) Sample 3 (52%) Percent of students approving of the student code o so 100 FIGURE 7-8 Range of Possible Sample Study Results. Shifting to amore Percent of students approving of the student code realistic example, let's assume that we want to sample student attitudes concerning aproposed conduct code. Let's assume FIGURE 7-9 that 50 percent of the whole student body approves and 50 percent disapproves-though the researcher doesn't Results Produced by Three Hypothetical Studies. Assuming know that. alarge student body, let's suppose that we selected three different samples, each of substantial size. We would not we interview the 100 students whose numbers necessarily expect those samples to perfectly reflect attitudes have been selected and ask for their attitudes to- in the whole student body, but they should come reasonably ward the student code: whether they approve or close. disapprove. Suppose this operation gives us 48 stu- dents who approve of the code and 52 who disap- that some of the new samples provide duplicate prove. This summary description of a variable in a estimates, as in the iUustration of ten cases. Fig- sample is called a statistic. We present this statistic ure 7-10 shows the sampling distribution of, say, by placing a dot on the x axis at the point repre- hundreds of samples. This is often referred to as a senting 48 percent. normal curve. Now let's suppose we select another sample of Note that by increasing the number of samples 100 students in exactly the same fashion and mea- selected and interviewed, -we've also increased the sure their approval or disapproval of the student range of estimates provided by the sampling opera- code. Perhaps 51 students in the second sample ap- tion. In one sense we've increased our dilemma in prove of the code. We place another dot in the ap- attempting to guess the parameter in the popula- propriate place on the x axis. Repeating this process tion. Probability theory, however, provides certain once more, we may discover that 52 students in the important mles regarding the sampling distribution third sample approve of the code. presented in Figure 7-10. Figure 7-9 presents the three different sample First, if many independent random samples are statistics representing the percentages of students in selected from a population, the sample statistics each of the three random samples who approved of provided by those samples will be distributed the student code. The basic mle of random sam- around the population parameter in a known way. pling is that such samples drawn from a population Thus, although Figure 7-10 shows a wide range of give estimates of the parameter that exists in the to- estimates, more of them are in the vicinity of 50 tal population. Each of the random samples, then, percent than elsewhere in the graph. Probability gives us an estimate of the percentage of students in theory tells us, then, that the tme value is in the the total student body who approve of the student vicinity of 50 percent. code. Unhappily, however, we have selected three samples and now have three separate estimates. Second, probability theory gives us a formula for estimating how dosely the sample statistics are To retrieve ourselves from this problem, let's dustered around the tme value. To put it another draw more and more samples of 100 students way, probability theory enables us to estimate ead1, question ead1 of the samples concerning their approval or disapproval of the code, and statistic The summary description of a variable in a plot the new sample statistics on our summary sample, used to estimate a population parameter. graph. In drawing many such samples, we discover

196 Chapter 7: The logic of Sampling (/) 00000 Q) 000000 •• 000.0 C. 00000000 cEo oooo.oooe (/) .0000.0 ••• '0 0000.0. 0 • 00 000 ••• 0000 •• \"- 80 0000.0.000000 00000.00000000 .Q.c) 00000000000000. E 60 •• 00000000000000 z::l 40 0000000 •• 0.000.0.0 00000000000000000000. 20 00000000 ••• 00000000.0000 0.000000000000000000 ••••• 0 0 00000000.0.0.000000000000000. 0 •• 00.00.000.000000000000.00 •••••• 000000 •••• 0000000000000000000.0 •••• 0 50 100 Percent of students approving of the student code FIGURE 7-10 The Sampling Distribution. If we were to select alarge number of good samples, we would expect them to cluster around the true value (50 percent), but given enough such samples, afew would fall far from the mark. the sampling error-the degree of error to be In probability theory, the standard error is a expected for a given sample design. This formula valuable piece of information because it indicates contains three factors: the parameter, the sample the extent to which the sample estimates will be size, and the standard error (a measure of sampling distributed around the population parameter. (If error): you're familiar with the standard deviation in statis- tics, you may recognize that the standard error, in s = ~P x Q this case, is the standard deviation of the sampling distribution.) Specifically, probability theory indi- 11 cates that certain proportions of the sample esti- mates will fall within specified increments-each The symbols P and Qin the formula equal the equal to one standard error-from the population population parameters for the binomial: If 60 per- parameter. Approximately 34 percent (0.3413) of cent of the student body approve of the code and the sample estimates will fall within one standard 40 percent disapprove, P and Qare 60 percent and error increment above the population parameter, 40 percent, respectively, or 0.6 and OA Note that and another 34 percent will fall within one stan- Q = 1 - P and P = 1 - Q. The symbol 11 equals the dard error below the parameter. In our example, number of cases in each sample, and s is the stan- the standard error increment is 5 percent, so we dard error. know that 34 percent of our samples will give esti- mates of student approval between 50 percent (the Let's assume that the population parameter in parameter) and 55 percent (one standard error the student example is 50 percent approving of the above); another 34 percent of the samples will give code and 50 percent disapproving. Recall that we've estimates between 50 percent and 45 percent (one been selecting samples of 100 cases each. When standard error below the parameter). Taken to- these numbers are put into the formula, we find that the standard error equals 0.05, or 5 percent. gether, then, we know that roughly two-thirds (68 percent) of the samples will give estimates within 5 sampling error The degree of error to be expected percent of the parameter. in probability sampling. The formula for determin- ing sampling error contains three factors: the param- Moreover, probability theory dictates that eter, the sample size, and the standard error. roughly 95 percent of the samples will fall within plus or minus two standard errors of the true value, and 99.9 percent of the samples will fall within plus

The Theory and logic of Probability Sampling 197 or minus three standard errors. In our present ex- reason we conduct a sample survey is to estimate ample, then, we know that only one sample out of that value. Moreover, we don't actually select large a thousand would give an estimate lower than 35 numbers of samples: We select only one sample. percent approval or higher than 65 percent. Nevertheless, the preceding discussion of probabil- ity theory provides the basis for inferences about The proportion of samples falling within one, the typical social research situation. Knowing what two, or three standard errors of the parameter is it would be like to select thousands of samples al- constant for any random sampling procedure such lows us to make assumptions about the one sample as the one just described, providing that a large we do select and study. number of samples are selected. The size of the standard error in any given case, however, is a Confidence Levels and Confidence Intervals function of the population parameter and the sample size. If we return to the formula for a mo- Whereas probability theory specifies that 68 per- ment, we note that the standard error will increase cent of that fictitious large number of samples as a function of an increase in the quantity P times would produce estimates falling within one stan- Q. Note further that this quantity reaches its maxi- dard error of the parameter, we can turn the logic mum in the situation of an even split in the popu- around and infer that any single random sample lation. If P 0.5, PQ = 0.25; if P = 0.6, PQ = 0.24; estimate has a 68 percent chance of falling within if P 0.8, PQ = 0.16; if P 0.99, PQ = 0.0099. By that range. This observation leads us to the two extension, if P is either 0.0 or 1.0 (either 0 percent key components of sampling error estimates: or 100 percent approve of the student code), the confidence level and confidence interval. We e:Kpress the accuracy of our sample statistics in standard error will be o. If everyone in the popula- terms of a levelojconfidence that the statistics fall vvithin a specified ilZten'al from the parameter. tion has the same attitude (no variation), then For example, we may say we are 95 percent every sample vvill give exactly that estimate. confident that our sample statistics (for example, 50 percent favor the new student code) are within The standard error is also a function of the plus or minus 5 percentage points of the population sample size-an inverse function. As the sample parameter. As the confidence interval is expanded size increases, the standard error decreases. As the for a given statistic, our confidence increases. For sample size increases, the several samples will be example, we may say that we are 99.9 percent clustered nearer to the true value. Another general confident that our statistic falls within three stan- guideline is evident in the formula: Because of the dard errors of the true value. square root formula, the standard error is reduced by half if the sample size is quadrupled. In our pres- Although we may be confident (at some level) ent example, samples of 100 produce a standard er- of being within a certain range of the parameter, ror of 5 percent; to reduce the standard error to 2.5 we've already noted that we seldom know what percent, we must increase the sample size to 400. the parameter is. To resolve this problem, we substitute our sample estimate for the parameter in All of this information is provided by estab- lished probability theory in reference to the selec- confidence level The estimated probability that a tion of large numbers of random samples. (If you've popUlation parameter lies within a given confidence taken a statistics course, you may know this as the interval. Thus, we might be 95 percent confident Central Tendency Theorem.) If the population pa- that between 35 and 45 percent of all voters favor rameter is known and many random samples are Candidate A. selected, we can predict how many of the sample confidence interval The range of values within estimates will fall within specified intervals from which a population parameter is estimated to lie. the parameter. Recognize that this discussion illustrates only the logic of probability sampling; it does not de- scribe the way research is actually conducted. Usually, we don't know the parameter: The very

198 Chapter 7: The Logic of Sampling the formula; that is, lacking the true value, we sub~ properly to represent Vermont voters ,viII be no stitute the best available guess. more accurate than a sample of 2,000 drawn prop- erly to represent all voters in the United States, The result of these inferences and estimations is even though the Vermont sample would be a sub- that we can estimate a population parameter and stantially larger proportion of that small state's also the expected degree of error on the basis of voters than would the same number chosen to rep- one sample drawn from a population. Beginning resent the nation's voters. The reason for this coun- with the question \"What percentage of the student terintuitive fact is that the equations for calculating body approves of the student code?\" you could se- sampling error all assume that the populations be- lect a random sample of 100 students and inter- ing sampled are infinitely large, so every sample view them. You might then report that your best would equal 0 percent of the whole. estimate is that 50 percent of the student body ap- proves of the code and that you are 95 percent Of course, this is not literally true in practice, confident that between 40 and 60 percent (plus or However, a sample of 2,000 represents only 0.68 minus two standard errors) approve. The range percent of the Vermonters who voted for president from 40 to 60 percent is the confidence intervaL in the 2000 election, and a sample of 2,000 US. (At the 68 percent confidence leveL the confidence voters represents a mere 0.002 percent of the interval would be 45-55 percent.) national electorate, Both of these proportions are sufficiently small as to approach the situation with The logic of confidence levels and confidence infinitely large populations. intervals also provides the basis for determining the appropriate sample size for a study. Once you've Unless a sample represents, say, 5 percent or decided on the degree of sampling error you can more of the population it's drawn from, that pro- tolerate, you'll be able to calculate the number of portion is irrelevant. In those rare cases of large cases needed in your sample, Thus, for example, if proportions being selected, a \"finite population cor- you want to be 95 percent confident that your rection\" can be calculated to adjust the confidence study findings are accurate within plus or minus intervals. Simply subtract the proportion from 1,0 5 percentage points of the popUlation parameters, and multiply the result times the sampling error. As you should select a sample of at least 400. (Appen- you can see, with proportions close to zero, this dix F is a convenient guide in this regard.) will make no difference. If, on the other hand, your sample were half of the population, the sampling This, then, is the basic logic of probability error would be cut in half by this procedure. In the sampling. Random selection permits the researcher extreme, if you included the whole population in to link findings from a sample to the body of your sample, the sample-to-population proportion probability theory so as to estimate the accuracy would be La, and you would mUltiply the calcu- of those findings. All statements of accuracy in lated standard error by O.O-suggesting there was sampling must specify both a confidence level no sampling error, which would, of course, be the and a confidence intervaL The researcher must re- case. (How cool is that?) port that he or she is x percent confident that the population parameter is between two specific Two cautions are in order before we conclude values. In this example, I've demonstrated the this discussion of the basic logic of probability sam- logic of sampling error using a variable analyzed pling. First, the survey uses of probability theory as in percentages. A different statistical procedure discussed here are technically not 'wholly justified. would be required to calculate the standard error The theory of sampling distribution makes assump- for a mean, for example, but the overall logic is tions that almost never apply in survey conditions. the same. The exact proportion of samples contained within specified increments of standard errors, for ex- Notice that nowhere in this discussion of ample, mathematically assumes an infinitely sample size and accuracy of estimates did we con- large population, an infinite number of samples, sider the size of the popUlation being studied. and sampling >vith replacement-that is, every This is because the population size is almost always sampling unit selected is \"thrown back into the irrelevant. A sample of 2,000 respondents drawn

Populations and Sampling Frames 199 pot\" and could be selected again. Second, our dis- complex population sample is the census block, cussion has greatly oversimplified the inferential the list of census blocks composes the sampling jump from the distribution of several samples to frame-in the form of a printed booklet, a mag- the probable characteristics of one sample. netic tape file, or some other computerized record. Here are some reports of sampling frames appear- I offer these cautions to provide perspective on ing in research journals In each example I've itali- the uses of probability theory in sampling. Social cized the actual sampling frames. researchers often appear to overestimate the preci- sion of estimates produced by the use of probability The data for this research were obtained from a theory. As I'll mention elsewhere in this chapter random sample of pareJUs ofchildren ill the rhird and throughoUl the book, variations in sampling grade in public alld parochial schools in Yakima techniques and nonsampling factors may further Coumy. Washington. reduce the legitimacy of such estimates. For ex- ample, those selected in a sample who fail or refuse (Pe[t'/'sm alld Maynard 1981: 92) to participate further detract from the representa- tiveness of the sample. The sample at Time 1 consisted of 160 names drawn randomly from the telephone directory of Nevertheless, the calculations discussed in this Lubbock, Texas. section can be extremely valuable to you in under- standing and evaluating your data, Although the (Tan 1980..242) calculations do not provide as precise estimates as some researchers might assume, they can be quite The data reported in this paper. \". were gath- valid for practical purposes. They are unquestion- ered from a probability sample of adults aged 18 ably more valid than less rigorously derived esti- alld Ol'e/\" residing ill hOllselzolds ill the 48 contigllolls mates based on less-rigorous sampling methods. United Srares, Personal interviews with 1,914 re- Most important, being familiar with the basic logic spondents were conducted by the Survey Re- underlying the calculations can help you react sen- search Center of the University of Michigan sibly both to your own data and to those reported during the fall of 1975. by others. (Jackmal/ mzd Sel/ler 1980.: 345) Populations and Sampling Frames Properly drawn samples provide information appropriate for describing the population of ele- The preceding section introduced the theoretical ments composing the sampling frame-nothing model for social research sampling. Although as more. I emphasize this point in view of the all- students, research consumers, and researchers we too-common tendency for researchers to select need to understand that theory, it's no less impor- samples from a given sampling frame and then tant to appreciate the less~than-perfect conditions make assertions about a population similar to, but that exist in the field. In this section we'll look at not identical to, the population defined by the sam- one aspect of field conditions that requires a com- pling frame. promise with idealized theoretical conditions and assumptions: the congruence of or disparity be- For example, take a look at this report, which tween populations of sampling frames. discusses the drugs most frequently prescribed by US\" physicians: Simply put, a sampling frame is the list or quasi list of elements from which a probability Information on prescription drug sales is not sample is selected. If a sample of students is se- easy to obtain. But Rinaldo V DeNuzzo, a lected from a student roster, the roster is the sam- pling frame. If the primary sampling unit for a sampling frame That list or quasi list of units com- posing a population from which a sample is selected. If the sample is to be representative of the popula- tion, it is essential that the sampling frame include all (or nearly all) members of the population\"

200 Chapter 7: The logic of Sampling professor of pharmacy at the Albany College typically have membership lists. In such cases, the of Pharmacy, Union University, Albany. NY, list of members constitutes an excellent sampling has been tracking prescription drug sales for frame. If a random sample is selected from a mem- 15 years by polling nearby drugstores. He pub- bership list the data collected from that sample may lishes the results in an industry trade magazine, be taken as representative of all members-if all lvIM&M. members are included in the list DeNuzzo's latest survey, covering 1980, is Populations that can be sampled from good based on reports from 66 pharmacies in organizational lists include elementary schooL 48 communities in New York and New Jersey. high schooL and university students and faculty; Unless there is something peculiar about that church members; factory workers; fraternity or part of the country, his findings can be taken sorority members; members of social, service, as representative of what happens across the or political clubs; and members of professional country. associations. (Aloskoll'irz 1981 33) The preceding comments apply primarily to local organizations. Often, statewide or national What is striking in the excerpt is the casual organizations do not have a single membership comment about whether there is anything peculiar list There is, for example, no single list of Episco- about New York and New Jersey. There is. The palian church members. However, a slightly more lifestyle in these two states hardly typifies the other complex sample design could take advantage of 48. We cannot assume that residents in these large, local church membership lists by first sampling urbanized, eastern seaboard states necessarily have churches and then subsampling the membership the same drug-use patterns that residents of Missis- lists of those churches selected. (More about sippi. Nebraska, or Vermont do. that lateL) Does the survey even represent prescription Other lists of individuals may be especially rele- patterns in New York and New Jersey? To deter- vant to the research needs of a particular study. mine that, we would have to know something Government agencies maintain lists of registered about the way the 48 communities and the 66 voters, for example, that might be used if you pharmacies were selected. We should be wary in wanted to conduct a preelection poll or an in-depth this regard, in view of the reference to \"polling examination of voting behavior-but you must in- nearby drugstores.\" As we'll see, there are several sure that the list is up-to-date. Similar lists contain methods for selecting samples that ensure repre- the names of automobile owners, welfare recipi- sentativeness, and unless they're used, we shouldn't ents, taxpayers, business permit holders, licensed generalize from the study findings. professionals, and so forth. Although it may be difficult to gain access to some of these lists, they A sampling frame, then, must be consonant provide excellent sampling frames for specialized with the population we wish to study. In the sim- research purposes. plest sample design, the sampling frame is a list of the elements composing the study population. In Realizing that the sampling elements in a study practice, though, existing sampling frames often need not be individual persons, we may note that define the study population rather than the other the lists of other types of elements also exist: uni- way around. That is, we often begin with a popula- versities, businesses of various types, cities, aca- tion in mind for our study; then we search for pos- demic journals, newspapers, unions, political clubs, sible sampling frames. Having examined and evalu- professional associations, and so forth. ated the frames available for our use, we decide which frame presents a study population most Telephone directories are frequently used for appropriate to our needs. \"quick and dirty\" public opinion polls. Undeniably they're easy and inexpensive to use-no doubt the Studies of organizations are often the simplest reason for their popularity. And, if you want to from a sampling standpoint because organizations make assertions about telephone subscribers, the

Populations and Sampling Frames 201 directory is a fairly good sampling frame. (Realize, easily in Japan than in the United States. Such a of course, that a given directory \"vill not include registration list in the United States would conflict new subscribers or those who have requested un- directly with this country's norms regarding indi- listed numbers. Sampling is further complicated by vidual privacy. the directories' inclusion of nonresidemiallistings.) Unfortunately, telephone directories are all too of- Review ofPopulations ten used as a listing of a city's population or of its and Sampling Frames voters. Of the many defects in this reasoning, the chief one involves a bias, as we have seen. Poor Because social research literature gives surprisingly people are less likely to have telephones; rich little attention to the issues of populations and people may have more than one line. A telephone sampling frames, I've devoted special attention to directory sample, therefore, is likely to have a them. Here is a summary of the main guidelines to middle- or upper-class bias. remember: The class bias inherent in telephone direc- 1. Findings based on a sample can be taken as tory samples is often hidden. Preelection polls representing only the aggregation of elements conducted in this fashion are sometimes quite ac- that compose the sampling frame. curate, perhaps because of the class bias evident in voting itself: Poor people are less likely to vote. 1. Often, sampling frames do not truly include Frequently, then, these two biases nearly coincide, all the elements their names might imply. so that the results of a telephone poll may come Omissions are almost inevitable. Thus, a first very close to the final election outcome. Unhappily, concern of the researcher must be to assess you never know for sure until after the election. the extent of the omissions and to correct And sometimes, as in the case of the 1936 Literary them if possible. (Of course, the researcher may DigeS[ polL you may discover that the voters feel that he or she can safely ignore a small have not acted according to the expected class number of omissions that cannot easily be biases. The ultimate disadvantage of this method, corrected. ) then, is the researcher's inability to estimate the degree of error to be expected in the sample 3. To be generalized even to the population com- findings. posing the sampling frame, all elements must have equal representation in the frame. Typi- Street directories and tax maps are often used cally, each element should appear only once. for easy samples of households, but they may also Elements that appear more than once ,viII have suffer from incompleteness and possible bias. For a greater probability of selection, and the example, in strictly zoned urban regions, illegal sample wilL overall, overrepresent those housing units are unlikely to appear on official elements. records. As a result, such units could not be se- lected, and sample findings could not be represen- Other, more practical matters relating to popu- tative of those units, which are often poorer and lations and sampling frames will be treated else- more crowded than the average. where in this book. For example, the form of the sampling frame-such as a list in a publication, a Though the preceding comments apply to the 3-by-5 card file, CD-ROMS, or magnetic tapes- United States, the situation is quite different in can affect how easy it is to use. And ease of use some other countries. In Japan, for example, the may often take priority over scientific considera- government maintains quite accurate population tions: An \"easier\" list may be chosen over a registration lists. Moreover, citizens are required by \"harder\" one, even though the latter is more ap- law to keep their information up-to-date, such as propriate to the target population. We should changes in residence or births and deaths in the not take a dogmatic position in this regard, but household. As a consequence, you can select simple random samples of the population more

202 Chapter 7: The Logic of Sampling every researcher should carefully weigh the researcher assigns a single number to each element relative advantages and disadvantages of such in the list. not skipping any number in the process. alternativeso A table of random numbers (Appendix C) is then used to select elements for the sample. \"Using a Types of Sampling Designs Table of Random Numbers\" explains its use. Up to this point. we've focused on simpl~ r.andon: If your sampling frame is in a machine-. sampling (SRS). Indeed, the body of statistics tYPI- readable form, such as CD-ROM or magnetic cally used by social researchers assumes such a tape, a simple random sample can be selected sample. As you'll see shortly, however, you have automatically by computer. (In effect. the com- several options in choosing your sampling method, puter program numbers the elements in the sam- and you'll seldom if ever choose simple ra~dom . pling frame, generates its own series of random sampling. There are two reasons for this. FlfSt, with numbers, and prints out the list of elements all but the simplest sampling frame, simple random sampling is not feasible. Second, and probably sur- selected.) prisingly, simple random sampling may not be the Fiaure 7-11 offers a graphic illustration of most accurate method available. Let's turn now to a discussion of simple random sampling and the simpl: random sampling. Note that the members other options available. of our hypothetical micropopulation have been numbered from I to 100. Moving to Appendix C Simple Random Sampling we decide to use the last two digits of the first col- umn and to begin with the third number from the As noted, simple random sampling is the basic top\" This yields person number 30 as the first one sampling method assumed in the statistical com~u­ selected into the sample. Number 67 is next. and so tations of social researcho Because the mathematics fortho (Person 100 would have been selected if \"00\" of random sampling are especially complex, we'll detour around them in favor of describing the ways had come up in the lis!.) of employing this method in the field. Systematic Sampling Once a sampling frame has been properly established, to use simple random sampling the Simple random sampling is seldom used in pra~­ simple random sampling A type of probability. tice. As you'll see, it's not usually the most effiCIent sampling in which the units composing a populat.lon are assigned numbers. A set of random numbers IS method, and it can be laborious if done manually. then generated, and the units having those numbers are included in the sample. Typically, simple random sampling requires a list of systematic sampling A ty.r~ of p:o~ability sam- piing in which every kth unIt m a list IS selected for elements. When such a list is available, researchers inclusion in the sample-for example, every 25th student in the college directory of students. You usually employ systematic sampling instead. . compute k by dividing the size of the popul~tio~ by the desired sample size; k is called the salllpllllg Il11er- In systematic sampling, every kth element 1I1 val. Within certain constraints, systematic sampling is a functional equivalent of simple random sam- . the total list is chosen (systematically) for inclusion piing and usually easier to do. Typically, the first umt is selected at random, in the sampleo If the list contained 10,000 elements and you wanted a sample of LOOO, you would se- lect every tenth element for your sampleo To ensure against any possible human bias in using this method, you should select the first element at ran- dom. Thus, in the preceding example, you would beain bv selecting a random number between one v~ .. and ten. The element having that number IS 1I1- cluded in the sample, plus every tenth element following it. This method is technically referred to as a svstemaric sample with a random start. Two terms are f;equently used in connection with systematic

Types of Sampling Designs 203 Insocial research, it's often appropriate to ;elect aset of ra~dom agree to take the digits farthest to the right. 480, or the middle numbers from atable such as the one In AppendiX CHeres hO'!1 to do that three digits, 048, and any of these plans 'IIould work) They key is to Suppose you wani to select asimple random sample of 100 people make aplan and stick 'ilith it For convenience, let's use the left- (or other units) out of apopulation totaling 980 most three digits To begin, number the members of the population in this case, from 1to 980\" How the problem is to select 100 random numbers. Once We can also choose to progress through the tables any way we you've done that, your sample will consist of the people having the numbers you've selected (Naie It's not essential to actually num- Vlant daVin the columns, up them,across to the right orto the left, ber them, as long as you're sure of the total. Ifyou have them in a list, for example, you can always count through the list after you've or diagonally Again, any of these plans will work just fine as long selected the numbers) 2 The next step is to determine the number of digits you'll need in as Vie stick to it For convenience, let's agree to move down the the random numbers you select In our example, there are 980 members ofthe population, so you'll need three-digit numbers to columns.vVhen we.get to the bottom of one column, we'll go to the give everyone achance of selection. (Ifthere were 11,825 members of the population, you'd need to select five-digit numbers) Thus, tOp of the next we want TO select 100 random numbers in the range from 001 to 980 6 Ilow, ,!Ihere do VIe start) You can close your eyes and stick apencil 110','1 turn to the first page of Appendix Ulotice there are several rows and columns of five-digit numbers, and there are several into the table and start wherever the pencil point lands (I know it pages The table represents aseries of random numbers in the range from 00001 to 99999 To use the table for your hypothetical doesn't sound SCientific, but it works) Or, if you're afraid you'll hurt sample, you have to ansvler these questions a How will you create three-digit numbers out of five-digit the book or miss it altogether, close your eyes and make up acol- numbers) umn number and arow number. (\"I'll pick the number in the fifth What pattern will you follo'!l in moving through the table to select your numbers? row of column 2\") Start with that number. Where will you start) 7. Let's suppose Vie decide to start with the fifth number in column 2 Each of these questions has several satisfactory answers. The key is to create aplan and follow it Here's an example If you look on the first page of Appendix C, you'll see that the start- 4 To create three-digit numbers from five-digit numbers, let's agree ina number is 39975 vVe've selected 399 as our first random num- to select five-digit numbers from the table but consider only the left-most three digits in each case If 'Ne picked the first number on b;r, and we haVe 99 more to go. Moving down the second column, the first page-10480-we'd only consider the 104 (We could Vie select 069, 729, 919,143,368,695,409,939, and so forth.M the bonom of column 2(on the second page of this table), we select number 017 and continue to the top of column 3015,255, and so on See how easy it is) But trouble lies ahead When we reach column 5, ','ie're along, selecting 816.309,763,078,061,277, 988 .VIait aminutelThere are only 980 students in the senior class.HO','j can 'lIe pick number 988lThe solution is simple Ignore it Any time you come across anumber that lieS outside your range, skip it and continue on your way i 88,174,and so forth.The same solution applies if the same number comes up more than once If you select 399 again, for example,just ignore it the second time 9 That's it You up the procedure until you've selected 100 ran- dom numbers. Returning to your list, your sample consists of per- son number 399, person number 69, person number 729, and so forth

204 Chapter 7: The logic of Sampling 2 3 4 5 6 7 8 9 10 11 ft t ff 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 f' ' f • • •51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 t tt tt ft. 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 10480 15011 01536 •30 67 70 21 62 22368 46573 25595 •79 75 18 53 24130 48360 22527 42167 93093 06243 37570 39975 81837 77921 06907 11008 99562 72905 56420 96301 91977 05463 89579 14342 63661 85475 36857 53342 28918 69578 88231 63553 40961 48235 09429 93969 52636 FIGURE 7-11 ASimple Random Sample. Having numbered everyone in the population, we can use atable of random numbers to select a representative sample from the overall population. Anyone whose number is chosen from the table is in the sample. sampling. The sampling interval is the standard the proportion of elements in the population that distance between elements selected in the sample: are selected: J/l 0 in the example. ten in the preceding sample. The sampling ratio is population size sampling interval The standard distance between sampling interval = sampIe s'ize elements selected from a population for a sample. sampling ratio The proportion of elements in the sample size population that are selected to be in a sample. samplin'a\" ratio = populat.ion,size

Types of Sampling DeSigns 205 In practice, systematic sampling is virtually that list. If the elements are arranged in any partic- identical to simple random sampling. If the list of ular order. you should figure out whether that or- elements is indeed randomized before sampling, der will bias the sample to be selected, then you one might argue that a systematic sample drawn should take steps to counteract any possible bias from that list is in fact a simple random sample. By (for example, take a simple random sample from now, debates over the relative merits of simple ran- cyclical portions). dom sampling and systematic sampling have been resolved largely in favor of the latter, simpler Usually, however, systematic sampling is supe- method. Empirically, the results are virtually identi- rior to simple random sampling, in convenience if caL And, as you'll see in a later section, systematic nothing else. Problems in the ordering of elements sampling, in some instances, is slightly more accu- in the sampling frame can usually be remedied rate than simple random sampling. quite easily. There is one danger involved in systematic Stratified SamPling sampling. The arrangement of elements in the list can make systematic sampling umvise. Such an So far we've discussed two methods of sample arrangement is usually called periodicity. If the list of selection from a list: random and systematic elements is arranged in a cyclical pattern that coin- Stratification is not an alternative to these meth- cides with the sampling interval, a grossly biased ods; rather, it represents a possible modification sample may be drawn. Here are two examples that of their use. illustrate this danger. Simple random sampling and systematic sam- In a classic study of soldiers during World War pling both ensure a degree of representativeness II. the researchers selected a systematic sample and permit an estimate of the error present. Strati- from unit rosters. Every tenth soldier on the roster fied sampling is a method for obtaining a greater de- was selected for the study. The rosters, however. gree of representativeness by decreasing the prob- were arranged in a table of organizations: sergeants able sampling error. To understand this method, we first then corporals and privates, squad by squad. must return briefly to the basic theory of sampling Each squad had ten members. As a result every distribution. tenth person on the roster was a squad sergeant. The systematic sample selected contained only ser- Recall that sampling error is reduced by two geants. It could, of course, have been the case that factors in the sample design. First a large sample no sergeants were selected for the same reason. produces a smaller sampling error than a small sample does. Second, a homogeneous population As another example, suppose we select a produces samples with smaller sampling errors sample of apartments in an apartment building. If than a heterogeneous population does. If 99 per- the sample is drmvn from a list of apartments ar- cent of the population agrees with a certain state- ranged in numerical order (for example, lOt I02, ment, it's extremely unlikely that any probability I03, 104, 201. 202, and so on), there is a danger of sample will greatly misrepresent the extent of the sampling interval coinciding with the number agreement. If the population is split 50-50 on the of apartments on a floor or some multiple thereof. statement, then the sampling error will be much Then the samples might include only northwest- greater. corner apartments or only apartments near the ele- vator. If these types of apartments have some other stratification The grouping of the units composing particular characteristic in cornman (for example, a population into homogeneous groups (or strata) higher rent). the sample will be biased. The same before sampling. This procedure, which may be used danger would appear in a systematic sample of in conjunction with simple random, systematic, or houses in a subdivision arranged with the same cluster sampling, improves the representativeness number of houses on a block. of a sample, at least in terms of the stratification variables. In considering a systematic sample from a list then, you should carefully examine the nature of

206 Chapter 7: The logic of Sampling Stratified sampling is based on this second fac- available for stratification, it's often used, Education tor in sampling theory, Rather than selecting a is related to many variables, but it's often not avail- sample from the total population at large, the re- able for stratification, Geographic location within a searcher ensures that appropriate numbers of ele- city, state, or nation is related to many things, ments are drawn from homogeneous subsets of Within a city, stratification by geographic location that population, To get a stratified sample of uni- usually increases representativeness in social class, versity students, for example, you would first orga- ethnic group, and so forth. Within a nation, it in- nize your population by college class and then draw creases representativeness in a broad range of atti- appropriate numbers of freshmen, sophomores, tudes as well as in social class and ethnicity, juniors, and seniors, In a nonstratified sample, rep- resentation by class would be subjected to the same When you're working with a simple list of all sampling error as other variables would, In a elements in the population, two methods of strati- sample stratified by class, the sampling error on this fication predominate, In one method, you sort the variable is reduced to zero, population elements into discrete groups based on whatever stratification variables are being used. On More-complex stratification methods are also the basis of the relative proportion of the popula- possible, In addition to stratifying by class, you tion represented by a given group, you select- might also stratify by gender, by GPA, and so forth, randomly or systematically-several elements In this fashion you might be able to ensure that from that group constituting the same proportion your sample would contain the proper numbers of of your desired sample size. For example, if sopho- male sophomores with a 35 average, of female more men with a 4,0 average compose 1 percent of sophomores with a 4,0 average, and so forth, the student population and you desire a sample of LOOO students, you would select 10 sophomore The ultimate function of stratification, then, is to men with a 4.0 average. organize the population into homogeneous subsets (\"\\ovith heterogeneity between subsets) and to select The other method is to group students as de- the appropriate number of elements from each, scribed and then put those groups together in a con- To the extent that the subsets are homogeneous on tinuous list beginning with all freshmen men vvith the stratification variables, they may be homoge- a 4.0 average and ending with all senior women neous on other variables as welL Because age is re- with a 1,0 or below. You would then select a sys- lated to college class, a sample stratified by class ,viII tematic sample, with a random start, from the en- be more representative in terms of age as well, com- tire list Given the arrangement of the list, a system- pared with an unstratified sample, Because occupa- atic sample 'would select proper numbers (within an tional aspirations still seem to be related to gender, a error range of 1 or 2) from each subgroup, (Note: A sample stratified by gender will be more representa- simple random sample drawn from such a compos- tive in terms of occupational aspirations, ite list would cancel out the stratification.) The choice of stratification variables typically Figure 7-12 offers a graphic illustration of depends on what variables are available, Gender stratified, systematic sampling, As you can see, we can often be determined in a list of names, Univer- lined up our micropopulation according to gender sity lists are typically arranged by class, Lists of fac- and race. Then, beginning with a random start of uIty members may indicate their departmental \"3,\" we've taken every tenth person thereafter: 3, affiliation, Government agency files may be ar- 13, 23, ' , , , 93. ranged by geographic region, Voter registration lists are arranged according to precinct Stratified sampling ensures the proper repre- sentation of the stratification variables; this, in turn, In selecting stratification variables from among enhances the representation of other variables those available, however, you should be concerned related to them. Taken as a whole, then, a stratified primarily with those that are presumably related to sample is more likely than a simple random sample variables you want to represent accurately, Because to be more representative on several variables. gender is related to many variables and is often Although the simple random sample is still re-

Types of Sampling Designs 207 Random start .jJ =======. = = = = = = =_ _• ===_..2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ·.~.x~_~+~jpil.*~, . •26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 =====.~~~~-~ &~========-.=======~ 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 j... \"j .•?'~j~>. tttttttttttt 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 · . \" ·• . • •d o R i s · • t 3 13 23 33 43 53 63 73 83 93 FIGURE 7-12 AStratified, Systematic Sample with a Random Start Astratified, systematic sample involves two stages, First the members of the population are gathered into homogeneous strata; this simple example merely uses gender as astratification variable, but more could be used. Then every kth (in this case, every 10th) person in the stratified arrangement is selected into the sample, garded as somewhat sacred, it should now be clear In a study of students at the University of that you can often do betteL Hawaii, after stratification by school class, the stu- dents were arranged by their student identification ImpHdtSifailficailon numbers. These numbers, however, were their so- in Systematic SampHng cial security numbers, The first three digits of the social security number indicate the state in which I mentioned that systematic sampling can, under the number was issued. As a result, within a class, certain conditions, be more accurate than simple students were arranged by the state in which they random sampling. This is the case whenever the were issued a social security number, providing a arrangement of the list creates an implicit strati- rough stratification by geographic origin. fication. As already noted, if a list of university stu- dents is arranged by class, then a systematic sample An ordered list of elements, therefore, may provides a stratification by class where a simple be more useful to you than an unordered, random- random sample would not ized list I've stressed this point in view of the unfortunate belief that lists should be randomized

208 01apter 7: The logic of Sampling before systematic sampling. Only if the arrange- Sample Selection ment presents the problems discussed earlier should the list be rearranged. Once the students had been arranged by class, a systematic sample was selected across the entire re- !IIustration: Sampling arranged list. The sample size for the study was ini- University Students tially set at 1, 100. To achieve this sample, the sam- pling program was set for a lIl4 sampling ratio, The Let's put these principles into practice by looking at program generated a random number between 1 an actual sampling design used to select a sample of and 14; the student having that number and every university students. The purpose of the study was fourteenth student thereafter was selected in the to survey, with a mail-out questionnaire, a repre- sample. sentative cross section of students attending the main campus of the University of Hawaii. The Once the sample had been selected, the com- following sections describe the steps and decisions puter was instructed to print each student's name involved in selecting that sample, and mailing address on self-adhesive mailing labels. These labels were then simply transferred to en- Study Population and Sampling Frame velopes for mailing the questionnaires. The obvious sampling frame available for use in Sample Modification this sample selection was the computerized file maintained by the university administration. The This initial design of the sample had to be modified. tape contained students' names, local and perma- Before the mailing of questionnaires, the re- nent addresses, and social security numbers, as well searchers discovered that, because of unexpected as a variety of other information such as field of expenses in the production of the questionnaires, study, class, age, and gendeL they couldn't cover the costs of mailing to all 1,100 students, As a result, one-third of the mailing The computer database, however, contained labels were systematically selected (with a random files on all people who could, by any conceivable start) for exclusion from the sample. The final definition, be called students, many of whom sample for the study was thereby reduced to seemed inappropriate for the purposes of the study. 733 students, As a result, researchers needed to define the study population in a somewhat more restricted fashion. I mention this modification in order to illustrate The final definition included those 15,225 day- the frequent need to alter a study plan in mid- program degree candidates who were registered for stream, Because the excluded students were sys- the fall semester on the Manoa campus of the uni- tematically omitted from the initial systematic versity, including all colleges and departments, both sample, the remaining 733 students could still be undergraduate and graduate students, and both taken as reasonably representing the study popula- US. and foreign students. The computer program tion, The reduction in sample size did, of course, in- used for sampling, therefore, limited consideration crease the range of sampling error, to students fitting this definition. Multistage Cluster Sampling Stratification The preceding sections have dealt with reasonably The sampling program also permitted stratification simple procedures for sampling from lists of ele- of students before sample selection, The researchers ments, Such a situation is ideal. Unfortunately, decided that stratification by college class would be however, much interesting social research requires sufficient, although the students might have been the selection of samples from populations that can· further stratified within class, if desired, by gender, not easily be listed for sampling purposes: the pop- college, major, and so forth. ulation of a city, state, or nation; all university stu- dents in the United States; and so forth. In such

Multistage Cluster Sampling Ei 209 cases, the sample design must be much more com- activity-one of the elements making face-to-face, plex. Such a design typically involves the initial household surveys quite expensive, Vincent Ian- sampling of groups of elements-dusters-followed nacchione, Jennifer Staab, and David Redden by the selection of elements within each of the se- (2003) report some initial success using postal lected clusters. mailing lists for this purpose. Although the lists are not perfect. they may be close enough to warrant Cluster sampling may be used when it's ei- the significant savings in cost.. ther impossible or impractical to compile an exhaus- tive list of the elements composing the target popu- Multistage cluster sampling makes possible lation, such as all church members in the United those studies that would othervvise be impossible, States. Often, however, the population elements are Specific research circumstances often call for spe- already grouped into subpopulations, and a list of cial designs, as \"Sampling Iran\" demonstrates. those subpopulations either exists or can be created practically. For example, church members in the Multistage Designs and Sampling Error United States belong to discrete churches, which are either listed or could be. Following a cluster sample Although cluster sampling is highly efficient, the format, then, researchers could sample the list of price of that efficiency is a less-accurate sample, d1Urches in some manner (for example, a stratified, A sin1ple random sample drawn from a population systematic sample). Next, they would obtain lists of list is subject to a single sampling error. but a two- members from each of the selected churches, Each stage cluster sample is subject to two sampling er- of the lists would then be sampled, to provide rors. First. the initial sample of clusters will repre- samples of church members for study. (For an ex- sent the population of clusters only within a range ample, see Glock, Ringer, and Babbie 1967.) of sampling error. Second, the sample of elements selected within a given cluster will represent all the Another typical situation concerns sampling elements in that cluster only within a range of sam- among population areas such as a city, Although pling errOL Thus, for example, a researcher runs a there is no single list of a city's population, citizens certain risk of selecting a sample of disproportion- reside on discrete city blocks or census blocks, Re- ately wealthy city blocks, plus a sample of dispro- searchers can, therefore, select a sample of blocks portionately wealthy households within those initially, create a list of people living on each of the blocks. The best solution to this problem lies in selected blocks, and take a subsample of the people the number of clusters selected initially and the on each block. number of elements within each cluster. In a more complex design, researchers might Typically, researchers are restricted to a total sample blocks, list the households on each selected sample size; for example, you may be limited to block, sample the households, list the people resid- conducting 2,000 interviews in a city. Given this ing in each household, and, finally, sample the broad limitation, however, you have several options people within each selected household. This multi- in designing your cluster sample, At the extremes stage sample design leads ultimately to a selection of you could choose one cluster and select 2,000 ele- a sample of individuals but does not require the ini- ments within that cluster. or you could select 2,000 tiallisting of all individuals in the city's population, clusters with one element selected within each, Of course, neither approach is advisable, but a broad Multistage duster sampling, then, involves the repetition of two basic steps: listing and sampling. cluster sampling A multistage sampling in which The list of primary sampling units (churches, blocks) natural groups (clusters) are sampled initially. with is compiled and, perhaps, stratified for sampling, the members of each selected group being sub- Then a sample of those units is selected. The se· sampled afterward. For example, you might select lected primary sampling units are then listed and a sample of US. colleges and universities from a di- perhaps stratified, The list of secondary sampling rectory, get lists of the students at all the selected units is then sampled, and so forth. schools, then draw samples of students from each, The listing of households on even the selected blocks is, of course, a labor-intensive and costly

210 Chapter 7: The logic of Sampling hereas most of the examples given in this textbook are taken 5. The western provinces including western and eastern Azarbaijan, from its country of origin, the United States, the basic methods of Zanjan, Ghazvin and Ardebil sampling would apply in other national settings as well. At the same time, researchers may need to make modifications appropriate to local 6. The eastern provinces including Khorasan and Semnan conditions. In selecting anational sample of Iran, for example, Abdol- 7. The northern provinces including Gilan, Mazandran and Golestan lahyan and Azadarmaki (2000 21) from [he University ofTehran began 8. Systan by stratifying the nation on the basis of cultural differences, dividing the 9. Kurdistan country into nine cultural zones as follovls Within each ofthese cultural areas, the researchers selected 1. Tehran samples of census blocks and, on each selected block, asample of house- 2 Central region including Isfahan, Arak, Qum,Yazd and Kerman holds.Their sample design made provisions for getting the proper num- 3 The southern provinces including Hormozgan, Khuzistan, Bushehr bers of men and women as respondents within households and provi- sions for replacing those households where no one was at home. and Fars 4. The marginal western region including Lorestan, Charmahal and Source Hamid Abdollahyan and Taghi Azadarmaki,Sampiing Design in aSurvey Re- search: The Sampling Practice in Iran, paper presented to the meetings of the /\\merican Bakhtiari, Kogiluyeh and Eelam Sociological Association, August 12-16,2000,Washington, DC. range of choices lies between them. Fortunately, are; the residents of a given city block are more the logic of sampling distributions provides a gen- alike than the residents of a whole city are. As a re- eral guideline for this task. sult relatively few elements may be needed to rep- resent a given natural cluster adequately, although Recall that sampling error is reduced by two a larger number of clusters may be needed to rep- factors: an increase in the sample size and increased resent adequately the diversity found among the homogeneity of the elements being sampled. These clusters. This fact is most clearly seen in the ex- factors operate at each level of a multistage sample treme case of very different clusters composed of design. A sample of clusters will best represent all identical elements within each. In such a situation, clusters if a large number are selected and if all a large number of clusters would adequately repre- clusters are very much alike. A sample of elements sent all its members. Although this extreme situa- will best represent all elements in a given cluster if tion never exists in reality, it's closer to the truth in a large number are selected from the cluster and if most cases than its opposite: identical clusters com- all the elements in the cluster are very much alike. posed of grossly divergent elements. With a given total sample size, however, if the The general guideline for cluster design, then, is number of clusters is increased, the number of to maximize the number of clusters selected while elements within a cluster must be decreased. In decreasing the number of elements within each this respect the representativeness of the clusters is cluster. However, this scientific guideline must be increased at the expense of more poorly represent- balanced against an administrative constraint. The ing the elements composing each cluster, or vice efficiency of cluster sampling is based on the ability versa. Fortunately, homogeneity can be used to to minimize the listing of population elements. By ease this dilemma. initially selecting clusters, you need only list the elements composing the selected clusters, not all Typically, the elements composing a given elements in the entire population. Increasing the natural cluster within a population are more number of clusters, however, goes directly against homogeneous than all elements composing the this efficiency factor. A small number of clusters total population are. The members of a given church are more alike than all church members

Multistage Cluster Sampling 211 may be listed more quickly and more cheaply than Once the primary sampling units (churches, a large number. (Remember that all the elements blocks) have been grouped according to the rele- in a selected cluster must be listed even if only a vant, available stratification variables, either simple few are to be chosen in the sample. ) random or systematic-sampling techniques can be used to select the sample. You might select a The final sample design will reflect these two specified number of units from each group, or stra- constraints. In effect, you'll probably select as many tum, or you might arrange the stratified clusters in clusters as you can afford. Lest this issue be left too a continuous list and systematically sample that list. open-ended at this point, here's one general guide- line. Population researchers conventionally aim at To the extent that clusters are combined into the selection of 5 households per census block. If a homogeneous strata, the sampling error at this stage total of 2,000 households are to be interviewed, will be reduced. The primary goal of stratification, you would aim at 400 blocks with 5 household in- as before, is homogeneity, tenrjews on each. Figure 7-13 presents a graphic overvievv of this process. There's no reason why stratification couldn't take place at each level of sampling. The elements Before we turn to other, more detailed proce- listed within a selected cluster might be stratified dures available to cluster sampling, let me reiterate before the next stage of sampling. Typically, how- that this method almost inevitably involves a loss ever, this is not done. (Recall the assumption of rel- of accuracy. The manner in which this appears, ative homogeneity within clusters,) however, is somewhat complex. First as noted ear- lier, a multistage sample design is subject to a sam- Probability Proportionate pling error at each stage. Because the sample size to Size (PPS) Sampling is necessarily smaller at each stage than the total sample size, the sampling error at each stage will be This section introduces you to a more sophisticated greater than would be the case for a single-stage form of cluster sampling, one that is used in many random sample of elements. Second, sampling er- large-scale survey-sampling projects. In the preced- ror is estimated on the basis of observed variance ing discussion, I talked about selecting a random or among the sample elements. When those elements systematic sample of clusters and then a random or are drawn from among relatively homogeneous systematic sample of elements within each cluster clusters, the estimated sampling error will be too selected. Notice that this produces an overall sam- optimistic and must be corrected in the light of the pling scheme in which every element in the whole cluster sample design. population has the same probability of selection. Stratification in Multistage Let's say we're selecting households within a Cluster Sampling city If there are 1,000 city blocks and we initially select a sample of 100, that means that each block Thus far, we've looked at cluster sampling as has a 10Qll,000 or 0.1 chance of being selected. If though a simple random sample were selected at we next select 1 household in 10 from those resid- each stage of the design. In fact stratification tech- ing on the selected blocks, each household has a niques can be used to refine and improve the 0.1 chance of selection within its block. To calculate sample being selected. the overall probability of a household being se- lected, we simply mUltiply the probabilities at the The basic options here are essentially the same individual steps in sampling. That is, each house- as those in single-stage sampling from a list. In se- hold has a l/l 0 chance of its block being selected lecting a national sample of churches, for example, and a l/l 0 chance of that specific household being you might initially stratify your list of churches selected if the block is one of those chosen. Each by denomination, geographic region, size, rural or household, in this case, has a l/l 0 X l/l 0 = l/l 00 urban location, and perhaps by some measure of chance of selection overall. Because each house- social class. hold would have the same chance of selection, the

212 Chapter 7: The Logic of Sampling Stage One: Identify blocks JD and select a sample. (Selected blocks are shaded.) 1L~ 5th St. r--I Stage Two: Go to each selected block and list all households in order. (Example of one listed block.) 1. 491 Rosemary Ave . Stage Three: For 16. 408 Thyme Ave., 2. 487 Rosemary Ave., 17. 424 Thyme Ave. 3. 473 Rosemary Ave, each list, select 18. 446 Thyme Ave. 4. 455 Rosemary Ave, sample of households. 19. 458 Thyme Ave. 5. 437 Rosemary Ave. 20. 480 Thyme Ave. 6. 423 Rosemary Ave, (In this example, every 21. 498 Thyme Ave. 7. 411 Rosemary Ave, sixth household has 22. 1186 5th St 8. 403 Rosemary Ave. been selected starting 23. 11745thSt. 9. 1101 4th St with #5, which was 24. 1160 5th St 10. 1123 4th St selected at random..) 25. 1140 5th Sf. 11. 1137 4th Sf. 26. 1122 5th St. 12. 1157 4th Sf. 27. 1118 5th St. 13. 1169 4th Sf. 28. 1116 5th St 14. 1187 4th St 29. 11045th St. 15. 402 Thyme Ave. 30. 1102 5th St FIGURE 7-13 Multistage Cluster Sampling. In multistage cluster sampling, we begin by selecting asample of the clusters (in this case, city blocks), Then, we make alist of the elements (households, in this case) and select asample of elements from each of the selected clusters. sample so selected should be representative of all high-rise apartment buildings, and suppose that households in the city. the rest of the population lives in single-family dwellings spread out over the remaining 900 blocks. There are dangers in this procedure, however. When we first select our sample of III 0 of the In particular, the variation in the size of blocks blocks, it's quite possible that we'll miss all of the (measured in numbers of households) presents a 10 densely packed high-rise blocks. No matter what problem. Let's suppose that half the city's population happens in the second stage of sampling, our final resides in 10 densely packed blocks filled with

Multistage Cluster Sampling 213 sample of households will be grossly unrepresenta- Disproportionate Sampling tive of the city, comprising only single-family and Weighting dwellings. Ultimately, a probability sample is representative of Whenever the clusters sampled are of greatly a population if all elements in the population have differing sizes, it's appropriate to use a modified an equal chance of selection in that sample. Thus, sampling design called PPS (probability propor- in each of the preceding discussions, we've noted tionate to size), This design guards against the that the various sampling procedures result in an problem I've just described and still produces a equal chance of selection-even though the ulti- final sample in which each element has the same mate selection probability is the product of several chance of selection. partial probabilities. As the name suggests, each cluster is given a More generally, however, a probability sample chance of selection proportionate to its size. Thus, is one iIi which each population element has a a city block with 200 households has twice the known nonzero probability of selection-even chance of selection as one with only 100 house- though different elements may have different prob- holds. Within each cluster, however, a fixed num- abilities, If controlled probability sampling proce- ber of elements is selected, say, 5 households dures have been used, any such sample may be per block. Notice how this procedure results in representative of the population from which it is each household having the same probability of dravvn if each sample element is assigned a weight selection overall. equal to the inverse of its probability of selection. Thus, where all sample elements have had the Let's look at households of two different city same chance of selection, each is given the same blocks. Block A has 100 households, Block B has weight: 1. This is called a seliweiglzting sample. only 10. In PPS sampling, we would give Block A ten times as good a chance of being selected as Block Sometimes it's appropriate to give some cases B, So if, in the overall sample design, Block A has a more weight than others, a process called weight- 1120 chance of being selected, that means Block B ing, Disproportionate sampling and weighting would only have a 11200 chance. Notice that this come into play in two basic ways. First, you may means that all the households on Block A would sample subpopulations disproportionately to ensure have a 1120 chance of having their block selected; sufficient numbers of cases from each for analysis, Block B households have only a 1/200 chance. For example, a given city may have a suburban area containing one-fourth of its total population. If Block A is selected and we're taking 5 house- Yet you might be especially interested in a detailed holds from each selected block, then the house- analysis of households in that area and may feel holds on Block A have a 5/1 00 chance of being se- that one-fourth of this total sample size would be lected into the block's sample. Because we can too few. As a result, you might decide to select the multiply probabilities in a case like this, we see that every household on Block A has an overall chance PPS (probability proportionate to size) This refers to a type of multistage cluster sample in which of selection equal to 1120 X 5/100 = 5/2000 = 11400, clusters are selected, not with equal probabilities (see EPSElvf) but with probabilities proportionate to If Block B happens to be selected, on the other their sizes-as measured by the number of units to hand, its households stand a much better chance be subsampled, of being among the 5 chosen there: 5/1 0, When this is combined with their relatively poorer chance weighting Assigning different weights to cases that of having their block selected in the first place, were selected into a sample with different probabili- however, they end up with the same chance of ties of selection. In the simplest scenario, each case is selection as those on Block A: 11200 X 5/1 0 = given a weight equal to the inverse of its probability 51.2000 = 11400. of selection. When all cases have the same chance of selection, no weighting is necessary. Further refinements to this design make it a very efficient and effective method for selecting large cluster samples. For now, however, it's enough to understand the basic logic involved.

214 Chapter 7: The Logic of Sampling same number of households from the suburban term represemative with its normal social science area as from the remainder of the city. Households usage. What they mean, of course, is that they in the suburban area, then, are given a dispropor- wanted to get a substantial or \"large enough\" re- tionately better chance of selection than are those sponse from women, and oversampling is a per- located elsewhere in the city. fectly acceptable way of accomplishing that. As long as you analyze the two area samples By sampling more women than a straightfor- separately or comparatively, you need not worry ward probability sample would have produced, the about the differential sampling. If you want to com- authors were able to \"select\" enough women (812) bine the two samples to create a composite picture to compare vvith the men (960). Thus, when they of the entire city, however, you must take the dis- report, for example, that 32 percent of the women proportionate sampling into account. If II is the and 66 percent of the men agree that \"the amount number of households selected from each area, of sexual harassment at work is greatly exagger- then the households in the suburban area had a ated,\" we know that the female response is based chance of selection equal to n divided by one- on a substantial number of cases. That's good. fourth of the total city population. Because the to- There are problems, however. tal city population and the sample size are the same for both areas, the suburban-area households To begin with, subscriber surveys are always problematic. In this case, the best the researchers should be given a weight of tn, and the remaining can hope to talk about is \"what subscribers to Har- households should be given a weight of fIL This vard Business Review think.\" In a loose way, it might make sense to think of that population as repre- weighting procedure could be simplified by merely senting the more sophisticated portion of corporate giving a weight of 3 to each of the households se- management. Unfortunately, the overall response lected outside the suburban area. rate was 25 percent. Although that's quite good for subscriber surveys, it's a low response rate in terms Here's an example of the problems that can be of generalizing from probability samples. created when disproportionate sampling is not ac- companied by a weighting scheme. When the Har- Beyond that. however, the disproportionate vard Business Review decided to survey its subscribers sample design creates another problem. When the on the issue of sexual harassment at work, it authors state that 73 percent of respondents favor seemed appropriate to oversample women because company policies against harassment (Collins and female subscribers were vastly outnumbered by Blodgett 1981: 78), that figure is undoubtedly too male subscribers. Here's how G. C Collins and high, because the sample contains a disproportion- Timothy Blodgett explained the matter: ately high percentage of women-who are more likely than men to favor such policies. And, when We also skewed the sample another way: to en- the researchers report that top managers are more sure a representative response from women, likely to feel that claims of sexual harassment are we mailed a questionnaire to virtually every fe- exaggerated than are middle- and lower-level man- male subscriber, for a male/female ratio of 68% agers (1981: 81), that finding is also suspect. As the to 32%. This bias resulted in a response of 52% researchers report women are disproportionately male and 44% female (and 4% who gave no represented in lower management. That alone indication of gender)-compared to HBR's might account for the apparent differences among US. subscriber proportion of 93 % male and levels of management. In short, the failure to take 7% female. account of the oversampling of women confounds all survey results that don't separate the findings by (1981 78) gender. The solution to this problem would have been to weight the responses by gender, as de- Notice a couple of things in this excerpt. First. it scribed earlier in this section. would be nice to know a little more about what \"virtually every female\" means. Evidently, the au- In the 2000 and 2004 election campaign thors of the study didn't send questionnaires to all polling, survey weighting became a controversial female subscribers, but there's no indication of who was omitted and why. Second, they didn't use the

Main Points 215 topic, as some polling agencies weighted their re- predict an election but can't interview all voters. As sults on the basis of party affiliation and other vari- we proceed through the book, we'll see in greater ables, whereas others did not. Weighting in this detail how social researchers have found ways to instance involved assumptions regarding the differ- deal mth this issue. ential participation of Republicans and Democrats in opinion polls and on election day-plus a deter- MAIN POINTS mination of how many Republicans and DemocTats there were. This is likely to be a topic of debate Introduction among pollsters and politicians in the years to s Social researchers must select observations that come. Alan Reifman has created a website devoted to a discussion of this topic (http://www.hs.ttu mIl allow them to generalize to people and .edu/hdfs3390/weighting.htm). events not observed. Often this involves sam- pling, a selection of people to observe. Probability Sampling s Understanding the logic of sampling is essential to doing social research. in Review A Brief History of Sampling Much of this chapter has been devoted to the key s Sometimes you can and should select probabil- sampling method used in controlled survey re- search: probability sampling. In each of the varia- ity samples using precise statistical techniques, tions examined, we've seen that elements are cho- but other times nonprobability techniques are sen for study from a population on a basis of random more appropriate. selection mth known nonzero probabilities. Nonprobability Sampling Depending on the field situation, probability s Nonprobability sampling techniques include re- sampling can be either very simple or extremely difficult time-consuming, and expensive. What- lying on available subjects, purposive or judg- ever the situation, however. it remains the most ef- mental sampling, snowball sampling, and quota fective method for the selection of study elements. sampling. In addition, researchers studying a There are two reasons for this. social group may make use of informants. Each of these techniques has its uses, but none of First probability sampling avoids researchers' them ensures that the resulting sample mll be conscious or unconscious biases in element selec- representative of the population being sampled. tion. If all elements in the population have an equal (or unequal and subsequently weighted) chance of The Theory and Logic selection, there is an excellent chance that the of Probability Sampling sample so selected mll closely represent the popu- s Probability sampling methods provide an excel- lation of all elements. lent way of selecting representative samples Second, probability sampling permits estimates from large, known populations. These methods of sampling error. Although no probability sample counter the problems of conscious and uncon- will be perfectly representative in all respects, con- scious sampling bias by giving each element in trolled selection methods permit the researcher to the population a known (nonzero) probability estimate the degree of expected errOL of selection. s The key to probability sampling is random In this lengthy chapter, we've taken on a basic selection. issue in much social research: selecting observa- s The most carefully selected sample will never tions that mIl tell us something more general than provide a perfect representation of the popula- the specifics we've actually observed. This issue tion from which it was selected. There will confronts field researchers, who face more action always be some degree of sampling error. and more actors than they can observe and record fully, as well as political pollsters who want to

216 Chapter 7: TIle LogiC of Sampling 8 By predicting the distribution of samples with 8 If the members of a population have unequal respect to the target parameter, probability probabilities of selection into the sample, re- sampling methods make it possible to estimate searchers must assign weights to the different the amount of sampling error expected in a observations made, in order to provide a repre- given sample. sentative picture of the total population. The weight assigned to a particular sample member 8 The expected error in a sample is expressed in should be the inverse of its probability of terms of confidence levels and confidence selection. intervals. KEY TERMS Populations and Sampling Frames 8 A sampling frame is a list or quasi list of the The following terms are defined in context in the chapter and at the bottom of the page where the term members of a population. It is the resource is introduced, as well as in the comprehensive glossary used in the selection of a sample. A sample's at the back of the book. representativeness depends directly on the ex- tent to which a sampling frame contains all the cluster sampling representativeness members of the total population that the confidence interval sampling error sample is intended to represent. confidence level sampling frame element sampling interval Types of Sampling Designs EPSEM sampling ratio 8 Several sampling designs are available to informant sampling unit nonprobability simple random researchers. sampling sampling 8 Simple random sampling is logically the most parameter snowball sampling population statistic fundamental technique in probability sampling, PPS stratification but it is seldom used in practice. probability sampling study population 8 Systematic sampling involves the selection of purposive (judgmental) systematic every kth member from a sampling frame. This sampling sampling method is more practical than simple random quota sampling weighting sampling; with a few exceptions, it is function- random selection ally equivalent. 8 Stratification, the process of grouping the mem- REVIEW QUESTIONS AND EXERCISES bers of a population into relatively homoge- neous strata before sampling, improves the rep- 1. Review the discussion of the 1948 Gallup Poll resentativeness of a sample by reducing the that predicted that Thomas Dewey would defeat degree of sampling error. Harry Truman for president. What are some ways Gallup could have modified his quota Multistage Cluster Sampling sample design to avoid the error? 8 Multistage cluster sampling is a relatively com- 2. Using Appendix C of this book, select a simple plex sampling technique that frequently is used random sample of 10 numbers in the range of 1 when a list of all the members of a population to 9,876. What is each step in the process? does not exist. Typically, researchers must balance the number of clusters and the size of 3. What are the steps involved in selecting a each cluster to achieve a given sample size. multistage cluster sample of students taking Stratification can be used to reduce the sampling first-year English in US. colleges and error involved in multistage cluster sampling. universities? 8 Probability proportionate to size (PPS) is a spe- cial. efficient method for multistage cluster sampling.

Online Study Resources 217 4. In Chapter 9, we'll discuss surveys conducted Online Study Resources on the Internet. Can you anticipate possible problems concerning sampling frames, repre- Sociology~' Now\"': Research Methods sentativeness, and the like? Do you see any solutions? 1. Before you do your final review of the chapter, take the SociologyNolV. Research Methods diagnos- 5. Using InfoTrac College Edition, locate studies tic quiz to help identify the areas on which you using (1) a quota sample, (2) a multistage clus- should concentrate. You'll find information on ter sample, and (3) a systematic sample. Write a this online tool. as well as instructions on how brief description of each study. to access all of its great resources, in the front of the book. ADDITIONAL READINGS 2.. As you review, take advantage of the Sociology Frankfort-Nachmias, Chava, and Anna Leon- Now: Research Methods customized study plan, Guerrero. 2000. Social Statistics for a Diverse based on your quiz results. Use this study plan Society. 2nd ed. Thousand Oaks, CA: Pine Forge with its interactive exercises and other re- Press. See Chapter 11 especially. This statistics sources to master the material. textbook covers many of the topics we've dis- cussed in this chapter but in a more statistical 3. When you're finished with your review, take context. It demonstrates the links between the posttest to confirm that you're ready to probability sampling and statistical analyses. move on to the next chapter. Kalton, Graham. 1983. Introduction ro Survey WEBSITE FOR THE PRACTICE Sampling. Newbury Park, CA: Sage. Kalton OF SOc/AL RESEARCH 11 TH EDITION goes into more of the mathematical details of sampling than the present chapter does, with- Go to your book's website at http://sociology out attempting to be as definitive as Kish, de- .wadswonh.com/babbie_practicelle for tools to scribed next. aid you in studying for your exams. You'll find Tuto- rial Quizzes with feedback, Intemet Exercises, Flashcards, Kish, Leslie. 1965. Survey Sampling. New York: and Chapter TIltorials, as well as E).1ended Projects, II/fo- Wiley. Unquestionably the definitive work on Trac College Edition search terms, Social Research ill sampling in social research. Kish's coverage Cyberspace. GSS Data, Web Links, and primers for us- ranges from the simplest matters to the most ing various data-analysis software such as SPSS and complex and mathematicaL both highly theo- NVivo. retical and downright practicaL Easily readable and difficult passages intermingle as Kish ex- WEB LINKS FOR THIS CHAPTER hausts everything you could want or need to know about each aspect of sampling. Please realize that the Internet is an evolving entity, subject to change. Nevertheless, these Sudman, Seymour. 1983. \"Applied Sampling.\" few websites should be fairly stable. Also, Pp. 145-94 in Handbook ofSurvey Research, check your book's website for even more VI't?b Lil/ks. edited by Peter H. Rossi. James D. Wright. and These websites, current at the time of this book's Andy B. Anderson. New York: Academic publication, provide opportunities to learn about Press. An excellent. practical guide to survey sampling. sampling. Bill Trochim, Probability Sampling SPSS EXERCISES http://www.socialresearchmethods.netlkb/ sampprob.htm See the booklet that accompanies your text for exercises using SPSS (Statistical Package for the Survey Sampling, Inc., The Frame Social Sciences). There are exercises offered for each http://www.worldopinion.com/the_frame/ chapter, and you'll also find a detailed primer on using SPSS. Bureau of Labor Statistics and Census Bureau, Sampling http://ww\\v.bls.census. gov/cps/bsampdes.htm

aving explored the structuring of inquiry in depth, we're now ready to dive into the various observational techniques available to social scientists. Experiments are usually thought of in connection with the physical sciences. In Chapter 8 we'll see how social scientists use experiments. This is the most rigor- ously controllable ofthe methods we'll examine. Under- standing experiments is also auseful way to enhance your understanding of the general logic of social sci- entific research. Chapter 9 will describe survey research, one of the most popular methods in social science. This type of research involves collecting data by asking people questions-either in self-administered questionnaires or through interviews, which, in tum, can be conducted face-to-face or over the telephone. Chapter 10, on qualitative field research, examines perhaps the most natural form of data collection used by social scientists: the direct observation of social phenom- ena in natural settings. As you'll see, some researchers go beyond mere observation to participate in what they're studying, because they want amore intimate view and afuller understanding of it.

I• Chapter 11 discusses three forms of unobtrusive we'll look briefly at social indicators as away of assess- data collection that take advantage of some of the ing broader social processes. data available all around us. For example, content analysis is amethod of collecting social data through Before we tum to the actual descriptions of these re- carefully specifying and counting social artifacts such search methods, two points should be made. First, you'll as books, songs, speeches, and paintings. Without probably discover that you've been using these scientific making any personal contact with people, you can methods casually in your daily life for as long as you use this method to examine awide variety of social can remember. You use some form of field research phenomena. 'The analysis of existing statistics offers every day. You employ acrude form of content analysis another way of studying people without having to talk every time you judge an author's motivation from her or to them. Governments and avariety of private organiza- his writings. You engage in at least casual experiments tions regularly compile great masses of data, which frequently. Part 3 will show you how to improve your you often can use with little or no modification to use of these methods so as to avoid certain pitfalls. answer properly posed questions. Finally, historical documents are avaluable resource for social scientific Second, none of the data-collection methods de- analysis. scribed in these chapters is appropriate to all research topiCS and situations. I give you some ideas, early in each Chapter 12, on evaluation research, looks at arap- chapter, of when agiven method might be appropriate. idly growing subfield in social science, involving the ap- Still, Icould never anticipate all the research topics that plication of experimental and quasi-experimental mod- may one day interest you. As ageneral guideline, you els to the testing of social interventions in real life. You should always use avariety of techniques in the study of might use evaluation research, for example, to test the any topic. BeCause each method has its weaknesses, the effectiveness of adrug rehabilitation program or the use of several methods can help fill in any gaps; if the dif- efficiency of anew school cafeteria. In the same chapter, ferent, independent approaches to the topic all yield the same conclusion, you've achieved aform of replication. 219

Experiments Introduction Variations on Experimental Design Topics Appropriate to Experiments Preexperimental Research Designs The Classical Experiment Validity Issues in Independent and Experimental Research Dependent Variables Pretesting and Posttesting An Illustration of Experimental and Control Experimentation Groups The Double-Blind Alternative Experimental Experiment Settings Selecting Subjects Web-Based Experiments Probability Sampling \"Natural\" Experiments Randomization Matching Strengths and Weaknesses of Matching or the Experimental Method Randomization? Sociology@Now\"': Research Methods Use this online tool to help you make the grade on your next exam. After reading this chapter, go to the \"Online Study Resources\" at the end of the chapter for instructions on how to benefit from SociologyNotv: Research Methods..

Topics Appropriate to Experiments 221 Introduction hypothesis testing. Because experiments focus on determining causation, they're also better suited to This chapter addresses the research method most explanatory than to descriptive purposes. commonly associated 'with structured science in general: the experiment. Here we'll focus on the Let's assume, for example, that we want to dis- experiment as a mode of scientific observation cover ways of reducing prejudice against African in social research. At base, experiments involve Americans. We hypothesize that learning about the (1) taking action and (2) observing the conse- contribution of African Americans to U.S. history quences of that action. Social researchers typically will reduce prejudice, and we decide to test this hy- select a group of subjects, do something to them, pothesis experimentally. To begin, we might test a and observe the effect of what was done. In this group of experimental subjects to determine their chapter, we'll examine the logic and some of the levels of prejudice against African Americans. Next. techniques of social scientific experiments. we might show them a documentary film depicting the many important ways African Americans have It's worth noting at the outset that we often use contributed to the scientific, literary, political. and experiments in nonscientific inquiry. In preparing a social development of the nation. Finally, we would stew, for example, we add salt, taste, add more salt, measure our subjects' levels of prejudice against and taste again. In defusing a bomb, we clip the red African Americans to determine whether the film wire, observe whether the bomb explodes, clip an- has actually reduced prejudice. other, and ... Experimentation has also been successful in the We also experiment copiously in our attempts to study of small-group interaction. Thus, we might develop generalized understandings about the world bring together a small group of experimental sub- we live in. All skills are learned through experimen- jects and assign them a task, such as making rec- tation: eating, walking, talking, riding a bicycle, ommendations for popularizing car pools. We ob- swimming, and so forth. Through experimentation, serve, then, how the group organizes itself and students discover how much studying is required for deals with the problem. Over the course of several academic success. Through experimentation, profes- such experiments, we might systematically vary sors learn how much preparation is required for suc- the nature of the task or the rewards for handling cessfullectures. This chapter discusses how social re- the task successfully. By observing differences in the searchers use experiments to develop generalized way groups organize themselves and operate under understandings. We'll see that. like other methods these varying conditions, we can learn a great deal available to the social researcher, experimenting has about the nature of small-group interaction and the its special strengths and weaknesses. factors that influence it. For example, attorneys sometimes present evidence in different ways to Topics Appropriate different mock juries, to see which method is the to Experiments most effective. Experiments are more appropriate for some topics We typically think of experiments as being con- and research purposes than others, Experiments ducted in laboratories. Indeed, most of the exam- are especially well suited to research projects in- ples in this chapter involve such a setting. This volving relatively limited and well-defined concepts need not be the case, however. Increasingly, social and propositions. In terms of the traditional image researchers are using the World Wide Web as a of science, discussed earlier in this book, the vehicle for conducting experiments. Further, some- experimental model is especially appropriate for times we can construct what are called natural f),:perimeJ1ts: \"experiments\" that occur in the regular course of social events. The latter portion of this chapter deals with such research.

222 Chapter 8: Experiments The Classical Experiment dependent variable in another., For example, preju- dice is the dependent variable in our example, but In both the natural and the social sciences, the most it might be the independent variable in an experi- conventional type of experiment involves three ment examining the effect of prejudice on voting major pairs of components: (1) independent and behavior., dependent variables, (2) pretesting and posttesting, and (3) experimental and control groups, This sec- To be used in an experiment both independent tion looks at each of these components and the and dependent variables must be operationally way they're put together in the execution of the defined. Such operational definitions might involve experiment a variety of observation methods. Responses to a questionnaire, for example, might be the basis for Independent and Dependent Variables defining prejudice. Speaking to or ignoring African Americans, or agreeing or disagreeing with them, Essentially, an experiment examines the effect of might be elements in the operational definition of an independent variable on a dependent variable, interaction with African Americans in a small- Typically, the independent variable takes the form group setting. of an experimental stimulus, which is either pres- ent or absent That is, the stimulus is a dichoto- Conventionally, in the e)\"'Perimental modeL de- mous variable, having two attributes, present or pendent and independent variables must be opera- not present In this typical modeL the experimenter tionally defined before the experiment begins. compares what happens when the stimulus is pres- However, as you'll see in connection with survey ent to what happens when it is not research and other methods, it's sometimes appro- priate to make a vvide variety of observations during In the example concerning prejudice against data collection and then determine the most useful African Americans, prejudice is the dependent vari- operational definitions of variables during later able and exposure to Africal1 American history is the analyses. Ultimately, however, experimentation, independent variable, The researcher'S hypothesis like other quantitative methods, requires specific suggests that prejudice depends, in part on a lack and standardized measurements and observations. of knowledge of African American history. The purpose of the experiment is to test the validity of Pretesting and Posttesting this hypothesis by presenting some subjects with an appropriate stimulus, such as a documentary In the simplest experimental design, subjects are film. In other terms, the independent variable is measured in terms of a dependent variable the cause and the dependent variable is the effect (pretesting), exposed to a stimulus representing Thus, we might say that watching the film caused a an independent variable, and then remeasured in change in prejudice or that reduced prejudice was terms of the dependent variable (posttesting), an effect of watching the film. Any differences between the first and last measure- ments on the dependent variable are then attrib- The independent and dependent variables ap- uted to the independent variable. propriate to experimentation are nearly limitless. Moreover. a given variable might serve as an In the example of prejudice and exposure to independent variable in one experiment and as a African American history, we'd begin by pretesting the extent of prejudice among our experimental pretesting The measurement of a dependent vari- subjects. Using a questionnaire asking about atti- able among subjects. tudes toward African Americans, for example, we posttesting The remeasurement of a dependent could measure both the extent of prejudice exhib- variable among subjects after they've been exllosed ited by each individual subject and the average to an independent variable. prejudice level of the whole group. After exposing the subjects to the African American history film, we could administer the same questionnaire again. Responses given in this posttest would permit us to

The Classical Experiment 223 measure the later extent of prejudice for each sub- Experimental Compare: Control ject and the average prejudice level of the group as Group Same? Group a whole. If we discovered a lower level of prejudice during the second administration of the question- Measure Measure naire, we might conclude that the film had indeed dependent dependent reduced prejudice. variable variable In the experimental examination of attitudes such as prejudice, we face a special practical prob- Administer lem relating to validity. As you may already have experimental imagined, the subjects might respond differently to stimulus (film) the questionnaires the second time even if their at- titudes remain unchanged. During the first admin- Remeasure Compare: Remeasure istration of the questionnaire, the subjects might be dependent Different? dependent unaware of its purpose. By the second measure- ment, they might have figured out that the re- variable variable searchers were interested in measuring their preju- dice. Because no one wishes to seem prejudiced, FIGURE 8-1 the subjects might \"clean up\" their answers the second time around. Thus, the film would seem Diagram of Basic Experimental Design, The fundamental to have reduced prejudice although, in fact, it purpose of an experiment is to isolate the possible effect of had not. an independent variable (called the stimulus in experiments) on a dependent variable, Members of the experimental This is an example of a more general problem group(s) are exposed to the stimulus, while those in the con- that plagues many forms of social research: The trol group(s) are not. very act of studying something may change it. The techniques for dealing with this problem in the much as that of the experimental group, then the context of experimentation i'.ill be discussed in var- apparent reduction in prejudice must be a function ious places throughout the chapter. The first tech- of the experiment or of some external factor rather nique involves the use of control groups. than a function of the film. If, on the other hand, prejudice is reduced only in the experimental group, Experimental and Control Groups this reduction would seem to be a consequence of exposure to the film, because that's the only differ- Laboratory experiments seldom, if ever, involve ence between the two groups. Alternatively, if only the observation of an experimental group to prejudice is reduced in both groups but to a greater which a stimulus has been administered. In addi- degree in the e)\"'Perimental group than in the con- tion, the researchers also observe a control group, trol group, that, too, would be grounds for assum- which does not receive the experimental stimulus. ing that the film reduced prejudice. In the example of prejudice and African Ameri- The need for control groups in social research can history, we might examine two groups of sub- became clear in connection ,vith a series of studies jects. To begin, we give each group a questionnaire designed to measure their prejudice against African experimental group In experimentation, a group An1ericans. Then we show the film only to the ex- of subjects to whom an experimental stimulus is perimental group, Finally, we administer a posttest administered. of prejudice to both groups. Figure 8-1 illustrates this basic experimental design. control group In experimentation, a group of subjects to whom no experimental stimulus is ad- Using a control group allows the researcher to ministered and who should resemble the exlleri- detect any effects of the experiment itself. If the mental group in all other respects. The comparison posttest shows that the overall level of prejudice of the control group and the experimental group at exhibited by the control group has dropped as the end of the experiment points to the effect of the experimental stimulus.

224 Chapter 8: Experiments of employee satisfaction conducted by F. J. Roeth- they, like the experimental group, are receiving an lisberger and W. 1. Dickson (1939) in the late 1920s experimental drug. Often, they improve. If the new and early 1930s. These two researchers were inter- drug is effective, however, those receiving the ac- ested in discovering what changes in working con- tual drug will improve more than those receiving ditions would improve employee satisfaction and the placebo. productivity. To pursue this objective, they studied working conditions in the telephone \"bank wiring In social scientific experiments, control groups room\" of the Western Electric Works in the Chi- guard against not only the effects of the experi- cago suburb of Hawthorne, Illinois. ments themselves but also the effects of any events outside the laboratory during the experiments. To the researchers' great satisfaction, they dis- In the example of the study of prejudice, suppose covered that improving the working conditions in- that a popular African American leader is assassi- creased satisfaction and productivity consistently. nated in the middle of, say, a weeklong experiment. As the workroom was brightened up through bet- Such an event may very well horrify the experi- ter lighting, for example, productivity went up. mental subjects, requiring them to examine their When lighting was further improved, productivity own attitudes toward African Americans, with the went up again. result of reduced prejudice. Because such an effect should happen about equally for members of the To further substantiate their scientific conclu- control and experimental groups, a greater reduc- sion, the researchers then dimmed the lights. tion of prejudice among the experimental group Whoops-productivity improved again! would, again, point to the impact of the experi- mental stimulus: the documentary film. At this point it became evident that the wiring- room workers were responding more to the atten- Sometimes an experimental design requires tion given them by the researchers than to improved more than one e)(perinlental or control group. In working conditions. As a result of this phenomenon, the case of the documentary film, for example, we often called the Hawthorne effea, social researchers might also want to examine the impact of reading a have become more sensitive to and cautious about book on African American history. In that case, we the possible effects of experiments themselves. In might have one group see the film and read the the wiring-room study, the use of a proper control book, another group only see the movie, still an- group-one that was studied intensively without other group only read the book, and the control any other changes in the working conditions- group do neither. With this kind of design, we would have pointed to the presence of this effect. could determine the impact of each stimulus sepa- rately, as well as their combined effect. The need for control groups in experimentation has been nowhere more evident than in medical The Double-Blind Experiment research. Time and again, patients who participate in medical experiments have appeared to improve, Like patients who improve when they merely think but it has been unclear how much of the improve- they're receiving a new drug, sometimes experi- ment has come from the experimental treatment menters tend to prejudge results. In medical re- and how much from the experiment In testing the search, the experimenters may be more likely to effects of new drugs, then, medical researchers fre- \"observe\" improvements among patients receiving quently administer a p/acebo-a \"drug\" with no rel- the experimental drug than among those receiving evant effect. such as sugar pills-to a control the placebo. (This would be most likely, perhaps, group. Thus, the control-group patients believe that for the researcher who developed the drug.) A double-blind experiment eliminates this pos- double-blind experiment An experimental de- sibility, because in this design neither the subjects sign in which neither the subjects nor the experi- nor the experimenters know which is the experi- menters know which is the experimental group and mental group and which is the controL In the which is the control.


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook