224 ■ Chapter 7: The Logic of Sampling they report, for example, that 32 percent of the among pollsters and politicians in the years to women and 66 percent of the men agree that “the come. Alan Reifman has created a website de- amount of sexual harassment at work is greatly voted to a discussion of this topic (go to the link exaggerated,” we know that the female response is on this book’s website: http://www.cengage.com/ based on a substantial number of cases. That’s good. sociology/babbie). There are problems, however. Probability Sampling To begin with, subscriber surveys are always in Review problematic. In this case, the best the researchers can hope to talk about is “what subscribers to Har- Much of this chapter has been devoted to the vard Business Review think.” In a loose way, it might key sampling method used in controlled survey make sense to think of that population as repre- research: probability sampling. In each of the senting the more sophisticated portion of corporate variations examined, we’ve seen that elements management. Unfortunately, the overall response are chosen for study from a population on a rate was 25 percent. Although that’s quite good for basis of random selection with known nonzero subscriber surveys, it’s a low response rate in terms probabilities. of generalizing from probability samples. Depending on the field situation, probability Beyond that, however, the disproportionate sampling can be either very simple or extremely sample design creates another problem. When the difficult, time-consuming, and expensive. What- authors state that 73 percent of respondents favor ever the situation, however, it remains the most ef- company policies against harassment (Collins and fective method for the selection of study elements. Blodgett 1981: 78), that figure is undoubtedly too There are two reasons for this. high, because the sample contains a disproportion- ately high percentage of women—who are more First, probability sampling avoids researchers’ likely than men to favor such policies. And, when conscious or unconscious biases in element selec- the researchers report that top managers are more tion. If all elements in the population have an equal likely to feel that claims of sexual harassment are (or unequal and subsequently weighted) chance exaggerated than are middle- and lower-level man- of selection, there is an excellent chance that the agers (1981: 81), that finding is also suspect. As the sample so selected will closely represent the popu- researchers report, women are disproportionately lation of all elements. represented in lower management. That alone might account for the apparent differences among Second, probability sampling permits estimates levels of management. In short, the failure to take of sampling error. Although no probability sample account of the oversampling of women confounds will be perfectly representative in all respects, con- all survey results that don’t separate the findings trolled selection methods permit the researcher to by gender. The solution to this problem would estimate the degree of expected error. have been to weight the responses by gender, as described earlier in this section. In this lengthy chapter, we’ve taken on a basic issue in much social research: selecting observa- In the 2000 and 2004 election campaign poll- tions that will tell us something more general than ing, survey weighting became a controversial topic, the specifics we’ve actually observed. This issue as some polling agencies weighted their results on confronts field researchers, who face more action the basis of party affiliation and other variables, and more actors than they can observe and record whereas others did not. Weighting in this instance fully, as well as political pollsters who want to involved assumptions regarding the differential predict an election but can’t interview all voters. As participation of Republicans and Democrats in we proceed through the book, we’ll see in greater opinion polls and on election day—plus a determi- detail how social researchers have found ways to nation of how many Republicans and Democrats deal with this issue. there were. This is likely to be a topic of debate
Main Points ■ 225 The Ethics of Sampling In addition, researchers studying a social group may make use of informants. Each of these tech- The key purpose of the sampling techniques dis- niques has its uses, but none of them ensures that cussed in this chapter is to allow researchers to the resulting sample will be representative of the make relatively few observations but gain an ac- population being sampled. curate picture of a much larger population. In the case of quantitative studies using probability sam- The Theory and Logic pling, the result should be a statistical profile, based of Probability Sampling on the sample, that closely mirrors the profile that would have been gained from observing the whole • Probability-sampling methods provide an excel- population. In addition to using legitimate sampling techniques, researchers should be careful to point lent way of selecting representative samples out the possibility of errors: sampling error, flaws from large, known populations. These methods in the sampling frame, nonresponse error, or any- counter the problems of conscious and uncon- thing else that might make the results misleading. scious sampling bias by giving each element in the population a known (nonzero) probability of Sometimes, more typically in qualitative stud- selection. ies, the purpose of sampling may be to tap into the breadth of variation within a population rather • Random selection is often a key element in prob- than to focus on the “average” or “typical” member of that population. While this is a legitimate and ability sampling. valuable approach, it poses the risk that readers may mistake the display of differences to reflect the • The most carefully selected sample will never distribution of characteristics in the population. In such a case, the researcher should make sure that provide a perfect representation of the population the reader is not misled. from which it was selected. There will always be some degree of sampling error. MAIN POINTS • By predicting the distribution of samples with Introduction respect to the target parameter, probability- • Social researchers must select observations that sampling methods make it possible to estimate the amount of sampling error expected in a given will allow them to generalize to people and events sample. not observed. Often this involves sampling a selec- tion of people to observe. • The expected error in a sample is expressed • Understanding the logic of sampling is essential to in terms of confidence levels and confidence intervals. doing social research. Populations and Sampling Frames A Brief History of Sampling • A sampling frame is a list or quasi list of the • Sometimes you can and should select probability members of a population. It is the resource used samples using precise statistical techniques, but in the selection of a sample. A sample’s represen- other times nonprobability techniques are more tativeness depends directly on the extent to which appropriate. a sampling frame contains all the members of the total population that the sample is intended to Nonprobability Sampling represent. • Nonprobability sampling techniques include rely- Types of Sampling Designs ing on available subjects, purposive or judgmental • Several sampling designs are available to sampling, snowball sampling, and quota sampling. researchers. • Simple random sampling is logically the most fun- damental technique in probability sampling, but it is seldom used in practice. • Systematic sampling involves the selection of every kth member from a sampling frame. This method is more practical than simple random sampling; with a few exceptions, it is functionally equivalent.
226 ■ Chapter 7: The Logic of Sampling • Stratification, the process of grouping the mem- confidence level sampling frame element sampling interval bers of a population into relatively homogeneous EPSEM sampling ratio strata before sampling, improves the represen- informant sampling unit tativeness of a sample by reducing the degree of nonprobability simple random sampling error. sampling sampling parameter snowball sampling Multistage Cluster Sampling population statistic PPS stratification • Multistage cluster sampling is a relatively complex probability sampling study population purposive (judgmental) systematic sampling technique that frequently is used when sampling sampling a list of all the members of a population does not quota sampling weighting exist. Typically, researchers must balance the random selection number of clusters and the size of each cluster to achieve a given sample size. Stratification can PROPOSING SOCIAL RESEARCH: SAMPLING be used to reduce the sampling error involved in multistage cluster sampling. In this portion of the proposal, you’ll describe how you’ll select from among all the possible observations • Probability proportionate to size (PPS) is a special, you might make. Depending on the data collection method you plan to employ, either probability or efficient method for multistage cluster sampling. nonprobability sampling may be more appropriate to your study. Similarly, this aspect of your proposal • If the members of a population have unequal may involve the sampling of subjects or informants, or it could involve the sampling of corporations, cities, probabilities of selection into the sample, research- books, and so forth. ers must assign weights to the different observa- tions made, in order to provide a representa- Your proposal, then, must specify what units tive picture of the total population. The weight you’ll be sampling among, the data you’ll use (such as assigned to a particular sample member should be a sampling frame) for purposes of your sample selec- the inverse of its probability of selection. tion, and the actual sampling methods you’ll use. Probability Sampling in Review REVIEW QUESTIONS AND EXERCISES • Probability sampling remains the most effective 1. Review the discussion of the 1948 Gallup Poll that predicted that Thomas Dewey would defeat Harry method for the selection of study elements for two Truman for president. What are some ways Gallup reasons: it avoids researcher bias in element selec- could have modified his quota sample design to tion and it permits estimates of sampling error. avoid the error? The Ethics of Sampling 2. Using Appendix C of this book, select a simple random sample of 10 numbers in the range of 1 to • Because probability sampling always carries a risk 9,876. What is each step in the process? of error, the researcher must inform readers of 3. What are the steps involved in selecting a multi- any errors that might make results misleading. stage cluster sample of students taking first-year English in U.S. colleges and universities? • Sometimes, nonprobability sampling methods are 4. In Chapter 9 we’ll discuss surveys conducted on used to obtain the breadth of variations in a popu- the Internet. Can you anticipate possible problems lation. In this case, the researcher must ensure concerning sampling frames, representativeness, that readers do not confuse variations with what’s and the like? Do you see any solutions? typical in the population. KEY TERMS The following terms are defined in context in the chapter and at the bottom of the page where the term is introduced, as well as in the comprehensive glossary at the back of the book. cluster sampling representativeness confidence interval sampling error
Online Study Resources ■ 227 5. Using InfoTrac College Edition, locate studies areas on which you should concentrate. You’ll using (1) a quota sample, (2) a multistage cluster find information on this online tool, as well as sample, and (3) a systematic sample. Write a brief instructions on how to access all of its great re- description of each study. sources, in the front of the book. SPSS EXERCISES 2. As you review, take advantage of the CengageNOW personalized study plan, based on your quiz See the booklet that accompanies your text for ex- results. Use this study plan with its interactive ex- ercises using SPSS (Statistical Package for the Social ercises and other resources to master the material. Sciences). There are exercises offered for each chapter, and you’ll also find a detailed primer on using SPSS. 3. When you’re finished with your review, take the posttest to confirm that you’re ready to move on Online Study Resources to the next chapter. If your book came with an access code card, visit WEBSITE FOR THE PRACTICE www.cengage.com/login to register. To purchase OF SOCIAL RESEARCH 12TH EDITION access, please visit www.ichapters.com. 1. Before you do your final review of the chapter, Go to your book’s website at www.cengage.com/ sociology/babbie for tools to aid you in studying for take the CengageNOW pretest to help identify the your exams. You’ll find Tutorial Quizzes with feedback, Internet Exercises, Flash Cards, Glossaries, and Essay Quiz- zes, as well as InfoTrac College Edition search terms, sug- gestions for additional reading, Web Links, and primers for using data-analysis software such as SPSS.
8 Experiments Having explored the structuring of inquiry in depth, we’re now ready to dive into the 9 Survey Research various observational techniques available to social scientists. 10 Qualitative Field Experiments are usually thought of in connection Research with the physical sciences. In Chapter 8 we’ll see how social scientists use experiments. This is the most rigor- 11 Unobtrusive Research ously controllable of the methods we’ll examine. Under- standing experiments is also a useful way to enhance 12 Evaluation Research your understanding of the general logic of social science research. Chapter 9 will describe survey research, one of the most popular methods in social science. This type of research involves collecting data by asking people questions—either in self-administered questionnaires or through interviews, which, in turn, can be conducted face-to-face or over the telephone. Chapter 10, on qualitative field research, examines perhaps the most natural form of data collection used by social scientists: the direct observation of social phenomena in natural settings. As you’ll see, some researchers go beyond mere observation to participate in what they’re studying, because they want a more intimate view and a fuller understanding of it. Chapter 11 discusses three forms of unobtrusive data collection that take advantage of some of the data
PART 3 Modes of Observation: Quantitative and Qualitative available all around us. For example, content analysis Before we turn to the actual descriptions of these re- is a method of collecting social data through carefully search methods, two points should be made. First, you’ll specifying and counting social artifacts such as books, probably discover that you’ve been using these scientific songs, speeches, and paintings. Without making any methods casually in your daily life for as long as you personal contact with people, you can use this method can remember. You use some form of field research to examine a wide variety of social phenomena. The every day. You employ a crude form of content analysis analysis of existing statistics offers another way of every time you judge an author’s motivation from her or studying people without having to talk to them. Govern- his writings. You engage in at least casual experiments ments and a variety of private organizations regularly frequently. Part 3 will show you how to improve your compile great masses of data, which you often can use use of these methods so as to avoid certain pitfalls. with little or no modification to answer properly posed questions. Finally, historical documents are a valuable Second, none of the data-collection methods de- resource for social science analysis. scribed in these chapters is appropriate to all research topics and situations. I give you some ideas, early in each Chapter 12, on evaluation research, looks at a rap- chapter, regarding when a given method might be ap- idly growing subfield in social science, involving the ap- propriate. Still, I could never anticipate all the research plication of experimental and quasi-experimental models topics that may one day interest you. As a general to the testing of social interventions in real life. You might guideline, you should always use a variety of techniques use evaluation research, for example, to test the effective- in the study of any topic. Because each method has its ness of a drug rehabilitation program or the efficiency of weaknesses, the use of several methods can help fill any a new school cafeteria. In the same chapter, we’ll look gaps; if the different, independent approaches to the topic briefly at social indicators as a way of assessing broader all yield the same conclusion, you’ve achieved a form of social processes. replication. 229
CHAPTER EIGHT Experiments CHAPTER OVERVIEW Introduction Variations on Experimental Design An experiment is a mode Topics Appropriate for of observation that enables Experiments Preexperimental Research researchers to probe causal Designs relationships.Many experiments The Classical Experiment Validity Issues in in social research are conducted Independent and Experimental Research under the controlled conditions Dependent Variables of a laboratory,but experimenters Pretesting and Posttesting An Illustration of can also take advantage of natural Experimental and Control Experimentation occurrences to study the effects of Groups events in the social world. The Double-Blind Alternative Experimental Experiment Settings Selecting Subjects Web-Based Experiments Probability Sampling “Natural” Experiments Randomization Matching Strengths and Weaknesses of Matching or the Experimental Method Randomization? Ethics and Experiments CengageNOW for Sociology Use this online tool to help you make the grade on your next exam. After reading this chapter, go to “Online Study Resources” at the end of the chapter for instructions on how to benefit from CengageNOW.
Topics Appropriate for Experiments ■ 231 Introduction volving relatively limited and well-defined con- cepts and propositions. In terms of the traditional This chapter addresses the controlled experiment: a image of science, discussed earlier in this book, the research method associated more with the natural experimental model is especially appropriate for than the social sciences. We begin Part 3 with this hypothesis testing. Because experiments focus on method because the logic and basic techniques determining causation, they’re also better suited to of the controlled experiment provide a useful explanatory than to descriptive purposes. backdrop for understanding other techniques more commonly used in social science, especially for Let’s assume, for example, that we want to explanatory purposes. We’ll also see in this chapter discover ways of reducing prejudice against African some of the inventive ways social scientists have Americans. We hypothesize that learning about the conducted experiments. contribution of African Americans to U.S. history will reduce prejudice, and we decide to test this At base, experiments involve (1) taking action hypothesis experimentally. To begin, we might test and (2) observing the consequences of that action. a group of experimental subjects to determine their Social researchers typically select a group of sub- levels of prejudice against African Americans. Next, jects, do something to them, and observe the effect we might show them a documentary film depicting of what was done. the many important ways African Americans have contributed to the scientific, literary, political, and It’s worth noting at the outset that we often use social development of the nation. Finally, we would experiments in nonscientific inquiry. In preparing a measure our subjects’ levels of prejudice against stew, for example, we add salt, taste, add more salt, African Americans to determine whether the film and taste again. In defusing a bomb, we clip the has actually reduced prejudice. red wire, observe whether the bomb explodes, clip another, and . . . Experimentation has also been successful in the study of small-group interaction. Thus, we We also experiment copiously in our attempts to might bring together a small group of experimental develop generalized understandings about the world subjects and assign them a task, such as making we live in. All skills are learned through experimen- recommendations for popularizing car pools. We tation: eating, walking, talking, riding a bicycle, observe, then, how the group organizes itself and swimming, and so forth. Through experimentation, deals with the problem. Over the course of several students discover how much studying is required for such experiments, we might systematically vary the academic success. Through experimentation, pro- nature of the task or the rewards for handling the fessors learn how much preparation is required for task successfully. By observing differences in the successful lectures. This chapter discusses how social way groups organize themselves and operate under researchers use experiments to develop generalized these varying conditions, we can learn a great deal understandings. We’ll see that, like other methods about the nature of small-group interaction and available to the social researcher, experimenting has the factors that influence it. For example, attorneys its special strengths and weaknesses. sometimes present evidence in different ways to different mock juries, to see which method is the Topics Appropriate most effective. for Experiments We typically think of experiments as being Experiments are more appropriate for some topics conducted in laboratories. Indeed, most of the and research purposes than others. Experiments examples in this chapter involve such a setting. are especially well suited to research projects in- This need not be the case, however. Increasingly, social researchers are using the World Wide Web as a vehicle for conducting experiments. Further,
232 ■ Chapter 8: Experiments sometimes we can construct what are called natu- the cause and the dependent variable is the effect. ral experiments: “experiments” that occur in the Thus, we might say that watching the film caused a regular course of social events. The latter portion of change in prejudice or that reduced prejudice was this chapter deals with such research. an effect of watching the film. The Classical Experiment The independent and dependent variables appropriate for experimentation are nearly limit- In both the natural and the social sciences, the less. Moreover, a given variable might serve as an most conventional type of experiment involves independent variable in one experiment and as a three major pairs of components: (1) independent dependent variable in another. For example, preju- and dependent variables, (2) pretesting and post- dice is the dependent variable in our example, but testing, and (3) experimental and control groups. it might be the independent variable in an experi- This section looks at each of these components and ment examining the effect of prejudice on voting the way they’re put together in the execution of behavior. the experiment. To be used in an experiment, both independent Independent and dependent variables must be operationally and Dependent Variables defined. Such operational definitions might involve a variety of observation methods. Responses to a Essentially, an experiment examines the effect of questionnaire, for example, might be the basis for an independent variable on a dependent variable. defining prejudice. Speaking to or ignoring African Typically, the independent variable takes the form Americans, or agreeing or disagreeing with them, of an experimental stimulus, which is either pres- might be elements in the operational definition ent or absent. That is, the stimulus is a dichoto- of interaction with African Americans in a small- mous variable, having two attributes, present or group setting. not present. In this typical model, the experimenter compares what happens when the stimulus is pres- Conventionally, in the experimental model, ent to what happens when it is not. dependent and independent variables must be operationally defined before the experiment In the example concerning prejudice against begins. However, as you’ll see in connection with African Americans, prejudice is the dependent vari- survey research and other methods, it’s sometimes able and exposure to African American history is the appropriate to make a wide variety of observations independent variable. The researcher’s hypothesis during data collection and then determine the most suggests that prejudice depends, in part, on a lack useful operational definitions of variables during of knowledge of African American history. The later analyses. Ultimately, however, experimenta- purpose of the experiment is to test the validity of tion, like other quantitative methods, requires this hypothesis by presenting some subjects with specific and standardized measurements and an appropriate stimulus, such as a documentary observations. film. In other terms, the independent variable is Pretesting and Posttesting pretesting The measurement of a dependent vari- able among subjects. In the simplest experimental design, subjects posttesting The remeasurement of a dependent are measured in terms of a dependent variable variable among subjects after they’ve been exposed (pretesting), exposed to a stimulus representing to an independent variable. an independent variable, and then remeasured in terms of the dependent variable (posttesting). Any differences between the first and last measure- ments of the dependent variable are then attributed to the independent variable.
The Classical Experiment ■ 233 In the example of prejudice and exposure to In the example of prejudice and African African American history, we’d begin by pretesting American history, we might examine two groups of the extent of prejudice among our experimental subjects. To begin, we give each group a question- subjects. Using a questionnaire asking about at- naire designed to measure their prejudice against titudes toward African Americans, for example, African Americans. Then we show the film to only we could measure both the extent of prejudice ex- the experimental group. Finally, we administer a hibited by each individual subject and the average posttest of prejudice to both groups. Figure 8-1 il- prejudice level of the whole group. After exposing lustrates this basic experimental design. the subjects to the African American history film, we could administer the same questionnaire again. Using a control group allows the researcher to Responses given in this posttest would permit us to detect any effects of the experiment itself. If the measure the later extent of prejudice for each sub- posttest shows that the overall level of prejudice ject and the average prejudice level of the group as exhibited by the control group has dropped as a whole. If we discovered a lower level of prejudice much as that of the experimental group, then the during the second administration of the question- apparent reduction in prejudice must be a function naire, we might conclude that the film had indeed of the experiment or of some external factor rather reduced prejudice. than a function of the film. If, on the other hand, prejudice is reduced only in the experimental In the experimental examination of attitudes group, this reduction would seem to be a conse- such as prejudice, we face a special practical prob- quence of exposure to the film, because that’s the lem relating to validity. As you may already have only difference between the two groups. Alterna- imagined, the subjects might respond differently to tively, if prejudice is reduced in both groups but to the questionnaires the second time even if their a greater degree in the experimental group than in attitudes remain unchanged. During the first ad- the control group, that, too, would be grounds for ministration of the questionnaire, the subjects assuming that the film reduced prejudice. might be unaware of its purpose. By the second measurement, they might have figured out that the The need for control groups in social research researchers were interested in measuring their prej- became clear in connection with a series of studies udice. Because no one wishes to seem prejudiced, of employee satisfaction conducted by F. J. Roeth- the subjects might “clean up” their answers the lisberger and W. J. Dickson (1939) in the late second time around. Thus, the film would seem to 1920s and early 1930s. These two researchers were have reduced prejudice although, in fact, it had not. interested in discovering what changes in working conditions would improve employee satisfaction This is an example of a more general problem and productivity. To pursue this objective, they that plagues many forms of social research: The studied working conditions in the telephone “bank very act of studying something may change it. wiring room” of the Western Electric Works in the The techniques for dealing with this problem in Chicago suburb of Hawthorne, Illinois. the context of experimentation will be discussed in various places throughout the chapter. The first experimental group In experimentation, a group technique involves the use of control groups. of subjects to whom an experimental stimulus is administered. Experimental and Control Groups control group In experimentation, a group of sub- Laboratory experiments seldom, if ever, involve jects to whom no experimental stimulus is admin- only the observation of an experimental group to istered and who should resemble the experimental which a stimulus has been administered. In addi- group in all other respects. The comparison of the tion, the researchers also observe a control group, control group and the experimental group at the which does not receive the experimental stimulus. end of the experiment points to the effect of the experimental stimulus.
234 ■ Chapter 8: Experiments FIGURE 8-1 research. Time and again, patients who participate in medical experiments have appeared to improve, Diagram of Basic Experimental Design. The fundamental but it has been unclear how much of the improve- purpose of an experiment is to isolate the possible effect of an ment has come from the experimental treatment independent variable (called the stimulus in experiments) and how much from the experiment. In testing on a dependent variable. Members of the experimental the effects of new drugs, then, medical research- group(s) are exposed to the stimulus, while those in the ers frequently administer a placebo—a “drug” with control group(s) are not. no relevant effect, such as sugar pills—to a control group. Thus, the control-group patients believe that To the researchers’ great satisfaction, they they, like the experimental group, are receiving discovered that improving the working conditions an experimental drug. Often, they improve. If the increased satisfaction and productivity consistently. new drug is effective, however, those receiving the As the workroom was brightened up through actual drug will improve more than those receiving better lighting, for example, productivity went up. the placebo. When lighting was further improved, productivity went up again. In social science experiments, control groups guard against not only the effects of the experi- To further substantiate their scientific con- ments themselves but also the effects of any events clusion, the researchers then dimmed the lights. outside the laboratory during the experiments. In Whoops—productivity improved again! the example of the study of prejudice, suppose that a popular African American leader is assassinated At this point it became evident that the in the middle of, say, a weeklong experiment. Such wiring-room workers were responding more to an event may very well horrify the experimental the attention given them by the researchers than subjects, requiring them to examine their own at- to improved working conditions. As a result of this titudes toward African Americans, with the result phenomenon, often called the Hawthorne effect, so- of reduced prejudice. Because such an effect should cial researchers have become more sensitive to and happen about equally for members of the control cautious about the possible effects of experiments and experimental groups, a greater reduction of themselves. In the wiring-room study, the use of a prejudice among the experimental group would, proper control group— one that was studied inten- again, point to the impact of the experimental sively without any other changes in the working stimulus: the documentary film. conditions—would have pointed to the presence of this effect. Sometimes an experimental design requires more than one experimental or control group. In The need for control groups in experimentation the case of the documentary film, for example, we has been nowhere more evident than in medical might also want to examine the impact of read- ing a book about African American history. In that case, we might have one group see the film and read the book, another group only see the movie, still another group only read the book, and the control group do neither. With this kind of design, we could determine the impact of each stimulus separately, as well as their combined effect. The Double-Blind Experiment Like patients who improve when they merely think they’re receiving a new drug, sometimes experimenters tend to prejudge results. In medical
Selecting Subjects ■ 235 research, the experimenters may be more likely to as subjects. Typically, the experimenter asks stu- “observe” improvements among patients receiv- dents enrolled in his or her classes to participate in ing the experimental drug than among those experiments or advertises for subjects in a college receiving the placebo. (This would be most likely, newspaper. Subjects may or may not be paid for perhaps, for the researcher who developed the participating in such experiments (recall also from drug.) A double-blind experiment eliminates Chapter 3 the ethical issues involved in asking this possibility, because in this design neither the students to participate in such studies). subjects nor the experimenters know which is the experimental group and which is the control. In In relation to the norm of generalizability in the medical case, those researchers who were re- science, this tendency clearly represents a potential sponsible for administering the drug and for noting defect in social research. Simply put, college un- improvements would not be told which subjects dergraduates are not typical of the public at large. were receiving the drug and which the placebo. There is a danger, therefore, that we may learn Conversely, the researcher who knew which sub- much about the attitudes and actions of college jects were in which group would not administer undergraduates but not about social attitudes and the experiment. actions in general. In social science experiments, as in medical However, this potential defect is less significant experiments, the danger of experimenter bias is in explanatory research than in descriptive re- further reduced to the extent that the operational search. True, having noted the level of prejudice definitions of the dependent variables are clear and among a group of college undergraduates in our precise. Thus, medical researchers would be less pretesting, we would have little confidence that the likely to unconsciously bias their reading of a pa- same level existed among the public at large. On tient’s temperature than they would be to bias their the other hand, if we found that a documentary assessment of how lethargic the patient was. For film reduced whatever level of prejudice existed the same reason, the small-group researcher would among those undergraduates, we would have be less likely to misperceive which subject spoke, or more confidence—without being certain—that it to whom he or she spoke, than whether the sub- would have a comparable effect in the community ject’s comments sounded cooperative or competi- at large. Social processes and patterns of causal tive, a more subjective judgment that’s difficult to relationships appear to be more generalizable and define in precise behavioral terms. more stable than specific characteristics such as an individual’s level of prejudice. As I’ve indicated several times, seldom can we devise operational definitions and measure- Aside from the question of generalizability, the ments that are wholly precise and unambiguous. cardinal rule of subject selection in experimenta- This is another reason why it can be appropriate tion concerns the comparability of experimental to employ a double-blind design in social research and control groups. Ideally, the control group experiments. represents what the experimental group would be like if it had not been exposed to the experimen- Selecting Subjects tal stimulus. The logic of experiments requires, therefore, that experimental and control groups In Chapter 7 we discussed the logic of sampling, be as similar as possible. There are several ways to which involves selecting a sample that is repre- accomplish this. sentative of some population. Similar consider- ations apply to experiments. Because most social double-blind experiment An experimental researchers work in colleges and universities, it design in which neither the subjects nor the experi- seems likely that research laboratory experiments menters know which is the experimental group and would be conducted with college undergraduates which is the control.
236 ■ Chapter 8: Experiments Probability Sampling Let’s return again to the basic concept of proba- bility sampling. If we recruit 40 subjects altogether, The discussions of the logic and techniques of prob- in response to a newspaper advertisement, for ability sampling in Chapter 7 provide one method example, there’s no reason to believe that the for selecting two groups of people that are similar 40 subjects represent the entire population from to each other. Beginning with a sampling frame which they’ve been drawn. Nor can we assume composed of all the people in the population under that the 20 subjects randomly assigned to the ex- study, the researcher might select two probability perimental group represent that larger population. samples. If these samples each resemble the total We can have greater confidence, however, that the population from which they’re selected, they’ll also 20 subjects randomly assigned to the experimental resemble each other. group will be reasonably similar to the 20 assigned to the control group. Recall also, however, that the degree of resem- blance (representativeness) achieved by probability Following the logic of our earlier discus- sampling is largely a function of the sample size. sions of sampling, we can see our 40 subjects as a As a general guideline, probability samples of less population from which we select two probability than 100 are not likely to be terribly representative, samples—each consisting of half the population. and social science experiments seldom involve that Because each sample reflects the characteristics of many subjects in either experimental or control the total population, the two samples will mirror groups. As a result, then, probability sampling is each other. seldom used in experiments to select subjects from a larger population. Researchers do, however, use As we saw in Chapter 7, our assumption of the logic of random selection when they assign similarity in the two groups depends in part on the subjects to groups. number of subjects involved. In the extreme case, if we recruited only two subjects and assigned, by Randomization the flip of a coin, one as the experimental subject and one as the control, there would be no reason Having recruited, by whatever means, a total group to assume that the two subjects are similar to each of subjects, the experimenter may randomly assign other. With larger numbers of subjects, however, those subjects to either the experimental or the randomization makes good sense. control group. The researcher might accomplish such randomization by numbering all of the Matching subjects serially and selecting numbers by means of a random number table. Alternatively, the experi- Another way to achieve comparability between menter might assign the odd-numbered subjects to the experimental and control groups is through the experimental group and the even-numbered matching. This process is similar to the quota- subjects to the control group. sampling methods discussed in Chapter 7. If 12 of our subjects are young white men, we might assign randomization A technique for assigning experi- 6 of them at random to the experimental group mental subjects to experimental and control groups and the other 6 to the control group. If 14 are randomly. middle-aged African American women, we might matching In connection with experiments, the assign 7 to each group. We repeat this process for procedure whereby pairs of subjects are matched every relevant grouping of subjects. on the basis of their similarities on one or more variables, and one member of the pair is assigned to The overall matching process could be most the experimental group and the other to the control efficiently achieved through the creation of a quota group. matrix constructed of all the most relevant charac- teristics. Figure 8-2 provides a simplified illustration of such a matrix. In this example, the experimenter
Selecting Subjects ■ 237 FIGURE 8-2 Quota Matrix Illustration. Sometimes the experimental and control groups are created by finding pairs of match- ing subjects and assigning one to the experimental group and the other to the control group. has decided that the relevant characteristics are are. Of course, these variables cannot be specified race, age, and gender. Ideally, the quota matrix is in any definite way, any more than I could specify constructed to result in an even number of subjects in Chapter 7 which variables should be used in in each cell of the matrix. Then, half the subjects in stratified sampling. Which variables are relevant ul- each cell go into the experimental group and half timately depends on the nature and purpose of an into the control group. experiment. As a general rule, however, the control and experimental groups should be comparable in Alternatively, we might recruit more subjects terms of those variables that are most likely to be than our experimental design requires. We might related to the dependent variable under study. In then examine many characteristics of the large a study of prejudice, for example, the two groups initial group of subjects. Whenever we discover a should be alike in terms of education, ethnicity, pair of quite similar subjects, we might assign one and age, among other characteristics. In some at random to the experimental group and the other cases, moreover, we may delay assigning subjects to the control group. Potential subjects who are to experimental and control groups until we have unlike anyone else in the initial group might be left initially measured the dependent variable. Thus, out of the experiment altogether. for example, we might administer a questionnaire measuring subjects’ prejudice and then match the Whatever method we employ, the desired re- experimental and control groups on this variable sult is the same. The overall average description of to assure ourselves that the two groups exhibit the the experimental group should be the same as that same overall level of prejudice. of the control group. For example, on average both groups should have about the same ages, the same Matching or Randomization? gender composition, the same racial composition, and so forth. This test of comparability should be When assigning subjects to the experimental and used whether the two groups are created through control groups, you should be aware of two argu- probability sampling or through randomization. ments in favor of randomization over matching. Thus far I’ve referred to the “relevant” vari- ables without saying clearly what those variables
238 ■ Chapter 8: Experiments First, you may not be in a position to know in variations to better show the potential for experi- advance which variables will be relevant for the mentation in social research. matching process. Second, most of the statistics used to analyze the results of experiments assume Preexperimental randomization. Failure to design your experiment Research Designs that way, then, makes your later use of those statis- tics less meaningful. To begin, Campbell and Stanley discuss three “preexperimental” designs, not to recommend On the other hand, randomization only makes them but because they’re frequently used in less- sense if you have a fairly large pool of subjects, so than-professional research. These designs are called that the laws of probability sampling apply. With preexperimental to indicate that they do not meet only a few subjects, matching would be a better the scientific standards of experimental designs, procedure. and sometimes they may be used because the conditions for full-fledged experiments are impos- Sometimes researchers can combine matching sible to meet. In the first such design—the one-shot and randomization. When conducting an experi- case study—the researcher measures a single group ment on the educational enrichment of young of subjects on a dependent variable following the adolescents, for example, J. Milton Yinger and his administration of some experimental stimulus. colleagues (1977) needed to assign a large number Suppose, for example, that we show the African of students, aged 13 and 14, to several different ex- American history film, mentioned earlier, to a perimental and control groups to ensure the com- group of people and then administer a question- parability of students composing each of the groups. naire that seems to measure prejudice against They achieved this goal by the following method. African Americans. Suppose further that the an- swers given to the questionnaire seem to represent Beginning with a pool of subjects, the research- a low level of prejudice. We might be tempted to ers first created strata of students nearly identical to conclude that the film reduced prejudice. Lacking one another in terms of some 15 variables. From a pretest, however, we can’t be sure. Perhaps the each of the strata, students were randomly assigned questionnaire doesn’t really represent a sensitive to the different experimental and control groups. measure of prejudice, or perhaps the group we’re In this fashion, the researchers actually improved studying was low in prejudice to begin with. In ei- on conventional randomization. Essentially, they ther case, the film might have made no difference, had used a stratified-sampling procedure (Chap- though our experimental results might have misled ter 7), except that they had employed far more us into thinking it did. stratification variables than are typically used in, say, survey sampling. The second preexperimental design discussed by Campbell and Stanley adds a pretest for the Thus far I’ve described the classical experi- experimental group but lacks a control group. This ment—the experimental design that best represents design—which the authors call the one-group pretest- the logic of causal analysis in the laboratory. In prac- posttest design—suffers from the possibility that some tice, however, social researchers use a great variety factor other than the independent variable might of experimental designs. Let’s look at some now. cause a change between the pretest and posttest results, such as the assassination of a respected Af- Variations on Experimental rican American leader. Thus, although we can see Design that prejudice has been reduced, we can’t be sure that the film is what caused that reduction. Donald Campbell and Julian Stanley (1963), in a classic book on research design, describe some To round out the possibilities for preexperimen- 16 different experimental and quasi-experimental tal designs, Campbell and Stanley point out that designs. This section describes some of these
Variations on Experimental Design ■ 239 FIGURE 8-3 Three Preexperimental Research Designs. These preexperimental designs anticipate the logic of true experiments but leave them- selves open to errors of interpretation. Can you see the errors that might be made in each of these designs? The various risks are solved by the addition of control groups, pretesting, and posttesting. some research is based on experimental and con- unless we had randomized our subjects, we would trol groups but has no pretests. They call this design have no way of knowing that the two groups had the static-group comparison. For example, we might the same degree of prejudice initially; perhaps the show the African American history film to one experimental group started out with less. group and not to another and then measure preju- dice in both groups. If the experimental group had Figure 8-3 graphically illustrates these three less prejudice at the conclusion of the experiment, preexperimental research designs by using a we might assume the film was responsible. But different research question: Does exercise cause weight reduction? To make the several designs
240 ■ Chapter 8: Experiments clearer, the figure shows individuals rather than and Stanley call the sources of internal invalidity, groups, but the same logic pertains to group com- reviewed and expanded in a follow-up book by parisons. Let’s review the three preexperimental Thomas Cook and Donald Campbell (1979). Then designs in this new example. we’ll consider the problem of generalizing experi- mental results to the “real” world, referred to as The one-shot case study represents a common external invalidity. Having examined these, we’ll form of logical reasoning in everyday life. Asked be in a position to appreciate the advantages of whether exercise causes weight reduction, we may some of the more sophisticated experimental and bring to mind an example that would seem to sup- quasi-experimental designs social science research- port the proposition: someone who exercises and ers sometimes use. is thin. There are problems with this reasoning, however. Perhaps the person was thin long before Sources of Internal Invalidity beginning to exercise. Or perhaps he became thin for some other reason, like eating less or getting The problem of internal invalidity refers to the sick. The observations shown in the diagram do not possibility that the conclusions drawn from ex- guard against these other possibilities. Moreover, perimental results may not accurately reflect what the observation that the man in the diagram is in has gone on in the experiment itself. The threat of trim shape depends on our intuitive idea of what internal invalidity is present whenever anything constitutes trim and overweight body shapes. All other than the experimental stimulus can affect the told, this is very weak evidence for testing the rela- dependent variable. tionship between exercise and weight loss. Campbell and Stanley (1963: 5–6) and Cook The one-group pretest-posttest design offers and Campbell (1979: 51–55) point to several somewhat better evidence that exercise produces sources of internal invalidity. Here are 12: weight loss. Specifically, we’ve ruled out the possibility that the man was thin before beginning 1. History. During the course of the experiment, to exercise. However, we still have no assurance historical events may occur that will confound that his exercising is what caused him to lose the experimental results. The assassination of an weight. African American leader during the course of an experiment on reducing anti–African American Finally, the static-group comparison eliminates prejudice is one example; the arrest of an African the problem of our questionable definition of what American leader for some heinous crime, which constitutes trim or overweight body shapes. In this might increase prejudice, is another. case, we can compare the shapes of the man who exercises and the one who does not. This design, 2. Maturation. People are continually growing and however, reopens the possibility that the man who changing, and such changes can affect the results exercises was thin to begin with. of the experiment. In a long-term experiment, the fact that the subjects grow older (and wiser?) may Validity Issues in have an effect. In shorter experiments, they may Experimental Research grow tired, sleepy, bored, or hungry, or change in other ways that affect their behavior in the At this point I want to present in a more systematic experiment. way the factors that affect the validity of experi- mental research. First we’ll look at what Campbell 3. Testing. As we’ve seen, often the process of testing and retesting influences people’s behavior, internal invalidity Refers to the possibility that thereby confounding the experimental results. the conclusions drawn from experimental results Suppose we administer a questionnaire to a group may not accurately reflect what went on in the ex- as a way of measuring their prejudice. Then we periment itself. administer an experimental stimulus and remea- sure their prejudice. By the time we conduct the
Variations on Experimental Design ■ 241 posttest, the subjects will probably have become to experimental and control groups. Comparisons more sensitive to the issue of prejudice and will be don’t have any meaning unless the groups are more thoughtful in their answers. In fact, they may comparable at the start of an experiment. have figured out that we’re trying to find out how prejudiced they are, and, because few people like to 7. Experimental mortality. Although some social appear prejudiced, they may give answers that they experiments could, I suppose, kill subjects, experi- think we want or that will make them look good. mental mortality refers to a more general and less- extreme problem. Often, experimental subjects will 4. Instrumentation. The process of measurement drop out of the experiment before it’s completed, in pretesting and posttesting brings in some of the and this can affect statistical comparisons and issues of conceptualization and operationalization conclusions. In the classical experiment involving discussed earlier in the book. If we use different an experimental and a control group, each with a measures of the dependent variable in the pretest pretest and posttest, suppose that the bigots in the and posttest (say, different questionnaires about experimental group are so offended by the African prejudice), how can we be sure they’re compa- American history film that they tell the experi- rable to each other? Perhaps prejudice will seem to menter to forget it, and they leave. Those subjects decrease simply because the pretest measure was sticking around for the posttest will have been less more sensitive than the posttest measure. Or if the prejudiced to start with, so the group results will measurements are being made by the experiment- reflect a substantial “decrease” in prejudice. ers, their standards or their abilities may change over the course of the experiment. 8. Causal time order. Though rare in social research, ambiguity about the time order of the experimental 5. Statistical regression. Sometimes it’s appropriate stimulus and the dependent variable can arise. to conduct experiments on subjects who start out Whenever this occurs, the research conclusion that with extreme scores on the dependent variable. If the stimulus caused the dependent variable can be you were testing a new method for teaching math challenged with the explanation that the “de- to hard-core failures in math, you’d want to con- pendent” variable actually caused changes in the duct your experiment on people who previously stimulus. had done extremely poorly in math. But consider for a minute what’s likely to happen to the math 9. Diffusion or imitation of treatments. When achievement of such people over time without experimental and control-group subjects can com- any experimental interference. They’re starting municate with each other, experimental subjects out so low that they can only stay at the bottom or may pass on some elements of the experimental improve: They can’t get worse. Even without any stimulus to the control group. For example, sup- experimental stimulus, then, the group as a whole pose there’s a lapse of time between our showing of is likely to show some improvement over time. Re- the African American history film and the posttest ferring to a regression to the mean, statisticians often administration of the questionnaire. Members of point out that extremely tall people as a group are the experimental group might tell control-group likely to have children shorter than themselves, subjects about the film. In that case, the control and extremely short people as a group are likely group becomes affected by the stimulus and is not to have children taller than themselves. There is a a real control. Sometimes we speak of the control danger, then, that changes occurring by virtue of group as having been “contaminated.” subjects’ starting out in extreme positions will be attributed erroneously to the effects of the experi- 10. Compensation. As you’ll see in Chapter 12, mental stimulus. in experiments in real-life situations—such as a special educational program—subjects in the 6. Selection biases. We discussed selection bias control group are often deprived of something earlier when we examined different ways of select- considered to be of value. In such cases, there may ing subjects for experiments and assigning them be pressures to offer some form of compensation. For example, hospital staff might feel sorry for
242 ■ Chapter 8: Experiments control-group patients and give them extra “tender subject to the same tests and experimenter effects. loving care.” In such a situation, the control group If the subjects have been assigned to the two is no longer a genuine control group. groups randomly, statistical regression should affect both equally, even if people with extreme scores on 11. Compensatory rivalry. In real-life experiments, prejudice are being studied. Selection bias is ruled the subjects deprived of the experimental stimulus out by the random assignment of subjects. Experi- may try to compensate for the missing stimulus by mental mortality is more complicated to handle, working harder. Suppose an experimental math but the data provided in this study design offer program is the experimental stimulus; the control several ways to deal with it. Slight modifications group may work harder than before on their math to the design—administering a placebo (such as a in an attempt to beat the “special” experimental film having nothing to do with African Americans) subjects. to the control group, for example—can make the problem even easier to manage. 12. Demoralization. On the other hand, feelings of deprivation within the control group may result The remaining five problems of internal in their giving up. In educational experiments, de- invalidity are avoided through the careful ad- moralized control-group subjects may stop study- ministration of a controlled experimental design. ing, act up, or get angry. The experimental design we’ve been discussing facilitates the clear specification of independent These, then, are some of the sources of internal and dependent variables. Experimental and control invalidity in experiments. Aware of these, experi- subjects can be kept separate, reducing the possibil- menters have devised designs aimed at handling ity of diffusion or imitation of treatments. Admin- them. The classical experiment, if coupled with istrative controls can avoid compensations given to proper subject selection and assignment, addresses the control group, and compensatory rivalry can be each of these problems. Let’s look again at that watched for and taken into account in evaluating study design, presented graphically in Figure 8-4. the results of the experiment, as can the problem of demoralization. If we use the experimental design shown in Figure 8-4, we should expect two findings. For Sources of External Invalidity the experimental group, the level of prejudice measured in their posttest should be less than Internal invalidity accounts for only some of the was found in their pretest. In addition, when the complications faced by experimenters. In addition, two posttests are compared, less prejudice should there is the problem that Campbell and Stanley call be found in the experimental group than in the external invalidity, which relates to the gener- control group. alizability of experimental findings to the “real” world. Even if the results of an experiment provide This design also guards against the problem of an accurate gauge of what happened during that history in that anything occurring outside the ex- experiment, do they really tell us anything about periment that might affect the experimental group life in the wilds of society? should also affect the control group. Consequently, there should still be a difference in the two post- Campbell and Stanley describe four forms of test results. The same comparison guards against this problem; I’ll present one as an illustration. The problems of maturation as long as the subjects generalizability of experimental findings is jeopar- have been randomly assigned to the two groups. dized, as the authors point out, if there’s an interac- Testing and instrumentation can’t be problems, be- tion between the testing situation and the experi- cause both the experimental and control groups are mental stimulus (1963: 18). Here’s an example of what they mean. external invalidity Refers to the possibility that conclusions drawn from experimental results may Staying with the study of prejudice and the Af- not be generalizable to the “real” world. rican American history film, let’s suppose that our
Variations on Experimental Design ■ 243 FIGURE 8-4 The Classical Experiment: Using an African American History Film to Reduce Prejudice. This diagram illustrates the basic structure of the classical experiment as a vehicle for testing the impact of a film on prejudice. Notice how the control group, the pretesting, and the posttesting function. experimental group—in the classical experiment— being the control group. Group 3 is administered has less prejudice in its posttest than in its pretest the experimental stimulus without a pretest, and and that its posttest shows less prejudice than that Group 4 is only posttested. This experimental of the control group. We can be confident that the design permits four meaningful comparisons, film actually reduced prejudice among our experi- which are described in the figure. If the African mental subjects. But would it have the same effect American history film really reduces prejudice— if the film were shown in theaters or on television? unaccounted for by the problem of internal validity We can’t be sure, because the film might be effec- and unaccounted for by an interaction between the tive only when people have been sensitized to the testing and the stimulus—we should expect four issue of prejudice, as the subjects may have been in findings: taking the pretest. This is an example of interaction between the testing and the stimulus. The clas- 1. In Group 1, posttest prejudice should be less sical experimental design cannot control for that than pretest prejudice. possibility. Fortunately, experimenters have devised other designs that can. 2. In Group 2, prejudice should be the same in the pretest and the posttest. The Solomon four-group design (D. Campbell and Stanley 1963: 24–25) addresses the problem 3. The Group 1 posttest should show less preju- of testing interaction with the stimulus. As the dice than the Group 2 posttest does. name suggests, it involves four groups of subjects, assigned randomly from a pool. Figure 8-5 presents 4. The Group 3 posttest should show less preju- this design graphically. dice than the Group 4 posttest does. Notice that Groups 1 and 2 in Figure 8-5 Notice that finding (4) rules out any interac- compose the classical experiment, with Group 2 tion between the testing and the stimulus. And remember that these comparisons are meaningful
244 ■ Chapter 8: Experiments FIGURE 8-5 design. As the authors argue persuasively, with proper randomization, only Groups 3 and 4 are The Solomon Four-Group Design. The classical experiment needed for a true experiment that controls for the runs the risk that pretesting will have an effect on subjects, problems of internal invalidity as well as for the so the Solomon Four Group Design adds experimental and interaction between testing and stimulus. With control groups that skip the pretest. randomized assignment to experimental and con- trol groups (which distinguishes this design from only if subjects have been assigned randomly to the the static-group comparison discussed earlier), different groups, thereby providing groups of equal the subjects will be initially comparable on the prejudice initially, even though their preexperimen- dependent variable—comparable enough to satisfy tal prejudice is measured only in Groups 1 and 2. the conventional statistical tests used to evalu- ate the results—so it’s not necessary to measure There is a side benefit to this research design, as them. Indeed, Campbell and Stanley suggest that the authors point out. Not only does the Solomon the only justification for pretesting in this situa- four-group design rule out interactions between tion is tradition. Experimenters have simply grown testing and the stimulus, it also provides data for accustomed to pretesting and feel more secure with comparisons that will reveal how much of this research designs that include it. Be clear, how- interaction has occurred in a classical experiment. ever, that this point applies only to experiments in This knowledge allows a researcher to review and which subjects have been assigned to experimen- evaluate the value of any prior research that used tal and control groups randomly, because that’s the simpler design. what justifies the assumption that the groups are equivalent without having been measured to The last experimental design I’ll mention here find out. is what Campbell and Stanley (1963: 25–26) call the posttest-only control group design; it consists of the This discussion has introduced the intricacies second half—Groups 3 and 4— of the Solomon of experimental design, its problems, and some solutions. There are, of course, a great many other experimental designs in use. Some involve more than one stimulus and combinations of stimuli. Others involve several tests of the dependent vari- able over time and the administration of the stimu- lus at different times for different groups. If you’re interested in pursuing this topic, you might look at the Campbell and Stanley book. An Illustration of Experimentation Experiments have been used to study a wide va- riety of topics in the social sciences. Some experi- ments have been conducted within laboratory situations; others occur out in the “real world” and are referred to as field experiments. The following discussion provides a glimpse of both. We’ll begin with an example of a field experiment.
An Illustration of Experimentation ■ 245 In George Bernard Shaw’s well-loved play teachers to present the results of the test. In par- Pygmalion—the basis of the long-running Broad- ticular, Rosenthal and Jacobson identified certain way musical My Fair Lady—Eliza Doolittle speaks of students as very likely to exhibit a sudden spurt in the powers others have in determining our social academic abilities during the coming year, based on identity. Here’s how she distinguishes the way she’s the results of the test. treated by her tutor, Professor Higgins, and by Hig- gins’s friend, Colonel Pickering: When IQ test scores were compared later, the researchers’ predictions proved accurate. The You see, really and truly, apart from the things students identified as “spurters” far exceeded their anyone can pick up (the dressing and the classmates during the following year, suggesting proper way of speaking, and so on), the dif- that the predictive test was a powerful one. In fact, ference between a lady and a flower girl is not the test was a hoax! The researchers had made how she behaves, but how she’s treated. I shall their predictions randomly among both good and always be a flower girl to Professor Higgins, poor students. What they told the teachers did because he always treats me as a flower girl, not really reflect students’ test scores at all. The and always will, but I know I can be a lady to progress made by the “spurters” was simply a result you, because you always treat me as a lady, and of the teachers expecting the improvement and always will. paying more attention to those students, encourag- ing them, and rewarding them for achievements. (Act V) (Notice the similarity between this situation and the Hawthorne effect discussed earlier in this The sentiment Eliza expresses here is basic chapter.) social science, addressed more formally by sociolo- gists such as Charles Horton Cooley (the “look- The Rosenthal-Jacobson study attracted a great ing-glass self”) and George Herbert Mead (“the deal of popular as well as scientific attention. Sub- generalized other”). The basic point is that who sequent experiments have focused on specific as- we think we are— our self-concept—and how we pects of what has become known as the attribution behave are largely a function of how others see and process, or the expectations communication model. This treat us. Related to this, the way others perceive us research, largely conducted by psychologists, paral- is largely conditioned by expectations they have lels research primarily by sociologists, which takes a in advance. If they’ve been told we’re stupid, for slightly different focus and is often gathered under example, they’re likely to see us that way—and we the label expectations-states theory. Psychological stud- may come to see ourselves that way and actually ies focus on situations in which the expectations act stupidly. “Labeling theory” addresses the phe- of a dominant individual affect the performance nomenon of people acting in accord with the ways of subordinates—as in the case of a teacher and that others perceive and label them. These theories students, or a boss and employees. The sociological have served as the premise for numerous movies, research has tended to focus more on the role of such as the 1983 film Trading Places, in which Eddie expectations among equals in small, task-oriented Murphy and Dan Ackroyd play a derelict converted groups. In a jury, for example, how do jurors ini- into a stockbroker and vice versa. tially evaluate each other, and how do those initial assessments affect their later interactions? (You The tendency to see in others what we’ve can learn more about this phenomenon, including been led to expect takes its name from Shaw’s attempts to find practical applications, by searching play. Called the Pygmalion effect, it’s nicely suited to the web for “Pygmalion Effect.”) controlled experiments. In one of the best-known experimental investigations of the Pygmalion ef- Here’s an example of an experiment conducted fect, Robert Rosenthal and Lenore Jacobson (1968) to examine the way our perceptions of our abilities administered what they called the “Harvard Test of and the abilities of others affect our willingness Inflected Acquisition” to students in a West Coast to accept the other person’s ideas. Martha Foschi, school. Subsequently, they met with the students’ G. Keith Warriner, and Stephen Hart (1985) were
246 ■ Chapter 8: Experiments particularly interested in the role “standards” play score,” although both the “partners” and their in that respect: “scores” would also be computerized fictions. (Subjects were told they would be communicat- In general terms, by “standards” we mean how ing with their partners via computer terminals but well or how poorly a person has to perform in would not be allowed to see each other.) If you order for an ability to be attributed or denied were assigned a score of 14, you would be told your him/her. In our view, standards are a key vari- partner had a score of 6; if you were assigned 6, able affecting how evaluations are processed you would be told your partner had 14. and what expectations result. For example, depending on the standards used, the same This procedure meant that you would enter level of success may be interpreted as a major the teamwork phase of the experiment believing accomplishment or dismissed as unimportant. either (1) you had done better than your partner or (2) you had done worse than your partner. This (1985: 108 –9) information constituted part of the “standard” you would be operating under in the experiment. In To begin examining the role of standards, the addition, half of each group was told that a score researchers designed an experiment involving four between 12 and 20 meant the subject definitely experimental groups and a control. Subjects were had pattern recognition ability; the other subjects told that the experiment involved something called were told that a score of 14 wasn’t really high “pattern recognition ability,” defined as an innate enough to prove anything definite. Thus, you ability some people had and others didn’t. The would emerge from this with one of the following researchers said subjects would be working in pairs beliefs: on pattern recognition problems. 1. You are definitely better at pattern recognition In fact, of course, there’s no such thing as pat- than your partner. tern recognition ability. The object of the experi- ment was to determine how information about 2. You are possibly better than your partner. this supposed ability affected subjects’ subsequent behavior. 3. You are possibly worse than your partner. The first stage of the experiment was to “test” 4. You are definitely worse than your partner. each subject’s pattern recognition abilities. If you had been a subject in the experiment, you would The control group for this experiment was told have been shown a geometric pattern for 8 seconds, nothing about their own abilities or those of their followed by two more patterns, each of which was partners. In other words, they had no expectations. similar to but not the same as the first one. Your task would be to choose which of the subsequent The final step in the experiment was to set the set had a pattern closest to the first one you saw. “teams” to work. As before, you and your part- You would be asked to do this 20 times, and a ner would be given an initial pattern, followed computer would print out your “score.” Half the by a comparison pair to choose from. When you subjects would be told that they had gotten 14 entered your choice in this round, however, you correct; the other half would be told that they had would be told what your partner had answered; gotten only 6 correct—regardless of which patterns then you would be asked to choose again. In your they matched with which. Depending on the luck final choice, you could either stick with your of the draw, you would think you had done either original choice or switch. The “partner’s” choice quite well or quite badly. Notice, however, that you was, of course, created by the computer, and as you wouldn’t really have any standard for judging your can guess, there were often a disagreements in the performance—maybe getting 4 correct would be teams: 16 out of 20 times, in fact. considered a great performance. The dependent variable in this experiment At the same time you were given your score, was the extent to which subjects would switch however, you would also be given your “partner’s their choices to match those of their partners. The researchers hypothesized that the definitely better
Alternative Experimental Settings ■ 247 group would switch least often, followed by the It’s worth taking a minute to consider some of probably better group, followed by the control group, the life situations where “expectation states” might followed by the probably worse group, followed by have very real and important consequences. I’ve the definitely worse group, who would switch most mentioned the case of jury deliberations. How often. about all forms of prejudice and discrimination? Or, consider how expectation states figure into job The number of times subjects in the five groups interviews or meeting your heartthrob’s parents. switched their answers follows. Realize that each If you think about it, you’ll undoubtedly see other had 16 opportunities to do so. These data indi- situations where these laboratory concepts apply in cate that each of the researchers’ expectations real life. was correct—with the exception of the compari- son between the possibly worse and definitely worse Alternative Experimental groups. Although the latter group was in fact the Settings more likely to switch, the difference was too small to be taken as a confirmation of the hypothesis. Although we tend to equate the terms experiment (Chapter 16 will discuss the statistical tests that let and laboratory experiment, many important social researchers make decisions like this.) science experiments occur outside controlled set- tings, as we’ve seen in our example of the Rosen- Group Mean Number of Switches thal-Jacobson study of the Pygmalion effect. Two Definitely better 5.05 other special circumstances deserve mention here: Possibly better 6.23 web-based experiments and “natural” experiments. Control group 7.95 Possibly worse 9.23 Web-Based Experiments Definitely worse 9.28 Increasingly, researchers are using the World Wide In more-detailed analyses, it was found that the Web as a vehicle for conducting experiments. same basic pattern held for both men and women, Because representative samples are not essential though it was somewhat clearer for women than in most experiments, researchers can often use for men. Here are the actual data: volunteers who respond to invitations online. One site you might visit to get a better idea of this form Mean Number of experimentation is Online Social Psychology of Switches Studies. This website offers hot links to numer- ous professional and student research projects on Women Men such topics as “interpersonal relations,” “beliefs and attitudes,” and “personality and individual differ- Definitely better 4.50 5.66 ences.” In addition, the site offers some resources Possibly better for conducting web experiments. (See the link Control group 6.34 6.10 on this book’s website: http://www.cengage.com/ Possibly worse sociology/babbie.) Definitely worse 7.68 8.34 “Natural” Experiments 9.36 9.09 Important social science experiments can oc- 10.00 8.70 cur in the course of normal social events, outside controlled settings. Sometimes nature designs and Because specific research efforts like this one sometimes seem extremely focused in their scope, you might wonder about their relevance to any- thing. As part of a larger research effort, however, studies like this one add concrete pieces to our un- derstanding of more-general social processes.
248 ■ Chapter 8: Experiments executes experiments that we can observe and ana- their actual presence depends on the unique lyze; sometimes social and political decision makers circumstances of each study. serve this natural function. (1981: 474) Imagine, for example, that a hurricane has struck a particular town. Some residents of the The foundation of this study was a survey of town suffer severe financial damages, and others the people who had been working at Three Mile Is- escape relatively lightly. What, we might ask, are land on March 28, 1979, when the cooling system the behavioral consequences of suffering a natural failed in the number 2 reactor and began melting disaster? Are those who suffer most more likely to the uranium core. The survey was conducted five take precautions against future disasters than those to six months after the accident. Among other who suffer least are? To answer these questions, things, the survey questionnaire measured work- we might interview residents of the town some ers’ attitudes toward working at nuclear power time after the hurricane. We might question them plants. If they had measured only the TMI workers’ regarding their precautions before the hurricane attitudes after the accident, the researchers would and the ones they’re currently taking, comparing have had no idea whether attitudes had changed as the people who suffered greatly from the hurricane a consequence of the accident. But they improved with those who suffered relatively little. In this their study design by selecting another, nearby— fashion, we might take advantage of a natural ex- seemingly comparable—nuclear power plant periment, which we could not have arranged even (abbreviated as PB) and surveyed workers there as if we’d been perversely willing to do so. a control group: hence their reference to a static- group comparison. A similar example comes from the annals of social research concerning World War II. After the Even with an experimental and a control war ended, social researchers undertook retrospec- group, the authors were wary of potential prob- tive surveys of wartime morale among civilians in lems in their design. In particular, their design several German cities. Among other things, they was based on the idea that the two sets of workers wanted to determine the effect of mass bombing were equivalent to each other, except for the single on the morale of civilians. They compared the fact of the accident. The researchers could have reports of wartime morale of residents in heavily assumed this if they had been able to assign work- bombed cities with reports from cities that received ers to the two plants randomly, but of course that relatively little bombing. (Bombing did not reduce was not the case. Instead, they needed to compare morale.) characteristics of the two groups and infer whether or not they were equivalent. Ultimately, the re- Because the researcher must take things pretty searchers concluded that the two sets of workers much as they occur, natural experiments raise were very much alike, and the plant the employees many of the validity problems discussed earlier. worked at was merely a function of where they Thus, when Stanislav Kasl, Rupert Chisolm, and lived. Brenda Eskenazi (1981) chose to study the impact that the Three Mile Island (TMI) nuclear accident Even granting that the two sets of workers in Pennsylvania had on plant workers, they had to were equivalent, the researchers faced another be especially careful in the study design: problem of comparability. They could not contact all the workers who had been employed at TMI at Disaster research is necessarily opportunistic, the time of the accident. The researchers discussed quasi-experimental, and after-the-fact. In the the problem as follows: terminology of Campbell and Stanley’s classi- cal analysis of research designs, our study falls One special attrition problem in this study was into the “static-group comparison” category, the possibility that some of the no-contact considered one of the weak research designs. nonrespondents among the TMI subjects, but However, the weaknesses are potential and not PB subjects, had permanently left the area because of the accident. This biased attrition
Strengths and Weaknesses of the Experimental Method ■ 249 would, most likely, attenuate the estimated and after attitudes of those who watched the show, extent of the impact. Using the evidence of moreover, suggested the show itself had little or no disconnected or “not in service” telephone effect. Those who watched it were no more egali- numbers, we estimate this bias to be negligible tarian afterward than they had been before. (1 percent). This example anticipates the subject of Chap- (Kasl, Chisolm, and Eskenazi 1981: 475) ter 12, evaluation research, which can be seen as a special type of natural experiment. As you’ll see, The TMI example points to both the special evaluation research involves taking the logic of ex- problems involved in natural experiments and the perimentation into the field to observe and evalu- possibility for taking those problems into account. ate the effects of stimuli in real life. Because this is Social research generally requires ingenuity and an increasingly important form of social research, insight; natural experiments call for a little more an entire chapter is devoted to it. than the average. Strengths and Weaknesses Earlier in this chapter, we used a hypothetical of the Experimental Method example of studying whether an African Ameri- can history film reduced prejudice. Sandra Ball- Experiments are the primary tool for studying Rokeach, Joel Grube, and Milton Rokeach (1981) causal relationships. However, like all research were able to address that topic in real life through methods, experiments have both strengths and a natural experiment. In 1977, the television weaknesses. dramatization of Alex Haley’s Roots, a historical saga about African Americans, was presented by ABC The chief advantage of a controlled experiment on eight consecutive nights. It garnered the largest lies in the isolation of the experimental variable’s audiences in television history up to that time. impact over time. This is seen most clearly in terms Ball-Rokeach and her colleagues wanted to know of the basic experimental model. A group of ex- whether Roots changed white Americans’ attitudes perimental subjects are found, at the outset of the toward African Americans. Their opportunity arose experiment, to have a certain characteristic; follow- in 1979, when a sequel—Roots: The Next Genera- ing the administration of an experimental stimulus, tion—was televised. Although it would have been they are found to have a different characteristic. nice (from a researcher’s point of view) to assign To the extent that subjects have experienced no random samples of Americans either to watch or other stimuli, we may conclude that the change of not to watch the show, that wasn’t possible. Instead, characteristics is attributable to the experimental the researchers selected four samples in Washing- stimulus. ton State and mailed questionnaires that measured attitudes toward African Americans. Following the Further, because individual experiments are last episode of the show, respondents were called often rather limited in scope, requiring relatively and asked how many, if any, episodes they had little time and money and relatively few subjects, watched. Subsequently, questionnaires were sent we often can replicate a given experiment several to respondents, remeasuring their attitudes toward times using several different groups of subjects. African Americans. (This isn’t always the case, of course, but it’s usually easier to repeat experiments than, say, surveys.) As By comparing attitudes before and after for in all other forms of scientific research, replication both those who watched the show and those who of research findings strengthens our confidence in didn’t, the researchers reached several conclu- the validity and generalizability of those findings. sions. For example, they found that people with already egalitarian attitudes were much more The greatest weakness of laboratory experi- likely to watch the show than were those who ments lies in their artificiality. Social processes that were more prejudiced toward African Americans: a occur in a laboratory setting might not necessarily self-selection phenomenon. Comparing the before
250 ■ Chapter 8: Experiments occur in natural social settings. For example, an MAIN POINTS African American history film might genuinely reduce prejudice among a group of experimen- Introduction tal subjects. This would not necessarily mean, however, that the same film shown in neighbor- • In experiments, social researchers typically select hood movie theaters throughout the country would reduce prejudice among the general public. a group of subjects, do something to them, and Artificiality is not as much of a problem, of course, observe the effect of what was done. for natural experiments as for those conducted in the laboratory. Topics Appropriate for Experiments In discussing several of the sources of internal • Experiments are an excellent vehicle for the con- and external invalidity mentioned by Campbell, Stanley, and Cook, we saw that we can create trolled testing of causal processes. experimental designs that logically control such problems. This possibility points to one of the great The Classical Experiment advantages of experiments: They lend themselves to a logical rigor that is often much more difficult • The classical experiment tests the effect of an to achieve in other modes of observation. experimental stimulus (the independent variable) Ethics and Experiments on a dependent variable through the pretest- ing and posttesting of experimental and control As you’ve probably seen, researchers must consider groups. many important ethical issues in conducting social science experiments. I’ll mention only two here. • It is generally less important that a group of First, experiments almost always involve de- experimental subjects be representative of some ception. In most cases, explaining the purpose of larger population than that experimental and the experiment to subjects would probably cause control groups be similar to each other. them to behave differently—trying to look less prejudiced, for example. It’s important, therefore, • A double-blind experiment guards against experi- to determine (1) whether a particular deception is essential to the experiment and (2) whether the menter bias, because neither the experimenter value of what may be learned from the experiment nor the subject knows which subjects are in the justifies the ethical violation. control group(s) and which in the experimental group(s). Second, experiments are typically intrusive. Subjects often are placed in unusual situations Selecting Subjects and asked to undergo unusual experiences. Even when the subjects are not physically injured (don’t • Probability sampling, randomization, and do that, by the way), there is always the possibility that they will be psychologically damaged, as matching are all methods of achieving compa- some of the previous examples in this chapter rability in the experimental and control groups. have illustrated. As with the matter of deception, Randomization is the generally preferred method. you’ll find yourself balancing the potential value In some designs, it can be combined with of the research against the potential damage to matching. subjects. Variations on Experimental Design • Campbell and Stanley describe three forms of preexperiments: the one-shot case study, the one- group pretest-posttest design, and the static-group comparison. None of these designs features all the controls available in a true experiment. • Campbell and Stanley list, among others, 12 sources of internal invalidity in experimental design. The classical experiment with random as- signment of subjects guards against each of these problems. • Experiments also face problems of external invalidity: Experimental findings may not reflect real life.
Review Questions and Exercises ■ 251 • The interaction of testing and stimulus is an control group matching double-blind experiment posttesting example of external invalidity that the classical experimental group pretesting experiment does not guard against. external invalidity randomization internal invalidity • The Solomon four-group design and other varia- PROPOSING SOCIAL RESEARCH: EXPERIMENTS tions on the classical experiment can safeguard against external invalidity. In the next series of exercises, we’ll focus on specific data-collection techniques, beginning with experi- • Campbell and Stanley suggest that, given proper ments here. If you’re doing these exercises as part of an assignment in the course, your instructor will tell randomization in the assignment of subjects to the you whether you should skip those chapters dealing experimental and control groups, there is no need with methods you won’t use. If you’re doing these ex- for pretesting in experiments. ercises on your own, to improve your understanding of the topics in the book, you can temporarily modify An Illustration of Experimentation your proposed data-collection method and explore how you would research your topic using the method • Experiments on “expectation states” demonstrate at hand—in this case, experimentation. experimental designs and show how experiments In the proposal, you’ll describe the experimental can prove relevant to real-world concerns. stimulus and how it will be administered, as well as detailing the experimental and control groups you’ll Alternative Experimental Settings use. You’ll also describe the pretesting and posttesting that will be involved in your experiment. What will be • More and more, researchers are using the Internet the setting for your experiments: a laboratory or more natural circumstances? for conducting experiments. It may be appropriate for you to conduct a double- • Natural experiments often occur in the course of blind experiment, in which case you should describe how you will accomplish it. You may also need to social life in the real world, and social research- explore some of the internal and external problems ers can implement them in somewhat the same of validity that might complicate your analysis of your way they would design and conduct laboratory results. experiments. Finally, the experimental model is used to test Strengths and Weaknesses of the specific hypotheses, so you should detail how you will Experimental Method accomplish that in terms of your study. • Like all research methods, experiments have REVIEW QUESTIONS AND EXERCISES strengths and weaknesses. Their primary weak- 1. In the library or on the web, locate a research ness is artificiality: What happens in an experi- report of an experiment. Identify the dependent ment may not reflect what happens in the outside variable and the stimulus. world. Their strengths include the isolation of the independent variable, which permits causal 2. Pick 6 of the 12 sources of internal invalidity dis- inferences; the relative ease of replication; and cussed in this chapter and make up examples (not scientific rigor. discussed in the chapter) to illustrate each. Ethics and Experiments 3. Create a hypothetical experimental design that illustrates one of the problems of external • Experiments typically involve deceiving subjects. invalidity. • By their intrusive nature, experiments open the 4. Think of a recent natural disaster you’ve wit- possibility of inadvertently causing damages to nessed or read about. Frame a research question subjects. KEY TERMS The following terms are defined in context in the chapter and at the bottom of the page where the term is introduced, as well as in the comprehensive glossary at the back of the book.
252 ■ Chapter 8: Experiments that might be studied by treating that disaster as a 1. Before you do your final review of the chapter, natural experiment. In two or three paragraphs, take the CengageNOW pretest to help identify the outline how the study might be done. areas on which you should concentrate. You’ll 5. In this chapter, we looked briefly at the problem find information on this online tool, as well as of “placebo effects.” On the web, find a study instructions on how to access all of its great re- in which the placebo effect figured importantly. sources, in the front of the book. Write a brief report on the study, including the source of your information. (Hint: you might want 2. As you review, take advantage of the CengageNOW to do a search on “placebo.”) personalized study plan, based on your quiz results. Use this study plan with its interactive ex- SPSS EXERCISES ercises and other resources to master the material. See the booklet that accompanies your text for ex- 3. When you’re finished with your review, take the ercises using SPSS (Statistical Package for the Social posttest to confirm that you’re ready to move on Sciences). There are exercises offered for each chapter, to the next chapter. and you’ll also find a detailed primer on using SPSS. WEBSITE FOR THE PRACTICE Online Study Resources OF SOCIAL RESEARCH 12TH EDITION If your book came with an access code card, visit Go to your book’s website at www.cengage.com/ www.cengage.com/login to register. To purchase sociology/babbie for tools to aid you in studying for access, please visit www.ichapters.com. your exams. You’ll find Tutorial Quizzes with feedback, Internet Exercises, Flash Cards, Glossaries, and Essay Quiz- zes, as well as InfoTrac College Edition search terms, sug- gestions for additional reading, Web Links, and primers for using data-analysis software such as SPSS.
CHAPTER 9 Survey Research CHAPTER OVERVIEW Introduction Self-Administered Questionnaires Mail Distribution and Return Researchers have many methods Topics Appropriate for Survey Monitoring Returns for collecting data through Research Follow-up Mailings surveys—from mail questionnaires Response Rates to personal interviews to online Guidelines for Asking Questions A Case Study surveys conducted over the Internet. Choose Appropriate Question Social researchers should know how Forms Interview Surveys to select an appropriate method Make Items Clear The Role of the Survey Interviewer and how to implement it effectively. Avoid Double-Barreled Questions General Guidelines for Survey Respondents Must Be Competent Interviewing to Answer Coordination and Control Respondents Must Be Willing to Answer Telephone Surveys Questions Should Be Relevant Computer-Assisted Telephone Short Items Are Best Interviewing (CATI) Avoid Negative Items Response Rates in Interview Avoid Biased Items and Terms Surveys Questionnaire Construction Online Surveys General Questionnaire Format Formats for Respondents Comparison of the Different Survey Contingency Questions Methods Matrix Questions Ordering Items in a Questionnaire Strengths and Weaknesses of Survey Questionnaire Instructions Research Pretesting the Questionnaire A Composite Illustration Secondary Analysis Ethics and Survey Research CengageNOW for Sociology Use this online tool to help you make the grade on your next exam. After reading this chapter, go to “Online Study Resources” at the end of the chapter for instructions on how to benefit from CengageNOW.
254 ■ Chapter 9: Survey Research Introduction The chapter includes a short discussion of secondary analysis, the analysis of survey data col- Surveys are a very old research technique. In the lected by someone else. This use of survey results Old Testament, for example, we find the following: has become an important aspect of survey research in recent years, and it’s especially useful for stu- After the plague the Lord said to Moses and to dents and others with scarce research funds. Eleazar the son of Aaron, the priest, “Take a census of all the congregation of the people of Let’s begin by looking at the kinds of topics that Israel, from twenty years old and upward.” researchers can appropriately study by using survey research. (Numbers 26: 1–2) Topics Appropriate Ancient Egyptian rulers conducted censuses for Survey Research to help them administer their domains. Jesus was born away from home because Joseph and Mary Surveys may be used for descriptive, explana- were journeying to Joseph’s ancestral home for a tory, and exploratory purposes. They are chiefly Roman census. used in studies that have individual people as the units of analysis. Although this method can be A little-known survey was attempted among used for other units of analysis, such as groups or French workers in 1880. A German political soci- interactions, some individual persons must serve ologist mailed some 25,000 questionnaires to work- as respondents or informants. Thus, we could ers to determine the extent of their exploitation undertake a survey in which divorces were the by employers. The rather lengthy questionnaire unit of analysis, but we would need to administer included items such as these: the survey questionnaire to the participants in the divorces (or to some other respondents). Does your employer or his representative resort to trickery in order to defraud you of a part of Survey research is probably the best method your earnings? available to the social researcher who is interested in collecting original data for describing a popula- If you are paid piece rates, is the quality of tion too large to observe directly. Careful prob- the article made a pretext for fraudulent deduc- ability sampling provides a group of respondents tions from your wages? whose characteristics may be taken to reflect those of the larger population, and carefully constructed The survey researcher in this case was not standardized questionnaires provide data in the George Gallup but Karl Marx ([1880] 1956: 208). same form from all respondents. Though 25,000 questionnaires were mailed out, there is no record of any being returned. Surveys are also excellent vehicles for measur- ing attitudes and orientations in a large popula- Today, survey research is a frequently used tion. Public opinion polls—for example, Gallup, mode of observation in the social sciences. In a Harris, Roper, and Yankelovich—are well-known typical survey, the researcher selects a sample of examples of this use. Indeed, polls have become respondents and administers a standardized ques- so prevalent that at times the public seems unsure tionnaire to them. Chapter 7 discussed sampling what to think of them. Pollsters are criticized by techniques in detail. This chapter discusses how to those who don’t think (or want to believe) that prepare a questionnaire and describes the various polls are accurate (candidates who are “losing” in options for administering it so that respondents polls often tell voters not to trust the polls). But answer your questions adequately. respondent A person who provides data for analy- sis by responding to a survey questionnaire.
Guidelines for Asking Questions ■ 255 polls are also criticized for being too accurate— Polling has said in condemning this practice (see for example, when exit polls on election day are also Figure 3-1): used to predict a winner before the actual voting is complete. A “push poll” is a telemarketing technique in which telephone calls are used to canvass The general attitude toward public opinion potential voters, feeding them false or mislead- research is further complicated by scientifically un- ing “information” about a candidate under sound “surveys” that nonetheless capture people’s the pretense of taking a poll to see how this attention because of the topics they cover and/ “information” affects voter preferences. In fact, or their “findings.” A good example is the “Hite the intent is not to measure public opinion but Reports” on human sexuality. While enjoying con- to manipulate it—to “push” voters away from siderable attention in the popular press, Shere Hite one candidate and toward the opposing candi- was roundly criticized by the research community date. Such polls defame selected candidates by for her data-collection methods. For example, a spreading false or misleading information about 1987 Hite report was based on questionnaires com- them. The intent is to disseminate campaign pleted by women around the country—but which propaganda under the guise of conducting a women? Hite reported that she distributed some legitimate public opinion poll. 100,000 questionnaires through various organiza- tions, and around 4,500 were returned. (Bednarz 1996) Now 4,500 and 100,000 are large numbers in In short, the labels “survey” and “poll” are the context of survey sampling. However, given sometimes misused. Done properly, however, sur- Hite’s research methods, her 4,500 respondents vey research can be a useful tool of social inquiry. didn’t necessarily represent U.S. women any more Designing useful (and trustworthy) survey research than the Literary Digest’s enormous 1936 sample begins with formulating good questions. Let’s turn represented the U.S. electorate when their 2 mil- to that topic now. lion sample ballots indicated that Alf Landon would bury FDR in a landslide. Guidelines for Asking Questions Sometimes, people use the pretense of survey research for quite different purposes. For example, In social research, variables are often operational- you may have received a telephone call indicat- ized when researchers ask people questions as a ing you’ve been selected for a survey, only to find way of getting data for analysis and interpreta- that the first question was “How would you like to tion. Sometimes the questions are asked by an make thousands of dollars a week right there in interviewer; sometimes they are written down your own home?” Or you may have been told you and given to respondents for completion. In other could win a prize if you could name the president cases, several general guidelines can help research- whose picture is on the penny. (Tell them it’s ers frame and ask questions that serve as excellent Elvis.) Unfortunately, a few unscrupulous tele- operationalizations of variables while avoiding marketers try to prey on the general cooperation pitfalls that can result in useless or even misleading people have given to survey researchers. information. By the same token, political parties and chari- Surveys include the use of a question- table organizations have begun conducting phony naire—an instrument specifically designed to “surveys.” Often under the guise of collecting pub- elicit information that will be useful for analysis. lic opinion about some issue, callers ultimately ask Although some of the specific points to follow are respondents for a monetary contribution. more appropriate to structured questionnaires than Recent political campaigns have produced an- other form of bogus survey, the “push poll.” Here’s what the American Association for Public Opinion
256 ■ Chapter 9: Survey Research to the more open-ended questionnaires used in Both questions and statements can be used qualitative, in-depth interviewing, the underlying profitably. Using both in a given questionnaire gives logic is valuable whenever we ask people questions you more flexibility in the design of items and in order to gather data. can make the questionnaire more interesting as well. Choose Appropriate Question Forms Open-Ended and Closed-Ended Questions Let’s begin with some of the options available In asking questions, researchers have two options. to you in creating questionnaires. These options They can ask open-ended questions, in which include using questions or statements and choosing case the respondent is asked to provide his or her open-ended or closed-ended questions. own answers to the questions. For example, the respondent may be asked, ”What do you feel is Questions and Statements the most important issue facing the United States today?” and be provided with a space to write in Although the term questionnaire suggests a collec- the answer (or be asked to report it verbally to an tion of questions, an examination of a typical ques- interviewer). As we’ll see in Chapter 10, in-depth, tionnaire will probably reveal as many statements qualitative interviewing relies almost exclusively as questions. This is not without reason. Often, the on open-ended questions. However, they are also researcher is interested in determining the extent used in survey research. to which respondents hold a particular attitude or perspective. If you can summarize the attitude In the case of closed-ended questions, the re- in a fairly brief statement, you can present that spondent is asked to select an answer from among statement and ask respondents whether they agree a list provided by the researcher. Closed-ended or disagree with it. As you may remember, Rensis questions are very popular in survey research Likert greatly formalized this procedure through because they provide a greater uniformity of re- the creation of the Likert scale, a format in which sponses and are more easily processed than open- respondents are asked to strongly agree, agree, ended ones. disagree, or strongly disagree, or perhaps strongly approve, approve, and so forth. Open-ended responses must be coded before they can be processed for computer analysis, as questionnaire A document containing questions we’ll see in Chapter 14. This coding process often and other types of items designed to solicit informa- requires the researcher to interpret the meaning tion appropriate for analysis. Questionnaires are used of responses, opening the possibility of misun- primarily in survey research but also in experiments, derstanding and researcher bias. There is also a field research, and other modes of observation. danger that some respondents will give answers open-ended questions Questions for which the that are essentially irrelevant to the researcher’s respondent is asked to provide his or her own an- intent. Closed-ended responses, on the other hand, swers. In-depth, qualitative interviewing relies al- can often be transferred directly into a computer most exclusively on open-ended questions. format. closed-ended questions Survey questions in which the respondent is asked to select an answer The chief shortcoming of closed-ended ques- from among a list provided by the researcher. Popu- tions lies in the researcher’s structuring of re- lar in survey research because they provide a greater sponses. When the relevant answers to a given uniformity of responses and are more easily pro- question are relatively clear, there should be no cessed than open-ended questions. problem. In other cases, however, the researcher’s structuring of responses may overlook some impor- tant responses. In asking about “the most impor- tant issue facing the United States,” for example, his or her checklist of issues might omit certain
Guidelines for Asking Questions ■ 257 issues that respondents would have said were ing “Current Population Survey” or CPS, which important. measures, among other critical data, the nation’s unemployment rate. A part of the measurement The construction of closed-ended questions of employment patterns focuses on a respondent’s should be guided by two structural requirements. activities during “last week,” by which the Census First, the response categories provided should be Bureau means Sunday through Saturday. Stud- exhaustive: They should include all the possible ies undertaken to determine the accuracy of the responses that might be expected. Often, research- survey found that more than half the respondents ers ensure this by adding a category such as “Other took “last week” to include only Monday through (Please specify: ________).” Second, the answer Friday. By the same token, whereas the Census categories must be mutually exclusive: The re- Bureau defines “working full-time” as 35 or more spondent should not feel compelled to select more hours a week, the same evaluation studies showed than one. (In some cases, you may wish to solicit that some respondents used the more traditional multiple answers, but these may create difficulties definition of 40 hours per week. As a consequence, in data processing and analysis later on.) To ensure the wording of these questions in the CPS was that your categories are mutually exclusive, care- modified in 1994 to specify the Census Bureau’s fully consider each combination of categories, definitions. asking yourself whether a person could reasonably choose more than one answer. In addition, it’s use- Similarly, the use of the term Native American to ful to add an instruction to the question asking the mean American Indian often produces an overrep- respondent to select the one best answer, but this resentation of that ethnic group in surveys. Clearly, technique is not a satisfactory substitute for a care- many respondents understand the term to mean fully constructed set of responses. “born in the United States.” Make Items Clear Avoid Double-Barreled Questions It should go without saying that questionnaire Frequently, researchers ask respondents for a single items need to be clear and unambiguous, but answer to a question that actually has multiple the broad proliferation of unclear and ambigu- parts. That seems to happen most often when the ous questions in surveys makes the point worth researcher has personally identified with a complex emphasizing. We can become so deeply involved question. For example, you might ask respondents in the topic under examination that opinions to agree or disagree with the statement “The United and perspectives are clear to us but not to our States should abandon its space program and spend respondents—many of whom have paid little or the money on domestic programs.” Although no attention to the topic. Or, if we have only a many people would unequivocally agree with superficial understanding of the topic, we may fail the statement and others would unequivocally to specify the intent of a question sufficiently. The disagree, still others would be unable to answer. question “What do you think about the proposed Some would want to abandon the space program peace plan?” may evoke in the respondent a and give the money back to the taxpayers. Others counterquestion: “Which proposed peace plan?” would want to continue the space program but also Questionnaire items should be precise so that the put more money into domestic programs. These respondent knows exactly what the researcher is latter respondents could neither agree nor disagree asking. The possibilities for misunderstanding are without misleading you. endless, and no researcher is immune (Polivka and Rothgeb 1993). As a general rule, whenever the word and appears in a question or questionnaire statement, One of the most established research projects check whether you’re asking a double-barreled in the United States is the Census Bureau’s ongo- question. See “Double-Barreled and Beyond” for some imaginative variations on this theme.
258 ■ Chapter 9: Survey Research Double-Barreled and Beyond Even established,professional researchers have sometimes created U.S.will War is probable double-barreled questions and worse.Consider this question,asked of U.S.citizens in April 1986,at a time when the country’s relationship not go but not War is with Libya was at an especially low point.Some observers suggested that the United States might end up in a shooting war with the small to war inevitable inevitable North African nation.The Harris Poll sought to find out what U.S.public opinion was. U.S.will not invade Libya 1 2 3 If Libya now increases its terrorist acts against the U.S.and we keep U.S.will invade Libya but it 45 inflicting more damage on Libya,then inevitably it will all end would be wrong in the U.S.going to war and finally invading that country which would be wrong. U.S.will invade Libya and it 67 would be right Respondents were given the opportunity of answering“Agree,” “Disagree,”or“Not sure.”Notice the elements contained in the complex The examination of prognoses about the Libyan situation is not the statement: only example of double-barreled questions sneaking into public opinion research.Here are some questions the Harris Poll asked in an attempt 1. Will Libya increase its terrorist acts against the U.S.? to gauge U.S.public opinion about then Soviet General Secretary 2. Will the U.S.inflict more damage on Libya? Gorbachev: 3. Will the U.S.inevitably or otherwise go to war against Libya? 4. Would the U.S.invade Libya? He looks like the kind of Russian leader who will recognize that 5. Would that be right or wrong? both the Soviets and the Americans can destroy each other with nuclear missiles so it is better to come to verifiable arms control These several elements offer the possibility of numerous points agreements. of view—far more than the three alternatives offered to the survey respondents.Even if we were to assume hypothetically that Libya He seems to be more modern,enlightened,and attractive, would“increase its terrorist attacks”and the United States would“keep which is a good sign for the peace of the world. inflicting more damage”in return,you might have any one of at least seven distinct expectations about the outcome: Even though he looks much more modern and attractive,it would be a mistake to think he will be much different from other Russian leaders. How many elements can you identify in each of the questions? How many possible opinions could people have in each case? What does a simple“agree”or“disagree”really mean in such cases? Sources: Reported in World Opinion Update, October 1985 and May 1986,respectively. Respondents Must As another example, student government lead- Be Competent to Answer ers occasionally ask their constituents to indicate how students’ fees ought to be spent. Typically, In asking respondents to provide information, you respondents are asked to indicate the percentage should continually ask yourself whether they can of available funds that should be devoted to a long do so reliably. In a study of child rearing, you might list of activities. Without a fairly good knowledge of ask respondents to report the age at which they the nature of those activities and the costs involved first talked back to their parents. Quite aside from in them, the respondents cannot provide meaning- the problem of defining talking back to parents, it’s ful answers. Administrative costs, for example, will doubtful that most respondents would remember receive little support although they may be essen- with any degree of accuracy. tial to the program as a whole.
Guidelines for Asking Questions ■ 259 One group of researchers examining the driv- similar problem in his field research among U.S. ing experience of teenagers insisted on asking an survivalists: open-ended question concerning the number of miles driven since receiving a license. Although Survivalists, for example, are ambivalent about consultants argued that few drivers would be able concealing their identities and inclinations. to estimate such information with any accuracy, They realize that secrecy protects them from the question was asked nonetheless. In response, the ridicule of a disbelieving majority, but en- some teenagers reported driving hundreds of thou- forced separatism diminishes opportunities for sands of miles. recruitment and information exchange. . . . Respondents Must Be “Secretive” survivalists eschew telephones, Willing to Answer launder their mail through letter exchanges, use nicknames and aliases, and carefully con- Often, we would like to learn things from people ceal their addresses from strangers. Yet once I that they are unwilling to share with us. For was invited to group meetings, I found them example, Yanjie Bian indicates that it has often cooperative respondents. been difficult to get candid answers from people in China. Questions Should Be Relevant [Here] people are generally careful about what Similarly, questions asked in a questionnaire should they say on nonprivate occasions in order to be relevant to most respondents. When attitudes survive under authoritarianism. During the are requested on a topic that few respondents have Cultural Revolution between 1966 and 1976, thought about or really care about, the results for example, because of the radical political are not likely to be useful. Of course, because the agenda and political intensity throughout the respondents may express attitudes even though country, it was almost impossible to use survey they’ve never given any thought to the issue, you techniques to collect valid and reliable data run the risk of being misled. inside China about the Chinese people’s life ex- periences, characteristics, and attitudes towards This point is illustrated occasionally when the Communist regime. researchers ask for responses relating to fictitious people and issues. In one political poll I conducted, (1994: 19 –20) I asked respondents whether they were familiar with each of 15 political figures in the community. Sometimes, U.S. respondents say they’re unde- As a methodological exercise, I made up a name: cided when, in fact, they have an opinion but think Tom Sakumoto. In response, 9 percent of the they’re in a minority. Under that condition, they respondents said they were familiar with him. Of may be reluctant to tell a stranger (the interviewer) those respondents familiar with him, about half re- what that opinion is. Given this problem, the Gallup ported seeing him on television and reading about Organization, for example, has used a “secret ballot” him in the newspapers. format, which simulates actual election conditions, in that the “voter” enjoys complete anonymity. When you obtain responses to fictitious issues, In an analysis of the Gallup Poll election data from you can disregard those responses. But when the is- 1944 to 1988, Andrew Smith and G. F. Bishop sue is real, you may have no way of telling which re- (1992) have found that this technique substantially sponses genuinely reflect attitudes and which reflect reduced the percentage of respondents who said meaningless answers to an irrelevant question. they were undecided about how they would vote. Ideally, we would like respondents to simply This problem is not limited to survey research, report that they don’t know, have no opinion, or however. Richard Mitchell (1991: 100) faced a are undecided in those instances where that is the case. Unfortunately, however, they often make up answers.
260 ■ Chapter 9: Survey Research Short Items Are Best that such a person should be prohibited from teaching. (A later study in the series using the an- In the interests of being unambiguous and pre- swer categories “permit” and “prohibit” produced cise and of pointing to the relevance of an issue, much clearer results.) researchers tend to create long and complicated items. That should be avoided. Respondents are In 1993 a national survey commissioned by the often unwilling to study an item in order to under- American Jewish Committee produced shocking stand it. The respondent should be able to read an results: One American in five believed that the item quickly, understand its intent, and select or Nazi Holocaust—in which six million Jews were provide an answer without difficulty. In general, reportedly killed—never happened; further, one assume that respondents will read items quickly in three Americans expressed some doubt that it and give quick answers. Accordingly, provide clear, had occurred. This research finding suggested that short items that will not be misinterpreted under the Holocaust Revisionist movement in America those conditions. was powerfully influencing public opinion (“1 in 5 Polled Voices Doubt on Holocaust” 1993). Avoid Negative Items In the aftermath of this shocking news, The appearance of a negation in a questionnaire researchers reexamined the actual question that item paves the way for easy misinterpretation. had been asked: “Does it seem possible or does it Asked to agree or disagree with the statement “The seem impossible to you that the Nazi extermina- United States should not recognize Cuba,” a sizable tion of the Jews never happened?” On reflection, portion of the respondents will read over the word it seemed clear that the complex, double-negative not and answer on that basis. Thus, some will agree question could have confused some respondents. with the statement when they’re in favor of recog- nition, and others will agree when they oppose it. A new survey was commissioned and asked, And you may never know which are which. “Does it seem possible to you that the Nazi exter- mination of the Jews never happened, or do you Similar considerations apply to other “nega- feel certain that it happened?” In the follow-up tive” words. In a study of support for civil liberties, survey, only 1 percent of the respondents believed for example, respondents were asked whether the Holocaust never happened, and another 8 they felt “the following kinds of people should be percent said they weren’t sure (“Poll on Doubt of prohibited from teaching in public schools” and Holocaust Is Corrected” 1994). were presented with a list including such items as a Communist, a Ku Klux Klansman, and so forth. Avoid Biased Items and Terms The response categories “yes” and “no” were given beside each entry. A comparison of the responses Recall from our discussion of conceptualization and to this item with other items reflecting support for operationalization in Chapter 5 that there are no civil liberties strongly suggested that many respon- ultimately true meanings for any of the concepts dents gave the answer “yes” to indicate willingness we typically study in social science. Prejudice has for such a person to teach, rather than to indicate no ultimately correct definition; whether a given person is prejudiced depends on our definition of bias That quality of a measurement device that that term. The same general principle applies to tends to result in a misrepresentation of what is be- the responses we get from people completing a ing measured in a particular direction. For example, questionnaire. the questionnaire item “Don’t you agree that the president is doing a good job?” would be biased in The meaning of someone’s response to a ques- that it would generally encourage more favorable tion depends in large part on its wording. This is true responses. of every question and answer. Some questions seem to encourage particular responses more than other questions do. In the context of questionnaires, bias
Guidelines for Asking Questions ■ 261 refers to any property of questions that encourages Democratic primary, many voters who might have respondents to answer in a particular way. been reluctant to vote for an African American (Barack Obama) or a woman (Hillary Clinton) Most researchers recognize the likely effect of might have also been reluctant to admit their a question that begins, “Don’t you agree with the racial or gender prejudice to a survey interviewer. President of the United States that . . .” No reputa- (Some, to be sure, were not reluctant to say how ble researcher would use such an item. Unhappily, they felt.) the biasing effect of items and terms is far subtler than this example suggests. The best way to guard against this problem is to imagine how you would feel giving each of the The mere identification of an attitude or posi- answers you intend to offer to respondents. If you tion with a prestigious person or agency can bias would feel embarrassed, perverted, inhumane, stu- responses. The item “Do you agree or disagree pid, irresponsible, or otherwise socially disadvant- with the recent Supreme Court decision that . . .” aged by any particular response, give serious would have a similar effect. Such wording may not thought to how willing others will be to give those produce consensus or even a majority in support of answers. the position identified with the prestigious person or agency, but it will likely increase the level of sup- The biasing effect of particular wording is port over what would have been obtained without often difficult to anticipate. For example, in both such identification. surveys and experiments, researchers sometimes ask respondents to consider hypothetical situations Sometimes the impact of different forms of and say how they think they would behave. Those question wording is relatively subtle. For example, situations often involve other people, however, and when Kenneth Rasinski (1989) analyzed the results the names used can affect responses. For instance, of several General Social Survey studies of attitudes researchers have long known that male names for toward government spending, he found that the the hypothetical people can produce different re- way programs were identified had an impact on sponses than female names do. Research by Joseph the amount of public support they received. Here Kasof (1993) points to the importance of what the are some comparisons: specific names are: whether they generally evoke positive or negative images in terms of attractive- More Support Less Support ness, age, intelligence, and so forth. Kasof’s review “Assistance to the poor” “Welfare” of past research suggests there has been a tendency “Halting rising crime rate” “Law enforcement” to use more positively valued names for men than “Dealing with drug addiction” “Drug rehabilitation” for women. “Solving problems of big cities” “Assistance to big cities” “Improving conditions of blacks” “Assistance to blacks” The Center for Disease Control (Choi and Pak “Protecting social security” “Social security” 2005) has provided an excellent analysis of various ways in which your choice of terms can bias and In 1986, for example, 62.8 percent of the respon- otherwise confuse responses to questionnaires. dents said too little money was being spent on “as- Among other things, they warn against using sistance to the poor,” whereas in a matched survey ambiguous, technical, uncommon, or vague words. that year, only 23.1 percent said we were spending Their thorough analysis provides many concrete too little on “welfare.” illustrations. In this context, be wary of what researchers As in all other research, carefully examine call the social desirability of questions and answers. the purpose of your inquiry and construct items Whenever we ask people for information, they an- that will be most useful to it. You should never be swer through a filter of what will make them look misled into thinking there are ultimately “right” good. This is especially true if they’re interviewed and “wrong” ways of asking the questions. When face-to-face. Thus, for example, during the 2008 in doubt about the best question to ask, moreover, remember that you should ask more than one.
262 ■ Chapter 9: Survey Research These, then, are some general guidelines for have been forced to reread confusing, abbreviated writing questions to elicit data for analysis and questions. Nor will they have been forced to write a interpretation. Next we look at how to construct long answer in a tiny space. questionnaires. Similar problems can arise for interview- Questionnaire Construction ers in a face-to-face or telephone interview. Like respondents to a self-administered questionnaire, Questionnaires are used in connection with many interviewers may miss questions, lose their place, modes of observation in social research. Although and generally become frustrated and flustered. structured questionnaires are essential to and most Interview questionnaires need to be laid out in a directly associated with survey research, they are way that supports the interviewer’s work, including also widely used in experiments, field research, special instructions and guidelines that go beyond and other data-collection activities. For this reason, what respondents to a self-administered question- questionnaire construction can be an important naire would need. practical skill for researchers. As we discuss the established techniques for constructing question- The desirability of spreading out questions in naires, let’s begin with some issues of questionnaire the questionnaire cannot be overemphasized. format. Squeezed-together questionnaires are disastrous, whether completed by the respondents themselves or administered by trained interviewers. The pro- cessing of such questionnaires is another nightmare; I’ll have more to say about that in Chapter 14. General Questionnaire Format Formats for Respondents The format of a questionnaire is just as important In one of the most common types of question- as the nature and wording of the questions asked. naire items, the respondent is expected to check An improperly laid out questionnaire can lead one response from a series. For this purpose my respondents to miss questions, confuse them about experience has been that boxes adequately spaced the nature of the data desired, and even lead them apart are the best format. Word processing makes to throw the questionnaire away. the use of boxes a practical technique these days; setting boxes in type can be accomplished easily As a general rule, a questionnaire should be and neatly. You can approximate boxes by using spread out and uncluttered. If a self-administered brackets: [ ], but if you’re creating a questionnaire questionnaire is being designed, inexperienced on a computer, you should take the few extra researchers tend to fear that their questionnaire minutes to use genuine boxes that will give your will look too long; as a result, they squeeze several questionnaire a more professional look. Here are questions onto a single line, abbreviate questions, some easy examples: and try to use as few pages as possible. These ef- forts are ill-advised and even dangerous. Putting □❍❑ more than one question on a line will cause some respondents to miss the second question altogether. Rather than providing boxes to be checked, Some respondents will misinterpret abbreviated you might print a code number beside each re- questions. More generally, respondents who find sponse and ask the respondent to circle the appro- they have spent considerable time on the first page priate number (see Figure 9-1). This method has of what seemed like a short questionnaire will be the added advantage of specifying the code number more demoralized than respondents who quickly to be entered later in the processing stage (see complete the first several pages of what initially Chapter 14). If numbers are to be circled, however, seemed like a rather long form. Moreover, the you should provide clear and prominent instruc- latter will have made fewer errors and will not tions to the respondent, because many will be
Questionnaire Construction ■ 263 FIGURE 9-1 FIGURE 9-2 Circling the Answer Contingency Question Format. Contingency questions offer a structure for exploring subject areas logically in some depth. tempted to cross out the appropriate number, which makes data processing more difficult. (Note There are several formats for contingency ques- that the technique can be used more safely when tions. The one shown in Figure 9-2 is probably the interviewers administer the questionnaires, because clearest and most effective. Note two key elements the interviewers themselves record the responses.) in this format. First, the contingency question is isolated from the other questions by being set off to Contingency Questions the side and enclosed in a box. Second, an arrow connects the contingency question to the answer Quite often in questionnaires, certain questions will on which it is contingent. In the illustration, only be relevant to some of the respondents and irrel- those respondents answering yes are expected to evant to others. In a study of birth control methods, answer the contingency question. The rest of the for instance, you would probably not want to ask respondents should simply skip it. men if they take birth control pills. Note that the questions shown in Figure 9-2 This sort of situation often arises when re- could have been dealt with in a single question. searchers wish to ask a series of questions about The question might have read, “How many times, a certain topic. You may want to ask whether if any, have you smoked marijuana?” The re- your respondents belong to a particular organiza- sponse categories, then, might have read: “Never,” tion and, if so, how often they attend meetings, “Once,” “2 to 5 times,” and so forth. This single whether they have held office in the organization, question would apply to all respondents, and each and so forth. Or, you might want to ask whether would find an appropriate answer category. Such respondents have heard anything about a certain a question, however, might put some pressure on political issue and then learn the attitudes of those who have heard of it. contingency question A survey question intended for only some respondents, determined by their Each subsequent question in series such as responses to some other question. For example, all these is called a contingency question: Whether respondents might be asked whether they belong to it is to be asked and answered is contingent on the Cosa Nostra, and only those who said yes would responses to the first question in the series. The be asked how often they go to company meetings proper use of contingency questions can facilitate and picnics. The latter would be a contingency the respondents’ task in completing the question- question. naire, because they are not faced with trying to answer questions irrelevant to them.
264 ■ Chapter 9: Survey Research FIGURE 9-4 Instructions to Skip FIGURE 9-3 contingency questions. Figure 9-4 provides an illus- tration of this method. Contingency Table. Sometimes it will be appropriate for certain kinds of respondents to skip over inapplicable questions. To In addition to these instructions, it’s worth- avoid confusion, you should be sure to provide clear instruc- while to place an instruction at the top of each tions to that end. page containing only the contingency questions. For example, you might say, “This page is only for respondents to report having smoked marijuana, respondents who have voted in a national, state, because the main question asks how many times or local election.” Clear instructions such as these they have smoked it, even though it allows for spare respondents the frustration of reading and those exceptional cases who have never smoked mari- puzzling over questions irrelevant to them and juana even once. (The emphases used in the previous increase the likelihood of responses from those for sentence give a fair indication of how respondents whom the questions are relevant. might read the question.) The contingency ques- tion format illustrated in Figure 9-2 should reduce Matrix Questions the subtle pressure on respondents to report having smoked marijuana. Quite often, you’ll want to ask several questions that have the same set of answer categories. This Used properly, even rather complex sets of is typically the case whenever the Likert response contingency questions can be constructed without categories are used. In such cases, it is often pos- confusing the respondent. Figure 9-3 illustrates a sible to construct a matrix of items and answers as more complicated example. illustrated in Figure 9-5. Sometimes a set of contingency questions is This format offers several advantages over other long enough to extend over several pages. Sup- formats. First, it uses space efficiently. Second, pose you’re studying political activities of college respondents will probably find it faster to com- students, and you wish to ask a large number of plete a set of questions presented in this fashion questions of those students who have voted in a than in other ways. In addition, this format may national, state, or local election. You could separate increase the comparability of responses given to out the relevant respondents with an initial ques- different questions for the respondent as well as for tion such as “Have you ever voted in a national, the researcher. Because respondents can quickly state, or local election?” but it would be confusing review their answers to earlier items in the set, to place the contingency questions in a box stretch- they might choose between, say, “strongly agree” ing over several pages. It would make more sense and “agree” on a given statement by comparing to enter instructions, in parentheses after each the strength of their agreement with their earlier answer, telling respondents to answer or skip the responses in the set. There are some dangers inherent in using this format, however. Its advantages may encourage you to structure an item so that the responses fit
Questionnaire Construction ■ 265 FIGURE 9-5 Matrix Question Format. Matrix questions offer an efficient format for presenting a set of closed-ended questionnaire items that have the same response categories. into the matrix format when a different, more Similarly, if respondents are asked to assess idiosyncratic set of responses might be more their overall religiosity (“How important is your appropriate. Also, the matrix question format can religion to you in general?”), their responses foster a response-set among some respondents: to later questions concerning specific aspects of They may develop a pattern of, say, agreeing with religiosity will be aimed at consistency with the all the statements. This would be especially likely if prior assessment. The converse is true as well. If the set of statements began with several that indi- respondents are first asked specific questions about cated a particular orientation (for example, a liberal different aspects of their religiosity, their subse- political perspective) with only a few later ones quent overall assessment will reflect the earlier representing the opposite orientation. Respondents answers. The order of responses within a question might assume that all the statements represented can also make a difference (Bishop and Smith the same orientation and, reading quickly, misread 2001). some of them, thereby giving the wrong answers. This problem can be reduced somewhat by alter- The impact of item order is not uniform. When nating statements representing different orienta- J. Edwin Benton and John Daly (1991) conducted tions and by making all statements short and clear. a local government survey, they found that the less-educated respondents were more influenced Ordering Items by the order of questionnaire items than those with in a Questionnaire more education were. The order in which questionnaire items are pre- Some researchers attempt to overcome this sented can also affect responses. First, the appear- effect by randomizing the order of items. This effort ance of one question can affect the answers given is usually futile. In the first place, a randomized set to later ones. For example, if several questions have of items will probably strike respondents as chaotic been asked about the dangers of terrorism to the and worthless. The random order also makes it United States and then a question asks respondents more difficult for respondents to answer, because to volunteer (open-endedly) what they believe to they must continually switch their attention from represent dangers to the United States, terrorism one topic to another. Finally, even a randomized will receive more citations than would otherwise be ordering of items will have the effect discussed the case. In this situation, it’s preferable to ask the previously—except that you’ll have no control open-ended question first. over the effect. The safest solution is sensitivity to the problem. Although you cannot avoid the effect of item order,
266 ■ Chapter 9: Survey Research try to estimate what that effect will be so that you It’s useful to begin every self-administered can interpret results meaningfully. If the order of questionnaire with basic instructions for complet- items seems especially important in a given study, ing it. Although many people these days have you might construct more than one version of the experience with forms and questionnaires, begin questionnaire with different orderings of the items. by telling them exactly what you want: that they You will then be able to determine the effects by are to indicate their answers to certain questions by comparing responses to the various versions. At the placing a check mark or an X in the box beside the very least, you should pretest your questionnaire appropriate answer or by writing in their answer in the different forms. (We’ll discuss pretesting in a when asked to do so. If many open-ended ques- moment.) tions are used, respondents should be given some guidelines about whether brief or lengthy answers The desired ordering of items differs between are expected. If you wish to encourage your re- interviews and self-administered questionnaires. spondents to elaborate on their responses to closed- In the latter, it’s usually best to begin the question- ended questions, that should be noted. naire with the most interesting set of items. The potential respondents who glance casually over the If a questionnaire has subsections—political first few items should want to answer them. Per- attitudes, religious attitudes, background data—in- haps the items will ask for attitudes they’re aching troduce each with a short statement concerning its to express. At the same time, however, the initial content and purpose. For example, “In this section, items should not be threatening. (It might be a we would like to know what people consider to bad idea to begin with items about sexual behav- be the most important community problems.” De- ior or drug use.) Requests for duller, demographic mographic items at the end of a self-administered data (age, gender, and the like) should generally questionnaire might be introduced thus: “Finally, be placed at the end of a self-administered ques- we would like to know just a little about you so we tionnaire. Placing these items at the beginning, as can see how different types of people feel about the many inexperienced researchers are tempted to do, issues we have been examining.” gives the questionnaire the initial appearance of a routine form, and the person receiving it may not Short introductions such as these help the be motivated to complete it. respondent make sense of the questionnaire. They make the questionnaire seem less chaotic, espe- Just the opposite is generally true for inter- cially when it taps a variety of data. And they help view surveys. When the potential respondent’s put the respondent in the proper frame of mind for door first opens, the interviewer must gain rapport answering the questions. quickly. After a short introduction to the study, the interviewer can best begin by enumerating the Some questions may require special instruc- members of the household, getting demographic tions to facilitate proper answering. This is es- data about each. Such items are easily answered pecially true if a given question varies from the and generally nonthreatening. Once the initial rap- general instructions pertaining to the whole ques- port has been established, the interviewer can then tionnaire. Some specific examples will illustrate this move into the area of attitudes and more sensitive situation. matters. An interview that began with the question “Do you believe in witchcraft?” would probably Despite attempts to provide mutually exclusive end rather quickly. answers in closed-ended questions, often more than one answer will apply for respondents. If you want Questionnaire Instructions a single answer, you should make this perfectly clear in the question. An example would be “From Every questionnaire, whether it is to be completed the list below, please check the primary reason for by respondents or administered by interviewers, your decision to attend college.” Often the main should contain clear instructions and introductory question can be followed by a parenthetical note: comments where appropriate. “Please check the one best answer.” If, on the other hand, you want the respondent to check as many answers as apply, you should make this clear.
Self-Administered Questionnaires ■ 267 When the respondent is supposed to rank- report on the effectiveness of each. They also order a set of answer categories, the instructions provide data on the cost of the various methods. should indicate this, and a different type of answer Paul Beatty and Gordon Willis (2007) offer a useful format should be used (for example, blanks instead review of “cognitive interviewing.” In this tech- of boxes). These instructions should indicate how nique, the pretest includes gathering respondents’ many answers are to be ranked (for example: all; comments about the questionnaire itself, so that only the first and second; only the first and last; the the researchers can see which questions are com- most important and least important). These instruc- municating effectively and collecting the informa- tions should also spell out the order of ranking (for tion sought. example: “Place a 1 beside the most important item, a 2 beside the next most important, and so forth”). There are many more tips and guidelines for Rank-ordering of responses is often difficult for re- questionnaire construction, but covering them all spondents, however, because they may have to read would take a book in itself. For now I’ll complete and reread the list several times, so this technique this discussion with an illustration of a real ques- should be used only in those situations where no tionnaire, showing how some of these comments other method will produce the desired result. find substance in practice. In multiple-part matrix questions, giving spe- Before turning to the illustration, however, I cial instructions is useful unless the same format is want to mention a critical aspect of questionnaire used throughout the questionnaire. Sometimes re- design: precoding. Because the information col- spondents will be expected to check one answer in lected by questionnaires is typically transformed each column of the matrix; in other questionnaires into some type of computer format, it’s usually ap- they’ll be expected to check one answer in each propriate to include data-processing instructions on row. Whenever the questionnaire contains both the questionnaire itself. These instructions indicate formats, it’s useful to add an instruction clarifying where specific pieces of information will be stored which is expected in each case. in the machine-readable data files. Notice that the following illustration has been precoded with the Pretesting the Questionnaire mysterious numbers that appear near questions and answer categories. No matter how carefully researchers design a data- collection instrument such as a questionnaire, there A Composite Illustration is always the possibility—indeed the certainty— of error. They will always make some mistake: an am- Figure 9-6 is part of a questionnaire used by the biguous question, one that people cannot answer, University of Chicago’s National Opinion Research or some other violation of the rules just discussed. Center in its General Social Survey. The question- naire dealt with people’s attitudes toward the gov- The surest protection against such errors is to ernment and was designed to be self-administered, pretest the questionnaire in full or in part. Give though most of the GSS is conducted in face-to- the questionnaire to the ten people in your bowl- face interviews. ing league, for example. It’s not usually essential that the pretest subjects comprise a representative Self-Administered sample, although you should use people for whom Questionnaires the questionnaire is at least relevant. So far we’ve discussed how to formulate questions By and large, it’s better to ask people to com- and how to design effective questionnaires. As im- plete the questionnaire than to read through it portant as these tasks are, the labor will be wasted looking for errors. All too often, a question seems to make sense on a first reading, but it proves to be (Text continues on p. 270.) impossible to answer. Stanley Presser and Johnny Blair (1994) describe several different pretesting strategies and
10. Here are some things the government might do for the economy. Circle one number for each action to show whether you are in favor of it or against it. 1. Strongly in favor of 28/ 2. In favor of 29/ 3. Neither in favor of nor against 30/ 4. Against 5. Strongly against 31/ 32 / PLEASE CIRCLE A NUMBER 33/ a. Control of wages by legislation .......................................... 1 2 3 4 5 b. Control of prices by legislation ........................................... 1 2 3 4 5 34 / c. Cuts in government spending ............................................ 1 2 3 4 5 d. Government financing of projects to 35/ create new jobs .................................................................. 1 2 3 4 5 e. Less government regulation of business ........................... 1 2 3 4 5 f. Support for industry to develop new products and technology .................................................... 1 2 3 4 5 g. Supporting declining industries to protect jobs ........................................................................ 1 2 3 4 5 h. Reducing the work week to create more jobs ........................................................................... 1 2 3 4 5 11. Listed below are various areas of government spending. Please indicate whether you would like to see more or less government spending in each area. Remember that if you say “much more,” it might require a tax increase to pay for it. 1. Spend much more 36 / 2. Spend more 37/ 3. Spend the same as now 38/ 4. Spend less 39/ 5. Spend much less 40/ 8. Can’t choose 41/ 42 / PLEASE CIRCLE A NUMBER 43/ a. The environment ....................................................... 1 2 3 4 5 8 44 / b. Health ........................................................................ 1 2 3 4 5 8 c. The police and law enforcement ............................... 1 2 3 4 5 8 45/ d. Education .................................................................. 1 2 3 4 5 8 e. The military and defense ........................................... 1 2 3 4 5 8 f. Retirement benefits ................................................... 1 2 3 4 5 8 g. Unemployment benefits ............................................. 1 2 3 4 5 8 h. Culture and the arts ................................................... 1 2 3 4 5 8 12. If the government had to choose between keeping down inflation or keeping down unemployment, to which do you think it should give highest priority? Keeping down inflation ....................................................................................................................... 1 Keeping down unemployment ............................................................................................................ 2 Can’t choose ...................................................................................................................................... 8 13. Do you think that labor unions in this country have too much power or too little power? Far too much power ........................................................................................................................... 1 Too much power ................................................................................................................................ 2 About the right amount of power ....................................................................................................... 3 Too little power ................................................................................................................................... 4 Far too little power ............................................................................................................................. 5 Can’t choose ...................................................................................................................................... 8 FIGURE 9-6 A Sample Questionnaire. This questionnaire excerpt is from the General Social Survey, a major source of data for analysis by social researchers around the world.
14. How about business and industry, do they have too much power or too little power? 46 / Far too much power .......................................................................................................................... 1 47/ Too much power ................................................................................................................................ 2 48/ About the right amount of power ....................................................................................................... 3 Too little power .................................................................................................................................. 4 Far too little power ............................................................................................................................ 5 Can’t choose ...................................................................................................................................... 8 15. And what about the federal government, does it have to much power or too little power? Far too much power .......................................................................................................................... 1 Too much power ................................................................................................................................ 2 About the right amount of power ....................................................................................................... 3 Too little power .................................................................................................................................. 4 Far too little power ............................................................................................................................. 5 Can’t choose ...................................................................................................................................... 8 16. In general, how good would you say labor unions are for the country as a whole? Excellent ............................................................................................................................................ 1 Very good .......................................................................................................................................... 2 Fairly good ......................................................................................................................................... 3 Not very good .................................................................................................................................... 4 Not good at all ................................................................................................................................... 5 Can’t choose ...................................................................................................................................... 8 17. What do you think the government’s role in each of these industries should be? 1. Own it 2. Control prices and profits but not own it 3. Neither own it nor control its prices and profits 8. Can’t choose PLEASE CIRCLE A NUMBER a. Electric power ..................................................................... 1 2 3 8 49/ b. The steel industry ............................................................... 1 2 3 8 50/ c. Banking and insurance ....................................................... 1 2 3 8 51/ 18. On the whole, do you think it should or should not be the government’s responsibility to . . . 1. Definitely should be 2. Probably should be 3. Probably should not be 4. Definitely should not be 8. Can’t choose PLEASE CIRCLE A NUMBER 52 / 53/ a. Provide a job for everyone who wants one .............................. 1 2 3 4 8 54 / b. Keep prices under control ........................................................ 1 2 3 4 8 c. Provide health care for the sick ............................................... 1 2 3 4 8 55/ d. Provide a decent standard of living for the old ...................................................................................... 1 2 3 4 8 FIGURE 9-6 (Continued )
270 ■ Chapter 9: Survey Research unless the questionnaire produces useful data— specifically to the mail survey, which is still the which means that respondents actually complete typical form of self-administered questionnaire. the questionnaire. We turn now to the major methods for getting responses to questionnaires. Mail Distribution and Return I’ve referred several times in this chapter to The basic method for collecting data through the interviews and self-administered questionnaires. mail has been to send a questionnaire accompa- Actually, there are three main methods of adminis- nied by a letter of explanation and a self-addressed, tering survey questionnaires to a sample of respon- stamped envelope for returning the question- dents: self-administered questionnaires, in which naire. The respondent is expected to complete the respondents are asked to complete the question- questionnaire, put it in the envelope, and return it. naire themselves; surveys administered by inter- If, by any chance, you’ve received such a question- viewers in face-to-face encounters; and surveys naire and failed to return it, it would be valuable conducted by telephone. This section and the next to recall the reasons you had for not returning it two discuss each of these methods in turn. A fourth and keep them in mind any time you plan to send section addresses online surveys, a new technique questionnaires to others. growing in popularity. A common reason for not returning question- The most common form of self-administered naires is that it’s too much trouble. To overcome questionnaire is the mail survey. However, there this problem, researchers have developed several are several other techniques that are often used as ways to make returning them easier. For instance, well. At times, it may be appropriate to administer a self-mailing questionnaire requires no return a questionnaire to a group of respondents gathered envelope: When the questionnaire is folded a at the same place at the same time. For example, a particular way, the return address appears on the survey of students taking introductory psychology outside. The respondent therefore doesn’t have to might be conducted during class. High school stu- worry about losing the envelope. dents might be surveyed during homeroom period. More-elaborate designs are available also. The Some recent experimentation has been con- university student questionnaire to be described ducted with regard to the home delivery of ques- later in this chapter was bound in a booklet with tionnaires. A research worker delivers the ques- a special, two-panel back cover. Once the ques- tionnaire to the home of sample respondents and tionnaire was completed, the respondent needed explains the study. Then the questionnaire is left only to fold out the extra panel, wrap it around for the respondent to complete, and the researcher the booklet, and seal the whole thing with the picks it up later. adhesive strip running along the edge of the panel. The foldout panel contained my return address Home delivery and the mail can also be used in and postage. When I repeated the study a couple combination. Questionnaires are mailed to families, of years later, I improved on the design. Both the and then research workers visit homes to pick up front and back covers had foldout panels: one for the questionnaires and check them for complete- sending the questionnaire out and the other for ness. Just the opposite technique is to have ques- getting it back—thus avoiding the use of envelopes tionnaires hand-delivered by research workers with altogether. a request that the respondents mail the completed questionnaires to the research office. The point here is that anything you can do to make the job of completing and returning the On the whole, when a research worker either questionnaire easier will improve your study. Imag- delivers the questionnaire, picks it up, or both, the ine receiving a questionnaire that made no provi- completion rate seems higher than it is for straight- sions for its return to the researcher. Suppose you forward mail surveys. Additional experimentation had to (1) find an envelope, (2) write the address with this technique is likely to point to other ways on it, (3) figure out how much postage it required, to improve completion rates while reducing costs. The remainder of this section, however, is devoted
Self-Administered Questionnaires ■ 271 and (4) put the stamps on it. How likely is it that reports the cumulative number or percentage. In you would return the questionnaire? part, this activity provides the researchers with gratification, as they get to draw a picture of their A few brief comments on postal options are in successful data collection. More important, how- order. You have options for mailing questionnaires ever, it serves as their guide to how the data collec- out and for getting them returned. On outgoing tion is going. If follow-up mailings are planned, the mail, your choices are essentially between first-class graph provides a clue about when such mailings postage and bulk rate. First class is more certain, should be launched. (The dates of subsequent mail- but bulk rate is far cheaper. (Check your local ings should be noted on the graph.) post office for rates and procedures.) On return mail, your choice is between postage stamps and As completed questionnaires are returned, business-reply permits. Here, the cost differential is each should be opened, scanned, and assigned more complicated. If you use stamps, you pay for an identification (ID) number. These numbers them whether people return their questionnaires should be assigned serially as the questionnaires or not. With the business-reply permit, you pay for are returned, even if other identification num- only those that are used, but you pay an additional bers have already been assigned. Two examples surcharge of about a nickel. This means that stamps should illustrate the important advantages of this are cheaper if a lot of questionnaires are returned, procedure. but business-reply permits are cheaper if fewer are returned (and you won’t know in advance how Let’s assume you’re studying attitudes toward many will be returned). a political figure. In the middle of the data collec- tion, the media break the story that the politician is There are many other considerations involved having extramarital affairs. By knowing the date of in choosing among the several postal options. Some that public disclosure and the dates when ques- researchers, for example, feel that using postage tionnaires were received, you’ll be in a position to stamps communicates more “humanness” and sin- determine the effects of the disclosure. (Recall from cerity than using bulk rate and business-reply per- Chapter 8 the discussion of history in connection mits does. Others worry that respondents will steam with experiments.) off the stamps and use them for some purpose other than returning the questionnaires. Because both In a less sensational way, serialized ID numbers bulk rate and business-reply permits require estab- can be valuable in estimating non-response biases lishing accounts at the post office, you’ll probably in the survey. Barring more-direct tests of bias, find stamps much easier for small surveys. you may wish to assume that those who failed to answer the questionnaire will be more like respon- Monitoring Returns dents who delayed answering than like those who answered right away. An analysis of questionnaires The mailing of questionnaires sets up a new re- received at different points in the data collection search question that may prove valuable to a study. might then be used for estimates of sampling bias. Researchers shouldn’t sit back idly as question- For example, if the grade point averages (GPAs) naires are returned; instead, they should undertake reported by student respondents decrease steadily a careful recording of the varying rates of return through the data collection, with those replying among respondents. right away having higher GPAs and those replying later having lower GPAs, you might tentatively con- An invaluable tool in this activity is a return clude that those who failed to answer at all have rate graph. The day on which questionnaires were lower GPAs yet. Although it would not be advisable mailed is labeled Day 1 on the graph, and every day to make statistical estimates of bias in this fashion, thereafter the number of returned questionnaires you could take advantage of approximate estimates is logged on the graph. It’s usually best to compile based on the patterns you’ve observed. two graphs. One shows the number returned each day—rising over time, then dropping. The second If respondents have been identified for pur- poses of follow-up mailing, then preparations
272 ■ Chapter 9: Survey Research for those mailings should be made as the mailing time— out and in—is more than two or questionnaires are returned. The case study later in three days.) this section discusses this process in greater detail. If the individuals in the survey sample are not Follow-up Mailings identified on the questionnaires, it may not be pos- sible to remail only to nonrespondents. In such a Follow-up mailings may be administered in several case, send your follow-up mailing to all members of ways. In the simplest, nonrespondents are sim- the sample, thanking those who may have already ply sent a letter of additional encouragement to participated and encouraging those who have not participate. A better method, however, is to send to do so. (The case study reported later describes a new copy of the survey questionnaire with the yet another method you can use in an anonymous follow-up letter. If potential respondents have not mail survey.) returned their questionnaires after two or three weeks, the questionnaires have probably been lost Response Rates or misplaced. Receiving a follow-up letter might encourage them to look for the original question- A question that new survey researchers frequently naire, but if they can’t find it easily, the letter may ask concerns the percentage return rate, or the go for naught. response rate, that should be achieved in a survey. The body of inferential statistics used in connection The methodological literature strongly sug- with survey analysis assumes that all members of gests that follow-up mailings provide an effective the initial sample complete the survey. Because this method for increasing return rates in mail surveys. almost never happens, non-response bias becomes In general, the longer a potential respondent delays a concern, with the researcher testing (and hoping) replying, the less likely he or she is to do so at all. for the possibility that the respondents look es- Properly timed follow-up mailings, then, provide sentially like a random sample of the initial sample, additional stimuli to respond. and thus a somewhat smaller random sample of the total population. The effects of follow-up mailings will be seen in the response rate curves recorded during data Nevertheless, overall response rate is one collection. The initial mailings will be followed by guide to the representativeness of the sample re- a rise and subsequent subsiding of returns; the fol- spondents. If a high response rate is achieved, there low-up mailings will spur a resurgence of returns; is less chance of significant non-response bias than and more follow-ups will do the same. In practice, with a low rate. Conversely, a low response rate is three mailings (an original and two follow-ups) a danger signal, because the nonrespondents are seem the most efficient. likely to differ from the respondents in ways other than just their willingness to participate in the sur- The timing of follow-up mailings is also im- vey. Richard Bolstein (1991), for example, found portant. Here the methodological literature offers that those who did not respond to a preelection less-precise guides, but I’ve found that two or three political poll were less likely to vote that those who weeks is a reasonable space between mailings. did participate. Estimating the turnout rate from (This period might be increased by a few days if the just the survey respondents, then, would have overestimated the number who would show up response rate The number of people participat- at the polls. Ironically, of course, since the non-re- ing in a survey divided by the number selected in spondents were unlikely to vote, the preferences of the sample, in the form of a percentage. This is also the survey participants might offer a good estimate called the completion rate or, in self-administered sur- of the election results. veys, the return rate: the percentage of questionnaires sent out that are returned. As you can imagine, one of the more persistent discussions among survey researchers concerns
Self-Administered Questionnaires ■ 273 ways of increasing response rates. You’ll recall that study was conducted by the students in my gradu- this was a chief concern in the earlier discussion ate seminar in survey research methods. of options for mailing out and receiving question- naires. Survey researchers have developed many As you may recall, 1,100 students were selected ingenious techniques addressing this problem. from the university registration database through a Some have experimented with novel formats. Oth- stratified, systematic sampling procedure. For each ers have tried paying respondents to participate. student selected, six self-adhesive mailing labels The problem with paying, of course, is that it’s were printed by the computer. expensive to make meaningfully high payment to hundreds or thousands of respondents, but some By the time we were ready to distribute the imaginative alternatives have been used. Some questionnaires, it became apparent that our meager researchers have said, “We want to get your two- research funds wouldn’t cover several mailings to cents’ worth on some issues, and we’re willing to the entire sample of 1,100 students (questionnaire pay”—enclosing two pennies. Another enclosed a printing costs were higher than anticipated). As quarter, suggesting that the respondent make some a result, we chose a systematic two-thirds sample little child happy. Still others have enclosed paper of the mailing labels, yielding a subsample of 733 money. Similarly, Michael Davern and his col- students. leagues (2003) found that financial incentives also increased completion rates in face-to-face interview Earlier, we had decided to keep the survey surveys (discussed in the next section). anonymous in the hope of encouraging more- candid responses to some sensitive questions. Don Dillman (2007) has spent decades pains- (Later surveys of the same issues among the same takingly assessing the various techniques that sur- population indicated this anonymity was unnec- vey researchers have used to increase return rates essary.) Thus, the questionnaires would carry no on mail surveys, and he evaluates the impact of identification of students on them. At the same each. More important, Dillman stresses the neces- time, we hoped to reduce the follow-up mailing sity of paying attention to all aspects of the study— costs by mailing only to nonrespondents. what he calls the “Tailored Design Method”— rather than one or two special gimmicks. To achieve both of these aims, a special post- card method was devised. Each student was mailed Having said all this, there is no absolutely ac- a questionnaire that carried no identifying marks, ceptable level of response to a mail survey, except for plus a postcard addressed to the research office— 100 percent. While it is possible to achieve response with one of the student’s mailing labels affixed to rates of 70 percent or more, most mail surveys prob- the reverse side of the card. The introductory letter ably fall below that level. Thus, it’s important to test asked the student to complete and return the ques- for non-response bias wherever possible. tionnaire—assuring anonymity—and to return the postcard simultaneously. Receiving the postcard A Case Study would tell us—without indicating which question- naire it was—that the student had returned his The steps involved in the administration of a mail or her questionnaire. This procedure would then survey are many and can best be appreciated in a facilitate follow-up mailings. walk-through of an actual study. Accordingly, this section concludes with a detailed description of The 32-page questionnaire was printed in how the student survey we discussed in Chapter booklet form. The three-panel cover described ear- 7 as an illustration of systematic sampling was lier in this chapter permitted the questionnaire to administered. This study did not represent the be returned without an additional envelope. theoretical ideal for such studies, but in that regard it serves our present purposes all the better. The A letter introducing the study and its purposes was printed on the front cover of the booklet. It explained why the study was being conducted (to learn how students feel about a variety of issues), how students had been selected for the study, the
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367
- 368
- 369
- 370
- 371
- 372
- 373
- 374
- 375
- 376
- 377
- 378
- 379
- 380
- 381
- 382
- 383
- 384
- 385
- 386
- 387
- 388
- 389
- 390
- 391
- 392
- 393
- 394
- 395
- 396
- 397
- 398
- 399
- 400
- 401
- 402
- 403
- 404
- 405
- 406
- 407
- 408
- 409
- 410
- 411
- 412
- 413
- 414
- 415
- 416
- 417
- 418
- 419
- 420
- 421
- 422
- 423
- 424
- 425
- 426
- 427
- 428
- 429
- 430
- 431
- 432
- 433
- 434
- 435
- 436
- 437
- 438
- 439
- 440
- 441
- 442
- 443
- 444
- 445
- 446
- 447
- 448
- 449
- 450
- 451
- 452
- 453
- 454
- 455
- 456
- 457
- 458
- 459
- 460
- 461
- 462
- 463
- 464
- 465
- 466
- 467
- 468
- 469
- 470
- 471
- 472
- 473
- 474
- 475
- 476
- 477
- 478
- 479
- 480
- 481
- 482
- 483
- 484
- 485
- 486
- 487
- 488
- 489
- 490
- 491
- 492
- 493
- 494
- 495
- 496
- 497
- 498
- 499
- 500
- 501
- 502
- 503
- 504
- 505
- 506
- 507
- 508
- 509
- 510
- 511
- 512
- 513
- 514
- 515
- 516
- 517
- 518
- 519
- 520
- 521
- 522
- 523
- 524
- 525
- 526
- 527
- 528
- 529
- 530
- 531
- 532
- 533
- 534
- 535
- 536
- 537
- 538
- 539
- 540
- 541
- 542
- 543
- 544
- 545
- 546
- 547
- 548
- 549
- 550
- 551
- 552
- 553
- 554
- 555
- 556
- 557
- 558
- 559
- 560
- 561
- 562
- 563
- 564
- 565
- 566
- 567
- 568
- 569
- 570
- 571
- 572
- 573
- 574
- 575
- 576
- 577
- 578
- 579
- 580
- 581
- 582
- 583
- 584
- 585
- 586
- 587
- 588
- 589
- 590
- 591
- 592
- 593
- 594
- 595
- 596
- 597
- 598
- 599
- 600
- 601
- 602
- 603
- 604
- 605
- 606
- 607
- 608
- 609
- 610
- 611
- 612
- 613
- 614
- 615
- 616
- 617
- 618
- 619
- 620
- 621
- 622
- 623
- 624
- 625
- 1 - 50
- 51 - 100
- 101 - 150
- 151 - 200
- 201 - 250
- 251 - 300
- 301 - 350
- 351 - 400
- 401 - 450
- 451 - 500
- 501 - 550
- 551 - 600
- 601 - 625
Pages: