Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore -Earl_Babbie-_The_Practice_of_Social_Research(BookFi)

-Earl_Babbie-_The_Practice_of_Social_Research(BookFi)

Published by dinakan, 2021-08-12 20:20:06

Description: e-Book ini adalah untuk tujuan pembacaan sahaja dan tidak berasaskan sebarang keuntungan.

Search

Read the Text Version

Content AnalYSis 325 Coding in Content Analysis nature of the novel. This method would have the advantage of ease and reliability in coding and of Content analysis is essentially a coding operation. letting the reader of the research report know pre- Coding is the process of transforming raw data cisely how eroticism was measured. It would have a into a standardized form. In content analysis, com- disadvantage, on the other hand, in terms of valid- munications-oraL written, or other-are coded ity. Surely the phrase erotic novel conveys a richer or classified according to some conceptual frame- and deeper meaning than the number of times the work. Thus, for example, newspaper editorials may word love is used. be coded as liberal or conservative. Radio broad- casts may be coded as propagandistic or not, novels Alternatively, you could code the latent con- as romantic or not. paintings as representational or tent of the communication: its underlying mean- not, and political speeches as containing character ing. In the present example, you might read an en- assassinations or not. Recall that because terms tire novel or a sample of paragraphs or pages and such as these are subject to many interpretations, make an overall assessment of how erotic the novel the researcher must specify definitions clearly. was. Although your total assessment might very well be influenced by the appearance of words such Coding in content analysis involves the logic of as love and kiss, it would not depend fully on their conceptualization and operationalization, which I frequency. discussed in Chapter 5. As in other research meth- ods, you must refine your conceptual framework Clearly, this second method seems better de- and develop specific methods for observing in rela- signed for tapping the underlying meaning of com- tion to that framework. munications, but its advantage comes at a cost to reliability and specificity. Especially if more than Manifest and Latent Content one person is coding the novel. somewhat different definitions or standards may be employed. A pas- In the earlier discussions of field research, we found sage that one coder regards as erotic may not seem that the researcher faces a fundamental choice be- erotic to another. Even if you do all of the coding tween depth and specificity of understanding. Of- yourself. there is no guarantee that your definitions ten, this represents a choice between validity and and standards will remain constant throughout the reliability, respectively Typically, field researchers enterprise. Moreover. the reader of your research opt for depth, preferring to base their judgments report will likely be uncertain about the definitions on a broad range of observations and information, you've employed. even at the risk that another observer might reach a different judgment of the same situation. Survey Wherever possible, the best solution to this research-through the use of standardized ques- dilenmla is to use both methods. For example, Carol tionnaires-represents the other extreme: total Auster was interested in changes in the socialization specificity, even though the specific measures of of young women in Girl Scouts. To explore this, she variables may not be adequately valid reflections undertook a content analysis of the Girl Scout man- of those variables. The content analyst has some uals as revised over time. In particular, Auster was choice in this matter. however. coding The process whereby raw data are trans- Coding the manifest content-the visible, formed into standardized form suitable for machine surface content-of a corI1l11unication is analogous processing and analysis. to using a standardized questionnaire To deter- mine, for example, how erotic certain novels are, manifest content In connection with content you might simply count the number of times the analysis, the concrete terms contained in a commu- word love appears in each novel or the average nication, as distinguished from latent content. number of appearances per page. Or. you might use a list of words, such as love, kiss, hug, and caress, each latent content As used in connection with content of which might serve as an indicator of the erotic analysis. the underlying meaning of communica- tions, as distinguished from their manifest content.

326 Chapter 11: Unobtrusive Research interested in the view that women should be lim- seriously, but just as many people enjoy finish- ited to homemaking\" Her analysis of the manifest ing a complicated jigsaw puzzle, many re- content suggested a change: \"I found that while searchers find great satisfaction in coding and 23 % of the badges in 1913 centered on horne life, analysis. As researchers. 0 • begin to see the this was true of only 13 % of the badges in 1963 and puzzle pieces corne together to form a more 7% of the badges in 1980\" (1985: 361)\" complete picture, the process can be downright thrilling. An analysis of the latent content also pointed to an emancipation of Girl Scouts, similar to that oc- Throughout this activity, remember that the curring in US society at large\" The change of uni- operational definition of any variable is composed form was one indicator: \"The shift from skirts to of the attributes included in it. Such attributes, pants may reflect an acknowledgement of the more moreover, should be mutually exclusive and ex- physically active role of women as well as the vari- haustive. A newspaper editorial, for example, ety of physical images available to modern women\" should not be described as both liberal and conser- (Auster 1985: 362)\" Supporting evidence was vative, though you should probably allow for some found in the appearance of badges such as \"Science to be middle-of-the-road. It may be sufficient for Sleuth,\" \"Aerospace,\" and \"Ms\" Fix-IL\" your purposes to code novels as erotic or nonerotic, but you may also want to consider that some could Conceptualization and the Creation be anti-erotic Paintings might be classified as rep- of Code Categories resentational or not, if that satisfied your research purpose, or you might wish to classify them as im- For all research methods, conceptualization and op- pressionistic, abstract, allegorical, and so forth. erationalization typically involve the interaction of theoretical concerns and empirical observations\" If, Realize further that different levels of measure- for example, you believe some newspaper editorials ment may be used in content analysis\" You might, to be liberal and others to be conservative, ask your- for example, use the nominal categories of liberal self why you think so. Read some editorials, asking and conservative for characterizing newspaper edi- yourself which ones are liberal and which ones are torials, or you might wish to use a more refined or- conservativeo Was the political orientation of a par- dinal ranking, ranging from extremely liberal to ex- ticular editorial most clearly indicated by its mani- tremely conservative. Bear in mind, however, that fest content or by its tone? Was your decision based the level of measurement implicit in your coding on the use of certain terms (for example, leftist,fas- methods-nominal, ordinaL interval, or ratio- fist, and so on) or on the support or opposition does not necessarily reflect the nature of your vari- given to a particular issue or political personality? ables. If the word love appeared 100 times in Novel A and 50 times in Novel B, you would be justified Both inductive and deductive methods should in saying that the word love appeared twice as often be used in this activity. If you're testing theoretical in Novel A, but not that Novel A was twice as erotic propositions, your theories should suggest empiri- as Novel K Similarly, agreeing with twice as many cal indicators of concepts. If you begin vvith speci- anti-Semitic statements in a questionnaire as some- fic empirical observations, you should attempt to one else does not necessarily make one twice as derive general principles relating to them and anti-Semitic as that other person\" then apply those principles to the other empirical observations. Counting and Record Keeping Bruce Berg (1989: Ill) places code develop- If you plan to evaluate your content analysis data ment in the context of grounded theory and likens quantitatively, your coding operation must be it to solving a puzzle: amenable to data processing. This means, first, that the end product of your coding must be numerical. Coding and other fundamental procedures as- If you're counting the frequency of certain words, sociated with grounded theory development are certainly hard work and must be taken

Content Analysis 327 Newspaper Number of SUBJECTIVE EVALUATION Number of Number of Number of 10 editorials 1. Very liberal \"isolationist\" \"pro-United \"anti-United evaluated 2\" Moderately liberal 3. Middle-of-road editorials Nations\" Nations\" 4. Moderately conservative editorials editorials 5. Very conservative 001 37 08 0 2 002 26 5 10 0 6I 003 44 4 2 12 004 22 3 1 I2 3 1 005 30 06 0 FIGURE 11-3 I Sample Tally Sheet (Partial) phrases, or other manifest content, the coding is as realistic if a high percentage of paintings were of necessarily numericaL But even if you're coding that genre. Similarly, it would tell us little that the latent content on the basis of overall judgments, word love appeared 87 times in a novel if we did not it will be necessary to represent your coding know about how many words there were in the en- decision numerically: 1 = very liberal, 2 moder- tire novel. The issue of observational base is most ately liberaL 3 moderately conservative, and easily resolved if every observation is coded in terms so on\" of one of the attributes making up a variable. Rather than simply counting the number of liberal editori- Second, your record keeping must clearly dis- als in a given collection, for example, code each edi- tinguish between units of analysiS and units of ob- torial by its political orientation, even if it must be servation, especially if these two are different The coded \"no apparent orientation.\" initial coding, of course, must relate to the units of observation\" If novelists are the units of analysis, Let's suppose we want to describe and explain for example, and you wish to characterize them the editorial policies of different newspapers. through a content analysis of their novels, your pri- Figure 11-3 presents part of a tally sheet that might mary records will represent novels as the units of result from the coding of newspaper editorials. observation\" You may then combine your scoring of Note that newspapers are the units of analysis. individual novels to characterize each novelist, the Each newspaper has been assigned an identification unit of analysis. number to facilitate mechanized processing. The second column has a space for the number of edi- Third, while you're counting, it will normally be torials coded for each newspapeL This will be an important to record the base from which the count- important piece of information, because we want ing is done. It would probably be useless to know the to be able to say, for example, \"Of all the editorials, number of realistic paintings produced by a given 22 percent were pro-United Nations,\" not just painter without knowing the number he or she has \"There were eight pro-United Nations editorialso\" painted all together; the painter would be regarded

328 Chapter 11: Unobtrusive Research One column in Figure 11-3 is for assigning a those 'with the most money, education, or both are subjective overall assessment of the newspapers' the most active leaders. editorial policies. (Such assignments might later be compared with the several objective measures.) This process is an example of what Barney Other columns provide space for recording num- Glaser and Anselm Strauss (1967) called anaZvtic bers of editorials reflecting specific editorial posi- induction. It is inductive in that it primarily begins tions. In a real content analysis, there would be with observations, and it is analytic because it goes spaces for recording other editorial positions plus beyond description to find patterns and relation- noneditorial information about each newspaper, ships among variables. such as the region in which it is published, its circulation, and so forth. There are, of course, dangers in this form of analysis, as in all others. The chief risk is misclassi- Qualitative Data Analysis fying observations so as to support an emerging hy- pothesis. For example, you may erroneously con- Not all content analysis results in counting. Some- clude that a nonleader didn't graduate from college times a qualitative assessment of the materials is or you may decide that the job of factory foreman most appropriate, as in Carol Auster's examination is \"close enough\" to being white-collar. of changes in Girl Scout uniforms and handbook language. Berg (1989: 124) offers techniques for avoiding these errors: Bruce Berg (1989: 123-25) discusses \"negative case testing\" as a technique for qualitative hypothe- L If there are sufficient cases, select some at ran- sis testing. First, in the grounded theory tradition, dom from each category in order to avoid you begin with an examination of the data, which merely picking those that best support the may yield a general hypothesis. Let's say that you're hypothesis. examining the leadership of a new community as- sociation by revievving the minutes of meetings to 2. Give at least three examples in support of every see who made motions that were subsequently assertion you make about the data. passed. Your initial examination of the data sug- gests that the wealthier members are the most 3. Have your analytic interpretations carefully re- likely to assume this leadership role. viewed by others uninvolved in the research project to see whether they agree. The second stage in the analysis is to search your data to find all the cases that contradict the 4. Report whatever inconsistencies you do dis- initial hypothesis. In this instance, you would look cover-any cases that simply do not fit your for poorer members who made successful motions hypotheses. Realize that few social patterns are and wealthy members who never did. Third, you 100 percent consistent, so you may have dis- must review each of the disconfirming cases and covered something important even if it doesn't either (I) give up the hypothesis or (2) see how it apply to absolutely all of social life. However, needs to be fine-tuned. you should be honest with your readers in that regard. Let's say that in your analysis of disconfirming cases, you notice that each of the unwealthy lead- An !Ilustration ot' Content Analysis ers has a graduate degree, whereas each of the wealthy nonleaders has very little formal educa- Several studies have indicated that women are tion. You may revise your hypothesis to consider stereotyped on television. R. Stephen Craig (1992) both education and wealth as routes to leadership took this line of inquiry one step further to exam- in the association. Perhaps you'll discover some ine the portrayal of both men and women during threshold for leadership (a white-collar job, a level different periods of television progranuning. of income, and a college degree) beyond which To study gender stereotyping in television com- mercials, Craig selected a sample of 2,209 network

Content AnalYSis 329 commercials during several periods between Janu- TABLE 11-1 ary 6 and 14, 1990. Percent of Adult Primary Visual Characters by Sex The weekday day part (in this sample, Mon- Appearing in Commercials in Three Day Parts day-Friday, 2-4 P.M.) consisted exclusively of soap operas and was chosen for its high per- Adult male Daytime Evening Weekend centage of women viewers. The weekend day Adult female part (two consecutive Saturday and Sunday af- 40 52 80 ternoons during sports telecasts) was selected 60 48 20 for its high percentage of men viewers. Evening \"prime time\" (Monday-Friday, 9-11 P.M.) was Source: R.Stephen Craig,\"The Effect ofTelevision Day Part on Gender Portrayals in d10sen as a basis for comparison with past Television Commercials: AContent Analysis,\" Sex Roles 26,nos.5/6 (1992):204. studies and the other day parts. parent/spouses, or sex object/models in every (1992: 199) day part.... Women were proportionately more likely to appear as sex object/models dur- Each of the commercials was coded in several ing the weekend than during the day. ways. \"Characters\" were coded as (1992204) All male adults The research also showed that different prod- All female adults ucts were advertised during different time periods. As you might imagine, almost all the daytime com- All adults, mixed gender mercials dealt with body, food, or home products. These products accounted for only one in three Male adults with children or teens (no women) on the weekends. Instead, weekend commercials stressed automotive products (29 percent), business Female adults with cl1ildren or teens (no men) products or services (27 percent), or alcohol (10 percent). There were virtually no alcohol ads Mixture of ages and genders during evenings and daytime. In addition, Craig's coders noted which charac- As you might suspect, women were most likely ter was on the SCTeen longest during the commer- to be portrayed in home settings, men most likely cial-the \"primary visual character\"-as well as to be shown away from home. Other findings dealt the roles played by the characters (such as spouse, vvith the different roles played by men and women. celebrity, parent), the type of product advertised (such as body product, alcohol), the setting (such The women who appeared in weekend ads as kitd1en, school, business), and the voice-over were almost never portrayed without men and narrator. seldom as the commercial's primary character. They were generally seen in roles subservient Table 11-1 indicates the differences in the times to men (e.g., hotel receptionist, secretary, or when men and women appeared in commercials. stewardess), or as sex objects or models in Women appeared most during the daytime (with which their only function seemed to be to lend its soap operas), men predominated during the an aspect of eroticism to the ad. weekend commercials (with its sports program- ming), and men and women were equally repre- (Craig 1992: 208) sented during evening prime time. Although some of Craig's findings may seem Craig found other differences in the ways men unsurprising, remember that \"common knowledge\" and women were portrayed. does not always correspond with reality. It's always worthwhile to check out widely held assumptions. Further analysis indicated that male primary And even when we think we know about a given characters were proportionately more likely situation, it's often useful to know specific details than females to be portrayed as celebrities and professionals in every day part, while women were proportionately more likely to be portrayed as interviewer/demonstrators,

330 Chapter 11: Unobtrusive Research such as those provided by a content analysis like reliability. Problems of validity are likely unless you this one. happen to be studying communication processes per se. Strengths and Weaknesses of Content Analysis On the other side of the ledger. the concrete- ness of materials studied in content analysis Probably the greatest advantage of content analysis strengthens the likelihood of reliability. You can al- is its economy in terms of both time and money. A ways code and recode and even recode again if you college student might undertake a content analysis, want, making certain that the coding is consistent. vl'hereas undertaking a survey, for example, might In field research, by contrast, there's probably noth- not be feasible. There is no requirement for a large ing you can do after the fact to ensure greater relia- research staff; no special equipment is needed. As bility in observation and categorization. long as you have access to the material to be coded, you can undertake content analysis. Let's move from content analysis now and turn to a related research method: the analysis of exist- Content analysis also has the advantage of al- ing data. Although numbers rather than communi- lowing the correction of errors. If you discover cations are analyzed in this case, I think you'll see you've botched up a surveyor an experiment, you the similarity to content analysis. may be forced to repeat the whole research project with all its attendant costs in time and money. If Analyzing Existing Statistics you botch up your field research, it may be impos- sible to redo the project; the event under study Frequently you can or must undertake social sci- may no longer exist. In content analysis, it's usually entific inquiry through the use of official or quasi- easier to repeat a portion of the study than it is in official statistics. This differs from secondary analy- other research methods. You might be required, sis, in which you obtain a copy of someone else's moreover, to recode only a portion of your data data and undertake your own statistical analysis. In rather than all of it. this section, we're going to look at ways of using the data analyses that others have already done. A third advantage of content analysis is that it permits the study of processes occurring over a This method is particularly significant because long time. You might focus on the imagery of Afri- existing statistics should always be considered at can Americans conveyed in U.S. novels of 1850 to least a supplemental source of data. If you were 1860, for example, or you might examine changing planning a survey of political attitudes, for ex- imagery from 1850 to the present. ample, you would do well to examine and present your findings within a context of voting patterns, Finally, content analysis has the advantage of rates of voter turnout, or sinlilar statistics relevant all unobtrusive measures, namely, that the content to your research interest. Or. if you were doing analyst seldom has any effect on the subject being evaluation research on an experimental morale- studied. Because the novels have already been vvrit- building program on an assembly line, then statis- ten, the paintings already painted, the speeches al- tics on absenteeism, sick leave, and so on would ready presented, content analyses can have no effect probably be interesting and revealing in connection on them. with the data from your own research. Existing sta- tistics, then, can often provide a historical or con- Content analysis has disadvantages as well. For ceptual context ,vithin which to locate your origi- one thing, it's limited to the examination of recorded nal research. communications. Such communications may be oraL written, or graphic, but they must be recorded Existing statistics can also provide the main data in some fashion to permit analysis. for a social scientific inquiry. An excellent example is the classic study mentioned at the beginning of this As we've seen, content analysis has both ad- chapter, Emile Durkheim's SlIicide ([1897]1951). vantages and disadvantages in terms of validity and

Analyzing Existing Statistics 331 Let's take a closer look at Durkheinl's work before This general hypothesis was substantiated and considering some of the special problems this specified through Durkheinl's analysis of a different method presents. set of data. The different countries of Europe had radically different suicide rates. The rate in Saxony, Durkheim's Study ofSuicide for example, was about ten times that of Italy, and the relative ranking of various countries persisted Why do people kill themselves? Undoubtedly every over time. As Durkheim considered other differ- suicide case has a unique history and explanation, ences among the various countries, he eventually yet all such cases could no doubt be grouped ac- noticed a striking pattern: Predominantly Protes- cording to certain common causes: financial failure, tant countries had consistently higher suicide rates trouble in love, disgrace, and other kinds of per- than Catholic ones did. The predominantly Protes- sonal problems. The French sociologist Emile tant countries had 190 suicides per million popula- Durkheim had a slightly different question in mind tion; mixed Protestant-Catholic countries, 96; and when he addressed the matter of suicide, however. predominantly Catholic countries, 58 (Durkheim He wanted to discover the environmental condi- [1897]1951: 152). tions that encouraged or discouraged it, especially social conditions. Although suicide rates thus seemed to be re- lated to religion, Durkheim reasoned that some The more Durkheim examined the available other factor, such as level of economic and cultural records, the more patterns of differences became development, might explain the observed differ- apparent to him. One of the first things to attract his ences among countries. If religion had a genuine attention was the relative stability of suicide rates. effect on suicide, then the religious difference would Looking at several countries, he found suicide rates have to be found lVithin given countries as welL To to be about the same year after year. He also discov- test this idea, Durkheim first noted that the Ger- ered that a disproportionate number of suicides oc- man state of Bavaria had both the most Catholics curred in summer, leading him to hypothesize that and the lowest suicide rates in that country, whereas temperature might have something to do vvith sui- heavily Protestant Prussia had a much higher cide. If this were the case, suicide rates should be suicide rate. Not content to stop there, however, higher in the southern European countries than in Durkheim examined the provinces composing each the temperate ones. However, Durkheim discov- of those states. ered that the highest rates were found in countries in the central latitudes, so temperature couldn't be Table 11-2 shows what he found. As you can the answer. see, in both Bavaria and Prussia, provinces ,vith the highest proportion of Protestants also had the high- He explored the role of age (35 was the most est suicide rates. Increasingly, Durkheim became common suicide age), gender (men outnumbered confident that religion played a significant role in women around four to one), and numerous other the matter of suicide. factors. Eventually, a general pattern emerged from different sources. Returning eventually to a more general theo- reticallevel. Durkheim combined the religious In terms of the stability of suicide rates over findings with the earlier observation about in- time, for instance, Durkheim found that the pat- creased suicide rates during tinles of political tur- tern was not totally stable. There were spurts in the moiL As we've seen, Durkheim suggested that rates during times of political turmoil, which oc- many suicides are a product of anomie, that is, curred in several European countries around 1848. \"normlessness,\" or a general sense of social instabil- This observation led him to hypothesize that sui- ity and disintegration. During times of political cide might have something to do with \"breaches in strife, people may feel that the old ways of society social equilibrium.\" Put differently, social stability are collapsing. They become demoralized and de- and integration seemed to be a protection against pressed, and suicide is one answer to the severe suicide. discomfort. Seen from the other direction, social

332 Chapter 11: Unobtrusive Research TABLE 11-2 integration and solidarity-reflected in personal feelings of being part of a coherent, enduring social Suicide Rates in Various German Provinces, whole-offer protection against depression and sui- Arranged in Terms of Religious Affiliation cide. That was where the religious difference fit in. Catholicism, as a far more structured and integrated Religious Character ofProvince Suicides per religious system, gave people a greater sense of co- Million Inhabitants herence and stability than did the more loosely structured Protestantism. Bavarian Provinces (7867-7875)* 167 207 From these theories, Durkheim created the con- Less than 50% Catholic 204 cept of anomie suicide. More importantly, as you 192 know, he added the concept of anomie to the lexi- Rhenish Palatinate con of the social sciences. Central Franconia 157 Upper Franconia 118 This account of Durkheim's classic study is 135 greatly simplified, of course. Anyone studying social Average research would profit from studying the original. 50% to 90% Catholic 64 For our purposes, Durkheim's approach provides 114 a good illustration of the possibilities for research Lower Franconia 19 contained in the masses of data regularly gathered Swabia 75 and reported by government agencies and other organizations. Average More than 90% Catholic The Consequences of Globalization Upper Palatinate The notion of \"globalization\" has become increas- Upper Bavaria ingly controversial in the United States and around Lower Bavaria the world, with reactions ranging from scholarly debates to violent confrontations in the streets. One Average point of view sees the spread of US-style capital- ism to developing countries as economic salvation Prussian Provinces (7883-7890) 309.4 for those countries. A very different point of view 312.9 sees globalization as essentially neocolonial eJ..'ploi- More than 90% Protestant 171.5 tation, in which multinational conglomerates ex- 264.6 ploit the resources and people of poor countries. Saxony And, of course, there are numerous variations on Schleswig 212.3 these contradictory views. Pomerania 200.3 296.3 Jeffrey Kentor (200 I) wanted to bring data to Average 171.3 bear on the question of how globalization affects the 68% to 89% Protestant 220.0 developing countries that host the process. To that end, he used data available from the World Bank's Hanover 123.9 \"World Development Indicators.\" (You can learn Hesse 260.2 more about these data at http://www.worldbank Brandenburg and Berlin 107.5 .org/data/.) Noting past variations in the way glob- East Prussia 163.6 alization was measured, Kentor used the amount of foreign investment in a country's economy as a Average 96.4 percentage of that country's whole economy. He 40% to 50% Protestant 100.3 reasoned that dependence on foreign investments 90.1 West Prussia 95.6 Silesia Westphalia Average 28% to 32% Protestant Posen Rhineland Hohenzollern Average *Note:The population below 15 years has been omitted Source: Adapted from Emile Durkheim, Suicide (Glencoe, IL: Free Press, [1897J 1951), 153.

Analyzing Existing Statistics 333 was more inlportant that the amount of the the religion of those people who committed sui- investment. cide. Ultimately, then, it was not possible for him to say whether Protestants committed suicide more In his analysis of 88 countries with a per capita often than Catholics did, though he inferred as gross domestic product (the total goods and ser- much. Because Protestant countries, regions, and vices produced in a country) of less that $10,000, states had higher suicide rates than did Catholic Kentor found that dependence on foreign invest- countries, regions, and states, he drew the obvious ment tended to increase income inequality among conclusion. the citizens of a country. The greater the degree of dependence, the greater the income inequality. There's danger in drawing this kind of conclu- Kentor reasoned that globalization produced well- sion, however- It's always possible that patterns of paid elites who, by working with the foreign corpo- behavior at a group level do not reflect correspond- rations, maintained a status well above that of the ing patterns on an individual level. Such errors are average citizen. But because the profits derived due to an ecological fallacy. which was discussed in from the foreign investments tended to be returned Chapter 4. In the case of Durkheim's study, it was al- to the investors' countries rather than enriching together possible, for example, that it was Catholics the poor countries, the great majority of the popu- who committed suicide in the predominantly lation in the latter reaped little or no economic Protestant areas. Perhaps Catholics in predomi- benefit. nantly Protestant areas were so badly persecuted that they were led into despair and suicide. In that Income inequality, in turn, was found to in- case it would be possible for Protestant countries to crease birth rates and, hence, population growth, in have high suicide rates without any Protestants a process too complex to summarize here. Popula- committing suicide. tion growth, of course, brings a \\vide range of prob- lems to countries already too poor to feed and pro- Durkheim avoided the danger of the ecological vide for the other needs of their people. fallacy in two ways. First. his general conclusions were based as much on rigorous theoretical deduc- This research example, along with our brief look tions as on the empirical facts. The correspondence at Durkheim's studies, should broaden your under- between theory and fact made a counterexplana- standing of the kinds of social phenomena that we tion, such as the one I just made up, less likely. Sec- can study through data already collected and com- ond, by extensively retesting his conclusions in a piled by others. variety of ways, Durkheim further strengthened the likelihood that they were correct. Suicide rates Units ofAnalysis were higher in Protestant countries than in Catho- lic ones; higher in Protestant regions of Catholic The unit of analysis involved in the analysis of ex- countries than in Catholic regions of Protestant isting statistics is often not the individuaL Durk- countries; and so forth. The replication of findings heim, for example, was required to work with po- added to the weight of evidence in support of his litical-geographic units: countries, regions, states, conclusions. and cities. The same situation would probably ap- pear if you were to undertake a study of crime Problems at Validity rates, accident rates, or disease. By their nature, most existing statistics are aggregated: They de- Whenever we base research on an analysis of data scribe groups. that already exist. we're obviously limited to what exists. Often, the existing data do not cover exactly The aggregate nature of existing statistics can what we're interested in, and our measurements present a problem, though not an insurmountable may not be altogether valid representations of the one. As we saw, for example, Durkheim wanted to variables and concepts we want to make conclu- determine whether Protestants or Catholics were sions about more likely to commit suicide. The difficulty was that none of the records available to him indicated

334 Chapter 11: Unobtrusive Research Two characteristics of science are used to arrest records would not give you a valid measure handle the problem of validity in analysis of of use. But even if you limited your inquiry to the existing statistics: logical reasonil1g and replication\" times after 1937, you would still have problems of Durkheim's strategy provides an example of logical reliability that stem from the nature of law enforce- reasoning\" Although he could not determine the ment and crime recording. religion of people who committed suicide, he rea- soned that most of the suicides in a predominantly Law enforcement, for example, is subject to Protestant region would be Protestants. various pressures. A public outcry against mari- juana, led perhaps by a vocal citizens' group, often Replication can be a general solution to prob- results in a police crackdown on drug trafficking- lems of validity in social research. Recall the earlier especially during an election or budget yeaL A sen- discussion of the interchangeability of indicators sational story in the press can have a similar effect. (Chapter 5)\" Crying in sad movies isn't necessarily In addition, the volume of other business facing the a valid measure of compassion; nor is putting little police can affect marijuana arrests. birds back in their nests nor giving money to char- ity\" None of these things, taken alone, would prove In tracing the pattern of drug arrests in Chicago that one group (women, say) was more compas- between 1942 and 1970, Lois DeFleur (1975) dem- sionate than another (men)\" But if women ap- onstrates that the official records present a far less peared more compassionate than men by all these accurate history of drug use than of police practices measures, that would create a weight of evidence and political pressure on police. On a different level in support of the conclusion\" In the analysis of ex- of analysis, Donald Black (1970) and others have isting statistics, a little ingenuity and reasoning can analyzed the factors influencing whether an of- usually turn up several independent tests of a given fender is actually arrested by police or let off with a hypothesis. If all the tests seem to confirm the hy- warning\" Ultimately, official crime statistics are pothesis, then the weight of evidence supports the influenced by whether specific offenders are well or validity of the measure. poorly dressed, whether they are polite or abusive to police officers, and so forth. When we consider un- Problems of Reliability reported crimes, sometimes estimated to be as much as ten times the number of crimes known to police, The analysis of existing statistics depends heavily the reliability of crime statistics gets even shakier. on the quality of the statistics themselves: Do they accurately report what they claim to report? This These comments concern crinle statistics at a can be a substantial problem sometimes, because local leveL Often it's useful to analyze national the weighty tables of government statistics, for ex- crime statistics, such as those reported in the FBI's ample, are sometimes grossly inaccurate. annual Uniform Crime Reports. Additional problems are introduced at the national leveL For example, Consider research into crime. Because a great di[ferent local jurisdictions define crimes differently. deal of this research depends on official crime sta- Also, participation in the FBI program is voluntary, tistics, this body of data has come under critical so the data are incomplete. evaluation\" The results have not been too encour- aging. As an illustration, suppose you were inter- Finally, the process of record keeping affects ested in tracing long-term trends in marijuana use the data available to researchers\" Whenever a law- in the United States. Official statistics on the num- enforcement unit improves its record-keeping sys- bers of people arrested for selling or possessing tem-computerizes it for example-the apparent marijuana would seem to be a reasonable measure crime rates increase dramatically. This can happen of use, right? Not necessarily. even if the number of crimes committed, reported, and investigated does not increase. To begin, you face a hefty problem of validity. Before the passage of the Marihuana Tax Act in Researchers' first protection against the prob- 1937, \"grass\" was legal in the United States, so lems of reliability in the analysis of existing statistics is knowing that the problem may exist. Investigating the nature of the data collection and tabulation

Analyzing Existing Statistics 335 nSeptember i 9,1999, ABC-TV broadcast aspecial shoVl, hosted Canada, Australia, Denmark, Finland, Iceland, Ireland, New Zealand, John Stossel, to examine where the United States stood in the Horvlay and Sweden ranking of the vlorld's societies As the show unfolded, it became clear Stossel 'llent on to contrast Hong Kong (a capitalist success story) that the USA vias doing okay-arguably #l-and that the to our vlith the alternative to afree economy \"stagnation, and often poverty success was primarily dUe to our laissez-faire capitalist system.To make Consider China, now mired in Third vVorld poverty.They were once the the latter point more strongly, Stossel pointed to other success stories leader of the viarid \"Again, FAIR suggests adifferent assessment that also oVled their success to laissez-faire capitalism f,ctually, China's economy is anything but \"stagnant\" As the Trea- According to Stossel, Hong Kong stood out among the world's na- tions as the leader of free-market economics As evidence of Hong Kong's sury Department's La'Nfence Summers said in aspeech last year, success,Stossel reported that it had \"the only government in the world that makeS asurplus,a big surplus.\" What do you think about that con- \"China has been the fastest economy in history since clusion) Is it convincing to you? [economicj reform began in 1980\" While China has adopted some Here's what the media watchdog, Fairness and Accuracy in Report- ing (FAIR), had to say about Stossel's assertion aspects of market economics, alarge proportion of its business As anyone who pays attention to Washington politics knows, the firms are still owned by the government USgovernment has been running afederal budget surplus for more than ayear; it amounted to 570 billion last year. Other coun- In the media and elsewhere, you'll often find assertions of fact that trieS with budget surpluses last year included the United Kingdom, appear to be based on statistical analyses. However, it's usually agood idea to check the facts. Source. Fairness and ;,ccuracy in Reponing,\"Mtion Aferr ,48C ik,','s Gj\\/es up on ,4c(u- racy!\"Sep:ember 28, nEViI~·t,'/\\'/ fajr.org/actj'iism!stGssel-americ2 htmi may enable you to assess the nature and degree of as well as on the nation as a whole. Where else can unreliability so that you can judge its porential im- you find the number of work stoppages in the pact on your research interesL If you also use logi- country year by year, the residential property taxes cal reasoning and replication, you can usually cope of major cities, the number of water pollution dis- with the problem, \"Is America #1?\" provides an ex- charges reported around the country, the number ample of what y'ou might discover by carefully ex- of business proprietorships in the nation, and hun- amining the use of existing statistics\" dreds of other such handy bits of information? To make things even better, Hoover's Business Press Sources of Existing Statistics offers the same book in soft cover for less cost. This commercial version, entitled Tlze AlIlericaJl AlmalZac, It would take a whole book just to list the sources shouldn't be confused with other almanacs that are of data available for analysis. In this section, I want less reliable and less useful for social scientific re- to mention a few sources and point you in the di- search. Better yet, you can buy the SlatislZcal Absm:zct rection of finding others relevant to your research on a CD-ROM, making the search for and transfer interest. of data quite easy. Best of alL you can download the Slatistiml Abstracr from the web for free (your Undoubtedly, the single most valuable book tax dollars at work for you). You'll find it at you can buy is the annual Statistical Abstract oftlz\" http://w.vwcensus.gov/statab/ww.v/.. Unired Srm\"s, published by the United States Depart- ment of Commerce. Unquestionably the best source Federal agencies-the Departments of Labor, of data about the United States, it includes statistics Agriculture, Transportation, and so forth-publish on the individual states and (less extensively) cities numerous data series . To find out what's available,

336 Chapter 11' Unobtrusive Research 1992, the Population Crisis Committee, anonprofit organization 88-Zaire to combating the population explosion, undertook to ana- 87-Laos 86-Guinea, Angola lyze the relative degree of suffering in nations around the '1lorld Every 85-Ethiopia, Uganda country with apopulation of one million or more was evaluated in terms 84-Cambodia, Sierra Leone ofthe following ten indicarors-with ascore of 10 on any indicator rep- 82-Chad, Guinea-Bissau resenting the highest level of adversity 81-Ghana,Burma 79-Malawi Life expectancy 77-Cameroon, Mauritania Daily per capita calorie supply 76-Rwanda,Vietnam, Liberia Percentage of the population with access to clean drinking water 75-Burundi, Kenya, Madagascar,Yemen Proporlion of infant immunization Rate of secondary school enrollment High Human Suffering Gross national product Infiation 74-lvory Coast Number of telephones per 1,000 people 73-Bhutan, Burkina Faso, Central African Republic Political freedom 71-Tanzania,Togo Civil rights 70-Lesotho, Mali, Niger, Nigeria 69-Guatemala, Nepal Here's how the world's nations ranked in terms of these indicators 68-Bangladesh, Bolivia,Zambia Remember, high scores are signs of overall suffering 67-Pakistan 66-Nicaragua, Papua-New Guinea, Senegal, SwaZiland, Zimbabwe Extreme Human Suffering 65-lraq 64-Gambia, Congo, EI Salvador, Indonesia, Syria 93-Mozambique 92-Somalia 89-Afghanistan, Haiti, Sudan go to your library, find the government documents ., Bureau of Transportation Statistics section, and spend a few hours browsing through http://wwwbts.gov/ the shelves. You can also visit the US. Government Printing Office website (http://www.access.gpo.gov) ., Federal Bureau of Investigation and look around. http://Vv'IvwJbLgov/ The web is the latest development in access to ., Central Intelligence Agency existing statistics. Here are just a few websites to il- http://www.cia.gov/ lustrate the richness of this new resource: ., The World Bank ., Bureau of the Census http://www.worldbank.org/ http://www. census. gov/ World statistics are available through the United ., Bureau of Labor Statistics Nations. Its Demographic Yearbook presents annual http://stats. bls.gov/ vital statistics (births, deaths, and other data rele- vant to population) for the individual nations of the Department of Education world . Other publications report a variety of other http://www.ed.gov/

Analyzing Existing Statistics 337 63-Comores, India, Paraguay, Peru 37-Chile, Uruguay, ~jorth Korea 62-Benin, Honduras 34-Costa Rica, South Korea, United Arab Emirates 61-Lebanon, China, Guyana, South Africa 33-Poland 59-Egypt. Morocco 32-Bulgaria, Hungary, Qatar 58-Ecuador, Sri Lanka 31-Soviet Union (former) 57-Botswana 29-Bahrain, Hong Kong, Trinidad and Tobago 56-Iran 28-Kuwait,Singapore 55-Suriname 25-Czechoslovakia, Portugal,Taiwan 54-Algeria,Thailand 53-Dominican Republic, Mexico,Tunisia,Turkey Minimal Human Suffering 51-Libya, Colombia, Venezuela 21-lsrael 50-Brazil, Oman, Philippines 19-Greece 16-United Kingdom Moderate Human Suffering 12-ltaly 49-Solomon Islands ll-Barbados, Ireland, Spain, Sweden 47-Albania 8-Finland, New Zealand 45-Vanuatu 7-France, Iceland, Japan, Luxembourg 44-Jamaica, Romania, Saudi Arabia, Seychelles,Yugoslavia (former) 6-Austria, Germany 43-Mongolia 5-U nited States 41-Jordan 4-Australia, Norway 40-Malaysia, Mauritius 3-Canada, Switzerland 39-Argentina 2-Belgium, Netherlands 38-Cuba, Panama l-Denmark kinds of data. Again, a trip to your library, along use. Their World Poplllatioll Data S/zeet and Poplllatioll with a web search, is the best introduction to Bulletill are resources heavily used by social scien- what's available tists. Social indicator data can be found in the jour- nal SINET A Quarrel/}, ReFielV ofSocial Reports alld Re- The amount of data provided by nongovern- search all Social Indicators, Social Trends, alld tlze Quality ment agencies is as staggering as the amount your ofLife. taxes buy. Chambers of commerce often publish data reports on business, as do private consumer The sources I've listed represent only a tiny groups. Ralph Nader has information on automo- fraction of the thousands that are available. With so bile safety, and Common Cause covers politics and much data already collected, the lack of funds to govermnent. George Gallup publishes reference support expensive data collection is no reason for volumes on public opinion as tapped by Gallup not doing good and useful social research . Polls since 1935. The availability of existing statistics also makes Organizations such as the Population Reference creating some fairly sophisticated measures possible. Bureau publish a variety of demographic data, U.S. \"Suffering around the World\" describes an analysis and international. that a secondary analyst could published by the Population Crisis Corrmlittee based

338 Chapter 11 : Unobtrusive Research on the kinds of data available in government Examples of Comparative practice. and Historical Research Let's move now from an inherently quantita- August Comte, who coined the term soci%gie, saw tive method to one that is typically qualitative: that new discipline as the final stage in a historical comparative and historical research. development of ideas. With his broadest brush, he painted an evolutionary picture that took humans Comparative istorical from a reliance on religion to metaphysics to sci- Research ence. With a finer brush, he portrayed science as evolving from the development of biology and the comparative and historical research differs other natural sciences to the development of psy- substantially from the methods discussed so far, chology and, finally, to the development of sci- though it overlaps somewhat 'with field research, entific sociology. content analysis, and the analysis of existing statis- tics. It involves the use of historical methods by A great many later social scientists have also sociologists, political scientists, and other social turned their attention to broad historical processes. scientists . Several have examined the historical progression of social forms from the simple to the complex, from The discussion of longitudinal research designs rural-agrarian to urban-industrial societies. The in Chapter 4 notwithstanding, our examination of US. anthropologist Le\\vis Morgan, for example, research methods so far has focused primarily on saw a progression from \"savagery\" to \"barbarism\" studies anchored in one point in time and in one to \"civilization\" (1870)._ Robert Redfield, another locale, whether a small group or a nation. Al- anthropologist, wrote more recently of a shift from though accurately portraying the main thrust of \"folk society\" to \"urban society\" (1941). Emile contemporary social scientific research, this focus Durkheim saw social evolution largely as a process conceals the fact that social scientists are also inter- of ever-greater division of labor ([ 1893] 1964). In a ested in tracing the development of social forms more specific analysis, Karl Marx examined eco- over time and comparing those developmental pro- nomic systems progressing historically from primi- cesses across cultures. James Mahoney and Dietrich tive to feudal to capitalistic forms ([1867] 1967). All Rueschemeyer (2003: 4) suggest that current com- history, he wrote in this context, was a history of parative and historical researchers \"focus on a wide class struggle-the \"haves\" struggling to maintain range of topics, but they are united by a commit- their advantages and the \"have-nots\" struggling ment to providing historically grounded explana- for a better lot in life. Looking beyond capitalism, tions of large-scale and substantively important Marx saw the development of socialism and finally outcomes.\" Thus, you find comparative and histori- communism. cal studies dealing ,vith the topics social class, capi- talism, religion, revolution, and the like. Not all historical studies in the social sciences have had this evolutionary flavor. however. Some Aiter describing some major instances of com- social scientific readings of the historical record, in parative and historical research, past and present, fact, point to grand cycles rather than to linear pro- this section discusses some of the key elements of gressions. No scholar better represents this view this method. than Pitirim A. Sorokin. A participant in the Rus- sian Revolution of 1917, Sorokin served as secre- comparative and historical research The exami- tary to Prime Minister Kerensky. Both Kerensky nation of societies (or other social units) over time and Sorokin fell from favor, however, and Sorokin and in comparison with one another. began his second career-as a US. sociologist. Whereas Comte read history as a progression from religion to science, Sorokin (1937-1940) sug- gested that societies alternate cyclically between

Comparative and Historical Research 339 two points of view, which he called \"ideational\" tion of Christianity. Calvin taught that the ultimate and \"sensate.\" Sorokin's sensate point of view salvation or damnation of every individual had defines reality in terms of sense experiences. The already been decided by God; this idea is called ideational, by contrast, places a greater emphasis on predestinarian. Calvin also suggested that God spiritual and religious factors. Sorokin's reading of communicated his decisions to people by making the historical record further indicated that the pas- them either successful or unsuccessful during their sage between the ideational and sensate was earthly existence. God gave each person an earthly through a third point of view, which he called the \"calling\"-an occupation or profession-and man- \"idealistic.\" This third view combined elements of ifested their success or failure through that me- the sensate and ideational in an integrated, rational dium. Ironically, this point of view led Calvin's fol- view of the world. lowers to seek proof of their coming salvation by working hard, saving their money, and generally These examples indicate some of the topics striving for economic success. comparative and historical researchers have exam- ined. To get a better sense of what comparative and In Weber's analysis, Calvinism provided an im- historical research entails, let's look at a few exam- portant stimulus for the development of capitalism. ples in somewhat more detail. Rather than \"wasting\" their money on worldly comforts, the Calvinists reinvested it in their eco- Weber and the Role ofIdeas nomic enterprises, thus providing the capital neces- sary for the development of capitalism. In arriving In his analysis of economic history, Karl Marx put at this interpretation of the origins of capitalism, fonvard a view of economic determinism. That is, Weber researched the official doctrines of the early he postulated that economic factors determined the Protestant churches, studied the preaching of nature of all other aspects of society. For example, Calvin and other church leaders, and examined Marx's analysis showed that a function of Euro- other relevant historical documents. pean churches was to justify and support the capi- talist status quo-religion was a tool of the power- In three other studies, Weber conducted de- ful in maintaining their dominance over the tailed historical analyses of Judaism ([1934] 1952) powerless. \"Religion is the sigh of the oppressed and the religions of China ([1934] 1951) and India creature,\" Marx wrote in a famous passage, \"the ([1934] 1958) . Among other things, Weber wanted sentiment of a heartless world, and the soul of to know why capitalism had not developed in the soulless conditions. It is the opium of the people\" ancient societies of China, India, and IsraeL In (BottomoreandRubel [1843]1956:27). none of the three religions did he find any teaching that would have supported the accumulation and Max Weber, a German sociologist, disagreed. reinvestment of capital-strengthening his conclu- Without denying that economic factors could and sion about the role of Protestantism in that regard. did affect other aspects of society, Weber argued that economic determinism did not explain every- Japanese Religion and Capitalism thing. Indeed, Weber said, economic forms could come from noneconomic ideas. In his research in Weber's thesis regarding Protestantism and capital- the sociology of religion, Weber examined the ex- ism has become a classic in the social sciences. Not tent to which religious institutions were the source surprisingly, other scholars have attempted to test it of social behavior rather than mere reflections of in other historical situations. No analysis has been economic conditions. His most noted statement of more interesting, however, than Robert Bellah's ex- this side of the issue is found in The Protestant Ethic amination of the growth of capitalism in Japan dur- and the Spirit ojCapitalism ([ 1905] 1958). Here's a ing the late nineteenth and early twentieth cen- brief overview of Weber's thesis. turies, Tokuga\\l'a Religion (1957). John Calvin (1509-1564), a French theologian, As both an undergraduate and a graduate stu- was an important figure in the Protestant reforma- dent Bellah had developed interests in Weber and

340 Chapter 11 : Unobtrusive Research in Japanese society. Given these two interests, it concentrate his attention on a single group: Shin- was perhaps inevitable that he would, in 1951, first gaku, a religious movement among merchants in conceive his Ph.D. thesis topic as \"nothing less than the eighteenth and nineteenth centuries. He found an 'Essay on the Economic Ethic of Japan' to be a that Shingaku had two influences on the develop- companion to Weber's studies of China, India, and ment of capitalism. It offered an attitude toward Judaism: The Economic Ethic ofthe World Religions\" work similar to the Calvinist notion of a \"calling,\" (recalled in Bellah 1967: 168). Originally, Bellah and it had the effect of making business a more ac- sketched his research design as follows: ceptable calling for Japanese. Previously, commerce had held a very low standing in Japan. Problems would have to be specific and lim- ited-no general history would be attempted- In other aspects of his analysis, Bellah exam- since time span is several centuries. Field work ined the religious and political roles of the Emperor in Japan on the actual economic ethic practiced and the economic impact of periodically appearing by persons in various situations, with, if pos- emperor cults. Ultimately, Bellah's research pointed to the variety of religious and philosophical factors sible, controlled matched samples from the u.s. that laid the groundwork for capitalism in Japan. It seems unlikely that he would have achieved any- (questionnaires, interviews, etc.). thing approaching that depth of understanding if he had been able to pursue his original plan to (1967..' 168) interview matched samples of U.S. and Japanese citizens. Bellah's original plan, then, called for surveys of contemporary Japanese and Americans. How- I've presented these two studies in some depth ever, he did not receive the financial support nec- to demonstrate the way comparative and historical essary for the study as originally envisioned. So in- researchers dig down into the variables relevant to stead, he immersed himself in the historical records their analyses. Here are a few briefer examples to of Japanese religion, seeking the roots of the rise of illustrate some of the topics interesting to compara- capitalism in Japan. tive and historical scholars today. In the course of several years' research, Bellah s The Rise ofChristianity.: Rodney Stark (1997) lays uncovered numerous leads. In a 1952 term paper out his research question in the book's subtitle: on the subject, Bellah felt he had found the answer How the Obscure, Marginal Jesus Movement Became in the samurai code of Bushido and in the Confu- the Dominal1t Religious Force ill the Westem World cianism practiced by the samurai class: ill a Few Centuries. For many people the answer to this puzzle is a matter of faith in the miracu- Here I think we find a real development of this lous destiny of Christianity. Without debunking worldly asceticism, at least equaling anything Christian faith, Stark looks for a scientific ex- found in Europe. Further, in this class the idea planation, undertaking an analysis of existing of duty in occupation involved achievement historical records that sketch out the popula- without traditionalistic limits, but to the limits tion growth of Christianity during its early cen- of one's capacities, whether in the role of bu- turies. He notes, among other things, that the reaucrat, doctor, teacher, scholar, or other role early growth rate of Christianity, rather than open to the Samurai. being unaccountably rapid, was very similar to the contemporary growth of Mormonism. He (Quoted ill Bellah 1967. 171) then goes on to examine elements in early Christian practice that gave it growth advan- The samurai, however, made up only a portion tages over the predominant paganism of the of Japanese society. So Bellah kept looking at the Roman Empire. For example, the early Chris- religions among the Japanese generally. His under- tian churches were friendlier to women than standing of the Japanese language was not yet very good, but he wanted to read religious texts in the original. Under these constraints and experi- encing increased time pressure, Bellah decided to

Comparative and Historical Research 341 paganism was, and much of the early growth explained by the ideologies of either the right occurred among women-who often con- or the left. verted their husbands later on. And in an era of deadly plagues, the early Christians were more These examples of comparative and historical willing to care for stricken friends and family research should give you some sense of the poten- members, which not only enhanced the sur- tial power of the method. Let's turn now to an ex- vival of Christians but also made it a more at- amination of the sources and techniques used in tractive conversion prospect. At every turn in this method. the analysis, Stark makes rough calculations of the demographic impact of cultural factors. Sources of Comparative This study is an illustration of how social re- and Historical Data search methods can shed light on nonscientific realms such as faith and religion. As we saw in the case of existing statistics, there is no end of data available for analysis in historical re- s IlZtemational Policing.: Mathieu Deflem (2002) search. To begin, historians may have already re- set out to learn how contemporary systems of ported on whatever it is you want to examine, and international cooperation among police agen- their analyses can give you an initial grounding in cies came about. All of us have heard movie the subject, a jumping-off point for more in-depth and TV references to InterpoL Deflem went research. back to the middle of the nineteenth century and traced its development through World Most likely you'll ultimately want to go beyond War n. In part, his analysis examines the others' conclusions and examine some \"raw data\" strains between the bureaucratic integration to draw your own conclusions. These data vary, of of police agencies in their home governments course, according to the topic under study. In Bel- and the need for independence from those lah's study of Tokugawa religion, raw data included governments. the sermons of Shingaku teachers. When W. L Thomas and Florian Znaniecki (1918) studied the s Understanding America: Charles Perrow (2002) adjustment process for Polish peasants coming to wanted to understand the roots of the uniquely the United States early in this century, theyexam- American form of capitalism. Compared with ined letters written by the immigrants to their fam- European nations, the United States has shown ilies in Poland. (They obtained the letters through less interest in providing for the needs of aver- newspaper advertisements.) Other researchers age citizens and has granted greater power to have analyzed old diaries. Such personal docu- gigantic corporations. Perrow feels the die was ments only scratch the surface, however. In dis- pretty much cast by the end of the nineteenth cussing procedures for studying the history of fam- century, resting primarily on Supreme Court ily life, Ellen Rothman points to the following decisions in favor of corporations and the expe- sources: riences of the textile and railroad industries. In addition to personal sources, there are public s American Democracy: Theda Skocpol (2003) records which are also revealing of family his- turns her attention to something that fascinated tory. Newspapers are especially rich in evidence Alexis de TocqueviUe in his 1840 Democracy in on the educationaL legaL and recreational as- America.' the grassroots commitment to democ- pects of family life in the past as seen from a lo- racy, which appeared in all aspects of American cal point of view. Magazines reflect more gen- community life. It almost seemed as though eral patterns of family life; students often find democratic decision making was genetic in the them interesting to explore for data on percep- new world, but what happened? Skocpol's tions and expectations of mainstream family analysis of contemporary U.S. culture suggests values. Magazines offer several different kinds a \"diminished democracy\" that cannot be easily

342 Chapter 11: Unobtrusive Research of sources at once: visual materials (illustrations ideas. Whatever resources you use, however, a and advertisements), commentary (editorial couple of cautions are in order. and advice columns), and fictiono Popular peri- odicals are particularly rich in the last two. Ad- As we saw in the case of existing statistics, you vice on many questions of concern to families- can't trust the accuracy of records-official or from the proper way to discipline children to unofficial, primary or secondary. Your protection the economics of wallpaper-fills magazine lies in replication: In the case of historical research, columns from the early nineteenth century to that means corroboration. If several sources point the presenL Stories that suggest common expe- to the same set of \"facts,\" your confidence in them riences or perceptions of family life appear with might reasonably increaseo the same continuityo At the same time, you need always be wary of (198153) bias in your data sources, If all your data on the de- velopment of a political movement are taken from Organizations generally document themselves, the movement itself, you're unlikely to gain a \\vell- so if you're studying the development of some or- rounded view of it. The diaries of well-to-do gentry ganization-as Bellah studied Shingaku, for ex- of the Midclle Ages may not give you an accurate ample-you should examine its official documents: view of life in general during those timeso Where charters, policy statements, speeches by leaders, possible, obtain data from a variety of sources rep- and so on. Once, when I was studying the rise of a resenting different points of view, Here's what Bel- contemporary Japanese religious group-Soka- lah said regarding his analysis of Shingaku: gakkai-I discovered not only 'weekly newspapers and magazines published by the group but also a One could argue that there would be a bias in published collection of all the speeches given by the what was selected for notice by Western schol- original leaders. With these sources, I could trace arso However, the fact that there was material changes in recruitment patterns over time. At the from Western scholars with varied interests outset, followers were enjoined to enroll all the from a number of countries and over a period of world. Later, the emphasis shifted specifically to Ja- nearly a century reduced the probability of biaso pano Once a sizable Japanese membership had been established, an emphasis on enrolling all the world (Bellaiz 1967: 179) returned (Babbie 1966). The issues raised by Bellah are important ones. Often, official government documents provide As Ron Aminzade and Barbara Laslett indicate in the data needed for analysiso To better appreciate \"Reading and Evaluating Documents,\" there is an the history of race relations in the United States, art to knovving how to regard such documents and A. Leon Higginbotham, JL (1978) examined what to make of them. 200 years of laws and court cases involving race. Himself the first African American appointed a fed- Incidentally, the critical review that Aminzade eral judge, Higginbotham found that, rather than and Laslett urge for the reading of historical docu- protecting African Americans, the law embodied ments is useful in many areas of your life besides the bigotry and oppression. In the earliest court cases, pursuit of comparative and historical research. Con- there was considerable ambiguity over whether sider applying some of their questions to presiden- African Americans were indentured servants or, in tial press conferences, advertising, or (gasp) college fact, slaveso Later court cases and laws clarified the textbooks. None of these offers a direct view of real- matter-holding African Americans to be some- ity; all have human authors and human subjects. thing less than humano Analytical Techniques The sources of data for historical analysis are too extensive to cover even in outline here, though The analysis of comparative and historical data is the examples we've looked at should suggest some another large subject that I can't cover exhaustively here. Moreover, because comparative and historical

Comparative and Historical Research 343 by Ron Aminzade general organizational routines under which the document was and Barbara Las/ett prepared) To what extent does the document provide more of an index of institutional activity than ofthe phenomenon being stud- University ofMinnesota ied) What is the time lapse between the observation of the events documented and the witnesses' documentation of them) How purpose of the following comments is to give you some sense of confidential or public was the document meant IO be)What role kind of interpretive work historians do and the critical approach did etiquette,convention,and custom play in the presentation of they take toward their sources. It should help you to appreciate some of the material contained within the document) If you relied solely the skills historians develop in their efforts to reconstruct the past from upon the evidence contained in these documents, how might your reSidues, to assess the evidentiary status of different types of documents, vision of the past be disIOrted? What other kinds of documents and TO determine the range of permissible inferences and interpretations might you look at for evidence on the same issues) Here are some of the questions historians ask about documents 3. What are the key categorieS and concepts used by the writer ofthe document to organize the information presented? What selectivi- Who composed the documents? Why were they written? Why have ties or silences result from these categories of thought? they survived all these years? What methods were used to acquire 4 What sorts of theoretical issues and debates do these documents the information contained in the documents) cast light on?What kinds of historical and/or sociological questions What are some of the biases in the documents and how might you do they help to answer? What SolIS of valid inferences can one go about checking or correcting them? How inclusive or represen- make from the information contained in these documents? What tative is the sample of individuals, events, and so on, contained in sorts of generalizations can one make on the basis of the informa- the document? What were the institutional constraints and the tion contained in these documents? research is usually a qualitative method, there are characteristics of social phenomenao Thus, for ex- no easily listed steps to follow in the analysis of his- ample, Weber himself did considerable research on torical data. Nevertheless, a few comments are in bureaucracyo Having observed numerous actual bu- order. reaucracies, Weber ([1925] 1946) detailed those qualities essential to bureaucracies in general: juris- Max \\Veber used the German term verstelzen- dictional areas, hierarchically structured authority, \"understanding\"-in reference to an essential qual- written files, and so on. Weber did not merely list ity of social research. He meant that the researcher those characteristics common to all the actual bu- must be able to take on, mentally, the circum- reaucracies he observed. Rather, to create a theo- stances, views, and feelings of those being studied, retical model of the \"perfect\" (ideal type) bureau- so that the researcher can interpret their actions cracy, he needed to understand fully the essentials appropriately. Certainly this concept applies to of bureaucratic operation. Figure 11-4 offers a more comparative and historical research. The re- recent, graphic portrayal of some positive and neg- searcher's imaginative understanding is what ative aspects of bureaucracy as a general social breathes life and meaning into the evidence being phenomenon. analyzed, Often, comparative and historical research is The comparative and historical researcher informed by a particular theoretical paradigm. must find patterns among the voluminous details Thus, Marxist scholars may undertake historical describing the subject matter of study, Often this analyses of particular situations-such as the his- takes the form of what Weber called ideal types: tory of Latinos and Latinas in the United States-to conceptual models composed of the essential

344 Chapter 11: Unobtrusive Research Division Inefficiency of labor and rigidity Rules and regulations Resistance to change ohEmployment based Perpetuation of race, class, and gender technic!3.1 qualification inequalities FIGURE 11-4 Some Positive and Negative Aspects of Bureaucracy Source: Diana Kendall,Sociology in Our Times, 4th ed. (Belmont, CA:Wadsworth, ©2003). Used by permission. determine whether they can be understood in monitor changing conditions over time, such as terms of the Marxist version of conflict theory. data on population, crime rates, unemployment, Sometimes, comparative and historical researchers infant mortality rates, and so forth. The analysis of attempt to replicate prior studies in new situations- such data sometimes requires sophistication, how- for example, Bellah's study of Tokugawa religion ever. For example, Larry Isaac and Larry Griffin in the context of Weber's studies of religion and (1989) discuss the uses of a variation on regression economics. techniques (see Chapter 16) in determining the meaningful breaking points in historical processes, Although comparative and historical research is as well as for specifying the periods vvithin which often regarded as a qualitative rather than quanti- certain relationships occur among variables. tative technique, this is by no means necessary. His- Criticizing the tendency to regard history as a torical analysts sometimes use time-series data to

Review Questions and Exercises 345 steadily unfolding process, the authors focus their Analyzing Existing Statistics attention on the statistical relationship between €! A variety of government and nongovernment unionization and the frequency of strikes, demon- strating that the relationship has shifted impor- agencies provide aggregate statistical data for tantly over time. studying aspects of social life. €! Problems of validity in the analysis of existing Isaac and Griffin raise several important issues statistics can often be handled through logical regarding the relationship among theory, research reasoning and replication. methods, and the \"historical facts\" they address. €! Existing statistics often have problems of relia- Their analysis, once again, warns against the naive bility, so they must be used with caution. assumption that history as documented necessarily coincides with what actually happened. Comparative and Historical Research €! Social scientists use comparative and historical MAIN POINTS methods to discover patterns in the histories of Introduction different cultures. €! Unobtrusive measures are ways of studying so- €! Although often regarded as a qualitative method, comparative and historical research cial behavior without affecting it in the process. can make use of quantitative techniques. Content Analysis KEY TERMS €! Content analysis is a social research method ap- The following terms are defined in context in the propriate for studying human communications chapter and at the bottom of the page where the teml through social artifacts. Researchers can use it is introduced, as well as in the comprehensive glossary to study not only communication processes but at the back of the book. other aspects of social behavior as well. €! Common units of analysis in content analysis coding latent content include elements of communications-words, comparative and manifest content paragraphs, books, and so forth. Standard prob- historical research unobtrusive research ability sampling techniques are sometimes ap- content analysis propriate in content analysis. €! Content analysis involves coding-transform- REVIEW QUESTIONS AND EXERCISES ing raw data into categories based on some conceptual scheme. Coding may attend to both 1. Outline a content analysis design to determine manifest and latent content. The determination whether the Republican or the Democratic of latent content requires judgments on the party is the more supportive of a basic constitu- part of the researcher. tional right such as free speech, freedom of reli- €! Both quantitative and qualitative techniques gion, or protection against self-incrimination. are appropriate for interpreting content analy- Be sure to specify units of analysis and sampling sis data. methods. Describe a coding scheme that you €! The advantages of content analysis include could use for the content analysis. economy, safety, and the ability to study pro- cesses occurring over a long time. Its disadvan- 2. Identify an international news story involving a tages are that it is limited to recorded commu- conflict between two nations or cultural groups, nications and can raise issues of reliability and such as clashes between Israelis and Palestini- validity. ans. Locate on the web a newspaper report of the event from within each of the countries or cultures involved. Note differences in the way the event is reported. Now, find a report of

346 Chapter 11: Unobtrusive Research the event in a newspaper in a third, distant Chicago Press. A lively debate over the state of country. (For example, compare reports from theory in historical research and about rational the Jerusalem Post, the Palestine Chronicle, and the choice in sociology in general. New York Times.) Does the third report seem to 0yen, Else, ed. 1990. Comparative lvIethodology: Theory favor one of the two original reports? If so, and PraCTice ill Illternatiollal Social Research. New- would you conclude that the third report is bury Park, CA: Sage. Here are a variety of view- biased toward one side or that one of the origi- points on different aspects of comparative re- nal reports was simply inaccurate? Explain how search. Appropriately, the contributors are from and why you reached that conclusion . (You many different countries. might use World Press Review-http://www U.S. Bureau of the Census. 2004. Statistical Abstract .worldpress. org-as an alternative source of ofrhe United Srates. 2004. Narional Data Book alld data . They present contrasting articles on a Guide TO Sources. Washington, DC: U.s. Govern- given story.) ment Printing Office. This is absolutely the best book bargain available (present company ex- 3. Using the World Wide Web, find out how many cluded). Although the hundreds of pages of countries have a higher \"expected life ex- tables of statistics are not exciting bedtime read- pectancy\" than the United States. (You might ing-the plot is a little thin-it is an absolutely want to try the Population Reference Bureau at essential resource volume for every social scien- http://www.prb.org.) tist. This document is now also available on CD- ROM and on the web. 4. Max Weber undertook extensive studies of some of the world's major religions. Create an Webb, Eugene T, Donald r. CampbelL Richard D. annotated bibliography of his works in this area . Schwartz, Lee Sechrest, and Janet Belew Grove. 5. On the web, locate the American Sociological 2000. UnobrrusiFe lvIeasures. Rev. ed. Thousand Association's section called \"Comparative and Oaks, CA: Sage. A compendium of unobtrusive Historical Sociology\" (http://www2.asaneLorg/ measures. Includes physical traces, a variety of sectionchs/). Summarize an article in the sec- archival sources, and observations. Good discus- tion's newsletter. sion of the ethics involved and the limitations of such measures. ADDITIONAL READINGS Weber, Robert Philip. 1990. Basic Content Analysis., Newbury Park. CA: Sage, Here's an excellent be- Baker, Vern, and Charles Lambert. 1990. \"The Na- ginner's book for the design and execution of tional Collegiate Athletic Association and the content analysis. Both general issues and specific Governance of Higher Education.\" Sociological techniques are presented. Quarterly 31 (3): 403-21. A historical analysis of the factors producing and shaping the NCAA. SPSS EXERCISES Berg. Bruce L 1998. Qualitative Research lvIerhods for See the booklet that accompanies your text for exer- the Social Sciences. 3rd ed. Boston: Allyn and Ba- cises using SPSS (Statistical Package for the Social Sci- con. Contains excellent materials on unobtru- ences). There are exercises offered for each chapter, sive measures, including a chapter on content and you'll also find a detailed primer on using SPSS., analysis. Although focusing on qualitative re- search, Berg shows the logical links between Online Study Resources qualitative and quantitative approaches. Sociology~ Now'\": Research Methods Evans. William. 1996. \"Computer-Supported Con- tent Analysis: Trends, Tools, and Techniques.\" 1.. Before you do your final review of the chapter, Social Science Compurer Review 14 (3): 269-79. take the SociologyNolV: Research iHethods diagnos- Here's a review of current computer software tic quiz to help identify the areas on which you for content analysis, such as CETA, DICTION, INTEXT, MCCA, MECA, TEXTPACK, VBPro, and WORDLINK. Gould. Roger V., ed. 2000. The Rational Choice Colltro- versy il1 Historical Sociology. Chicago: University of

Online Study Resources 347 should concentrate. You'll find information on WEB LINKS FOR THIS CHAPTER this online tooL as well as instructions on how to access all of its great resources, in the front of Please realize that the Internet is an evolv- the book, ing entity, subject to change. Nevertheless, these few websites should be fairly stable. 2. As you review, take advantage of the Sociology Also. check your book's website for even more vVeb Now. Research lvIethods customized study plan, Links. These websites, current at the time of this book's based on your quiz results. Use this study plan publication. provide opportunities to learn about un- with its interactive exercises and other re- obtrusive research. sources to master the material. U. S. Census Bureau, Statistical Abstract 3, When you're finished with your review, take a/the United States the posttest to confirm that you're ready to http://www..census.gov/statab/vvww/ move on to the next chapter. Here's a superb resource for the analysis of existing statistics on the United States. WEBSITE FOR THE PRACTICE OF SOCIAL RESEARCH 11TH EDITION ASA, Comparative and Histp,ical Sociology http://www2.,asaneLorg/sectionchs/ Go to your book's website at http://sociology This is the homepage of the American Sociological As- wadsworth.com/babbie_practice11e for tools to sociation's section on comparative and historical re- aid you in studying for your exams. You'll find Tuto- search. Hot links will take you to research tools, data rial Quizzes with feedback, Internet Exercises, Flashcards, sources, and other resources. and Chapter TlIforials, as well as E>.1ended Projects, Info- Trac College Editioll search terms, Social Research in Cyber- space, GSS Data, H'<?b Lillks, and primers for using vari- ous data-analysis software such as SPSS and J\\'Vivo.

Evaluation Research Introduction Types of Evaluation Research Designs Topics Appropriate to Evaluation Research Experimental Designs Quasi-Experimental Formulating the Problem: Designs Issues of Measurement Qualitative Evaluations Specifying Outcomes The Social Context Measuring Experimental Logistical Problems Contexts Some Ethical Issues Specifying Interventions Use of Research Results Specifying the Population New versus Existing Social Indicators Research Measures The Death Penalty Operationalizing and Deterrence Success/Failure Computer Simulation Sociology(~Now'M: Research Methods Use this online tool to help you make the grade on your next exam. After reading this chapter, go to the \"Online Study Resources\" at the end of the chapter for instructions on how to benefit from SodologyNow: Research MerllOds.

Introduction 349 Introduction married women who listened to the show said they now practiced family planning, compared with You may not be familiar with Twende na Wakati only 19 percent of the nonlisteners. There were (\"Let's Go with the Times\"), but it's the most popu- other impacts: lar radio show in Tanzania. It's a soap opera. The main character, .iYlkwaju, is a truck driver with Some 72 percent of the listeners in 1994 said some pretty traditional ideas about gender roles that they adopted an HIVIAIDS prevention and sex. By contrast, Fundi Mitindo, a tailor. and behavior because of listening to \"Twfnde Ila his wife, Mama Waridi, have more modern ideas Wakati, \" and this percentage increased to 82 regarding the roles of men and women, particularly percent in our 1995 survey. Seventy-seven per- in relation to the issues of overpopulation and fam- cent of these individuals adopted monogamy, ily planning. 16 percent began using condoms, and 6 per- cent stopped sharing razors and/or needles. Twende Ila Wakati was the creation of Population Communications International (PCI) and other or- (Rogers et at 1996: 21) ganizations working in conjunction with the Tan- zanian government in response to two problems We can judge the effectiveness of the soap facing that country: (I) a population growth rate opera because of a particular form of social science. over twice that of the rest of the world and (2) an Evaluation research refers to a research purpose AIDS epidemic particularly heavy along the inter- rather than a specific method. This purpose is to national truck route, where more than a fourth of evaluate the impact of social interventions such as the truck drivers and over half the commercial sex new teaching methods or innovations in parole. workers were found to be HIV positive in 1991. Many methods-surveys, experiments, and so The prevalence of contraceptive use was II percent on-can be used in evaluation research. (Rogers et aL 1996: 5-6). Evaluation research is probably as old as social The purpose of the soap opera was to bring research itself. Whenever people have instituted a about a change in knowledge, attitudes, and prac- social reform for a specific purpose, they have paid tices (KAP) relating to contraception and family attention to its actual consequences, even if they planning. Rather than instituting a conventional have not always done so in a conscious, deliberate, educational campaign, PCI felt it would be more or sophisticated fashion. In recent years, however, effective to illustrate the message through the field of evaluation research has become an in- entertainment. creasingly popular and active research specialty, as reflected in textbooks, courses, and projects. More- Between 1993 and 1995, 208 episodes of over, the growth of evaluation research points to a Twende Ila Wakati were aired, aiming at the 67 per- more general trend in the social sciences. As a re- cent of Tanzanians who listen to the radio. Eighty- searcher, you'll likely be asked to conduct evalua- four percent of the radio listeners reported listening tions of your own. to the PCI soap opera, making it the most popular show in the country. Ninety percent of the show'S In part, the growth of evaluation research listeners recognized .tY1kwaju, the sexist truck reflects social researchers' increasing desire to make driver, and only 3 percent regarded him as a posi- a difference in the world. At the same time, we tive role model. Over two-thirds identified Mama can't discount the influence of (1) an increase in Waridi, a businesswoman, and her tailor husband federal requirements that program evaluations as positive role models. must accompany the implementation of new pro- grams and (2) the availability of research funds to Surveys conducted to measure the impact of fulfill those requirements. In any case, it seems the show indicated it had affected knowledge, atti- clear that social researchers will be bringing their tudes, and behavior. For example, 49 percent of the skills into the real world more than ever before.

350 Chapter 12: Evaluation Research This chapter looks at some of the key elements Topics Appropriate in this form of social research. After considering the to Evaluation Research kinds of topics commonly subjected to evaluation, we'll move through some of its main operational as- Evaluation research is appropriate whenever some pects: measurement, study design, and execution. social intervention occurs or is planned. A social ill- As you'll see, formulating questions is as important rervemioJl is an action taken within a social context as answering them. Because it occurs \\vithin real for the purpose of producing some intended result. life, evaluation research has its own problems, some In its simplest sense, evaluation research is the of which we'll examine. Besides logistical problems, process of determining whether a social interven- special ethical issues arise from evaluation research tion has produced the intended result. generally and from its specific, technical procedures. As you review reports of program evaluations, you The topics appropriate to evaluation research should be especially sensitive to these problems. are limitless. When the federal government abol- ished the selective service system (the draft), mili- Evaluation is a form of applied research-that tary researchers began paying special attention to is, it's intended to have some real-world effect. It the impact on enlistment. As individual states have \\vill be usefuL therefore, to consider whether and liberalized their marijuana laws, researchers have how it's actually applied. As you'll see, the obvious sought to learn the consequences, both for mari- implications of an evaluation research project do juana use and for other forms of social behavior. not necessarily affect real life. They may become Do no-fault divorce reforms increase the number the focus of ideologicaL rather than scientific, de- of divorces, and do related social problems decrease bates. They may simply be denied out of hand, for or increase? Has no-fault automobile insurance re- political or other reasons. Perhaps most typically, ally brought dovvn insurance policy premiums? they may simply be ignored and forgotten, left to Agencies providing foreign aid also conduct evalua- collect dust in bookcases across the land. tions to determine whether the desired effects were produced. The chapter concludes \\vith a look at a particu- lar resource for large-scale evaluation-social indi- There are many variations in the intent of eval- cators research. This type of research is also a rap- uation research. Needs assessment studies aim idly grovving specialty. Essentially it involves the to determine the existence and extent of problems, creation of aggregated indicators of the \"health\" of typically among a segment of the population, such society, similar to the economic indicators that give as the elderly. Cost-benefit studies determine diagnoses and prognoses of economies. whether the results of a program can be justified by its expense (both financial and other). Monitoring evaluation research Research undertaken for the studies provide a steady flow of information about purpose of detemlining the impact of some social something of interest, such as crime rates or the out- intervention, such as a program aimed at solving break of an epidemic. Sometimes the monitoring a social problem. involves incremental interventions. Read this de- scription of \"adaptive management\" by the Nature needs assessment studies Studies that aim to Conservancy, a public-interest group seeking to determine the existence and extent of problems, protect natural areas: typically among a segment of the popUlation, such as the elderly. First partners assess assumptions and set man- cost-benefit studies Studies that determine ao'an\" etmhiesnat s'a\"soeaslssmfoern tt hteh ecot enasmervtaaktieosn aacrteioa.n,Btahseend whether the results of a program can be justified by its expense (both financial and other). monitors the environment to see how it re- sponds. After measuring results, partners refine monitoring studies Studies that provide a steady their assumptions, goals and monitoring regi- flow of information about something of interest, men to reflect what they've learned from past such as crime rates or the outbreak of an epidemic.

Formulating the Problem: Issues of Measurement 351 e}'.'periences. With refinements in place, the en- As you might imagine, these results were not tire process begins again. (2005: 3) well received by those most closely associated with driver training. This matter was complicated, more- Much of evaluation research is referred to as over, by the fact that the I\\T}f[SA study was also eval- program evaluation or outcome asseSSlnent: uating a new, more intensive training program- the determination of whether a social intervention and the preliminary results showed that the new is producing the intended result. Here's an example. program was effective. Some years ago, a project evaluating the na- Here's a very different example of evaluation tion's drivers' education programs, conducted by research. Rudolf Andorka, a Hungarian sociologist, the National Highway and Transportation Safety has been particularly interested in his country's Administration (NHTSA), stirred up a controversy. shift to a market economy. Even before the dra- Philip Hilts (1981: 4) reported on the study's matic events in Eastern Europe in 1989, Andorka findings: and his colleagues had been monitoring the na- tion's \"second economy\"-jobs pursued outside the For years the auto insurance industry has given socialist economy. Their surveys followed the rise large insurance discounts for children who take and fall of SUcll jobs and examined their impact drivers' education courses, because statistics within Hungarian society. One conclusion was show that they have fewer accidents. that \"the second economy, which earlier probably tended to diminish income inequalities or at least The preliminary results of a new major study, improved the standard of living of the poorest part however, indicate that drivers' education does of the population, in the 1980s increasingly con- not prevent or reduce the incidence of traffic tributed to the growth of inequalities\" (Andorka accidents at alL 1990: Ill). Based on an analysis of 17,500 young people in As you can see, the questions appropriate to DeKalb County, Georgia (including Atlanta), the evaluation research are of great practical signifi- preliminary findings indicated that students who cance: Jobs, programs, and investments as well as took drivers' education had just as many accidents beliefs and values are at stake. Let's now examine and traffic violations as those who didn't take it. how these questions are answered-how evalua- The study also seemed to reveal some subtle as- tions are conducted. pects of driver training. Formulating the Problem: First. it suggested that the apparent impact Issues of Measurement of drivers' education was largely a matter of self- selection. The kind of students who took drivers' Several years ago, I headed an institutional re- education were less likely to have accidents and search office that conducted research directly rele- traffic violations-with or without driver training. vant to the operation of the university. Often, we Students with high grades, for example, were more were asked to evaluate new programs in the cur- likely to sign up for driver training, and they were riculum. The following description is fairly typical also less likely to have accidents. of the problem that arose in that context, and it More startling, however, was the suggestion program evaluation/outcome assessment The that driver-training courses may have actually in- determination of whether a social intervention is creased traffic accidents! The existence of drivers' producing the intended result. education may have encouraged some students to get their licenses earlier than if there were no such courses. In a study of ten Connecticut towns that discontinued driver training, about three-fourths of those who probably would have been licensed through their classes delayed getting licenses until they were 18 or older (Hilts 1981: 4).

352 Chapter 12: Evaluation Research points to one of the key barriers to good evaluation To organize this information for efficient research. delivery and convenient access, furnish the equipment necessary for its use, and provide Faculty members would appear at my office to assistance in its utilization; and say they'd been told by the university administra- tion to arrange for an evaluation of the new pro- To effect maximum use of this information gram they had permission to try~ This points to a toward making the community a better place common problem: Often the people whose pro- in which to live through aiding the search for grams are being evaluated aren't thrilled at the understanding by its citizens. prospect. For them, an independent evaluation threatens the survival of the program and perhaps (1981. 306) even their jobs. As the researchers said, \"Everything that VCPL The main problem I want to introduce, how- does can be tested against the Statement of Pur- ever, has to do with the purpose of the intervention pose.\" They then set about creating operational to be evaluated. The question \"What is the intended measures for each of the purposes. result of the new program?\" often produced a vague response such as \"Students will get an in-depth and Although \"official\" purposes of interventions genuine understanding of mathematics, instead of are often the key to designing an evaluation, they simply memorizing methods of calculations.\" Fabu- may not always be sufficienL Anna-Marie Madison lous! And how could we measure that \"in-depth (1992), for example, warns that programs designed and genuine understanding\"? Often, I was told that to help disadvantaged minorities do not always the program aimed at producing something that reflect what the proposed recipients of the aid may could not be measured by conventional aptitude need and desire: and achievement tests. No problem there; that's to be expected when we're innovating and being un- The cultural biases inherent in how rniddle- conventional. What would be an unconventional class white researchers interpret the experi- measure of the intended result? Sometimes this dis- ences of low-income minorities may lead to er- cussion came down to an assertion that the effects roneous assumptions and faulty propositions of the program would be \"unmeasurable.\" concerning causal relationships, to invalid social theory, and consequently to invalid program There's the cornmon rub in evaluation research: theory. Descriptive theories derived from measuring the \"unmeasurable~\" Evaluation re- faulty premises, which have been legitimized search is a matter of finding out whether some- in the literature as existing knowledge, may thing is there or not there, whether something have negative consequences for program happened or didn't happen. To conduct evaluation participants. research, we must be able to operationalize, ob- serve, and recognize the presence or absence of (1992 38) what is under study. In setting up an evaluation, then, researchers Often, outcomes can be derived from published must pay careful attention to issues of measure- program documents. Thus, when Edward Howard ment. Let's take a closer look at the types of mea- and Darlene Norman (1981) evaluated the perfor- surements that evaluation researchers must mance of the Vigo County Public Library in Indi- deal with. ana, they began with the statement of purpose pre- viously adopted by the library's Board of Trustees. Specifying Outcomes To acquire by purchase or gift, and by recording As I've already suggested, a key variable for eval- and production, relevant and potentially useful uation researchers to measure is the outcome, information that is produced by, about, or for or what is called the response variable. If a social the citizens of the community; program is intended to accomplish something, we must be able to measure that something. If we

Fonnulatingthe Problem: Issues of Measurement 353 want to reduce prejudice, we need to be able other related topics; the research indicated that to measure prejudice. If we want to increase the show had affected these as well~ Finally, the marital harmony, we need to be able to measure program aimed at affecting behavior. We've already that. seen that radio listeners reported changing their behavior with regard to AIDS prevention. They It's essential to achieve agreements on defini- reported a greater use of family planning as well. tions in advance: However, because there's always the possibility of a gap between what people say they do and what The most difficult situation arises when there they actually do, the researchers sought indepen- is disagreement as to standards. For example, dent data to confirm their conclusions. many parties may disagree as to what defines serious drug abuse-is it defined best as 15% or Tanzania's national AIDS-control program had more of students using drugs weekly, 5% or been offering condoms free of charge to citizens. In more using hard drugs such as cocaine or PCP the areas covered by the soap opera, the number of monthly, students beginning to use drugs as condoms given out increased sixfold between 1992 young as seventh grade, or some combination and 1994. This far exceeded the increase of 1.4 of the dimensions of rate of use, nature of times in the control area, where broadcasters did use, and age of user? .. ~ Applied researchers not carry the soap opera. should, to the degree possible, attempt to achieve consensus from research consumers in Measuring Experimental Contexts advance of the study (e~g., through advisory groups) or at least ensure that their studies are Measuring the dependent variables directly in- able to produce data relevant to the standards volved in the experimental program is only a begin- posited by all potentially interested parties. ning. As Henry Riecken and Robert Boruch (1974: 120-21) point out, it's often appropriate and inl- (Hedrick. Bickman. and Rog 1993,' 27) portant to measure those aspects of the context of an experiment researchers think might affect the In some cases you may find that the definitions experiment. Though external to the e}'.'periment of a problem and a sufficient solution are defined itself, some variables may affect it. by law or by agency regulations; if so, you must be aware of such specifications and accommodate Suppose, for example, that you were conduct- them. Moreover, whatever the agreed-on defini- ing an evaluation of a program aimed at training tions, you must also achieve agreement on how the unskilled people for employment. The primary measurements will be made. Because there are dif- outcome measure would be their success at gaining ferent possible methods for estimating the percent- employment after completing the program. You age of students \"using drugs weekly,\" for example, would, of course, observe and calculate the sub- you'd have to be sure that all the parties involved jects' employment rate, but you should also deter- understood and accepted the method(s) you've mine what has happened to the employmentI chosen. unemployment rates of society at large during the evaluation. A general slump in the job market In the case of the Tanzanian soap opera, there should be taken into account in asseSSing what were several outcome measures. In part, the pur- might otherwise seem a pretty low employment pose of the program was to improve knowledge rate for subjects. Or, if all the experimental subjects about both family planning and AIDS. Thus, for get jobs following the program, you should con- example, one show debunked the belief that the sider any general increase in available jobs. Com- AIDS virus was spread by mosquitoes and could be bining complementary measures with proper avoided by the use of insect repellant. Studies of lis- control-group designs should allow you to teners showed a reduction in that belief (Rogers pinpoint the effects of the program you're et al. 1996: 21). evaluating. PCI also wanted to change Tanzanian attitudes toward family size, gender roles, HIV/AIDS, and

354 Chapter 12: Evaluation Research Specifying Interventions ex-perimental and control groups as warranted by the study design. Defining the population, how- Besides making measurements relevant to the out- ever, can itself involve specifying measurements. If comes of a program, researchers must measure the we're evaluating a new form of psychotherapy, for program intervention-the experimental stimulus. example, it's probably appropriate for people with In part, this measurement will be handled by the mental problems. But how will \"mental problems\" assignment of subjects to experimental and control be defined and measured? The job-training pro- groups, if that's the research design. Assigning a gram mentioned previously is probably intended person to the experimental group is the same as for people who are having trouble finding work, scoring that person yes on the stimulus, and assign- but what counts as \"having trouble\"? ment to the control group represents a score of no. In practice, however, it's seldom that simple. Beyond defining the relevant population, then, the researcher should make fairly precise measure- Let's stick with the job-training example. Some ments of the variables considered in the definition. people will participate in the program; others will For example, even though the randomization of not. But imagine for a moment what job-training subjects in the psychotherapy study would ensure programs are probably like. Some subjects will par- an equal distribution of those with mild and those ticipate fully; others will miss a lot of sessions or with severe mental problems into the experimental fool around when they are present. So you may and control groups, we'd need to keep track of the need measures of the extent or quality of participa- relative severity of different subjects' problems in tion in the program. If the program is effective, case the therapy turns out to be effective only for you should find that those who participated fully those with mild disorders. Similarly, we should have higher employment rates than those who measure such demographic variables as gender, age, participated less do. race, and so forth in case the therapy works only for women, the elderly, or some other group. Other factors may further confound the admin- istration of the experimental stimulus. Suppose New versus Existing Measures we're evaluating a new form of psydlotherapy de- signed to cure sexual impotence. Several therapists In prOviding for the measurement of these different administer it to subjects composing an experimen- kinds of variables, the researcher must continually tal group. We plan to compare the recovery rate choose whether to create new measures or use of the experimental group vvith that of a control ones already devised by others. If a study addresses group, which receives some other therapy or none something that's never been measured before, the at all. It may be useful to indude the names of the choice is easy. If it addresses something that others therapists treating specific subjects in the experi- have tried to measure, the researcher will need to mental group, because some may be more effective evaluate the relative worth of various existing than others. If this turns out to be the case, we measurement devices in terms of her or his specific must find out why the treatment worked better for research situations and purpose. Recall that this some therapists than for others. What we learn will is a general issue in social research that applies further develop our understanding of the therapy well beyond evaluation research. Let's briefly compare creating new measures and using itself. existing ones. Specifying the Population Creating measurements specifically for a study can offer greater relevance and validity than using In evaluating an intervention, it's important to existing measures would. If the psychotherapy define the population of possible subjects for whom we're evaluating aims at a specific aspect of recov- the program is appropriate. Ideally, all or a sample ery, we can create measures that pinpoint that of appropriate subjects will then be assigned to

FOl1llulating the Problem: Issues of Measurement 355 aspect. We might not be able to find any standard- student, and it may raise students' performances on ized psychological measures that hit that aspect tests by an average of 15 points. Because the test right on the head. However, creating our own mea- scores can't be converted into dollars, there's no ob- sure 'will cost us the advantages to be gained from vious ground for weighing the costs and benefits. using preexisting measures. Creating good mea- sures takes time and energy, both of which could be Sometimes, as a practical matter, the criteria saved by adopting an existing technique. Of greater of success and failure can be handled through scientific significance, measures that have been competition among programs. If a different foreign used frequently by other researchers carry a body language program costs only $50 per student and of possible comparisons that might be important to produces an increase of 20 points in test scores, it our evaluation. If the experinlental therapy raises will undoubtedly be considered more successful scores by an average of ten points on a standardized than the first program-assuming that test scores test, we'll be in a position to compare that therapy are seen as an appropriate measure of the purpose with others that had been evaluated using the same of both programs and the less expensive program measure. Finally, measures 'with a long history of has no unintended negative consequences. use usually have known degrees of validity and reliability, but newly created measures will require Ultimately, the criteria of success and failure pretesting or will be used with considerable are often a matter of agreement. The people re- uncertainty. sponsible for the program may commit themselves in advance to a particular outcome that will be Operafionalizing Success/Failure regarded as an indication of success. If that's the case, all you need to do is make absolutely cer- Potentially one of the most taxing aspects of evalu- tain that the research design will measure the ation research is determining whether the program specified outcome. I mention this obvious require- under review succeeded or failed. The purpose of a ment simply because researchers sometimes fail to foreign language program may be to help students meet it, and there's little or nothing more embar- better learn the language, but how much better is rassing than that. enough? The purpose of a conjugal visit program at a prison may be to raise morale, but how high does In summary, researchers must take measure- morale need to be raised to justify the program? ment quite seriously in evaluation research, care- fully determining all the variables to be measured As you may anticipate, clear-cut answers to and getting appropriate measures for each. As I've questions like these almost never arrive. This di- implied, however, SUdl decisions are typically not lemma has surely been the source of what is gen- purely scientific ones. Evaluation researchers often erally called cost-benefit analysis. How much does must work out their measurement strategy with the program cost in relation to what it returns in the people responsible for the program being evalu- benefits? If the benefits outweigh the cost, keep ated. It usually doesn't make sense to determine the program going. If the reverse, junk it. That's whether a program achieves Outcome X when its simple enough, and it seems to apply in straightfor- purpose is to achieve Outcome Y. (Realize, how- ward economic situations: If it cost you $20 to pro- ever, that evaluation designs sometimes have the duce something and you can sell it for only $18, purpose of testing for unintended consequences.) there's no way you can make up the difference in volume. There is a political aspect to these choices, also. Because evaluation research often affects other Unfortunately, the situations faced by evalua- people's professional interests-their pet program tion researchers are seldom amenable to straightfor- may be halted, or they may be fired or lose profes- ward economic accounting. The foreign language sional standing-the results of evaluation research program may cost the school district S100 per are often argued about. Let's turn now to some of the research designs commonly employed by evaluators.

356 Chapter 12: Evaluation Research Types of Evaluation altogether during the experiment. Because ethical Research Designs practice would probably prevent withdrawing ther- apy altogether from the control group, however, it's As I noted at the start of this chapter, evaluation more likely that the control group would continue research is not itself a method, but rather one ap- to receive their conventional therapy, plication of social research methods. As such, it can involve any of several research designs. Here we'll Having assigned subjects to the experimental consider three main types of research design that and control groups, we would need to agree on the are appropriate for evaluations: experimental de- length of the experiment. Perhaps the designers of signs, quasi-experimental designs, and qualitative the new therapy feel it ought to be effective within evaluations. two months, and an agreement could be reached. The duration of the study doesn't need to be rigid, Experimental Designs however. One purpose of the experiment and eval- uation might be to determine how long it actually Many of the experimental designs introduced in takes for the new therapy to be effective. Conceiv- Chapter 8 can be used in evaluation research, By ably, then, an agreement could be struck to mea- way of illustration, let's see how the classical exper- sure recovery rates weekly, say, and let the ultimate imental model might be applied to our evaluation length of the experiment rest on a continual review of a new psychotherapy treatment for sexual of the results. impotence, Let's suppose the new therapy involves show- In designing our evaluation, we should begin ing pornographic movies to patients. We'd need to by identifying a population of patients appropriate specify that stimulus, How often would patients see to the therapy. This identification might be made the movies, and how long would each session be? by researchers experimenting with the new ther- Would they see the movies in private or in groups? apy. Let's say we're dealing v\\lith a clinic that al- Should therapists be present? Perhaps we should ready has 100 patients being treated for sexual im- observe the patients while the movies are being potence, We might take that group and the clinic's shown and include our observations among the definition of sexual impotence as a starting point, measurements of the experimental stimulus. Do and we should maintain any existing assessments some patients watch the movies eagerly but others of the severity of the problem for each specific look away from the screen? We'd have to ask these patient. kinds of questions and create specific measure- ments to address them. For purposes of evaluation research, however, we would need to develop a more specific measure Having thus designed the study, all we have to of impotence. Maybe it would involve whether do is \"roll 'em,\" The study is set in motion, the ob- patients have sexual intercourse at all within a servations are made and recorded, and the mass of specified time, how often they have intercourse, or data is accumulated for analysis. Once the study whether and how often they reach orgasm. Alter- has run its course, we can determine whether the natively, the outcome measure might be based on new therapy had its intended-or perhaps some the assessments of independent therapists not in- unintended-consequences. We can tell whether volved in the therapy who interview the patients the movies were most effective for mild problems later. In any event, we would need to agree on the or severe ones, whether they worked for young measures to be used. subjects but not older ones, and so forth. In the simplest design, we would assign the This simple illustration shows how the standard 100 patients randomly to experimental and control experimental designs presented in Chapter 8 can groups; the former would receive the new therapy, be used in evaluation research, Many, perhaps and the latter would be taken out of therapy most, of the evaluations reported in the research literature don't look exactly like this illustration, however. Because it's nested in real life, evaluation

Types of Evaluation Research Designs\" 357 research often calls for quasi-experimental designs. en 12 Let's see what this means. C 11 Quasi-Experimental Designs (!) Quasi experiments are distinguished from \"true\" E eenn 10 Controversial experiments primarily by the lack of random as- Eo signment of subjects to an experimental and a con- <1l trol group, In evaluation research, it's often impos- '-'0 9 sible to achieve such an assignment of subjects. eOl.e- 8 1________D_i_SC,_us_s_io__n ___-=~ Rather than forgo evaluation altogether, research- :.;;:: een ers sometimes create designs that give some evalua- 7 tion of the program in question, This section des- <1l 6 cribes some of these designs. E 2en Time-Series Designs en C (!) To illustrate the time-series design-which in- ::::J volves measurements taken over time-I'll beain 5(!) 0- by asking you to assess the meaning of some h;rpo- \"COl thetical data. Suppose 1come to you with what I .2 e 4 say is an effective teclmique for getting students to participate in classroom sessions of a course I'm en ':52 teaching. To prove my assertion, I tell you that on en Monday only four students asked questions or '0 3 made a comment in class; on Wednesday I devoted <1l the class time to an open discussion of a Controver- to sial issue raging on campus; and on Friday, when 0 2 we returned to the subject matter of the course, .0 eight students asked questions or made comments, 1E In other words, I contend, the discussion of a con- troversial issue on Wednesday has doubled class- ::::J room participation. This simple set of data is pre- sented graphically in Figure 12- L Z0 Have I persuaded you that the open discussion Monday Wednesday Friday on Wednesday has had the consequence I claim for it? Probably you'd object that my data don't FIGURE 12-1 prove the case. Two observations (Monday and Friday) aren't really enough to prove anything, Ide- Two Observations of Class Participation: Before and After ally I should have had two classes, with students an Open Discussion assigned randomly to each, held an open discussion in only one, and then compared the two on Friday. Figure 12-2 presents three possible patterns of But I don't have two classes with random assign- class participation over time, both before and after ment of students, Instead, I've been keeping a record the open discussion on Wednesday. Which of these of class participation throughout the semester for patterns would give you some confidence that the the one class, This record allows you to conduct a discussion had the impact I contend it had? time-series evaluation. If the time-series results looked like the first pattern in Figure 12-2, you'd probably conclude that the process of greater class participation had begun on the Wednesday before the discussion and had continued, unaffected, after the day devoted to the discussion. The long-term data suggest that the trend would have occurred even without the dis- cussion on Wednesday. The first pattern, then, con- tradicts my assertion that the special discussion in- creased class participation. The second pattern contradicts my assertion by indicating that class participation has been bounc- ing up and down in a regular pattern throughout quasi experiments Nonrigorous inquiries some. what resembling controlled experiments but lackina key elements such as pre- and posttesting and/or \" control groups. time-series design A research design that involves measurements made over some period, such as the study of traffic accident rates before and after lower- ing the speed limit.

Pattern 1 en 16 E Q) 14 ~~~~~~ EEo~eenn 12 Controversial UU Oe: l.e-: 3c2o een: 10 E 2en en 8 E Q) ::J Q) 0- 6 a1:l 4 3eO2:l en en '0 CO 'ai '6 2 .0 E ::J Z0 Wed Fri Mon Wed Pattern 2 16,---~------------------~------------~-~-------~--- en 14 E r_~~__~~__~~~~~~__~____~______~~~__~~~~__~~__~~_ illEQ) en en E 8 12 i- eO:l e' -: Controversial 32 ~ 10 Discussion ~ .g ~ 8 2e: ::J '2Q) 0- g> 6 W:s2 '0 ~ 4 ~o 2 E oLL--L--L~--~~~~~~~~~~~~~~, ::J Z Wed Fri Mon Wed Fri Pattern 3 2e: 16 UuQ) 14E eecnno ~-----~-- 12Eo sOec:o:: l.eee-n:: 10 Controversial E2 en en 8 E Q) ::J Q) 0- a1:l Oe:l 6 Ul:32 4 en '0 CO Oi '6 2 .0 E ::J Z0 Wed Wed Fri Mon FIGURE 12-2 Three Pattems of Class Participation in alonger Historical Period

Types of Evaluation Research Designs 359 the semester. Sometimes it increases from one class The pairing of the two schools and their assign- to the next, and sometimes it decreases; the open ment to \"experimental\" and \"control\" condi- discussion on that Wednesday simply came at a tions was not random. The local Lung Associa- time when the level of participation was about to tion had identified the school where we increase. More to the point, we note that class par- delivered the program as one in which admin- ticipation decreased again at the class follovving the istrators were seeking a solution to admitted alleged postdiscussion increase. problems of smoking, alcohoL and drug abuse. The \"control\" school was chosen as a conven- Only the third pattern in Figure 12-2 supports ient and nearby demographic match where my contention that the open discussion mattered. administrators were vvilling to allow our sur- As depicted there, the level of discussion before veying and breath-testing procedures. The that Wednesday had been a steady four students principal of that school considered the existing per class. Not only did the level of participation program of health education to be effective and double following the day of the discussion, but it believed that the onset of smoking was rela- continued to increase aftenvard. Although these tively uncommon among his students. The data do not protect us against the possible influ- communities served by the two schools were ence of some extraneous factor (I might also have very similar. The rate of parental smoking re- mentioned that participation would figure into stu- ported by the students was just above 40 per- dents' grades), they do exclude the possibility that cent in both schools. the increase results from a process of maturation (indicated in the first pattern) or from regular fluc- (/vlcAiister et aI. 1980. 720.) tuations (indicated in the second). In the initial set of observations, the experi- Nonequivalent Control Groups mental and control groups reported virtually the same (low) frequency of smoking. Over the The time-series design just described involves only 21 months of the study, smoking increased in both an \"e:q)erimental\" group; it doesn't provide the groups, but it increased less in the experimental value to be gained from having a control group. group than in the control group, suggesting that Sometimes, when researchers can't create experi- the program affected students' behavior. mental and control groups by random assignment from a common pool, they can find an existing Multiple Time-Series Designs \"control\" group that appears similar to the experi- mental group. Such a group is called a nonequiva- Sometimes the evaluation of processes occurring lent control group. If an innovative foreign lan- outside of \"pure\" experimental controls can be guage program is being tried in one class in a large made easier by the use of more than one time- high school, for example, you may be able to find series analysis. Multiple time-series designs are another foreign language class in the same school an improved version of the nonequivalent control that has a very similar student population: one that has about the same composition in terms of grade nonequivalent control group A control group in schooL gender, ethnicity, IQ, and so forth. The that is similar to the experimental group but is not second class, then, could provide a point of com- created by the random assignment of subjects. This parison even though it is not formally part of the sort of control group differs significantly from the study. At the end of the semester, you could give experimental group in terms of the dependent vari- both classes the same foreign language test and able or variables related to it. then compare performances. multiple time-series designs The use of more than one set of data that were collected over time, as Here's how two junior high schools were se- in accident rates over time in several states or cities, lected for purposes of evaluating a program aimed so that comparisons can be made. at discouraging tobacco, alcohoL and drug use:

360 Chapter 12: Evaluation Research group design just described, Carol Weiss has pre- example, Only then would their new technological sented a useful example: skills bear fruit. An interesting example of multiple time series Both intervention and evaluation were at- was the evaluation of the Connecticut crack- down on highway speeding, Evaluators col- tached to an ongoing program in whicl1 25 villages lected reports of traffic fatalities for several peri- had been selected for technological training. Two ods before and after the new program went poor farmers from each village had been trained in into effect They found that fatalities went new agricultural technologies, Then they had been down after the crackdown, but since the series sent home to share their new knowledge with their had had an unstable up-and-down pattern for village and to organize other farmers into \"peer many years, it was not certain that the drop groups\" who would assist in spreading that knowl- was due to the program, They then compared edge. Two years later, the authors randomly se- the statistics with time-series data from four lected two of the 25 villages (subsequently called neighboring states where there had been no Group A and Group B) for special training and II changes in traffic enforcement, Those states other untrained groups as controls. A careful com- registered no equivalent drop in fatalities, The parison of demographic characteristics showed the comparison lent credence to the conclusion experimental and control groups to be strikingly that the crackdown had had some effect. similar, suggesting they were sufficiently compara- ble for the study. (1972: 69) The peer groups from the two experinlental vil- Although this study design is not as good as lages were brought together for special training in one in which subjects are assigned randomly, it's organization building. The participants were given nonetheless an improvement over assessing the some information about organizing and making de- experimental group's performance vvithout any mands on the government, and they were also comparison. That's what makes these designs given opportunities to act out dramas similar to the quasi experiments instead of just fooling around. situations they faced at home, The training took The key in assessing this aspect of evaluation stud- ies is comparability, as the following example three days, illustrates, The outcome variables considered by the evalu- A growing concern in the poor countries of the ation all had to do with the extent to which mem- world, rural development has captured the atten- bers of the peer groups initiated group activities de- tion and support of many rich countries, Through signed to improve their situation, Six types were national foreign assistance programs and through studied, \"Active initiative,\" for example, was defined international agencies such as the World Bank, the as \"active effort to influence persons or events af- developed countries are in the process of sharing fecting group members versus passive response or their technological knowledge and skills with the withdrawal\" (Tandon and Brown 1981: 180), The developing countries, Such programs have had data for evaluation came from the journals that the mixed results, however. Often, modern techniques peer group leaders had been keeping since their do not produce the intended results when applied initial teclmological training, The researchers read in traditional societies. through the journals and counted the number of initiatives taken by members of the peer groups. Rajesh Tandon and L. Dave Brovvn (1981) un- Two researchers coded the journals independently dertook an experiment in which technological and compared their work to test the reliability of training would be accompanied by instruction in village organization. They felt it was important for the coding process. poor farmers to learn how to organize and exert Figure 12-3 compares the number of active collective influence vvithin their villages-getting needed action from government officials, for initiatives by members of the two experimental groups with those coming from the control groups, Similar results were found for the other outcome measures.

Types of Evaluation Research Designs 361 14 13 Group B ====='=:$'~ (village with training) 12 11 10 .::: 9(/J Group A ===::!i (!) (village with training) :rg 8 :~ .:(:!:) 7 t5 t1l '0 6 uc>- 5(!) ::::l e- uli.!. 4 3 Controls 2 r-----------------------------------;·I-·J~-· (11 villages without training) o Dec. March June Sept Dec March Sept. Dec . 1976 1977 FIGURE 12-3 Active Initiatives over TIme Source: Rajesh Tandon and LDave Brown,\"Organization-Building for Rural Development: An Experiment in India,\"JournalofAppliedBehavioral Science, April-June 1981,p 182. Notice two things about the graph, First, there This example illustrates the strengths of mul- is a dramatic difference in the number of initiatives tiple time-series designs in situations where true by the two experimental groups as compared with experiments are inappropriate to the program be- the eleven controls. This would seem to confirm the ing evaluated, effectiveness of the special training program. Sec- ond, notice that the number of initiatives also in- Qualitative Evaluations creased among the control groups. The researchers explain this latter pattern as a result of contagion. Although I've laid out the steps involved in tightly Because all the villages were near each other, the structured, mostly quantitative evaluation research, lessons learned by peer group members in me ex- evaluations can also be less structured and more perimental groups were communicated in part to qualitative, For example, Pauline Bart and Patricia members of the control villages.

362 Chapter 12: Evaluation Research O'Brien (1985) wanted to evaluate different ways dition, the researchers conducted numerous focus to stop rape, so they undertook in-depth interviews groups to probe more deeply into the impact the with rape victims and ,vith women who had suc- shows had on listeners, Also, content analyses were cessfully fended off rape attempts. As a general done on the soap opera episodes themselves and rule, they found that resistance (e.g., yelling, kick- on the many letters received from listeners. Both ing, running away) was more likely to succeed than quantitative and qualitative analyses were under- to make the situation worse, as women sometimes taken (Swalehe et aL 1995). fear it ,viII. The soap opera research also offers an opportu- Sometimes even structured quantitative evalu- nity to see the impact of different cultures on the ations can yield unexpected qualitative results. Paul conduct of research. I had an opportunity to expe- Steel is a social researcher speCializing in the evalu- rience this firsthand when I consulted on the eval- ation of programs ain1ed at pregnant drug users. uation of soap operas being planned in Ethiopia. In One program he evaluated involved counseling by contrast to the Western concern for confidentiality public health nurses, who warned pregnant drug in social research, respondents selected for inter- users that continued drug use would likely result in views in rural Ethiopian villages often took a spe- underweight babies whose skulls would be an aver- cial pride at being selected and wanted their an- age of 10 percent smaller than normaL In his in- swers broadly known in the community, depth interviews vvith program participants, how- ever, he discovered that the program omitted one Or, sometimes, local researchers' desires to important piece of information: that undersized ba- please the client got in the way of the evaluation. bies were a bad thing, Many of the young women For example, some pilot episodes were tested in fo- Steel intervievved thought that smaller babies cus groups to determine whether listeners would would mean easier deliveries. recognize any of the social messages being commu- nicated. The results were more encouraging than In another program, a local district attorney could have been expected, When I asked how the had instituted what would generally be regarded as focus group subjects had been selected, the re- a progressive, enlightened program. If a pregnant searcher described his introductory conversation: drug user were arrested, she could avoid prosecu- \"We would like you to listen to some radio pro- tion if she would (1) agree to stop using drugs and grams designed to encourage people to have small (2) successfully complete a drug-rehabilitation pro- families, and we'd like you to tell us whether we've gram, Again, in-depth interviews suggested that been successful.\" Not surprisingly, the small-family the program did not always operate on the ground theme came through clearly to the focus group the way it did in principle. Specifically, Steel discov- subjects. ered that whenever a young woman was arrested for drug use, her fellow inmates would advise her These experiences, along vvith earlier com- to get pregnant as soon as she was released on baiL ments in previous sections, have hinted at the pos- That way, she would be able to avoid prosecution sibility of problems in the actual execution of eval- (personal communication, November 22, 1993), uation research projects. Of course, all forms of research can run into problems, but evaluation re- The most effective evaluation research is one search has a special propensity for it, as we shall that combines qualitative and quantitative compo- now explore further. nents, Making statistical comparisons is useful. and so is gaining an in-depth understanding of the pro- The Social Context cesses producing the observed results-or prevent- ing the expected results from appearing. This section looks at some of the logistical problems and special ethical issues in evaluation research. It The evaluation of the Tanzanian soap opera, concludes ,vith some observations about using presented earlier in this chapter, employed several evaluation research results. research techniques. I've already mentioned the lis- tener surveys and data obtained from clinics, In ad-

The Social Context 363 Logistical Problems Eventually, the reluctant supervisors came around and \"this initial reluctance gave way to guarded op- In a military context. logistics refers to moving sup- tirnism and later to enthusiasm\" (1980: 489). plies around-making sure people have food, guns, and tent pegs when they need them. Here, I The low performers themselves were even use it to refer to getting subjects to do what they're more of a problem, however. The research design supposed to do, getting research instruments dis- called for pre- and posttesting of attitudes and per- tributed and returned, and other seemingly simple sonalities, so that changes brought about by the tasks. These tasks are more challenging than you program could be measured and evaluated. might guess! Unfortunately, all of the LPs (Low Performers) Motivating Sailors were strongly opposed to taking these so-called personality tests and it was therefore concluded When Kent Crawford, Edmund Thomas, and Jef- that the data collected under these circum- frey Fink (1980) set out to find a way to motivate stances would be of questionable validity. Ethi- \"low performers\" in the US. Navy, they found out cal concerns also dictated that we not force just how many problems can occur. The purpose of \"testing\" on the LPs. the research was to test a three-pronged program for motivating sailors who were chronically poor (Cralliol'd et al. 1980:490) performers and often in trouble aboard ship. First, a workshop was to be held for supervisory person- As a consequence, the researchers had to rely neL training them in the effective leadership of low on interviews vvith the low performers and on the performers. Second, a few supervisors would be se- judgments of supervisors for their measures of atti- lected and trained as special counselors and role tude change, The subjects continued to present models-people the low performers could turn to problems, however. for advice or just as sounding boards. Finally, the low performers themselves would participate in Initially, the ship's command ordered 15 low workshops aimed at training them to be more mo- performers to participate in the experiment. Of the tivated and effective in their work and in their lives. 15, however, one went into the hospital. another The project was to be conducted aboard a particu- was assigned duties that prevented participation, lar ship, ,vith a control group selected from sailors and a third went \"over the hill\" (absent vvithout on four other ships. leave), Thus, the experiment began vvith 12 sub- jects, But before it was completed, three more sub- To begin, the researchers reported that the su- jects completed their tour of duty and left the Navy, pervisory personnel were not exactly thrilled vvith and another was thrown out for disciplinary rea- the program. sons. The experiment concluded, then, vvith eight subjects, Although the evaluation pointed to posi- Not surprisingly, there was considerable resis- tive results, the very small number of subjects war- tance on the part of some supervisors toward ranted caution in any generalizations from the dealing vvith these issues. In fact. their reluc- experiment. tance to assume ownership of the problem was reflected by \"blaming\" any of several fac- The special. logistical problems of evaluation re- tors that can contribute to their personnel search grow out of the fact that it occurs vvithin the problem. The recruiting system, recruit train- context of real life. Although evaluation research is ing, parents, and society at large were named modeled after the experiment-which suggests as influencing low performance-factors that the researchers have control over what hap- that were well beyond the control of the pens-it takes place vvithin frequently uncontrol- supervisors. lable daily life. Of course, the participant-observer in field research doesn't have control over what is (Cralliol'd et at 1980. 488) observed either, but that method doesn't strive for control. Given the objectives of evaluation research,

364 Chapter 12: Evaluation Research lack of control can create real dilemmas for the commanding officers sent others to fill in for the researcher. missing soldiers. And whom do you suppose they picked to fill in? Soldiers who didn't have any- Administrative Control thing else to do or who couldn't be trusted to do anything important. You might learn this bit of in- As suggested in the previous example, the logistical formation a week or so before the deadline for sub- details of an evaluation project often fall to pro- mitting your final report on the impact of the race- gram administrators. Let's suppose you're evaluat- relations lectures. ing the effects of a \"conjugal visit\" program on the morale of married prisoners. The program allows These are some of the logistical problems con- inmates periodic visits from their spouses during fronting evaluation researchers. You need to be fa- which they can have sexual relations. On the fourth miliar with the problems to understand why some day of the program, a male prisoner dresses up in research procedures may not measure up to the de- his wife's clothes and escapes. Although you might sign of the classical experiment. As you read re- be tempted to assume that his morale was greatly ports of evaluation research, however, you'll find improved by escaping, that turn of events would that-my earlier comments notwithstanding-it is complicate your study design in many ways. Per- possible to carry out controlled social research in haps the warden will terminate the program alto- conjunction with real-life experiments. gether, and where's your evaluation then? Or. if the warden is brave, he or she may review the files Just as evaluation research has special logistical of all those prisoners you selected randomly for the problems, it also raises special ethical concerns. experimental group and veto the \"bad risks.\" There Because ethical problems can affect the scientific goes the comparability of your experimental and quality of the research, we should look at them control groups. As an alternative, stricter security briefly. measures may be introduced to prevent further es- capes, but the security measures may have a damp- Some Ethical Issues ening effect on morale. So the experimental stimu- lus has changed in the middle of your research Ethics and evaluation are intertwined in many project. Some of the data will reflect the original ways. Sometimes the social interventions being stimulus; other data will reflect the modification. evaluated raise ethical issues. Evaluating the impact Although you'll probably be able to sort it all out, of busing school children to achieve educational in- your carefully designed study has become a logisti- tegration will throw the researchers directly into the cal snake pit. politicaL ideologicaL and ethical issues of busing itself. It's not possible to evaluate a sex-education Or suppose you've been engaged to evaluate program in elementary schools without becoming the effect of race-relations lectures on prejudice in involved in the heated issues surrounding sex edu- the army. You've carefully studied the soldiers cation itself, and the researcher will find it difficult available to you for study, and you've randomly to remain impartial. The evaluation study design assigned some to attend the lectures and others to will require that some children receive sex educa- stay away. The rosters have been circulated weeks tiorl-in fact, you may very well be the one who in advance, and at the appointed day and hour, decides which children do. (From a scientific stand- the lectures begin. Everything seems to be going point. you should be in charge of selection.) This smoothly until you begin processing the files: The means that when parents become outraged that names don't match. Checking around, you discover their child is being taught about sex, you'll be di- that military field exercises, KP duty, and a variety rectly responsible. of emergencies required some of the experimental subjects to be elsewhere at the time of the lectures. Now let's look on the \"bright\" side. Maybe the That's bad enough, but then you learn that helpful experimental program is of great value to those participating in it. Let's say that the new industrial safety program being evaluated reduces injuries

The Social Context 365 dramatically. What about the control-group mem- Reality isn't that simple and reasonable, how- bers who were deprived of the program by the re- ever. Other factors intrude on the assessment of search design? The evaluators' actions could be an evaluation research results, sometimes blatantly important part of the reason that a control-group and sometimes subtly. As president, Richard Nixon subject suffered an injury. appointed a blue-ribbon national commission to study the consequences of pornography. After a Sometimes the name of evaluation research diligent, multifaceted evaluation, the commission has actually served as a mask for unethical behav- reported that pornography didn't appear to have ior. In Chapter 9 I discussed push polls, which pre- any of the negative social consequences often at- tend to evaluate the impact of various political tributed to it. EX\"posure to pornographic materials, campaign accusations but intend to spread mali- for example, didn't increase the likelihood of sex cious misinformation. That's not the worst ex- crimes. You might have expected liberalized legisla- ample, however. tion to follow from the research. Instead, the presi- dent said the commission was wrong. In 1932, researchers in Tuskegee, Alabama, began a program of providing free treatment for Less-dramatic examples of the failure to follow syphilis to poor, African American men suffering the implications of evaluation research could be from the disease. Over the years that followed, listed endlessly. Undoubtedly every evaluation several hundred men participated in the program. researcher can point to studies he or she con- What they didn't know was tl1at they were not ac- ducted-studies providing clear research results tually receiving any treatment at all; the physicians and obvious policy implications-that were ig- conducting the study merely wanted to observe the nored, as \"The Impact of 'Three Strikes' Laws\" natural progress of the disease. Even after penicillin illustrates . was found to be an effective cure, the researchers withheld the treatment. Although there is unani- There are three important reasons why the im- mous agreement today about the unethical nature plications of the evaluation research results are not of the study, this was not the case at the time. Even always put into practice. First the implications may when the study began being reported in research not always be presented in a way that the non- publications, the researchers refused to acknowl- researchers can understand. Second, evaluation re- edge they had done anything wrong. When profes- sults sometimes contradict deeply held beliefs. That sional complaints were finally lodged with the US. was certainly true in the case of the pornography Center for Disease Control in 1965, there was no commission. If \"everybody knows\" that pornogra- reply (Jones 1981). phy causes all manner of sexual deviance, then re- search results to the contrary will probably have My purpose in these comments has not been little immediate impact. By the same token, people to cast a shadow on evaluation research. Rather, I thought Copernicus was crazy when he said the want to bring home the real-life consequences of earth revolved around the sun. Anybody could tell the evaluation researcher's actions. Ultimately, all the earth was standing still. The third barrier to the social research has ethical components. use of evaluation results is vested interests. If I've devised a new rehabilitation program that I'm con- Use of Research Results vinced will keep ex-convicts from returning to prison, and if people have taken to calling it \"The One more facts-of-life aspect of evaluation research Babbie Plan,\" how do you think I'm going to feel concerns how evaluations are used. Because the when your evaluation suggests the program doesn't purpose of evaluation research is to determine the work? I might apologize for misleading people, fold success or failure of social interventions, you might up my tent, and go into another line of work More think it reasonable that a program would automati- likely, I'd call your research worthless and begin in- cally be continued or terminated based on the re- tense lobbying with the appropriate authorities to sults of the research. have my program continue.

366 Chapter 12: Evaluation Research SACRAMENTO (AP)-The author of California's five-year-old criminal justice system approximately $5.5 billion more per year, espe- \"three strikes\" law says it's prevented more than amillion crimes cially in prison costs as\"career criminals\" were sentenced to longer terms. and has saved 521} billion. Although the Rand group did not deny that the \"three strikes\" legislation Secretary of Slate Bill Jones offered his interpretation of the would have some impact on crime-those serving long terms in prison \"three strikes\" results to aDoris Tate Crimes Victim Bureau confer- ence on Friday in Sacramento can't commit crimes on the streets-a follow-up study (Greenwood, Rydell,and Model 1996) suggested it was an inefficient way of attacking (Boylnsider,March 1,1999) crime.They estimated that amillion dollars spent on \"three strikes\"would prevent 60 crimes, whereas the same amount spent on programs en- The 1990s saw the passage of\"three strikes\" laws at the federal level and couraging high school students to stay in school and graduate would in numerous states The intention was to reduce crime rates by locking up prevent 258 crimes. \"career criminals\" Under the 1994 California law, for example, having a past felony conviction would double your punishment when you were Criminologists have long recognized that most crimes are commit- convicted of your second felony, and the third felony conviction would ted by young menJocusing attention on older\"career criminals\"has little bring amandatory sentence of 25 years to life Over the years, only Cali- or no affect on the youthful offenders. In fact,\"three strikes\"sentences fornia has enforced such laws with any vigor disproportionately fall on those approaching the end of their criminal careers by virtue of growing older. Those who supported the passage of\"three strikes\" legislation, such as Bill Jones, quoted earlier, were quick to link the dramatic drop in crime In amore general critique, John Irwin and James Austin (1997) rates during the 1990s to the new policy of getting tough vlith career suggest that people in the United States tend to overuse prisons as a criminals. While acknowledging that\"three strikes\" may not be the only solution to crime, ignoring other, more effective, solutions. Often, impris- cause of the drop in crime, Jones added,\"lfyou can have a51 percent re- onment causes problems more serious than those it was intended to duction in the homicide rate in five years, Iwould guarantee you three remedy strikes is abig part of the reason.\" As with many other social interventions, however, much of the sup- In spite of the politicians' guarantees, other observers have looked port for \"three strikes\" laws in California and elsewhere have mostly to do for additional evidence to support the impact of'three strikes\" laws with public emotions about crime and the political implications ofsuch Some critics of these laws, for example, have noted that crime rates have emotionsThus,evaluation research on these laws may eventually bring been dropping dramatically across the country, not only in California but about changes, but its impact is likely to be much slower than you might in stales that have no \"three strikes\"laws and in those where the courts logically expect have not enforced the \"three strikes\"laVis that exist In fact,crime rates have dropped in those California counties that have tended to ignore that Karyn Model, Diveriing Children from 0 state's law Moreover, the drop in California crime rates began before the (Santa Monica,CA: Rand Corporation, 1996); \"three strikes\" law went into effect. Peter WGreenwood et al ,Three Strikes and YOUri' OurEsiimofed Benell!s and COSfS of In 1994, Peter Greenwood and his colleagues at the Rand Corpora- Monica, Ck Rand Corporation, tion estimated that implementation of the law would cost California's 1994);John Irwin and James Austin, (Belmont, CA:Wadsworth 1997); \"State Saved $21.7 Billion with five-Year-Old 'Three Strikes'law,\" Baylnsider, March 1,1999 In the earlier example of the evaluation of can't do anything to train drivers\" You can only drivers' education, Philip Hilts reported some of the improve medical facilities and build stronger reactions to the researchers' preliminary results: cars for when the accidents happen\" \" \" . This knocks the whole philosophy of education\".\" Ray Burneson, traffic safety specialist with (198/ 4) th~ National Safety CounciL criticized the By its nature, evaluation research takes place in study, saying that it was a product of a group the midst of real life, affecting it and being affected (NHTSA) run by people who believe \"that you

The Social Context 367 by it Here's another example, well known to social TABLE 12·1 researchers\" Analysis of Rape Cases Before and After Legislation Rape RefaIm Legislation Rape For years, many social researchers and other ob- servers have noted certain problems with the Outcome of case Before After prosecution of rape cases\" All too often, it is felt, the victim ends up suffering almost as much on the Convicted of original charge (N = 2,252) (N = 2,369) witness stand as in the rape itself. Frequently the Convicted of another charge defense la,vyers portray her as having encouraged Not convicted 45,,8% 45.4% the sex act and being of shady moral character; 20.6 19.4 other personal attacks are intended to deflect re- Median prison sentence in months 33.6 35.1 sponsibility from the accused rapist. For those convicted of original charge 96.0 144.0 Criticisms such as these have resulted in a vari- ety of state-level legislation aimed at remedying the For those convicted of another 36.0 36.0 problems\" Cassie Spohn and Julie Horney (1990) charge were interested in tracking the impact of such legislation\" The researchers summarize the ways maximum prison sentences; there was an in- in which new laws were intended to make a crease of almost 48 months for rape and of difference: almost 36 months for sex offenses. Because plots of the data indicated an increase in the The most changes are: (1) redefining rape and average sentence before the reform took effect, replacing the single crime of rape with a series we modeled the series with the intervention of graded offenses defined by the presence or moved back one year earlier than the actual re- absence of aggravating conditions; (2) changing form date\" The size of the effect was even larger the consent standard by eliminating the re- and still significant, indicating that the effect quirement that the victim physically resist her should not be attributed to the legal reform. attacker; (3) eliminating the requirement that the victim's testimony be corroborated; and (1990 10) (4) placing restrictions on the introduction of evidence of the victim's prior sexual conduce Notice in the table that there was virtually no change in the percentages of cases ending in con- (1990: 2) viction for rape or some other charge (e.g., assault). Hence the change in laws didn't have any effect on It was generally expected that such legislation the likelihood of conviction. As the researchers would encourage women to report being raped and note, the one change that is evident-an increase would increase convictions when the cases were in the length of sentences-cannot be attributed to brought to courL To examine the latter expectation, the reform legislation itself. the researchers focused on the period from 1970 to 1985 in Cook County, Illinois: \"Our data file in- In addition to the analysis of existing statistics, cludes 4,628 rape cases, 405 deviate sexual assault Spohn and Horney interviewed judges and lawyers cases, 745 aggravated criminal sexual assault cases, to determine what they felt about the impact of and 37 criminal sexual assault cases\" (1990: 4)\" the laws\" Their responses were somewhat more Table 12-1 shovvs some of what they discovered\" encouraging\" Spohn and Horney summarized these findings Judges, prosecutors and defense attorneys in as follows: Chicago stressed that rape cases are taken more seriously and rape victims treated more The only significant effects revealed by humanely as a result of the legal changes\" our analyses were increases in the average

368 Chapter 12: Evaluation Research These educative effects clearly are important the IPE, however, some victims are offered the pos- and should please advocates of rape reform sibility of dropping the charges if they so choose legislation. later in the process. In addition, the court offers several other options. Because wife battering is (1990 17) largely a function of sexism, stress, and an inability to deal with anger, some of the innovative possibili- Thus, the study found other effects besides the ties in the IPE involve educational classes with qualitative results the researchers looked fOL This anger-control counseling. study demonstrates the importance of following up on social interventions to determine whether, in If the defendant admits his guilt and is willing what ways, and to what degree they accomplished to participate in an anger-control counseling pro- their intended results. gram, the judge may postpone the trial for that purpose and can later dismiss the charges if the de- Preventing Wife Battering fendant successfully completes the program. Alter- natively, the defendant may be tried and, if found In a somewhat similar study, researchers in Indi- guilty, be granted probation provided he partici- anapolis focused their attention on the problem of pates in the anger-control program. Finally, the de- wife battering, with a special concern for whether fendant can be tried and, if found guilty, be given a prosecuting the batterers can lead to subsequent vi- conventional punishment such as imprisonment. olence. David Ford and Mary Jean Regoli (1992) set about studying the consequences of various op- Which of these possibilities most effectively tions for prosecution allowed within the \"Indianap- prevents subsequent wife battering? That's the olis Prosecution Experiment\" (IPE). question Ford and Regoli addressed. Here are some of their findings. Wife-battering cases can follow a variety of pat- terns, as Ford and Regoli summarize: First, their research shows that men who are brought to court for a hearing are less likely to con- After a violent attack on a woman, someone tinue beating their wives, no matter what the out- mayor may not call the police to the scene. If come of the hearing. Simply being brought into the the police are at the scene, they are expected to crinunal justice system has an impact. investigate for evidence to support probable cause for a warrantless arrest. If it exists, they Second, women who have the right to drop may arrest at their discretion. Upon making charges later on are less likely to be abused subse- such an on-scene arrest, officers fill out a prob- quently than those who do not have that right. able cause affidavit and slate the suspect into In particular, the combined policies of arresting de- court for an initial hearing. When the police fendants by warrant and allowing victims to drop are not called, or if they are called but do not charges provides victims with greater security from arrest, a victim may initiate charges on her own subsequent violence than any of the other prosecu- by going to the prosecutor's office and swearing tion policies do (Ford and Regoli 1992). out a probable cause affidavit with her allega- tion against the man. Following a judge's ap- However, giving victims the right to drop provaL the alleged batterer may either be sum- charges has a somewhat strange impact. Women moned to court or be arrested on a warrant who exercise that right are more likely to be abused and taken to court for his initial hearing. later than those who insist on the prosecution pro- ceeding to completion. The researchers interpret (1992: 184) this as shOwing that future violence can be de- creased when victims have a sense of control What if a wife brings charges against her hus- supported by a clear and consistent alliance with band and then reconsiders later on? Many courts criminal justice agencies. have a policy of prohibiting such actions, in the be- lief that they are serving the interests of the victim A decisive system response to any violation of by forcing the case to be pursued to completion. In conditions for pretrial release, including of

Social Indicators Research 369 course new violence, should serve notice that convenience, or he is contacted by the warrant the victim-system alliance is strong. It tells the service agency and invited to turn himself in. defendant that the victim is serious in her re- Thus, he may not experience the obvious pun- solve to end the violence and that the system is ishment of, say, being arrested, handcuffed, and unwavering in its support of her interest in se- taken away from a workplace. curing protection. (Ford 1989.: 9-10) (Ford and Regoli 1992: 204) In summary, many factors besides the scientific The effectiveness of anger-control counseling quality of evaluation research affect how its results cannot be assessed simply. Policies aimed at getting are used. And, as we saw earlier, factors outside defendants into anger-control counseling seem to the evaluator'S control can affect the quality of the be relatively ineffective in preventing new violence. study itself. But tIus \"messiness\" is balanced by the The researchers noted, however, that the policy ef- potential contributions that evaluation research can fects should not be confused with actual counseling make toward the betterment of human life. outcomes. Some defendants scheduled for treat- ment never received it. Considerably more infor- Social Indicators Research mation on implementing counseling is needed for a proper evaluation. I want to conclude this chapter with a type of re- search that combines what you've learned about Moreover, the researchers cautioned that their evaluation research and about the analysis of exist- results point to general patterns, and that battered ing data. A rapidly growing field in social research wives must choose courses of action appropriate involves the development and monitoring of social to their particular situations and should not act indicators, aggregated statistics that reflect the so- blindly on the basis of the overall patterns. The re- cial condition of a society or social subgroup. Re- search is probably more useful in what it says about searchers use social indicators to monitor aspects ways of structuring the criminal justice system (giv- of social life in much the way that economists ing victin1s the right to drop charges, for example) use indexes such as gross national product (GNP) than in guiding the actions of individual victims. per capita as an indicator of a nation's economic development. Finally, the IPE offers an example of a common problem in evaluation research. Often, actual prac- Suppose we wanted to compare the relative tices differ from what might be expected in prin- health conditions in different societies. One strategy ciple. For example, the researchers considered the would be to compare their death rates (number of impact of different alternatives for bringing sus- deaths per 1,000 population). Or, more specifically, pects into court: Specifically, the court can issue ei- we could look at infant mortality: the number of ther a summons ordering the husband to appear in infants who die during their first year of life among court or a warrant to have the husband arrested. every 1,000 births. Depending on the particular as- The researchers were concerned that having the pect of health conditions we were interested in, we husband arrested might actually add to his anger could devise any number of other measures: physi- over the situation. They were somewhat puzzled, cians per capita, hospital beds per capita, days of therefore, to find no difference in the anger of hus- bands summoned or arrested. social indicators Measurements that reflect the quality or nature of social life, such as crime rates, The solution of the puzzle lay in the discrep- infant mortality rates, number of physicians per ancy between principle and practice: 100,000 population, and so forth. Social indicators are often monitored to determine the nature of so- Although a warrant arrest should in principle cial change in a society. be at least as punishing as on-scene arrest, in practice it may differ little from a summons. A man usually knows about a warrant for his ar- rest and often elects to turn llimself in at his

370 Chapter 12: Evaluation Research hospitalization per capita, and so forth. Notice that TABLE 12-2 intersocietal comparisons are facilitated by calculat- ing per capita rates (dividing by the size of the Average Rate per 100,000 Population of First- and Second- population) . Degree Murders for Capital-Punishment and Non-Capital- Punishment States, 1967 and 1968 Before we go further, recall from Chapter 11 the problems involved in using existing statistics. In Non-Capita/- Capita/- a word, they're often unreliable, reflecting their Punishment Punishment modes of collection, storage, and calculation. With this in mind, we'll look at some of the ways we can States States use social indicators for evaluation research on a large scale. 7967 7968 7967 7968 The Death Penalty and Deterrence First-degree murder 0.18 0.21 1.47 1.58 Does the death penalty deter capital cTimes such as Second-degree murder 0.30 0.43 1.92 1.03 murder? This question is hotly debated every time a state considers eliminating or reinstating capital Total murders 0.48 0.64 1.38 1.59 punishment and every time someone is executed. Those supporting capital punishment often argue Source: Adapted from William CBailey,\"Murder and Capital Punishment,\" in that the threat of execution will deter potential William Hhambliss, ed\" Criminal Lawin Action, Copyright © 1975 by John Wiley murderers from killing people. Opponents of capi- &Sons, Inc Used by permission. tal punishment often argue that it has no effect in that regard. Social indicators can help shed some given Bailey's data was baeJ..'1vard. Maybe the exis- light on the question. tence of the death penalty as an option was a con- sequence of high murder rates: Those states with If capital punishment actually deters people high rates instituted it; those ,¥ith low rates didn't from committing murder, then we should expect institute it or repealed it if they had it on the books. to find murder rates lower in those states that have It could be the case, then, that instituting the death the death penalty than in those that do not. The penalty would bring murder rates down, and repeal- relevant comparisons in this instance are not only ing it would increase murders and still produce- possible, they've been compiled and published. in a broad aggregate-the data presented in Table Table 12-2 presents data compiled by William Bai- 12-2. Not so, however. Analyses over time do not ley (1975) that directly contradict the view that the show an increase in murder rates when a state re- death penalty deters murderers. In both 1967 and peals the death penalty nor a decTease in murders 1968, those states with capital punishment had '''lhen one is instituted. dramatically higher murder rates than those with- out capital punishment did. Some people criticized Notice from the preceding discussion that it's the interpretation of Bailey's data, saying that most possible to use social indicators data for comparison states had not used the death penalty in recent across groups either at one time or across some pe- years, even when they had it on the books. That riod of time. Often, doing both sheds the most light could explain why it didn't seem to work as a de- on the subject. terrent. Further analysis, however, contradicts this explanation. When Bailey compared those states Though overall murder rates have increased that hadn't used the death penalty '¥ith those that substantially, by the way, the pattern observed by had, he found no real difference in murder rates. Bailey in 1967 and 1968 has persisted over time. In 1999, for example, the 38 death-penalty states had Another counterexplanation is possible, how- a combined murder rate of 5.86 per 100,000, com- ever. It could be the case that the interpretation pared with a combined murder rate of 3.84 among the 12 states that lack the death penalty (US. Bu- reau of the Census 2001: 22, 183). At present, work on the use of social indicators is proceeding on two fronts. On the one hand, researchers are developing ever more-refined

Main Points 371 indicators-finding which indicators of a general Meadows and their colleagues at Dartmouth and variable are the most useful in monitoring social Massachusetts Institute of Technology (Meadows life. At the same time, research is being devoted to et aL 1972, 1992). They've taken as input data discovering the relationships among variables known and estimated reserves of various non- within whole societies. replaceable natural resources (for example, oil, coaL iron), past patterns of population and eco- As 'with many aspects of social research, the nomic growth, and the relationships between World Wide Web has become a valuable resource, growth and use of resources. Using a complex To pursue the possibilities of social indicators, you computer-simulation modeL they've been able to might check out Sociometries Corporation (http:/ / project, among other things, the probable number w,vwsocio,com/), for example. Or simply search of years various resources will last in the face of al- for \"social indicators\" using one of the web search ternative usage patterns in the future. Going be- engines. yond the initially gloomy projections, such models also make it possible to chart out less gloomy fu- computer Simulation tures, specifying the actions required to achieve them. Clearly, the value of computer simulation is One of the most exciting prospects for social indica- not limited to evaluation research, though it can tors research lies in the area of computer simula- serve an important function in that regard. tion, As researchers begin compiling mathematical equations describing the relationships that link so- This potentiality points to the special value of cial variables to one another (for example, the rela- evaluation research in general. Throughout human tionship between growth in population and the history, we've been tinkering with our social number of automobiles), those equations can be arrangements, seeking better results. Evaluation re- stored and linked to one another in a computer. search provides a means for us to learn right away With a sufficient number of adequately accurate whether a particular tinkering really makes things equations on tap, researchers one day will be able better. Social indicators allow us to make that de- to test the implications of specific social changes by termination on a broad scale; coupling them with computer rather than in real life. computer simulation opens up the possibility of knOwing how much we would like a particular in- Suppose a state contemplated doubling the size tervention, without having to experience its risks. of its tourism industry, for example. We could enter that proposal into a computer-simulation model MAIN POINTS and receive in seconds or minutes a description of all the direct and indirect consequences of the in- Introduction/Topics Appropriate crease in tourism. We could know what new public to Evaluation Research facilities would be required, which public agencies €I Evaluation research is a form of applied research such as police and fire departments would have to be incTeased and by how much, what the labor that studies the effects of social interventions. force would look like, what kind of training would be required to provide it, how much new income Formulating the Problem: Issues and tax revenue would be produced, and so forth, of Measurement through all the intended and unintended conse- €I A careful formulation of the problem, including quences of the action. Depending on the results, the pUblic planners might say, \"Suppose we in- relevant measurements and criteria of success creased the industry only by halL\" and have a new or failure, is essential in evaluation research. In printout of consequences immediately. particular, evaluators must carefully specify outcomes, measure experimental contexts, An excellent illustration of computer simula- tion linking social and physical variables is to be found in the research of Donella and Dennis

372 Chapter 12: Evaluation Research specify the intervention being studied and the REVIEW QUESTIONS AND EXERCISES population targeted by the intervention, and decide whether to use existing measures or de- L Suppose a community establishes an alcohol- vise new ones. and drug-free teen center as a way of reducing the use of alcohol and drugs by teenagers. De- Types of Evaluation Research Designs scribe how you might go about evaluating the G Evaluation researchers typically use e:-peri- effectiveness of the center. Indicate whether your design would be experimentaL quasi- mental or quasi-experimental designs. Exam- experimentaL or qualitative (or some combina- ples of quasi-experimental designs include tion of these). time-series studies and the use of nonequiva- lent control groups. 2. Review the evaluation of the Navy low- G Evaluators can also use qualitative methods of performer program discussed in the chapteL Re- data collection. Both quantitative and qualita- design the program and the evaluation to handle tive data analyses can be appropriate in evalua- the problems that appeared in the actual study. tion research, sometimes in the same study. 3. Discuss some of the potential political and ethi- The Social Context cal issues that might be involved in the study G Evaluation research entails special logistical and you described in Exercise 1. ethical problems because it's embedded in the 4. Take a minute to think of the many ways your day-to-day events of real life. society has changed during your own lifetime., G The implications of evaluation research won't Specify three or four social indicators that could necessarily be put into practice, especially if be used in monitoring the effects of at least one they conflict with official points of view. of those changes on the quality of life in your society. Social Indicators Research G Social indicators can provide an understanding 5. The U.S. Bureau of Prisons engages in evalua- tion research into various aspects of prison op- of broad social processes. erations. Locate one of their studies and write G Computer-simulation models hold the promise a short summary of the study design and the findings. See http://www.bop.gov/news/ of allowing researchers to study the possible re- research_projects.jsp. sults of social interventions \\·vithout having to incur those results in real life. ADDITIONAL READINGS KEY TERMS Berg, Richard, and Peter H. Rossi. 1998. Thinking about Program Evaluatioll. Thousand Oaks, CA: The follo'Aing terms are defined in context in the Sage. Great book if you're looking for gaining chapter and at the bottom of the page where the term good foundations in evaluation research while is introduced, as well as in the comprehensive glossary enjoying a wide range of examples. at the back of the book. Bickman, Leonard, and Debra J. Rog, eds. 1998. cost-benefit studies nonequivalent control Handbook ofApplied Social Research Methods. evaluation research group Thousand Oaks, CA: Sage. The two editors of monitoring studies program evaluationI this book have provided examples that illustrate outcome assessment all stages in evaluation, from planning through multiple time-series quasi experiments the collection and analysis of data. In addition, designs they cover ethical issues in the particular con- needs assessment social indicators text of an evaluation research. studies time-series design Chen, Huey-Tsyh. 1990. Theory-Driven Evaluations. Newbury Park, CA: Sage. Chen argues that eval- uation research must be firmly based in theory if it is to be meaningful and useful. Cunningham, J. Barton. 1993. Action Research and Organizational Development. Westport,

Online Study Resources III 373 CT: PraegeL This book urges researchers to with its interactive exercises and other re- bridge the gap between theory and action, be- sources to master the materiaL coming engaged participants in the evolution of organizational life and using social research to 3.. When you're finished with your review, take monitor problems and solutions. the posttest to confirm that you're ready to Hedrick, Terry E., Leonard Bickman, and Debra J. move on to the next chapteL Rog. 1993. Applied Research Design A Practical Guide. Newbury Park, CA: Sage. This introduc- WEBSITE FOR THE PRACTICE tion to evaluation research is, as its subtitle OF SOCIAL RESEARCH 11 TH EDITION claims, a practical guide, dealing straight-out with the compromises that must usually be Go to your book's website at http://sociology made in research design and execution. .wadsworth.com/babbie_practicelle for tools to aid you in studying for your exams. You'll find Tuto- Rossi, Peter H\" and Howard E. Freeman. 1996. Eval- rial Quizzes with feedback, Internet Exercises, Flashcards, !larioll: A Systematic Approach. Newbury Park, CA: and Chapler Tutorials, as well as E)'1ended Projects, Info- Sage. This thorough examination of evaluation Trae College Editioll search terms, Social Research ill research is an excellent resource. In addition to Cyberspace, GSS Data, Web Links, and primers for using discussing the key concepts of evaluation re- various data-analysis software such as SPSS and search, the authors provide numerous examples NVivo. that can help you guide your own deSigns. WEB LINKS FOR THIS CHAPTER Swanson, David A., and Louis G. Pol. 2004. \"Con- temporary Developments in Applied Demogra- Please realize that the Internet is an evolv- phy with the United States.\" Joint issue of Jour- ing entity, subject to change. Nevertheless, Ilal ofApplied Sociology 21 (2) and Sociological these few websites should be fairly stable. Practice 6 (2): 26-56. Demography offers tools Also, check your book's website for even more Wi?b for the evaluation of many macro-level pro- Links. These websites, current at the time of tins book's grams, and the authors detail recent trends to publication, prOlide opportunities to learn about eval- strengthen the applied aspects of this field. uation research, SPSS EXERCISES ERIC, Clearinghouse on Assessment and Evaluation See the booklet that accompanies your text for exer- http://ericae.net! cises using SPSS (Statistical Package for the Social Sci- The Educational Resources Information Center (ERIC) ences). There are exercises offered for each chapter, provides a powerful resource for evaluation research and you'll also find a detailed primer on using SPSS. in education. Online Study Resources Evaluation Exchange http://wwvv.gse.harvard.edu/hfrp/eval.html Sociology~Now'M: Research Methods This is a free, online journal covering evaluation methodology, sponsored by the Harvard Family Re- 1. Before you do your final review of the chapter, search Project. take the SociologyNow. Research Methods diagnos- tic quiz to help identify the areas on which you UNICEF, Research and Evaluation should concentrate. You'll find information on http://www.unicef.org/evaluation/index.html this online tooL as well as instructions on how This MIl tell you how the United Nations evaluates its to access all of its great resources, in the front of many programs aimed at improving life conditions for the book. women and children around the world. 2. As you review, take advantage of the Sociology U.S. General Accounting Office, Evaluation Noll'. Research lvIethods customized study plan, Research and Methodology based on your quiz results. Use this study plan h t t p : / / w w w . g a o . g o vIspeciaL pubs/erm.html As an example of your tax dollars at work for you, this will give you a glimpse of how important evaluation research is regarded by this congressional watchdog agency.

n this part of the book, we'll discuss the analysis of social research data, and we'll examine the steps that separate observation from the final reporting of findings. In Chapter 1, I made afundamental distinction be- tween qualitative and quantitative data. In the subse- quent discussions, we've seen that many of the funda- mental concerns in social research apply equally to both types of data. The analysis of qualitative and quantita- tive data, however, are quite different and will be dis- cussed separately. Before outlining the specifics of Part 4, I want to of- fer an observation about the ease or difficulty of produc- ing high-quality data analyses, as represented in the fol- lowing table, where /11\" is the easiest to do and /14\" is the hardest. Qualitative Simplistic Sophisticated Quantitative 1 4 2 3


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook