Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore introduction_to_categorical_data_analysis_805

introduction_to_categorical_data_analysis_805

Published by orawansa, 2019-07-09 08:41:05

Description: introduction_to_categorical_data_analysis_805

Search

Read the Text Version

2.3 THE ODDS RATIO 31 Because the sampling distribution is closer to normality for log θˆ than θˆ, it is better to construct confidence intervals for log θ. Transform back (that is, take antilogs, using the exponential function, discussed below) to form a confidence interval for θ . A large-sample confidence interval for log θ is log θˆ ± zα/2(SE) Exponentiating endpoints of this confidence interval yields one for θ. For Table 2.3, the natural log of θˆ equals log(1.832) = 0.605. From (2.5), the SE of log θˆ equals SE = 1 + 1 + 1 + 1 = 0.123 189 10,933 104 10,845 For the population, a 95% confidence interval for log θ equals 0.605 ± 1.96(0.123), or (0.365, 0.846). The corresponding confidence interval for θ is [exp(0.365), exp(0.846)] = (e0.365, e0.846) = (1.44, 2.33) [The symbol ex, also expressed as exp(x), denotes the exponential function evalu- ated at x. The exponential function is the antilog for the logarithm using the natural log scale.2 This means that ex = c is equivalent to log(c) = x. For instance, e0 = exp(0) = 1 corresponds to log(1) = 0; similarly, e0.7 = exp(0.7) = 2.0 corresponds to log(2) = 0.7.] Since the confidence interval (1.44, 2.33) for θ does not contain 1.0, the true odds of MI seem different for the two groups. We estimate that the odds of MI are at least 44% higher for subjects taking placebo than for subjects taking aspirin. The endpoints of the interval are not equally distant from θˆ = 1.83, because the sampling distribution of θˆ is skewed to the right. The sample odds ratio θˆ equals 0 or ∞ if any nij = 0, and it is undefined if both entries in a row or column are zero. The slightly amended estimator θ˜ = (n11 + 0.5)(n22 + 0.5) (n12 + 0.5)(n21 + 0.5) corresponding to adding 1/2 to each cell count, is preferred when any cell counts are very small. In that case, the SE formula (2.5) replaces {nij } by {nij + 0.5}. 2All logarithms in this text use this natural log scale, which has e = e1 = 2.718 . . . as the base. To find ex on pocket calculators, enter the value for x and press the ex key.

32 CONTINGENCY TABLES For Table 2.3, θ˜ = (189.5 × 10,933.5)/(10,845.5 × 104.5) = 1.828 is close to θˆ = 1.832, since no cell count is especially small. 2.3.4 Relationship Between Odds Ratio and Relative Risk A sample odds ratio of 1.83 does not mean that p1 is 1.83 times p2. That’s the interpretation of a relative risk of 1.83, since that measure is a ratio of proportions rather than odds. Instead, θˆ = 1.83 means that the odds value p1/(1 − p1) is 1.83 times the odds value p2/(1 − p2). From equation (2.4) and from the sample analog of definition (2.2), Odds ratio = p1/(1 − p1) = Relative risk × 1 − p2 p2/(1 − p2) 1 − p1 When p1 and p2 are both close to zero, the fraction in the last term of this expression equals approximately 1.0. The odds ratio and relative risk then take similar values. Table 2.3 illustrates this similarity. For each group, the sample proportion of MI cases is close to zero. Thus, the sample odds ratio of 1.83 is similar to the sample relative risk of 1.82 that Section 2.2.3 reported. In such a case, an odds ratio of 1.83 does mean that p1 is approximately 1.83 times p2. This relationship between the odds ratio and the relative risk is useful. For some data sets direct estimation of the relative risk is not possible, yet one can estimate the odds ratio and use it to approximate the relative risk, as the next example illustrates. 2.3.5 The Odds Ratio Applies in Case–Control Studies Table 2.4 refers to a study that investigated the relationship between smoking and myocardial infarction. The first column refers to 262 young and middle-aged women (age < 69) admitted to 30 coronary care units in northern Italy with acute MI during a 5-year period. Each case was matched with two control patients admitted to the same hospitals with other acute disorders. The controls fall in the second column of the table. All subjects were classified according to whether they had ever been smokers. The “yes” group consists of women who were current smokers or ex-smokers, whereas Table 2.4. Cross Classification of Smoking Status and Myocardial Infarction Ever Smoker MI Cases Controls Yes 172 173 No 90 346 Source: A. Gramenzi et al., J. Epidemiol. Community Health, 43: 214–217, 1989. Reprinted with permission by BMJ Publishing Group.

2.3 THE ODDS RATIO 33 the “no” group consists of women who never were smokers. We refer to this variable as smoking status. We would normally regard MI as a response variable and smoking status as an explanatory variable. In this study, however, the marginal distribution of MI is fixed by the sampling design, there being two controls for each case. The outcome measured for each subject is whether she ever was a smoker. The study, which uses a retrospective design to look into the past, is called a case–control study. Such studies are common in health-related applications, for instance to ensure a sufficiently large sample of subjects having the disease studied. We might wish to compare ever-smokers with nonsmokers in terms of the pro- portion who suffered MI. These proportions refer to the conditional distribution of MI, given smoking status. We cannot estimate such proportions for this data set. For instance, about a third of the sample suffered MI. This is because the study matched each MI case with two controls, and it does not make sense to use 1/3 as an estimate of the probability of MI. We can estimate proportions in the reverse direction, for the conditional distribution of smoking status, given myocardial infarction status. For women suffering MI, the proportion who ever were smokers was 172/262 = 0.656, while it was 173/519 = 0.333 for women who had not suffered MI. When the sampling design is retrospective, we can construct conditional distri- butions for the explanatory variable, within levels of the fixed response. It is not possible to estimate the probability of the response outcome of interest, or to com- pute the difference of proportions or relative risk for that outcome. Using Table 2.4, for instance, we cannot estimate the difference between nonsmokers and ever smokers in the probability of suffering MI. We can compute the odds ratio, however. This is because the odds ratio takes the same value when it is defined using the conditional distribution of X given Y as it does when defined [as in equation (2.3)] using the distribution of Y given X; that is, it treats the variables symmetrically. The odds ratio is determined by the conditional distributions in either direction. It can be cal- culated even if we have a study design that measures a response on X within each level of Y . In Table 2.4, the sample odds ratio is [0.656/(1 − 0.656)]/[0.333/(1 − 0.333)] = (172 × 346)/(173 × 90) = 3.8. The estimated odds of ever being a smoker were about 2 for the MI cases (i.e., 0.656/0.344) and about 1/2 for the controls (i.e., 0.333/0.667), yielding an odds ratio of about 2/(1/2) = 4. We noted that, when P (Y = 1) is small for each value of X, the odds ratio and relative risk take similar values. Even if we can estimate only conditional probabilities of X given Y , if we expect P (Y = 1 | X) to be small, then the sample odds ratio is a rough indication of the relative risk. For Table 2.4, we cannot estimate the relative risk of MI or the difference of proportions suffering MI. Since the probability of young or middle-aged women suffering MI is probably small regardless of smoking status, however, the odds ratio value of 3.8 is also a rough estimate of the relative risk. We estimate that women who had ever smoked were about four times as likely to suffer MI as women who had never smoked. In Table 2.4, it makes sense to treat each column, rather than each row, as a bino- mial sample. Because of the matching that occurs in case–control studies, however,

34 CONTINGENCY TABLES the binomial samples in the two columns are dependent rather than independent. Each observation in column 1 is naturally paired with two of the observations in col- umn 2. Chapters 8–10 present specialized methods for analyzing correlated binomial responses. 2.3.6 Types of Observational Studies By contrast to the study summarized by Table 2.4, imagine a study that follows a sample of women for the next 20 years, observing the rates of MI for smokers and nonsmokers. Such a sampling design is prospective. There are two types of prospective studies. In cohort studies, the subjects make their own choice about which group to join (e.g., whether to be a smoker), and we simply observe in future time who suffers MI. In clinical trials, we randomly allocate subjects to the two groups of interest, such as in the aspirin study described in Section 2.2.2, again observing in future time who suffers MI. Yet another approach, a cross-sectional design, samples women and classifies them simultaneously on the group classification and their current response. As in a case– control study, we can then gather the data at once, rather than waiting for future events. Case–control, cohort, and cross-sectional studies are observational studies. We observe who chooses each group and who has the outcome of interest. By contrast, a clinical trial is an experimental study, the investigator having control over which subjects enter each group, for instance, which subjects take aspirin and which take placebo. Experimental studies have fewer potential pitfalls for comparing groups, because the randomization tends to balance the groups on lurking variables that could be associated both with the response and the group identification. However, observa- tional studies are often more practical for biomedical and social science research. 2.4 CHI-SQUARED TESTS OF INDEPENDENCE Consider the null hypothesis (H0) that cell probabilities equal certain fixed values {πij }. For a sample of size n with cell counts {nij }, the values {μij = nπij } are expected frequencies. They represent the values of the expectations {E(nij )} when H0 is true. This notation refers to two-way tables, but similar notions apply to a set of counts for a single categorical variable or to multiway tables. To illustrate, for each of n observations of a binary variable, let π denote the probability of success. For the null hypothesis that π = 0.50, the expected frequency of successes equals μ = nπ = n/2, which also equals the expected frequency of failures. If H0 is true, we expect about half the sample to be of each type. To judge whether the data contradict H0, we compare {nij } to {μij }. If H0 is true, nij should be close to μij in each cell. The larger the differences {nij − μij }, the stronger the evidence against H0. The test statistics used to make such comparisons have large-sample chi-squared distributions.

2.4 CHI-SQUARED TESTS OF INDEPENDENCE 35 2.4.1 Pearson Statistic and the Chi-Squared Distribution The Pearson chi-squared statistic for testing H0 is X2 = (nij − μij )2 (2.6) μij It was proposed in 1900 by Karl Pearson, the British statistician known also for the Pearson product–moment correlation estimate, among many contributions. This statistic takes its minimum value of zero when all nij = μij . For a fixed sample size, greater differences {nij − μij } produce larger X2 values and stronger evidence against H0. Since larger X2 values are more contradictory to H0, the P -value is the null probability that X2 is at least as large as the observed value. The X2 statistic has approximately a chi-squared distribution, for large n. The P -value is the chi-squared right-tail probability above the observed X2 value. The chi-squared approxima- tion improves as {μij } increase, and {μij ≥ 5} is usually sufficient for a decent approximation. The chi-squared distribution is concentrated over nonnegative value√s. It has mean equal to its degrees of freedom (df ), and its standard deviation equals (2df ). As df increases, the distribution concentrates around larger values and is more spread out. The distribution is skewed to the right, but it becomes more bell-shaped (normal) as df increases. Figure 2.2 displays chi-squared densities having df = 1, 5, 10, and 20. Figure 2.2. Examples of chi-squared distributions.

36 CONTINGENCY TABLES The df value equals the difference between the number of parameters in the alternative hypothesis and in the null hypothesis, as explained later in this section. 2.4.2 Likelihood-Ratio Statistic Of the types of statistics Section 1.4.1 summarized, the Pearson statistic X2 is a score statistic. (This means that X2 is based on a covariance matrix for the counts that is estimated under H0.) An alternative statistic presented in Section 1.4.1 results from the likelihood-ratio method for significance tests. Recall that the likelihood function is the probability of the data, viewed as a function of the parameter once the data are observed. The likelihood-ratio test determines the parameter values that maximize the likelihood function (a) under the assumption that H0 is true, (b) under the more general condition that H0 may or may not be true. As Section 1.4.1 explained, the test statistic uses the ratio of the maximized likelihoods, through −2 log maximum likelihood when parameters satisfy H0 maximum likelihood when parameters are unrestricted The test statistic value is nonnegative. When H0 is false, the ratio of maximized likelihoods tends to be far below 1, for which the logarithm is negative; then, −2 times the log ratio tends to be a large positive number, more so as the sample size increases. For two-way contingency tables with likelihood function based on the multinomial distribution, the likelihood-ratio statistic simplifies to G2 = 2 nij log nij (2.7) μij This statistic is called the likelihood-ratio chi-squared statistic. Like the Pearson statistic, G2 takes its minimum value of 0 when all nij = μij , and larger values provide stronger evidence against H0. The Pearson X2 and likelihood-ratio G2 provide separate test statistics, but they share many properties and usually provide the same conclusions. When H0 is true and the expected frequencies are large, the two statistics have the same chi-squared distribution, and their numerical values are similar. 2.4.3 Tests of Independence In two-way contingency tables with joint probabilities {πij } for two response variables, the null hypothesis of statistical independence is H0: πij = πi+π+j for all i and j

2.4 CHI-SQUARED TESTS OF INDEPENDENCE 37 The marginal probabilities then determine the joint probabilities. To test H0, we identify μij = nπij = nπi+π+j as the expected frequency. Here, μij is the expected value of nij assuming independence. Usually, {πi+} and {π+j } are unknown, as is this expected value. To estimate the expected frequencies, substitute sample proportions for the unknown marginal probabilities, giving μˆ ij = npi+p+j = n ni+ n+j = ni+n+j n nn This is the row total for the cell multiplied by the column total for the cell, divided by the overall sample size. The {μˆ ij } are called estimated expected frequencies. They have the same row and column totals as the observed counts, but they display the pattern of independence. For testing independence in I × J contingency tables, the Pearson and likelihood- ratio statistics equal X2 = (nij − μˆ ij )2 , G2 = 2 nij log nij (2.8) μˆ ij μˆ ij Their large-sample chi-squared distributions have df = (I − 1)(J − 1). The df value means the following: under H0, {πi+} and {π+j } determine the cell probabilities. There are I − 1 nonredundant row probabilities. Because they sum to 1, the first I − 1 determine the last one through πI+ = 1 − (π1+ + · · · + πI−1,+). Similarly, there are J − 1 nonredundant column probabilities. So, under H0, there are (I − 1) + (J − 1) parameters. The alternative hypothesis Ha merely states that there is not independence. It does not specify a pattern for the IJ cell probabilities. The probabilities are then solely constrained to sum to 1, so there are IJ − 1 nonredundant parameters. The value for df is the difference between the number of parameters under Ha and H0, or df = (I J − 1) − [(I − 1) + (J − 1)] = I J − I − J + 1 = (I − 1)(J − 1) 2.4.4 Example: Gender Gap in Political Affiliation Table 2.5, from the 2000 General Social Survey, cross classifies gender and political party identification. Subjects indicated whether they identified more strongly with the Democratic or Republican party or as Independents. Table 2.5 also contains estimated expected frequencies for H0: independence. For instance, the first cell has μˆ 11 = n1+n+1/n = (1557 × 1246)/2757 = 703.7. The chi-squared test statistics are X2 = 30.1 and G2 = 30.0, with df = (I − 1) (J − 1) = (2 − 1)(3 − 1) = √2.(T2dhfi)s ch√i-squared distribution has a mean of df = 2 and a standard deviation of = 4 = 2. So, a value of 30 is far out in the right-hand tail. Each statistic has a P -value < 0.0001. This evidence of association






















































































Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook