Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore Logistic Regression_Kleinbaum_2010

Logistic Regression_Kleinbaum_2010

Published by orawansa, 2019-07-09 08:44:41

Description: Logistic Regression_Kleinbaum_2010

Search

Read the Text Version

236 7. Modeling Strategy for Assessing Interaction and Confounding briefly how you would assess whether the variable CT needs to be controlled for precision reasons. 10. What problems are associated with the assessment of confounding and precision described in Exercises 8 and 9? Test The following questions consider the use of logistic regres- sion on data obtained from a matched case-control study of cervical cancer in 313 women from Sydney, Australia (Brock et al., 1988). The outcome variable is cervical can- cer status (1 ¼ present, 0 ¼ absent). The matching vari- ables are age and socioeconomic status. Additional independent variables not matched on are smoking status, number of lifetime sexual partners, and age at first sexual intercourse. The independent variables are listed below together with their computer abbreviation and coding scheme. Variable Abbreviation Coding Smoking status SMK 1 ¼ ever, 0 ¼ never Number of sexual NS 1 ¼ 4þ, 0 ¼ 0–3 partners Age at first intercourse AS 1 ¼ 20þ, 0 ¼ 19 Age of subject AGE Category matched Socioeconomic status SES Category matched Assume that at the end of the variable specification stage, the following E, V, W model has been defined as the initial model to be considered: logit PðXÞ ¼ a þ bSMK þ ~g*i Vi* þ g1NS þ g2AS þ g3NS Â AS þ d1SMK Â NS þ d2SMK Â AS þ d3SMK Â NS Â AS; where the Vi* are dummy variables indicating matching strata, the g*i are the coefficients of the Vi* variables, SMK is the only exposure variable of interest, and the variables NS, AS, AGE, and SES are being considered for control. 1. For the above model, which variables are interaction terms? 2. For the above model, list the steps you would take to assess interaction using a hierarchically backward elimination approach. 3. Assume that at the end of interaction assessment, the only interaction term found significant is the product term SMK Â NS. What variables are left in the model at

Answers to Practice Exercises 237 the end of the interaction stage? Which of the V vari- ables in the model cannot be deleted from any further models considered? Explain briefly your answer to the latter question. 4. Based on the scenario described in Question 3 (i.e., the only significant interaction term is SMK Â NS), what is the expression for the odds ratio that describes the effect of SMK on cervical cancer status at the end of the interaction assessment stage? 5. Based again on the scenario described in Question 3, what is the expression for the odds ratio that describes the effect of SMK on cervical cancer status if the vari- able NS Â AS is dropped from the model that remains at the end of the interaction assessment stage? 6. Based again on the scenario described in Question 3, how would you assess whether the variable NS Â AS should be retained in the model? (In answering this question, consider both confounding and precision issues.) 7. Suppose the variable NS Â AS is dropped from the model based on the scenario described in Question 3. Describe how you would assess confounding and preci- sion for any other V terms still eligible to be deleted from the model after interaction assessment. 8. Suppose the final model obtained from the cervical cancer study data is given by the following printout results: Variable b S.E. Chi sq P SMK 1.9381 0.4312 20.20 0.0000 NS 1.4963 0.4372 11.71 0.0006 AS À0.6811 0.3473 0.0499 SMK Â NS À1.1128 0.5997 3.85 0.0635 3.44 Describe briefly how you would use the above informa- tion to summarize the results of your study. (In your answer, you need only describe the information to be used rather than actually calculate numerical results.) Answers to 1. A “chunk” test for overall significance of interaction Practice terms can be carried out using a likelihood ratio test Exercises that compares the initial (full) model with a reduced model under the null hypothesis of no interaction terms. The likelihood ratio test will be a chi-square test with two degrees of freedom (because two inter- action terms are being tested simultaneously).

238 7. Modeling Strategy for Assessing Interaction and Confounding 2. Using a backward elimination procedure, one first determines which of the two product terms HT Â AGE and HT Â SEX is the least significant in a model containing these terms and all main effect terms. If this least significant term is significant, then both interaction terms are retained in the model. If the least significant term is nonsignificant, it is then dropped from the model. The model is then refitted with the remaining product term and all main effects. In the refitted model, the remaining interac- tion term is tested for significance. If significant, it is retained; if not significant, it is dropped. 3. Interaction assessment would be carried out first using a “chunk” test for overall interaction as described in Exercise 1. If this test is not significant, one could drop both interaction terms from the model as being not significant overall. If the chunk test is significant, then backward elimination, as described in Exercise 2, can be carried out to decide if both interaction terms need to be retained or whether one of the terms can be dropped. Also, even if the chunk test is not significant, backward elimination may be carried out to determine whether a significant inter- action term can still be found despite the chunk test results. 4. The odds ratio formula is given by exp(b), where b is the coefficient of the HT variable. All V variables remain in the model at the end of the interaction assessment stage. These are HS, CT, AGE, and SEX. To evaluate which of these terms are confounders, one has to consider whether the odds ratio given by exp(b) changes as one or more of the V variables are dropped from the model. If, for example, HS and CT are dropped and exp(b) does not change from the (gold standard) model containing all Vs, then HS and CT do not need to be controlled as confounders. Ideally, one should consider as candidates for control any subset of the four V variables that will give the same odds ratio as the gold standard. 5. If CT and AGE do not need to be controlled for con- founding, then, to assess precision, we must look at the confidence intervals around the odds ratio for a model which contains neither CT nor AGE. If this confidence interval is meaningfully narrower than the corresponding confidence interval around the gold standard odds ratio, then precision is gained by dropping CT and AGE. Otherwise, even though these variables need not be controlled for confounding, they

Answers to Practice Exercises 239 should be retained in the model if precision is not gained by dropping them. 6. The odds ratio formula is given by exp(b þ d1AGE þ d2SEX). 7. Using the hierarchy principle, CT and HS are eligible to be dropped as nonconfounders. 8. Drop CT, HS, or both CT and HS from the model and determine whether the coefficients b, d1, and d2 in the odds ratio expression change. Alternatively, deter- mine whether the odds ratio itself changes by compar- ing tables of odds ratios for specified values of the effect modifiers AGE and SEX. If there is no change in coefficients and/or in odds ratio tables, then the variables dropped do not need to be controlled for confounding. 9. Drop CT from the model and determine if the confi- dence interval around the odds ratio is wider than the corresponding confidence interval for the model that contains CT. Because the odds ratio is defined by the expression exp(b þ d1AGE þ d2SEX), a table of confi- dence intervals for both the model without CT and with CT will need to be obtained by specifying differ- ent values for the effect modifiers AGE and SEX. To assess whether CT needs to be controlled for precision reasons, one must compare these tables of confidence intervals. If the confidence intervals when CT is not in the model are narrower in some overall sense than when CT is in the model, precision is gained by dropping CT. Otherwise, CT should be controlled as precision is not gained when the CT variable is removed. 10. Assessing confounding and precision in Exercises 8 and 9 requires subjective comparisons of either sev- eral regression coefficients, several odds ratios, or several confidence intervals. Such subjective compar- isons are likely to lead to highly debatable conclu- sions, so that a safe course of action is to control for all V variables regardless of whether they are confoun- ders or not.

8 Additional Modeling Strategy Issues n Contents Introduction 242 Abbreviated Outline 242 Objectives 243 298 Presentation 244 Detailed Outline 286 Practice Exercises 289 Test 293 Answers to Practice Exercises D.G. Kleinbaum and M. Klein, Logistic Regression, Statistics for Biology and Health, 241 DOI 10.1007/978-1-4419-1742-3_8, # Springer ScienceþBusiness Media, LLC 2010

242 8. Additional Modeling Strategy Issues Introduction In this chapter, we consider five issues on modeling Strat- egy, which were not covered in the previous two chapters Abbreviated on this topic: Outline 1. Modeling strategy when there are two or more exposure variables 2. Screening variables when modeling 3. Collinearity diagnostics 4. Influential observations 5. Multiple testing Each of these issues represent important features of any regression analysis that typically require attention when determining a “best” model, although our specific focus concerns a binary logistic regression model. The outline below gives the user a preview of the material to be covered by the presentation. A detailed outline for review purposes follows the presentation. I. Overview (page 244) II. Modeling strategy involving several exposure variables (pages 244–262) III. Screening variables (pages 263–270) IV. Collinearity diagnostics (pages 270–275) V. Influential observations (pages 275–279) VI. Multiple testing (pages 280–282) VII. Summary (pages 283–285)

Objectives Objectives 243 Upon completing this chapter, the learner should be able to: 1. Given a binary logistic model involving two or more exposures, describe or illustrate how to carry out a modeling strategy to determine a “best” model. 2. Given a fitted binary logistic model involving a large number of exposure and/or covariates (potential confounders or effect modifiers), describe or illustrate how to conduct screening to reduce the number of variables to be considered in your initial multivariate model. 3. Explain by illustration when it is questionable to screen covariates using statistical testing for a crude association with the outcome variable. 4. Given a binary logistic model involving several exposures and/or covariates, describe and/or illustrate how to assess collinearity and how to proceed if a collinearity problem is identified. 5. Given a binary logistic model involving several exposure variables and/or covariates, describe and/or illustrate how to determine whether there are any influential observations and how to proceed with the analysis if influential observations are found. 6. Given a binary logistic model involving several exposure variables and/or covariates, describe and/or illustrate how to consider (or possibly correct for) multiple testing when carrying out a modeling strategy to determine a “best” model.

244 8. Additional Modeling Strategy Issues Presentation I. Overview This presentation addresses several modeling strategy issues not considered in the previous Focus two chapters (6 and 7). These issues represent important features of any regression analysis Modeling issues not considered that typically require attention when going in previous chapters about the process of determining a “best” model, although our specific focus concerns a Apply to any regression analysis binary logistic regression model. Goal: determine “best” model Binary logistic model Issues: We consider five issues, listed here at the left, each of which will be described and illustrated 1. Modeling strategy when there in the sections that follow. are two or more exposure variables 2. Screening variables when modeling 3. Collinearity diagnostics 4. Influential observations 5. Multiple testing II. Modeling Strategy for In this section, we extend the modeling strat- Several Exposure egy guidelines described in the previous two Variables chapters to consider two or more exposure variables, controlling for covariates that are Extend modeling strategy potential confounders and/or effect modifiers. We begin with an example involving exactly  Outome: D(0,1) two exposure variables.  Exposures: E1, E2, . . . , Eq  Control variables: C1, C2, . . . , Cp EXAMPLE A cross-sectional study carried out at Grady Hospital in Atlanta, Georgia involved 297 Example: Two Es adult patients seen in an emergency department whose blood cultures taken within 24 hours of Cross-sectional study admission were found to have Staphylococcus Grady Hospital, Atlanta, GA aureus infection (Rezende et al., 2002). Infor- 297 adult patients mation was obtained on several variables that Diagnosis: Staph. aureus infection were considered as potential predictors of methicillin-resistance infection (MRSA). Concern: potential predictors of MRSA

Presentation: II. Modeling Strategy for Several Exposure Variables 245 EXAMPLE (continued) The outcome variable is MRSA status (1 ¼ yes, 0 ¼ no), and covariates of interest included the Outcome: D ¼ MRSA status following variables: PREVHOSP (1 ¼ previous (0 ¼ no, 1 ¼ yes) hospitalization, 0 ¼ no previous hospitaliza- Predictors: tion), PAMU (1 ¼ antimicrobial drug use in the previous 3 months, 0 ¼ no previous anti- PREVHOSP (0 ¼ no, 1 ¼ yes) microbial drug use), AGE (continuous), and PAMU (0 ¼ no, 1 ¼ yes) GENDER (1 ¼ male, 0 ¼ female). AGE (continuous) GENDER (0 ¼ F, 1 ¼ M) For these data, we consider the following ques- tion: Are the variables PREVHOSP and PAMU Question: associated with MRSA outcome controlling for AGE and GENDER? PREVHOSP, PAMU MRSA controlling for AGE, GENDER Two Es : E1 ¼ PREVHOSP For this question, our predictors include two E2 ¼ PAMU Es (PREVHOSP and PAMU) and two Cs (AGE and GENDER). Two Cs : C1 ¼ AGE C2 ¼ GENDER We now consider an initial EVW model (shown at the left) that includes both Es and both Cs as Initial model: main effects plus product terms involving each E with each C and the product of the two Es. Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ þ ðg1V1 þ g2V2Þ This initial model considers the control vari- þ ðd11E1W1 þ d12E1W2 ables AGE and GENDER as both potential con- þ d21E2W1 þ d22E2W2Þ founders (i.e., V1 and V2) and as potential effect þ d*E1E2; modifiers (i.e., W1 and W2) of both E1 and E2. The model also contains an interaction term where V1 ¼ C1 ¼ W1 and involving the two Es. V2 ¼ C2 ¼ W2. V1 and V2: potential confounders W1 and W2: potential effect modifiers E1E2: interaction of exposures Modeling Strategy with Several As recommended in the previous chapters Exposures when only one exposure variable was being considered, we continue to emphasize that Step 1: Variable Specification the first step in one’s modeling strategy, even (Initial Model) with two or more Es, is to specify the initial model. This step requires consideration of the Considers the following: literature about the study question and/or out- come and/or variables needing to be controlled  Study question based on one’s biological/medical conceptuali-  Literature review zation of the study question.  Biological/medical conceptualization (previously recommended with only one E)

246 8. Additional Modeling Strategy Issues EXAMPLE For our example, therefore, we have assumed that AGE and GENDER are well-known risk Our example assumes: factors for MRSA, and that there is also interest to assess whether each of these variables are  AGE and GENDER risk factors effect modifiers of either or both of the expo-  AGE and GENDER potential sure variables. We also assume that the inter- action of the exposures with each other is of effect modifiers of interest interest.  Interaction of PREVHOSP and PAMU also of interest No interaction model: If, on the other hand, we decided that interac- tion of any kind was either not of interest or not Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ practically interpretable, our initial model þ ðg1V1 þ g2V2Þ would omit such interaction terms. In such a case, the initial model would still involve the two Es, but it would be a no interaction model, as shown at the left. Distinction between one E and sev- The primary distinction between the modeling eral Es: strategy for a single E variable vs. several Es is that, in the latter situation, there are two types Single E: Only one type of of interactions to consider: interactions of Es interaction, EWs with Ws and interactions of Es with other Es. Also, when there are several Es, we may con- Several Es: sider omitting some Es (as nonsignificant) in Two types of interaction, EiWjs the final (“best”) model. and EiEks Potentially omit Es from final model General form: EVW model for Although we will return to this example shortly, we show here at the left the general several Es form of the EVW model when there are several exposures. This model is written rather suc- q p1 cinctly using summation signs, including dou- ble summation signs when considering Logit PðXÞ ¼ a þ ~ biEi þ ~ gjVj interactions. Notice that in this general form, there are q exposure variables, p1 potential con- i¼1 j¼1 founders, and p2 potential effect modifiers. q p2 þ ~ ~ dikEiWk i¼1 k¼1 qq þ ~ ~ d*ii0 EiEi0 i ¼ 1 i0 ¼ 1 i ¼6 i0

Presentation: II. Modeling Strategy for Several Exposure Variables 247 Alternative form (w/o summation This same model is alternatively written with- signs): out summation signs here and is divided into four groups of predictor variables: Logit P(X) = a + b1E1 + b2E2 + ... + bqEq Es The first group lists the E variables. + g1V1 + g2V2 + ... + gp1Vp1 Vs The second group lists the V variables. + d11E1W1 + d12E1W2 + ... + d1,p2E1Wp2 EWs The third group lists the EW variables, the first d21E2W1 + d22E2W2 + ... + d2,p2E2Wp2 line of which contains products of E1 with each + ... of the Wjs, the second line contains products of + E2 with each of the Wjs, and so on, with the last line of the group containing products of Eq + + + +dq1EqW1 dq2EqW2 ... dq,p2EqWp2 with each of the Wjs. + d1∗2E1E2 + d1∗3 E1E3 + ... + d1∗q E1Eq EEs Finally, the fourth group lists the EE variables, + d2∗3E2E3 + d2∗4 E2E4 + ... + d2∗q E2Eq i π i’ with the first line containing products of E1 + +... dq∗–1,q Eq–1Eq with all other Es, the second line containing products of E2 with all other Es except E1, and so on, with the last line containing the single product term EqÀ1 Eq. EXAMPLE Returning to our initial model for the MRSA MRSA example: data, there are q ¼ 2 E variables, Logit P(X) = a + b1E1 + b2E2 q = 2 p1 ¼ 2 V variables, + g1V1 + g2V2 p1 = 2 p2 ¼ 2 W variables, which yields 4 EW variables, + d11E1W1 + d12E1W2 + d21E2W1 and a single EE variable. + d22E2W2 4 EWs + d ∗E1E2 1 EE Next step in modeling strategy? So, how do we proceed once we have identified our initial model? Following our previous Step 2: Assess interaction strategy for one E variable, we recommend assessing interaction as the next step. But, Questions regarding EWs and EEs? since there are two types of product terms, EWs and EEs, should we consider these types  Consider separately or separately or simultaneously, and if separately, simultaneously? do we first consider EWs or EEs?  If separately, EWs or EEs first? Answer: It depends! The answer, not surprisingly, is it depends! Several reasonable options. That is, there are several reasonable options.

248 8. Additional Modeling Strategy Issues Option A: Overall (chunk) LR test for One Option (A) begins with a “chunk” LR test interaction; then “subchunk” LR tests that simultaneously evaluates all product for EWs and EEs; then Vs; finally Es terms. We then test separate “subchunks” involving EWs and EEs, after which we assess EXAMPLE the Vs for confounding and precision. Finally, we consider dropping nonsignificant Es. MRSA example: For the initial MRSA model, since there are five  Overall chunk test: LR $ w25 df product terms, the overall chunk test would involve a chi square statistic with 5 degrees of under H0: freedom. The two “subchunks” would involve d11 ¼ d12 ¼ d21 ¼ d22 ¼ d* ¼ 0 the 4 EW terms and the single EE term, as we illustrate on the left.  Subchunk tests: LR $ w24 df under H01: d11 ¼ d12 ¼ d21 ¼ d22 ¼ 0 LR $ w21 df under H02: d* ¼ 0 Option B: Assess EWs first, then EEs, prior Alternatively, a second Option (B) differs from to Vs and Es Option A by simply skipping the overall chunk test. Both Options A and B make sense if we Reasons: decide that assessing interaction should always Assess interaction (EWs and EEs) precede assessing confounding and precision, prior to confounding and preci- and that EWs should always be assessed prior to sion, and Assess EWs prior to EEs EEs. Option C: Assess EWs first, then Vs, prior As another Option (C), recall that when we to EEs and Es considered a model with only a single E, we left this E in the model throughout the entire Reason: process of evaluating interaction, confound- Assess effect modification (Ws) ing, and then precision. An analogous and confounding (Vs) before con- approach for several Es is to evaluate effect sidering exposures (Es and EEs) modifiers (Ws) and potential confounders (Vs) before considering any terms involving Es, EXAMPLE including product terms (EEs). Initial Model Output: 22ln L ¼ 275.683 We will apply Options A through C to the MRSA data. First, we present, at the left, edited Analysis of maximum likelihood estimates results from fitting the initial model. Es Vs EWs EEs Reduced Model A: We now show the results from using Option A for the reduced model (A) that eliminates all Logit P ðXÞ ¼ a þ ðb1E1 þ b2E2Þ five interaction terms from the initial model. þ ðg1V1 þ g2V2Þ

Presentation: II. Modeling Strategy for Several Exposure Variables 249 EXAMPLE The LR statistic for the overall “chunk” test Model A Output: 22ln L ¼ 279.317 that compares initial and reduced models yields a chisquare statistic of 3.634 with 5 df, Analysis of maximum likelihood estimates which is highly nonsignificant. This suggests that the no-interaction MRSA model A is prefer- Es able to the initial model containing five inter- Vs action terms. LR = –2 1n LR(A) – (–2 1n LF) Nevertheless, to consider the possibility that = 279.317 – 275.683 some of the product terms are significant = 3.634 5 df (P = 0.6032) despite the nonsignificant overall chunk test results, we now carry out a test for the No–interaction model A “subchunk” of EWs terms followed by another test for the “subchunk” of EE terms (of which ⇒ preferred to full there is only one: E1E2). We are now essentially considering (the start of) Option B. interaction model Testing for the EW terms first, we present, at Possibility: some product terms significant the left, the reduced model (B) obtained by + eliminating the four EW terms from the initial model, thereby keeping the single EE term in Carry out subchunk tests for the model. EWs and EEs The resulting output for this model is shown here. ðstart of Option BÞ From the output, the LR statistic for the “sub- Reduced Model B (w/o EW terms): chunk” test that compares the initial model with Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ reduced model B yields a chi-square statistic of 1.984 with 4 df, which is highly nonsignificant. þ ðg1V1 þ g2V2Þ This suggests that the reduced model B (that þ d*E1E2 excludes all EW terms) is preferable to the initial modeling containing five interaction terms. Model B Output: –2ln L = 277.667 Focusing on the single EE term (E1E2) in the Es reduced model B, we can see from the output for this model that the Wald test for H0: d* ¼ 0 Vs is nonsignificant (P ¼ 0.2122). The correspon- ding LR statistic (that compares model A with EE model B) is also nonsignificant (P ¼ 0.1990). LR = –2ln LR(B) – (–2ln LF) The above interaction results using Options A = 277.667 – 275.683 and B indicate that the no-interaction model A = 1.984 4 df (P =0.7387) shown at the left is preferable to a model involving any EW or EE terms. Reduced model B ⇒ preferred to full interaction model EE term: E1E2 (¼PRHPAM) Testing H0: d* ¼ 0 in reduced model B Wald statistic ¼ 1.089df¼1 (P¼ 0.2122) LR statistic ¼ 279.317 À 277.667 ¼ 1.650df ¼ 1 (P ¼ 0.1990) No-interaction Model A: Logit P(X) ¼ a þ (b1E1 þ b2E2) þ (g1V1 þ g2V2) preferred to interaction model

250 8. Additional Modeling Strategy Issues EXAMPLE (continued) To assess confounding, we need to determine whether the estimated OR meaningfully Options A and B (continued) changes (e.g., by more than 10%) when either AGE or GENDER or both are dropped from the Confounding: model. Here, the gold standard (GS) model is Does O^ R meaningfully change the no-interaction model A just shown. when AGE and/or GENDER are The formula for the odds ratio for the GS dropped? model is shown at the left, where (E1*, E2*) and (E1, E2) denote two specifications of the GS model: no-interaction model A two exposures PREVHOSP (i.e., E1) and PAMU above (i.e., E2). ORGSðAÞ ¼ exp½b1ðE1* À E1Þ There are several ways to specify X* and X for þ b2ðE2* À E2ފ; PREVHOSP and PAMU. Here, for convenience and simplicity, we will choose to compare a where X* ¼ (E1*, E2*) and X ¼ (E1, subject X* who is positive (i.e., yes) for both E2) are two specifications of the two Es Es with a subject X who is negative (i.e., no) for both Es. Our choices for E1 and E2 on two subjects: Based on the above choices, the OR formula for our GS reduced model A simplifies, as X* ¼ ðE1* ¼ 1; E2* ¼ 1Þ vs: shown here. yes yes To assess confounding, we must now deter- mine whether estimates of our simplified X ¼ ðE1 ¼ 0; E2 ¼ 0Þ ORGS(A) meaningfully change when we drop AGE and/or GENDER. This requires us to con- no no sider a table of ORs, as shown at the left. ORGS(A) To complete the above table, we need to fit the four models shown at the left. The first model, ¼ exp½b1ð1 À 0Þ þ b2ð1 À 0ފ which we have already described, is the GS(A) ¼ exp½b1 þ b2Š model containing PREVHOSP, PAMU, AGE, and GENDER. The other three models exclude Table of ORs (check confounding) GENDER, AGE, or both from the model. Vs in model AGE,GEN AGE GEN Neither Since all four models involve the same two E vari- ables, the general formula for the OR that com- OR ORI ORII ORIII ORIV pares a subject who is exposed on both Es (E1* ¼ 1, E2* ¼ 1) vs. a subject who is not Model # exposed on both Es (E1 ¼ 0, E2 ¼ 0) has the same algebraic form for each model, including I. Logit PI(X) ¼ a þ b1E1 the GS model. þ b2E2 þ g1V1 þ g2V2 II. Logit PII(X) ¼ a þ b1E1 þ b2E2 þ g1V1 III. Logit PIII(X) ¼ a þ b1E1 þ b2E2 þ g2V2 IV. Logit PIV(X) ¼ a þ b1E1 þ b2E2 OR formula (E1* ¼ 1, E2* ¼ 1) vs. (E1 ¼ 0, E2 ¼ 0) for all four models: OR = exp [b1 + b2]

Presentation: II. Modeling Strategy for Several Exposure Variables 251 EXAMPLE (continued) However, since the models do not all have the same predictors, the estimates of the regres- Options A and B (continued) sion coefficients are likely to differ somewhat. However: b^1 and b^2 likely differ for each model At the left, we show for each model, the values of these two estimated regression coefficients Estimate Regression Coefficients together with their corresponding OR esti- and O^Rs mates. From this information, we must decide which one or more of the four models controls Model: I (GS) II III IV for confounding. Certainly, the GS model con- AGE GEN Neither trols for confounding, but do any of the other Vs in model AGE,GEN models do so also? b1 An equivalent question is: which of the other b2 three models yields the “same” OdR as obtained for the GS model? A quick glance at the table OR indicates that the OdR estimate for the GS model is somewhat higher than the estimates for the Assessing confounding (Option B): other three models, suggesting that only the GS Which models have “same” OdR as GS model controls for confounding. model? Quick glance: OR for GS highest + Only GS model controls confounding Change of Estimate Results: 10% Rule Model: I (GS) II III IV Moreover, if we use a “change-of-estimate” rule AGE GEN Neither of 10%, we find that none of models II, III, or Vs in model AGE,GEN IV have an OdR within 10% of the OdR of 26.2430 for the GS model (I), although model III comes OR very close. Within 10% This result indicates that the only model that of GS? controls for confounding is the GS model. That is, we cannot drop either AGE or GENDER from Note: ±10% of 26.2430: (23.6187, 28.8673) the model. Only GS model controls confounding We therefore have decided that both Vs need Model at this point contains to stay in the model, but we have not yet addressed the Es in the model. E1, E2, V1, and V2 The only other variable that we might consider can’t drop dropping at this point is E1 or E2, provided we decide that one of these is nonsignificant, haven’t yet controlling for the other. However, on inspec- addressed tion of the output for this model, shown again at the left, we find that the Wald statistic for E1 Model A Output –2 1n L = 279.317 is significant (P ¼ 0.0002), as is the Wald sta- tistic for E2 (P < 0.0001). Param DF Estimate Std Err ChiSq Pr > ChiSq Intercept Es PREVHOSP PAMU Vs AGE GENDER Wald for E1 (PREVHOSP): P ¼ 0.0002 E2 (PAMU): P < 0.0001

252 8. Additional Modeling Strategy Issues EXAMPLE (continued) Thus, based on these Wald statistics, we cannot drop either variable from the model (and simi- Options A and B (continued) lar conclusions from LR tests). Cannot drop PREVHOSP or Consequently, using Options A or B, our best PAMU model is the (reduced) no-interaction model A, which we have called the Gold Standard Using Options A or B: model. fl No-Interaction Model A is best model Logit P(X) = a + b1E1 + b2E2 + g1V1 + g2V2 X* ¼ ðE1* ¼ 1; E2* ¼ 1Þ For this model, then, the OR that compares a subject X* who is positive (i.e., yes) for both Es yes yes with a subject X who is negative (i.e., no) for both Es simplifies to the exponential formula vs: X ¼ ðE1 ¼ 0; E2 ¼ 0Þ shown at the left. no no Below this formula, at the left, we show the estimated OR and a 95% confidence interval ORmodel A ¼ exp½b1ð1 À 0Þ around this odds ratio. þ b2ð1 À 0ފ These results show that there is a very strong and significant (but highly variable) effect ¼ exp½b1 þ b2Š when comparing MRSA models with X* and X. OR = exp[b1 + b2] Alternatively, we might wish to compute the = exp[1.4855 + 1.7819] = 26.2415 odds ratios for the effects of each E variable, separately, controlling for the other E and the 95% CI: (11.5512, 59.6146) two V variables. The results are shown at the left and can also be obtained using the output Conclusion from Options A and B: for reduced model A shown earlier. Very strong (but highly variable) combined effect of PREVHOSP and PAMU ORE1 E2,V1,V2 = exp[b1] = exp[1.4855] = 4.417 =95% CIE1 E2,V1,V2 [2.2004, 9.734] ORE2 E1,V1,V2 = exp[b2] = exp[1.7819] = 5.941 95% CIE2 E1,V1,V2 = [2.873, 12.285] Options A or B: Additional From these results, we can conclude from conclusions using Options A or B that both PREVHOSP and PAMU have moderately strong and signifi- Both PREVHOSP and PAMU cant individual effects (ORs of 4.417 and 5.941, have moderately strong and respectively) when controlling for the other significant individual effects. three variables in the final model, i.e., no-inter- action model A.

Presentation: II. Modeling Strategy for Several Exposure Variables 253 EXAMPLE (continued) Recall that both Options A and B assessed interaction of EWs and EEs before considering Option A: Overall (chunk) interaction, confounding and precision, where Option A then, in order. EWs, EEs, used an overall (chunk) test for interaction Vs, and Es and Option B did not. We are now ready to consider Option C, which assesses interactions Option B: Assess EWs first, then, in involving EWs first, then confounding and pre- order EEs, Vs, and Es cision (i.e., the Vs), after which EEs and finally Es are evaluated. Option C: Assess EWs first, then, in order, Vs, EEs, and Es Reduced Model B (w/o EW terms): Since all three Options, including Option C, Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ assess EWs before EEs, Vs, and Es, we have already determined the results for the EWs. þ ðg1V1 þ g2V2Þ That is, we can drop all the EWs, which yields þ d*E1E2 reduced model B, as shown again at the left. Note: Reduced model B preferred to full interaction model Model B Output: –21n L = 277.667 The corresponding (edited) output for model B is shown again here. This model retains the EE Param DF Estimate Std Err ChiSq Pr > ChiSq product term PRHPAM (¼ E1E2), which using Option C, will not be considered for exclusion Intercept until we address confounding for AGE and GENDER (i.e., the Vs). Es PREVHOSP PAMU Vs AGE GENDER EE PRHPAM Confounding: To assess confounding, we need to determine Does OdR meaningfully change whether the estimated OR meaningfully changes when AGE and/or GENDER are (e.g., by more than 10%) when either AGE or dropped? GENDER or both are dropped from the model. Here, the gold standard (GS) model is the GS model: reduced model B above reduced model B, which contains the E1E2 term. ORGSðBÞ ¼ exp½b1ðE1* À E1Þ The formula for the odds ratio for the GS þ b2ðE2* À E2Þ model is shown at the left, where (E1*, E2*) þ d*ðE1*E2* À E1E2ފ; and (E1, E2) denote two specifications of the two exposures PREVHOSP (i.e., E1) and PAMU where X* ¼ (E1*, E2*) and X ¼ (E1, E2) (i.e., E2). This formula contains three para- are two specifications of the two Es meters: b1, b2, and d*.

254 8. Additional Modeling Strategy Issues EXAMPLE (continued) Option C (continued) As previously noted (for Option A), there are several ways to specify X* and X. Here, again, Must specify X* and X: we will choose to compare a subject X* who is positive (i.e., yes) for both Es with a subject X X* ¼ ðE1* ¼ 1; E2* ¼ 1Þ who is negative (i.e., no) for both Es. yes yes Based on the above choices, the OR formula for our GS reduced model B simplifies, as vs: X ¼ ðE1 ¼ 0; E2 ¼ 0Þ shown here. no no ORGSðBÞ ¼ exp½ b1ð1 À 0Þ þ b2ð1 À 0Þ þ d^*ð½1  1Š À ½0  0ŠÞŠ ¼ exp½b1 þ b2 þ d*Š Table of ORs (check confounding) To assess confounding, we must once again (as with Option A) consider a table of OdRs, as Vs in model AGE,GEN AGE GEN Neither shown at the left. OR ORI* ORII* ORIII* ORIV* Model choices: To complete the above table, we need to fit the four models shown at the left. The first model I*. Logit PI* (X) ¼ a þ b1E1 þ b2E2 (I*), which we have already described, is the þ g1V1 þ g2V2 Gold Standard (GS(B)) model containing þ d*E1E2 PREVHOSP, PAMU, AGE, GENDER, and PRHPAM. The other three models exclude II*. Logit PII* (X) ¼ a þ b1E1 GENDER, AGE, or both from the model. þ b2E2 þ g1V1 Since all four models involve the same two E þ d*E1E2 variables, the general formula for the OR that compares a subject who is exposed on both Es III*. Logit PIII* (X) ¼ a þ b1E1 (E1* ¼ 1, E2* ¼ 1) vs. a subject who is not þ b2E2 exposed on both Es (E1 ¼ 0, E2 ¼ 0) has the þ g2V2 same algebraic form for each model, including þ d*E1E2 the GS(B) model. IV*. Logit PIV* (X) ¼ a þ b1E1 However, since, the models do not all have the þ b2E2 same predictors, the estimates of the regres- þ d*E1E2 sion coefficients are likely to differ somewhat. OR formula (E1* ¼ 1, E2* ¼ 1) vs. (E1 ¼ 0, E2 ¼ 0) for all four models: OR ¼ exp½b1 þ b2 þ d*Š However, b^1; b^2, and d^* likely differ for each model

Presentation: II. Modeling Strategy for Several Exposure Variables 255 EXAMPLE (continued) At the left, we show for each model, the values of these three estimated regression coefficients Option C (continued) together with their corresponding OR esti- mates. Estimated Regression Coefficients and ORs From this information, we must decide whether any one or more of models II*, III*, Model: I*(GS) II* III* IV* and IV* yields the “same” OdR as obtained for Vs in AGE, the GS model (I*). model GEN AGE GEN Neither Notice, first, that the OR estimate for the GS b^1 1.0503 1.1224 1.2851 1.2981 model (22.5762) is somewhat higher than the b^2 0.9772 1.0021 0.8002 0.8251 estimates for the other three models, suggest- ^d* 1.0894 0.8557 0.9374 0.8398 ing that only the GS model controls for con- OdR 22.5762 19.6918 20.5467 19.3560 founding. Confounding (Option C): Which models have “same” OdR as GS model? OdR for GS is highest + Only GS model controls confounding Change of Estimate Results: However, using a “change-of-estimate” rule 10% Rule of 10%, we find that the OdR (20.5467) for model III*, which drops AGE but retains GEN- Model: I* (GS) II* III* IV* DER, is within 10% of the OdR (22.5762) for the Vs in AGE, GS model. This result suggests that there are model GEN AGE GEN Neither two candidate models (I* and III*) that control OdR 22.5762 19.6918 20.5467 19.3560 for confounding. Within – No Yes No From the above results, we must decide at this 10% of point which of two conclusions to draw about confounding: (a) the only model that controls GS? for confounding is the GS model; or (b) both the GS model (I*) and model III* control for Note: Æ10% of 22.5762: (20.3186, confounding. 24.8338) Two alternative conclusions: (a) Only GS model controls confounding (b) GS model (I*) and model III* both control confounding

256 8. Additional Modeling Strategy Issues EXAMPLE (continued) Suppose we decide that only the GS model controls for confounding. Then, we cannot Option C (continued) drop either AGE or GENDER from the model. We therefore have decided that both Vs need to Suppose decide only GS(B) stay in the model, but we have not yet control confounding addressed the Es in the model. ⇓ Model at this point contains E1, E2, V1, V2 and E1E2, can’t drop have not yet For the next step, we would test whether the addressed E1E2 product term is significant. Next step: test E1E2: From our output for reduced model B given previously, we find that the Wald test for the Wald w2 ðreduced model BÞ PRHPAM term (i.e., E1E2) is not significant ¼ 1:5562; P ¼ 0:2122 ðn:s:Þ (P ¼ 0.2122). The corresponding LR test is obtained by comparing À2lnL statistics for LR ¼ À 2 ln Lmodel A À2 ln Lmodel BÞ reduced models A and B, yielding a LR statistic ¼ 279:317 À 277:667 ¼ 1:650 $ w21 df of 1.650, also nonsignificant. ðP ¼ 0:1989Þ We can now reduce our model further by No-interaction Model A: dropping the E1E2 term, which yields the no- Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ þ ðg1V1 interaction model A, shown at the left. þ g2V2Þ; Recall that Model A was chosen as the best model using Options A and B. Consequently, where V1 ¼ C1 ¼ AGE, using Option C, if we decide that the only V2 ¼ C2 ¼ GENDER model that controls confounding is the GS(B) E1 ¼ PREVHOSP, model (I* above), then our best model for E2 ¼ PAMU Option C is also Model A. Recall : Options A and B ) The above conclusion (i.e., Model A is best), Model A is best nevertheless, resulted from the decision that only the GS(B) model controlled for confound- Option C : only GS model controls ing. However, we alternatively allowed for two confounding candidate models, the GS(B) model (I*) and + model III*, which dropped AGE from the model, to control for confounding. Model A is best If we decide to consider model III* in addition Alternative decision about to the GS(B) model, how do we decide between confounding for Option C these two models? The answer, according to the modeling strategy described in Chap. 7, is + to consider precision. 2 candidate models control confounding: Model I*: GS(B) Model III*: (AGE dropped) How to decide between models? Answer: Precision

Presentation: II. Modeling Strategy for Several Exposure Variables 257 EXAMPLE (continued) Since both Models I* and III* include the inter- Option C (continued) action term E1E2, the OR formula has the same OR formula for Models I* and III*: structure for both models (shown again at the OR ¼ exp½b1ðE1* À E1Þ þ b2ðE2* À E2Þ left). þ d*ðE1*E2* À E1E2ފ; To evaluate precision for each odds ratio, we must therefore compute (say, 95%) confidence where X* ¼ (E1*, E2*) and X ¼ (E1, E2) intervals (CIs) for the OR for each model. are two specifications of the two Es The CI limits for each OR will depend on how Precision ) computing CIs for the we specify X* and X. As we did for confound- OR for Models I* ing, we again focus on comparing a subject X* and III* who is positive (i.e., yes) for both Es with a subject X who is negative (i.e., no) for both CI depends on how we specify X* Es. The OR formula simplifies as shown at and X: the left. Our focus again: To assess precision, therefore, we must now X* ¼ ð1; 1Þ vs: X ¼ ð0; 0Þ consider a table that gives the (95%) CI for + the OR for each model, and then decide whether or not precision is gained when AGE OR ¼ exp½b1 þ b2 þ d*Š is dropped from the GS model. The resulting table is shown at the left. Table of ORs and CIs for Models I* and III* From the above results, we can see that, although both models give wide (i.e., impre- OR 95% CI for OR cise) confidence intervals, Model III* has a Model I* (GS(B)) tighter confidence interval than Model I*. Model III* (w/o AGE) CI width Model 50.8871 À 10.0175 ¼ 40.8696 I* Model 44.8250 À 9.4174 ¼ 35.4076 III* Better model: Model III* Therefore, we suggest that Model III* be cho- sen as the “better” model, since it gives the Logit PIII* ðXÞ ¼ a þ ðb1E1 þ b2E2Þ “same” (within 10%) OR estimate and provides þ g1V2 þ d*E1E2 more precision. (same OR but better precision than GS) Model III* at this point contains At this point, using model III*, we have decided E1, E2, V2, and E1E2, to drop V1 ¼ AGE from our initial model. Nev- ertheless, we have not yet addressed the Es in haven’t yet addressed the model.

258 8. Additional Modeling Strategy Issues EXAMPLE (continued) Option C (continued) Model III* Output For the next step, we would test whether the E1E2 product term is significant. Using the Analysis of maximum likelihood estimates output for Model III* (shown at the left), we find that the Wald test for the PRHPAM term Std (i.e., E1E2) is not significant (P ¼ 0.2663). The corresponding LR test is also not significant. Param DF Estimate Err ChiSq Pr > ChiSq We can now reduce our model further by Intercept 1 À2.6264 0.4209 38.9414 <.0001 dropping the E1E2 term, which yields the reduced Model C, shown at the left. PREVHOSP 1 1.2851 0.5107 6.3313 0.0119 The only other variables that we might con- PAMU 1 0.8002 0.7317 1.1960 0.2741 sider dropping at this point are E1 or E2, provided one of these is not significant, GENDER 1 0.4633 0.3066 2.2835 0.1308 controlling for the other. PRHPAM 1 0.9374 0.8432 1.2358 However, on inspection of the output for this model, shown at the left, we find that the Model C : Wald statistic for E1 is highly significant Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ þ g2V2 (P < 0.0001), as is the Wald statistic for E2 (P < 0.0001). Thus, based on these Wald statis- Can we drop E1 or E2 from Model C? tics, we cannot drop either E variable from the Model C Output model (and similar conclusions from LR tests). Cannot drop either E1 or E2 Consequently, if we decide to use Option C, and we allow Models I* and Models III* to be can- Option C conclusion: didate models that control for confounding, Model C is best model then our best model is given by Model C. To X* ¼ ðE1* ¼ 1; E2* ¼ 1Þ vs: make this choice, we considered precision as X ¼ ðE1 ¼ 0; E2 ¼ 0Þ well as significance of the E in the model. ORModel C ¼ exp½b1 þ b2Š For this model, then, the OR that compares a subject X* who is positive (i.e., yes) for both Es with a subject X who is negative (i.e., no) for both Es simplifies to the exp formula shown at the left. ORModel C = exp[b1 + b2] Below this formula, we show the estimated OR = exp[1.6627 + 1.4973] and a 95% confidence interval around this = 23.5708 odds ratio, which indicates a very strong and significant (but highly variable) effect. 95% CI: (10.7737, 51.5684)

Presentation: II. Modeling Strategy for Several Exposure Variables 259 EXAMPLE (continued) Summarizing the results we have obtained from the above analyses on the MRSA data, Best Model Summary: Options A, we have found two different final choices for B, C the best model shown at the left depending on Options A and B (same result): three approaches to our modeling strategy, Model A: contains PREVHOSP, Options A and B (same result) and Option C. PAMU, AGE, The outputs for the two “best” models are and GENDER shown here. Option C: Model C: contains PREVHOSP, Both models are no-interaction models, and PAMU, and they both contain the main effects of two highly GENDER significant E variables, PREVHOSP and PAMU. Model A Output (Best: Options A The estimated coefficients of PREVHOSP and and B) PAMU differ somewhat for each model. The estimate for PREVHOSP is 1.4855 for Model A Analysis of maximum likelihood estimates whereas it is 1.6627 for Model C. The estimate for PAMU is 1.7819 for Model A compared with Param DF Estimate Std Err ChiSq Pr > ChiSq 1.4973 for Model C. Intercept 1 À5.0583 0.7643 43.8059 <.0001 OR estimates from each model are shown in PREVHOSP 1 1.4855 0.4032 13.5745 0.0002 the table at the left. Both models show moder- PAMU 1 1.7819 0.3707 23.1113 <.0001 ately strong effects for each E variable and a very strong effect when comparing X* ¼ (E1 ¼ 1, AGE 1 0.0353 0.0092 14.7004 0.0001 E2 ¼ 1) with X ¼ (E1 ¼ 0, E2 ¼ 0). How- ever, the effect of PREVHOSP is 16% lower in GENDER 1 0.9329 0.3418 7.4513 0.0063 Model A than in Model C, whereas the effect of PAMU 25% higher in Model A than in Model C. Model C Output (Best: Option C) Analysis of maximum likelihood estimates Param DF Estimate Std Err ChiSq Pr > ChiSq Intercept 1 À2.7924 0.4123 45.8793 <.0001 PREVHOSP 1 1.6627 0.3908 18.1010 <.0001 PAMU 1 1.4973 0.3462 18.7090 <.0001 GENDER 1 0.4335 0.3030 2.4066 0.1525 PREVHOSP ORs COMBINED PAMU Model exp[b1] exp[b1 + b2] A 4.417 exp[b2] C 5.274 5.941 26.242 4.470 23.571 MRSA example: Options A, B, and C We see, therefore, that for our MRSA example, + modeling strategy Options A, B, and C give similar, but slightly different conclusions Similar, slightly different numerical involving two E variables. conclusions In general: No guarantee for In general, as shown by this example, there is same conclusions no guarantee that these three options will always yield the same conclusions. Therefore, General form of Initial Model the researcher may have to decide which option he/she prefers and/or which conclusion q p1 makes the most (biologic) sense. Logit PðXÞ ¼ a þ ~ biEi þ ~ gjVj In summary, we recommend that the initial model has the general form shown at the left. This model i¼1 j¼1 involves Es, Vs, EWs, and EEs, so there are two types of interaction terms to consider. q p2 q q þ~ ~ dikEiWk þ ~ ~ d*ii0 EiEi0 i¼1 k¼1 i¼1 i0¼1 i6¼i0

260 8. Additional Modeling Strategy Issues Modeling Strategy Summary: Several Es We then recommend assessing interaction, first by deciding whether to do an overall chunk test, Step 1: Define initial model (above formula) then testing for the EWs, after which a choice has to be made as to whether to test for the EE Step 2: Assess interaction terms prior to or subsequent to assessing con- Option A: Overall chunk test + Options B or C founding and precision (involving the Vs). Option B: Test EWs, then EEs Option C: Test EWs, but assess Vs before EEs The resulting model can then be further assessed to see whether any of the E terms in Step 3: Assess confounding and precision (Vs) the model can be dropped as nonsignificant. Option A and B (cont’d): Vs after EWs and EEs Option (cont’d): Vs after EWs, but prior to EEs Step 4: Test for nonsignificant Es if not components of significant EEs Special Cases: Several Es There are two special cases that we now address: (a) All Vs are controlled as main What if you decide to control for all Vs as effects, main effects (without assessing confounding i.e., confounding and and/or precision)? precision for Vs not considered In this case (a), we only need to consider Options A and B, so that Step 3 of our previ- Modeling Strategy: All Vs controlled ously described strategy can be omitted. Step 1: Define initial model (above formula) Step 2: Assess Interaction Option A: Overall chunk test + Options B Option B: Test EWs, then EEs Step 4: Test for nonsignif Es if not components of significant EEs EXAMPLE For example, using the MRSA data, the initial model, shown again at the left contains two Es, MRSA Initial Model, Special case(a) two Vs, 4 EWs and one EE term. Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ When previously applying Options A and B to þ ðg1V1 þ g2V2Þ this model, we dropped all interaction terms, þ ðd11E1W1 þ d12E1W2 resulting in reduced model A shown at the left. þ d21E2W1 þ d22E2W2Þ If we decide in advance to control for both Vs, þ d*E1E2 then this is our final model, since both Es were significant in this model. Model A: Final Model Logit PðXÞ ¼ a þ ðb1E1 þ b2E2Þ þ ðg1V1 As a second special case, what if our model con- tains only Es, so there are no Cs to control? This þ g2V2Þ case is often referred to as a “hypothesis gener- ating” model, since we are essentially assuming (b) The model contains only Es that we have limited knowledge on all possible and EEs, but no Cs predictors and that no risk factors have been (i.e., no Vs or Ws) established. “Hypothesis Generating” Model

Presentation: II. Modeling Strategy for Several Exposure Variables 261 General Model: Only Es and EEs In this case (b), our general model takes the simplified form shown at the left. q qq For this model, we recommend a correspond- Logit PðXÞ ¼ a þ ~ biEi þ ~ ~ d*ii0 EiEi0 ingly simplified strategy as shown at the left that involves statistical testing only, first for i¼1 i¼1 i0¼1 EE terms, and then for Es that are not compo- i¼6 i0 nents of significant EEs. In terms of the options we previously described, we only need Modeling Strategy: All Es, no Cs to consider a modified version of Option A, and Step 1: Define initial model (above formula) that Step 3, once again, can be omitted. Step 2: Assess interaction involving Es Option A*: Overall chunk test for EEs, followed by backward elimination of EEs Step 4: Test for nonsignif Es if not components of significant EEs EXAMPLE Applying this situation to our MRSA data, the initial model (w/o the Cs) is shown at the left. MRSA example Initial Model, Special case (b) Logit PðXÞ ¼ a þ b1E1 þ b2E2 þ d*E1E2 Final model: All Es, no Cs: Testing H0: d* ¼ 0 in this model yields a nonsig- nificant result (data not shown), and the final Logit PðXÞ ¼ a þ b1E1 þ b2E2; model (since individual Es cannot be dropped) where E1 ¼ PREVHOSP and is the no-interaction model shown here. E2 ¼ PAMU One other issue: specifying the We now address one other issue, which con- initial model cerns how to specify the initial model. We (MRSA example) describe this issue in the context of the MRSA example. EXAMPLE Possible Causal Diagrams for MRSA Study Diagram 1 V1 V2 E1 E2 D D = MRSA (0,1) At the left, we consider two possible causal Diagram 2 = diagrams for the MRSA data. V1 = AGE V1 V2 V2 = GENDER Diagram 1 indicates that PAMU (i.e., E2) is an E1 PREVHOSP intervening variable in the causal pathway between PREVHOSP (i.e., E1) and MRSA out- E2 = PAMU come, and that AGE and GENDER (i.e., V1 and V2) are confounders of the relationship E1 D E2 between PREVHOSP and MRSA. Diagram 1 ) PAMU intervening variable; AGE and GENDER confounders

262 8. Additional Modeling Strategy Issues EXAMPLE (continued) Diagram 2 indicates that PREVHOSP and PAMU are independent risk factors for MRSA Diagram 2 ) PREVHOSP and PAMU outcome, and that AGE and GENDER are con- independent risk factors; founders of both PREVHOSP and PAMU. AGE and GENDER confounders Diagram 2 appropriate ) initial The initial model that we considered in our model containing both analysis of the MRSA data, containing both E1 and E2 is justified PREVHOSP and PAMU in the model as E vari- ables, can be justified if we decide that Dia- gram 2 is a correct representation of the causal pathways involved. Diagram 1 appropriate ) initial In contrast, if we decide that Diagram 1 is more model should not appropriate than Diagram 2, we should not put contain both both PREVHOSP and PAMU in the same PREVHOSP and model to assess their joint or separate effects. PAMU In other words, if PAMU is an intervening vari- i.e., PAMU intervening variable able, we should consider a model involving only one E variable, preferably PREVHOSP. + An example of such a model, which controls Logit PðXÞ ¼ a þ b1E1 þ g1V1 þ g2V2 for AGE and GENDER and allows for interac- tion effects, is shown at the left. þ d11E1V1 þ d12E1V2 The moral: Causal diagram can Thus, as mentioned previously in Chap. 6, the influence choice of initial model choice of the initial model can be influenced by the causal diagram considered most appro- priate for one’s data.

Presentation: III. Screening Variables 263 III. Screening Variables In this section, we address the following sce- nario: Suppose you wish to fit a binary logistic 8Scenario: model involving a binary exposure and out- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><CLDEL“loo1(ae,0ggrs,Cgiii1rste2e)t,”Pidvc.ðps.XiM.n. ÞD,iot¼C(ida0pela,l1mþ) pobdEeþl: come variables E and D controlling for the p potential confounding and effect-modifying effects of a “large” number of variables Cj, ~ gjCj j ¼ 1, 2, . . . , p that you have identified from the literature. j¼1 You would like to begin with a model contain- þ ~ djECj ing E, the main effects of each Cj, and all prod- >>>>>>>>>>>>>>>>>>>>>>>>>>:>>>>>>>>>>>FtHiooolnwlFpCose)iowttvmrteearhdpt,euimsegturyeoapr(drppCceorhhljo¼siuaeg1cpno:ra.arrlem7lB)iadWboleese(ln“imloatrirnguean”- uct terms of the form E Â Cj, and then follow the hierarchical backward elimination strategy described in Chap. 7 to obtain a “best” model. However, when you run a computer program (e.g., SAS’s Proc Logistic) to fit this model, you find that the model does not run or you decide that, even if the model runs, the resulting fitted model is too unreliable because of the large number of variables being considered. What do you do in this situation? What do you do? OPTIONS (large-number-of-vari- There are several possible options: ables problem) 1. Use some kind of “screening” technique to 1. Screening: exclude some of the Cj variables from the  Exclude some Cj one-at-a- model one-at-a-time, and then begin again time with a reduced-sized model that you hope is reasonably reliable and/or at least will run.  Begin again with reduced model 2. Use “collinearity” diagnostic methods starting with the initial model to exclude vari- 2. Collinearity diagnostics on ables (typically product terms) that are initial model: strongly related to other variables in the model.  Exclude some Cj and/or E Â Cj strongly related to 3. Use a forward regression algorithm that other variables in the model starts with a model containing all main effect Cj terms and proceed to sequentially add statis- 3. Forward algorithm for tically significant product terms. interactions:  Start with E and all Cj, j ¼ 1, . . . , p  Sequentially add significant E Â Cj

264 8. Additional Modeling Strategy Issues 4. Backward for Cs, then 4. Start with a model containing all Cj terms, forward for E Â Cj: proceed backward to eliminate nonsignificant  Start with E and all Cj, Cj terms, and then sequentially add statistically j ¼ 1, . . . , p significant product terms among the remain-  Sequentially drop ing Cj terms. nonsignif. Cj  Sequentially add EÂCj for remaining Cj COMMENTS/CRITICISMS OF OPTIONS: Option 2 above will be described in the next section. It cannot be used, however, if initial Option 2: next section. model does not run. Option 3: Option 3 has the advantage of starting with a small-sized model, but has two disadvantages: þ Starts with small-sized model the model may still have reliability problems if À Can still be unreliable if large there are a “large” number of Cs, and the for- ward approach to assess interaction does not number of Cs allow all interaction terms to be assessed À Interactions not assessed simultaneously as with a backward approach. simultaneously Option 4: þ Frequently used in practice (??) Option 4, which is frequently used in practice, À Inappropriately uses statistical can be strongly criticized because it uses statis- tical testing to determine whether potential testing to exclude potential confounders Cj should stay in the model, confounders whereas statistical testing should not be used À Questionably excludes Cs before to assess confounding. Furthermore, option 4 assessing E Â Cs excludes potential confounders prior to asses- sing interaction, whereas interaction should be Screening: assessed before confounding. Good ways and questionable ways We now return to describe Option 1: screening. Purpose: As we will describe below, there are good ways and questionable ways to carry out screening.  Reduce number of predictors The purpose of screening is to reduce the num- ber of predictors being considered so that a  Obtain a reliable and reliable and interpretable final model can be interpretable final model obtained to help answer study questions of interest.

Presentation: III. Screening Variables 265 The two primary drawbacks of screening are: Drawbacks of screening: 1. Does not accomplish simultaneous assess- ment of all exposure and control variables 1. No simultaneous assessment of recommended from the literature or conceptu- all Es and Cs. alization of one’s research question. 2. No guarantee that one’s final model con- 2. No guarantee final model tains all the relevant variables of interest, contains all “relevant” variables although there is no such guarantee for any modeling strategy. General screening situation: Consider the following general screening situa- n subjects, k predictors (Xi, tion: Your dataset contains n subjects and k predictors, and you decide k is large enough i ¼ 1, . . . , k) to warrant some kind of screening procedure k “large enough” to require screening to reduce the number of predictors in your initial model. Method 0: A typical approach (let’s call it Method 0) is to  Consider predictors one-at-a- screen-out (i.e., remove from one’s initial time model) those variables that are not individually significantly associated with the (binary) out-  Screen-out those Xi not come. significantly associated with D Questions about Method 0: Q1. Is there anything that can be criticized about Method 0? Q1. Any criticism? Q2. Should the use of Method 0 depend on Q2. Depends on types of Xs? types of predictors? E.g., whether your  Use if several Es and Cs? predictors are a mixture of Es and Cs, involve  Use if one E and several one E and several Cs, or only involve Es? Cs?  Use if only Es? Q3. How large does k have to be relative to n in order to justify screening? Q3. How large k compared to n?  k ¼ 10, n ¼ 50: 20%? Q4. Are there other ways (i.e., Methods A, B,  k ¼ 10, n ¼ 100: 10%? C, . . .) to carry out (one-at-a-time) screening  k ¼ 10, n ¼ 200: 5%? and when, if at all, should they be preferred to the typical approach? Q4. Other ways than Method 0? Q5. Where does collinearity assessment fit in Q5. Collinearity and/or with this problem? screening?

266 8. Additional Modeling Strategy Issues Answers: Q1. Is there anything that can be criticized about Method 0? Q1. Yes: Yes, Method 0 involves statistical testing only;  Statistical testing only it does not consider confounding or effect (questionable) modification (interaction) when assessing  Does not consider variables one-at-a-time. confounding or interaction To assess confounding involving binary dis- ease D, binary exposure E, and a single poten- Assess confounding with D(0,1), tial confounder C, you need to fit two E(0,1), one C: regression models (shown at left), one of Logit PðXÞ ¼ a þ bE; which contains E and C, and the other of which contains only E. where P(X) ¼ Pr(D ¼ 1|E) Logit P*ðXÞ ¼ a* þ b*E þ g*C; where P*(X) ¼ Pr(D ¼ 1|E,C) Confounding: meaningfully different Confounding is present if we conclude that ORDE = ebˆ ¹ ORDE|C = ebˆ* corresponding odds ratio estimates are mean- ingfully different for the two models. Assess interaction with D(0,1), To assess interaction involving binary disease E(0,1), one C: D, binary exposure E, and a single potential Logit PðXÞ ¼ a þ bE þ gC þ dE Â C; confounder C, we need to fit the following logistic regression model shown at the left where P(X) ¼ Pr(D ¼ 1|E,C, E Â C) that contains the main effects of E and C and the product term E Â C. H0: d ¼ 0 Interaction is then assessed by testing the null hypothesis that the coefficient (d) of the prod- Wald ¼  2 w21 df under H0 uct term is zero using either a Wald test or a d^=sd^ $ likelihood ratio test (preferred), where the test statistic is chi square with 1 df under H0. LR ¼ À2 ln LR À (À2 ln LF) $ w21 df under H0

Presentation: III. Screening Variables 267 Answers about Screening (con- tinued) Q2. Yes: It depends. Q2. Should the use of Method 0 depend on types of predictors?  Use Method 0 if several Es Yes, Method 0 makes most sense when the and Cs? No model only involves Es, i.e., no potential con- founders (Cs) and no corresponding changes in  Use Method 0 if one E and ORs are being considered. However, Method several Cs? No 0 is questionable whenever there are variables being controlled (Cs).  Use Method 0 if only Es? Yes Questionable when model con- siders Cs Q3. No clear-cut rules for Q3. How large does k have to be relative to n “large k.” in order to justify screening? However, need screening if initial model does not run. There are no clear-cut rules, but you will become aware that screening should be consid- ered if your initial model does not run (see next section on collinearity). Q4. Other options for (1-at-a- Q4. Are there other ways to carry out (one-at- time) screening? Yes a-time) screening and when, if at all, should they be preferred to the typical approach?  Variations to assess confounding or There are several reasonable options for interaction, e.g., use screening, all of which are variations of ways stratified analysis instead to assess possible confounding and/or effect of logistic regression modification involving covariates. Such options should be preferred whenever there is a mixture  Such options needed if of Es and Cs to be considered. considering Cs 5. pCoffi lPlirnioerartiotys?creening and/or Q5. Where does collincarity assessment fit in after screening with this problem? Collinearity may be considered prior to screening or after screening is performed. Initial model does not run If your initial model does not run, typical col- + linearity diagnostics (e.g., condition indices, to be described in the next section) cannot be Cannot obtain collinearity diagnostics obtained, so screening must be considered + from the beginning. Start with screening Also, once screening has been performed, col- linearity assessment may determine that Screening completed your reduced model (after screening) is still + unreliable. Model may still be unreliable + Consider collinearity diagnostics

268 8. Additional Modeling Strategy Issues Question 2 (continued) Following up on a previous question (2), sup- What if only Es (no Cs consid- pose your starting model involves several Es ered)? but no Vs or Ws, i.e., no Cs are considered. Answer: Use Method 0. How do you carry out (one-at-a-time) screen- ing for this situation? Why? Confounding not an issue We recommend using Method 0 here, because when there are no Cs to consider, confounding is not an issue. Consequently, using statistical testing for one-at-a-time screening of Es is appropriate. Initial model with Es and Cs? Suppose your initial model involves several Es Screening procedure: it depends! and Cs. How do you carry out screening for this Option 1: Screen Cs only without situation? The answer is, as is often the case, it depends! using Method 0 Option 1: You may decide to screen only Cs, Option 2: Screen Cs without using and then consider the Es during your modeling strategy process. Method 0, and screen Es Option 2: If you have large numbers of E and C using Method 0 variables, you might screen both types of vari- ables, making sure not to use Method 0 for the Cs and using Method 0 for the Es.

Presentation: III. Screening Variables 269 EXAMPLE Examples of Screening: Single E, C1, We now provide a few simple examples to illus- trate screening. We will begin by assuming that C2, . . . , C10. we have a single E variable and 10 C variables, C1, C2, . . . , C10. At the left, we describe four Four Scenarios: Which of these is different screening scenarios for this situation. Can you determine which of these scenarios 8“legitimate”? corresponds to carrying out Method 0, and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which represents what we have described i. Crude analyses relating D above as a “legitimate” method of screening? ii. to each Ci identify only C1 and C4 to be significant predictors of D. Starting model then contains E, C1, C4, EC1, and EC4. Best model determined using hierarchical backward elimination approach (HBWE) outlined in Chap. 6 Stratified analyses relating D to E and each Ci identify C1 and C4 to be individual confounders, and C5 to be an effect modifier of the E, D effect. Starting model then contains E, C1, C4, C5, EC1, EC4, and EC5. Best model determined using HBWE. :>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>iiivi.. Crude analyses relating D to each Ci identify only C1 and C4 to be significant predictors of D. Starting model then contains E, C1, and C4. Backward elimination on Cs eliminates C4, but retains C1 (and E). Add interaction term EC1. Best model determined using HBWE. Logistic regression models relating D to E and each Ci identify C1 and C4 to be individual confounders, and C5 to be an effect modifier of the E, D effect. Starting model then contains E, C1, C4, C5, EC1, EC4, and EC5. Best model determined using HBWE.

270 8. Additional Modeling Strategy Issues EXAMPLE (continued) The answer to the above question is that sce- narios ii and iv represent “legitimate” methods Answer: of screening because both scenarios do not Scenarios ii and iv “legitimate” involve using a significance test of a crude screening effect between Ci and D. Scenario iii differs from i in that backward elimination is (ques- Scenarios i and iii incorrectly use tionably) performed on the Cs before interac- significance tests to screen tion is assessed. individual Ci. Summarizing our main points about Method 0: Scenario iii questionably uses BW elimination on Cs before assessing interaction Summary about Method 0: 1. Does not assess confounding or 1. Method 0 does not consider confounding interaction for individual Ci. and/or interaction for predictors treated one-at-a-time. 2. Makes most sense if model only involves Es. 2. Method 0 makes most sense when the model only involves Es, but is questionable with both Es and Cs being considered. IV. Collinearity Collinearity concerns the extent to which one or more of the predictor variables (the Xs) in one’s Can some Xs be predicted by other model can be predicted from other Xs in the Xs? model. If Xs are “strongly” related, then If there are very strong relationships among 9 some of the Xs, then the fitted model may b^j unreliable >=>>>>>> collinearity yield unreliable regression coefficients for Va^r b^j high ;>>>>>>> problem some predictors. In other words, coefficients may have high estimated variances, or perhaps model may not run the model may not even run. When this occurs, we say that the model has a collinearity problem. Because collinearity problems may involve Collinearity may involve more than relationships among more than two Xs, it is two Xs not sufficient to diagnose collinearity by + simply looking at correlations among pairs of variables. Simple approach : rXi;Xj not sufficient For example, if X3 was approximately equal to EXAMPLE the difference between X1 and X2, this relation- ship could not be detected simply by looking at If X3 % X1 À X2, correlations between X3 and X1, X3 and X2, or could not detect this from X1 and X2. rX3,X1, rX3,X2, or rX1,X2.

Presentation: IV. Collinearity 271 EXAMPLE (continued) b0 One popular way to diagnose collinearity uses b1 a computer program or macro that produces a table (example shown at left) containing two b2 kinds of information, condition indices (CNIs) b3 and variance decomposition proportions (VDPs). b4 (See Kleinbaum et al., Applied Regression and b5 Other Multivariable Methods, 4th Edition, b6 Chap. 14, 2008 for mathematical details about b7 CNIs and VDPs) diagnosing 1. Largest CNI “large” Using such a table, a collinearity problem is collinearity (e.g., >30) diagnosed if the largest of the CNIs is consid- ered large (e.g., >30) and at least two of the 2. At least two VDPs “large” VDPs are large (e.g., ! 0.5). (e.g., ≥ 0.5) The diagnostic table we have illustrated indi- b0 cates that there is at least one collinearity prob- b1 lem that involves the variables E, C3 and E Â C3 because the largest CNI exceeds 30 and two of b2 the VDPs are as large as 0.5. b3 b4 b5 b6 b7 Diagnosing collinearity conceptually We now describe briefly how collinearity is Computer software for nonlinear diagnosed conceptually, and how this relates models to available computer software for nonlinear models such as the logistic regression model. Collinearity objective: The objective of collinearity diagnostics is to determine whether (linear) relationships  Determine if fitted model is among the predictor variables result in a fitted unreliable , model that is “unreliable.” This essentially translates to determining whether one or more  Determine whether Varðb^jÞ is of the estimated variances (or corresponding “large enough” standard errors) of the b^j become “large enough” to indicate unreliability. Estimated Variance– Covariance Matrix The estimated variances are (diagonal) compo- Var ( bˆ1) Var ( bˆ2) Covariances nents of the estimated variance–covariance matrix ðV^ Þ obtained for the fitted model. For non Vˆ = Var ( bˆ3) linear models in which ML estimation is used, the V^ matrix is called the inverse of the informa- Covariances Var ( bˆ3) tion matrix (I21), and is derived by taking the = I–1 for nonlinear models second derivatives of the likelihood function (L).

272 8. Additional Modeling Strategy Issues CNIs and VDPs derived from V^ The CNIs and VDPs previously introduced are in turn derived from the V^ matrix. As illu- CNIs identify if collinearity exists strated earlier, the CNIs are used to identify whether or not a collinearity problem exists, VDPs identify variables causing and the VDPs are used to identify those vari- collinearity ables that are the source of any collinearity problem. (Again, see Kleinbaum et al., 2008, Chapter 14, for a more mathematical descrip- tion of CNIs, VDPs, and I21.) SAS; STATA; SPSS Unfortunately, popular computer packages do not compute CNIs and VDPs such as SAS, STATA, and SPSS do not contain programs (e.g., SASs LOGISTIC procedure) for nonlinear models that compute CNIs and VDPs for nonlinear models. However, a SAS macro (Zack et al.), But: SAS macro available developed at CDC and modified at Emory Uni- versity’s School of Public Health, allows com- putation of CNIs and VDPs for logistic and other nonlinear models (see Bibliography). Application of macro later We illustrate the use of this macro shortly. Difficulties: Nevertheless, there are difficulties in diagnos- ing collinearity. These include determining How large is large for CNIs and how “large is large” for both the CNIs and the VDPs? VDPs, and how to proceed if a collinearity problem is found. How to proceed if collinearity is found? Collinearity cut-off recommenda- The classic textbook on collinearity diagnostics tions (Belsey, Kuh, and Welch, 1981) recommends a ðBKW; 1981Þ: CNI ! 30; VDP ! 0:5 cut-off of 30 for identifying a high CNI and a cut-off of 0.5 for identifying a high VDP. Nev- + ertheless, these values were clearly described Guidelines as “guidelines” rather than firm cut-points, and they were specified for linear models only. for linear regression models To what extent the guidelines (particularly for CNIs) should be modified (e.g., lowered) for Modifying guidelines for nonlinear nonlinear models remains an open question. models: Moreover, even for linear models, there is con- siderable flexibility in deciding how high is high.  open question (lower CNI cutpoint?)  flexibility for how high is high

Presentation: IV. Collinearity 273 We recommend (linear or logistic For either linear or logistic models, we recom- models): mend that the largest CNI be “considerably” larger than 30 before deciding that one’s  Require CNI >> 30 model is unreliable, and then focusing on  Focus on VDPs for largest CNI VDPs corresponding to the largest CNI before  Address largest CNI before addressing any other CNI. other CNIs Sequential approach This viewpoint is essentially a sequential + approach in that it recommends addressing the most likely collinearity problem before con- Drop variable, refit model, readdress sidering any additional collinearity problems. collinearity, continue until no collinearity Option 1 – (most popular) Correct- Once a collinearity problem has been deter- ing Collinearity: mined, the most popular option for correcting the problem is to drop one of the variables EXAMPLE identified (by the VDPs) to be a source of the problem. If, for example, the VDPs identify two Drop a variable from the model main effects and their product, the typical solu- Example: VDPs identify X1; X2; X1 Â X2 tion is to drop the product term from the model. + Drop X1 Â X2 Dropping a collinear variable: Nevertheless, when such a term is dropped from the model, this does not mean that this  Does not mean variable is term is nonsignificant, but rather that having nonsignificant such a term with other variables in the model makes the model unreliable. So, by dropping  Indicates dropped variable an interaction term in such a case, we indicate cannot be assessed with other this interaction cannot be assessed, rather than collinear variables it is nonsignificant. Option 2 – Correction Collinearity: A second option for correcting collinearity is to Define a new (interpretable) define a new variable from the variables caus- variable: ing the problem, provided this new variable is (conceptually and/or clinically) interpretable.  Does not make sense for Combining collinear variables will rarely make product terms sense if a product term is a source of the prob-  Can combine height and weight lem. However, if, for example, main effect vari- into BMI ¼ height/weight2 ables such as height and weight were involved, then the “derived” variable BMI (¼ height/ weight2) might be used to replace both height and weight in the model.

274 8. Additional Modeling Strategy Issues EXAMPLE We now illustrate the use of collinearity diag- nostics for the MRSA dataset we have MRSA example – Initial Model: described earlier. We consider the initial Logit PðXÞ ¼ a þ b1E1 þ b2E2 þ g1V1 model shown at the left, which contains two Es, two Vs, 4 EWs, and a single EE. þ g2V2 þ d11E1W1 þ d12E1W2 þ d21E2W1 Using the collinearity macro introduced above, þ d22E2W2 þ d*E1E2; we obtain the (edited) collinearity diagnostic output shown at the left. From this table, we where see that the highest CNI is 45.6, which is consid- erably higher than 30, and there are two VDPs D ¼ MRSA status ð0; 1Þ; greater than 0.5, corresponding to the variables E1 ¼ PREVHOSP PREVHOSP (VDP ¼ 0.95) and the product term E2 ¼ PAMU; V1 ¼ W1 ¼ AGE; PREVHOSP Â AGE (VDP ¼ 0.85). V2 ¼ W2 ¼ GENDER At least one collinearity problem, Based on these results, we decide that there is at least one collinearity problem associated i.e., involves PREVHOSP and with the highest CNI and that this problem involves the two variables PREVHOSP and PREVHOSP × AGE Drop PREVHOSP Â AGE. Proceeding sequentially, we would now drop the product term from the model and reassess collinearity for the resulting reduced model. The results are shown at the left. From this table, we see that the highest CNI is now 34.3, which is slightly higher than 30, and there are two VDPs greater than 0.5, corresponding to the variables AGE (VDP ¼ 0.74) and the prod- uct term PAMU Â AGE (VDP ¼ 0.75). Another possible collinearity problem Since the highest CNI here (34.3) is only (CNI ¼ 34.3) slightly above 30, we might decide that this Two alternatives at this point: value is not high enough to proceed further to assess collinearity. Alternatively, proceeding  Stop further collinearity conservatively, we could drop the product assessment PAMU Â AGE and further assess collinearity.  Drop PAMU Â AGE and continue

Presentation: V. Influential Observations 275 EXAMPLE (continued) The collinearity diagnostics resulting when PAMU Â AGE is dropped from the model are shown at the left. The largest CNI in this table is 21.5, which is much smaller than 30. Thus, we conclude that after we drop both PRE- VHOSPÂAGE and PAMUÂAGE, there are no more collinearity problems. Reduced model after diagnosing So, after assessing collinearity in our MRSA collinearity: example, we have arrived at the reduced model shown at the left. This model then Logit PðXÞ ¼ a þ b1E1 þ b2E2 þ g2V2 becomes a “revised” initial model from which þ d12E1W2 þ d22E2W2 we determine a final (“best”) model using the þ d*E1E2 hierarchical backward elimination (HBWE) strategy we have previously recommended. (Note: E1W1 and E2W1 removed from initial model) V. Influential Another diagnostic issue concerns influential Observations observations: those subjects (if any) in one’s dataset that strongly “influence” the study Are there any subjects in the data- results. set that “influence” study results? Technically, a subject is an influential observa- + tion if removal from the dataset results in a Does removal of subject from the “significant” change in one or more of the esti- data result in “significant” change mated bj (or ORs of interest in a logistic in b^j or O^ R? model). Popular approach: A popular approach for identifying influential Measure extent of change in b^j observations is to compute for each study sub- when subject is dropped from the ject, a measure of the change in one or more estimated regression coefficients when the data: subject is dropped from the data. For a given variable in the model, this measure is called a Delta-beta (Dbj) Delta-beta.

276 8. Additional Modeling Strategy Issues EXAMPLE For example, a model containing four predic- 4 predictors: E, AGE, RACE, SEX ) tors, say E, AGE, RACE, and SEX, would pro- 4 Db values for each subject (i) duce four Delta-betas for each subject. n ¼ 100 ) Compute 400 Dbj values If the dataset contained 100 subjects, 400 Delta-betas would be computed, 4 for each subject. DbE;i; DbAGE;i; DbRACE;i; DbSEX;i If subject A, say, has a “large” or “significant” i ðsubjectÞ ¼ 1; 2; . . . ; 100 Delta-beta for the variable E, then one may conclude that removal of this subject from the DbE,i¼A is “large” or “significant”: analysis may change the conclusions drawn + about the effect of E. removing subject A from analysis changes conclusions about effect of E Summary measure for linear Also, a summary measure that combines the regression: Delta-beta information from all variables is typically computed. For linear regression, one Cook’s distance (CD) such measure is called Cook’s distance, which combines Dbj,i information for is a form of weighted average of the Delta-betas over all predictor variables Xj in one’s model. all Xj predictors for subject i, e.g., a weighted average of the form CDi ¼ ~jwjDbj;i = ~jwj; i ¼ 1; 2; . . . ; n Logistic regression: For logistic regression, a similar measure (Pregibon, 1981) is used and often referred to  Cook’s distance-type index as a Cook’s distance-type index. This measure is  Uses approximation to change derived using an approximation to the change in logit values when a subject is dropped from in logit values the data and, in essence, combines Delta-beta  Similar to combining Delta- values using a logistic model. betas Suggested alternative for logistic However, since the effect measure in logistic regression: regression is typically an odds ratio, which exponentiates regression coefficients, a modi- CD*i ¼ ~ wjDðexp½bŠÞj;i = ~ wj fied Cook’s distance-type index that computes a weighted average of changes in exp[b], i.e., jj Dexp[b], might be preferable, but is not avail- able in most computer packages. but not available in computer packages.

Presentation: V. Influential Observations 277 More on influential observations including:  How to use computer software? We now briefly illustrate using the MRSA  How to proceed when influential example how to use computer software to diag- nose influential observations. We also discuss observations are identified? how to proceed with the analysis when influen- tial observations are identified. Computer Packages, e.g., SAS, Most computer software packages such as STATA, SPSS produce their SAS, STATA, and SPSS allow the user to obtain own version of influence diag- influence diagnostics for logistic regression, nostics although they each have their own version of the program code and the statistics produced. SAS’s LOGISTIC: influence and For example, with SAS’s LOGISTIC procedure, iplots options at end of model the user can specify two options: the “influ- ence” option, and the “iplots” option after the statement model statement. Large collection of regression Both these LOGISTIC options produce a large diagnostic information collection of regression diagnostics informa- produced, including tion for any fitted model. This includes Delta- beta measures for each variable in the model Db values for each variable plus overall Cook’s distance-type measures. and Here, we focus on the latter, which we hence- forth refer to as “C measures.” C measures over all variables Two slightly different C measures are produced by the influence and iplot options, a “C” and a SAS’s LOGISTIC: C and Cbar mea- “Cbar” measure (Pregibon, 1981). These mea- sures (similar but not identical) sures typically yield similar, though not always identical, conclusions as to which subjects are EXAMPLE “influential”. Influence option Iplots option The influence option produces a figure that vertically (on the Y-axis) lists each subject and Case Valve C C horizontally (on the X-axis) plots the value of Number the influence measure (C or Cbar). The iplots (1 unit = 0.01) C option, on the other hand, produces a figure 1 0 2 4 6 8 12 16 0.2 that lists the subjects horizontally and plots the 2 influence measure on the vertical axis. Case 9 3 0.00921 * Case 9 Case 16 4 0.00857 * The two figures on the left show the results for 5 0.00304 * * the influence measure C for the first 42 sub- 6 * jects in the MRSA data set for the no-interac- 7 0.0217 0.1 * tion model shown below the figures. In both 8 . * Case 16 figures, subjects 9 and 16 appear to have C 9 * scores that are much higher than the other 10 0.00360 * * scores. 11 0.00340 * 12 0.00547 * 13 * * ** * 14 0.1311 * 15 0.00877 ** ** ** * ** * ** * * 16 0.00945 * *** ** 17 * 0.0 * *** * ** * * * * 18 . 19 0.0204 * 0 5 10 15 20 25 30 35 40 20 0.00496 * Case Number 21 0.0260 22 0.1010 * 23 0.00456 * 24 0.00456 * 25 0.00669 * 26 0.00733 * 27 0.00619 * 28 0.00694 * 29 0.00241 30 0.0153 * 31 0.00646 * 32 0.00609 * 33 0.0112 * 34 0.00266 * 35 0.0271 * 36 0.00760 * 37 0.00760 * 38 39 . * 40 0.0464 * 41 0.00607 * 42 0.00204 * 0.00261 * 0.0141. * 0.0661 * 0.0179 * 0.00314 * 0.00309 MRSA no-interaction model Logit P(X) = a + b1PREVHOSP + b2PAMU + g1AGE + g2GENDER

278 8. Additional Modeling Strategy Issues EXAMPLE (continued) These results indicate that if either or both of these subjects are dropped from the dataset, Results suggest that regression the collection of estimated regression coeffi- coefficients are “influenced” by cients in the fitted model would meaningfully these two subjects, e.g., drop change, which, in turn, could result in mean- subjects 9 and/or 16 from data ingfully different estimated ORs (e.g., exp½b^1Š). + Estimates of a, b1, b2, g1, g2 meaningfully change Possibly influential subjects other Since the above figures consider only 42 of a than 9 and 16. total of 289 subjects, there may be other influ- ential subjects. No Interaction Model w/o subjects 9 Without looking for other influential subjects, and 16 we show on the left the output obtained for the no-interaction model when subjects 9 and 16 Param DF Estimate Std Err exp[coeff] are dropped from the dataset. Below this, we provide the output for the same model for the Intercept 1 À5.3830 0.8018 À full dataset. PREVHOSP 1 1.6518 0.4237 5.217 These results indicate that, particularly for the E variables PREVHOSP and PAMU, corres- PAMU 1 1.8762 0.3809 6.528 ponding b^j and exp½b^jŠ are somewhat different, although both sets of results indicate strong AGE 1 0.0370 0.0095 1.038 and statistically significant effects. GENDER 1 0.9214 0.3809 2.513 No Interaction Model full data Param DF Estimate Std Err exp[coeff] Intercept 1 À5.0583 0.7643 À PREVHOSP 1 1.4855 0.4032 4.417 PAMU 1 1.7819 0.3707 5.941 AGE 1 0.0353 0.0092 1.036 GENDER 1 0.9329 0.3418 2.542 Should influential subjects be So, if we decide that some subjects (e.g., 9 and dropped from the data? 16) are truly influential, what should we do? Drop them from the dataset? Answer: It depends! The answer, once again, is it depends! A large  Incorrect data influence statistic may be due to incorrect data  Incorrect model on one or more subjects, but it can also be the  Legitimate and important data result of an incorrect model, or even reflect the legitimate importance of (correct) data on a given subject.

Erroneous data Presentation: V. Influential Observations 279 + Certainly, if a subject’s data is erroneous, it Correct if possible should be corrected if possible. If such an Drop if not correctable error is not clearly correctable, then the subject may be dropped from the analysis. Correct data + However, if the data on an influential subject is not erroneous, the researcher has to decide Decision to delete up to researcher whether the subject should be dropped. For Report and interpret in “discussion” example, if the subject is much older than most subjects (i.e., an outlier), the researcher may Inappropriate model? have to decide whether the age range initially + allowed needs to be modified. Instead of deleting such an individual, the researcher may wish to Difficult to decide if due to report and interpret the presence of influential influential observation subjects in the “discussion” of results. Summary: It is typically difficult to determine whether a Be careful about deleting. large influence statistic results from an inappro- Conservative approach: priate model. Since the initial model is rarely one’s final (i.e., best) model, a final decision as drop subject only if to whether a given subject is influential should uncorrectable error wait until one’s final model is determined. In summary, the researcher must be careful when considering whether or not to delete an observation. A very conservative approach is to only delete an observation if it is obviously in error and cannot be corrected.

280 8. Additional Modeling Strategy Issues VI. Multiple Testing The modeling strategy guidelines we have described when one’s model contains either a Modeling strategy ) several single E (Chapters 6 and 7) or several Es (ear- statistical tests lier in this chapter) all involve carrying out statistical significance testing for interaction + terms as well as for E terms. Nevertheless, Potential for incorrect “overfitted” performing several such tests on the same data- model, i.e., “too many” significant set may yield an incorrect “overfitted” final test results model if “too many” test results are found to be significant. “Too many”: By “too many”, we mean that the null hypothe- sis may actually be true for some significant variable(s) found significant, test results, so that some “significant” variables but H0 true. (e.g., interaction terms) may remain in the final model even though the corresponding null hypotheses are true. The multiple-testing problem: This raises the question as to whether or not we should adjust our modeling strategy to account Should we adjust for number for the number of statistical tests we perform of significance tests and, if and, if so, how should we carry out such adjust- so, how to adjust? ment? Statistical principle: A well-established statistical inference principle is that the more statistical tests one performs, number of significance tests increases the more likely at least one of them will reject its ß null hypothesis even if all null hypotheses are true. The parameter a* shown at the left, is a* FWER often called the family-wise error rate (FWER), whereas the significance level a for an individ- = Pr(Reject at least one H0i | all H0i true) ual test is called the test-wise error rate. increases Mathematically, the above principle can be expressed by the formula shown at the left. Note: a = test-wise error rate = Pr(reject H0i | H0) Formula: a* ¼ 1 À ð1 À aÞT; where T ¼ number of independent tests of H0i, i ¼ 1, . . . , T EXAMPLE For example, the table at the left shows that if a ¼ 0.05, and T ranges from 1 to 5 to 10 to 20, T a* then a* increases from 0.05 at T ¼ 1 to 0.64 at T ¼ 20. a ¼ 0:05 ) 1 0.05 5 0.23 10 0.40 20 0.64

Presentation: VI. Multiple Testing 281 Bonferroni approach: A popular (Bonferroni) approach for insuring To achieve a* a0, set a ¼ a0/T that a* never exceeds a desired FWER of, say, EXAMPLE a0 is to require the significance level (a) for each test to be a0/T. To illustrate, if a0 ¼ 0.05 e.g., a0 ¼ 0.05, T ¼ 10: and T ¼ 10, then a ¼ 0.005, and a* calculates a ¼ 0.05/10 ¼ 0.005 to 0.049, close to 0.05. + a* ¼ 1 À (1 À 0.005)10 ¼ 0.49 a0 ¼ 0.05 Problem with Bonferroni: A problem, however, with using the Bonferroni approach is that it “over-adjusts” by making it Over-adjusts: does not reject more difficult to reject any given H0i; that is, its enough- low power (model may “power” to reject true alternative hypotheses is be underfitted) typically too low. Bonferroni-type alternatives avail- Alternative formulae for adjusting for multiple- able to: testing (e.g., Sidak, 1967; Holm, 1979; Hochberg, 1988) have been offered to provide increased  Increase power power and to allow for nonindependent signifi-  Allow for nonindependent tests cance tests. Another approach: Moreover, another adjustment approach (Benjamini and Hochberg, 1995) replaces the Replaces FWER with “overall” goal of adjustment from obtaining a False Discovery Rate (FDR) ¼ T0/T, desired “family-wise error rate” (FWER) to where obtaining a desired “false discovery rate” (FDR), which is defined as the proportion of T0 ¼ no. of tests incorrectly the number of significance tests that incor- rejected, i.e., H0i true rectly reject the null (i.e., truly Type 1 errors). T ¼ total no. of tests Criticisms of multiple testing: Nevertheless, there remains some controversy in the methodologic literature (Rothman, (1) Assuming universal H0: all 1990) as to whether any attempt to correct for H0i true unrealistic multiple-testing is even warranted. Criticisms of “adjustment” include (1) the assumption of a (2) Paying a “penalty for peeking” “universal” null hypothesis that all H0i are non reduces importance of specific significant is unrealistic (2) paying a “penalty tests of interest for peeking” (Light and Pillemer, 1984) reduces the importance of specific contrasts of interest; (3) Where do you stop correcting (3) where does the need for adjustment stop for multiple-testing? when considering all the tests that an individ- ual researcher performs?

282 8. Additional Modeling Strategy Issues Multiple-testing literature: Finally, the literature on multiple-testing researcher knows in advance focuses on the situation in which the number of tests researcher knows in advance how many tests are to be performed. This is not the situation Modeling strategy (“best model”) being addressed when carrying out a modeling problem: researcher does not strategy to determine a “best” model, since the know in advance number of tests number of tests, say for interaction terms, is only determined during the process of obtain- ing one’s final model. Bonferroni-type adjustment not Consequently, when determining a “best” possible when determining a model, a Bonferroni-type adjustment is not “best model” possible since the number of tests (T) to be performed cannot be specified in advance. (Cannot specify T in advance) Ad hoc procedure: One approach for reducing the number of tests, nevertheless, is to use the results of non Drop all variables in a significant chunk tests to drop all the variables nonsignificant chunk, e.g., all in the chunk, rather than continue with back- interaction terms ward elimination (using more tests). However, note that the latter may detect significant (Drawback: BW elimination may (interaction) effects that might be overlooked find significant effects when only using a chunk test. overlooked by chunk test) Summary about multiple testing: Thus, in summary, there is no full-proof No full-proof method avail- method for adjusting for multiple-testing able. when determining a best model. It is up to the researcher to do anything, if at all.

Presentation: VII. Summary 283 VII. SUMMARY This presentation is now complete. Five issues on model strategy We have described the five issues (shown at guidelines: the left) on model strategy guidelines not cov- ered in the previous two chapters on this topic. 1. Modeling strategy when there Each of these issues represent important fea- are two or more exposure tures of any regression analysis that typically variables require attention when determining a “best” model. 2. Screening variables when modeling Regarding issue 1, we recommend that the initial model have the general form shown at 3. Collinearity diagnostics the left. This model involves Es, Vs, EWs, and EEs, so there are two types of interaction 4. Multiple testing terms to consider. 5. Influential observations Issue 1: Several Es q p1 Logit PðXÞ ¼ a þ ~ biEi þ ~ gjVj i¼1 j¼1 q p2 þ ~ ~ dikEiWk i¼1 k¼1 qq þ ~ ~ di*i0 EiEi0 i¼1 i0¼1 i6¼i0 Modeling Strategy Summary: We then recommend assessing interaction, Several Es first by deciding whether to do an overall chunk test, then testing for the EWs, after Step 1: Define initial model (above which a choice has to be made as to whether formula) to test for the EE terms prior to or subsequent to assessing confounding and precision. The Step 2: Assess interaction: overall resulting model is then further assessed to see chunk test (?), then EWs whether any of the E terms are nonsignificant. and then (?) EEs Step 3: Assess confounding and precision (Vs) (prior to EEs?) Step 4: Test for nonsignif Es if not components of significant EEs Issue 2: Screening Variables Regarding issue 2, we described an approach Method 0: Consider predictors (called Method 0) in which those variables that are not individually significantly associated one-at-a-time with the (binary) outcome are screened-out Screen-out those (i.e., removed from one’s initial model). Xi not significantly associated with D Does not consider confounding or Method 0 does not consider confounding and/ interaction. or interaction for predictors treated one-at-a- Questionable if model contains time. Thus, Method 0 makes most sense when both Es and Cs the model only involves Es but is questionable with both Es and Cs being considered.

284 8. Additional Modeling Strategy Issues SUMMARY (continued) For issue 3, we described how collinearity can be diagnosed from two kinds of information, Issue 3: Collinearity condition indices (CNIs) and variance decom- Diagnose using CNIs and VDPs position proportions (VDPs). A collinearity Collinearity detected if: problem is indicated if the largest of the CNIs is considered large (e.g., >30) and at Largest CNI is large (>30) least two of the VDPs are large (e.g., ! 0.5). At least 2 VDPs are large (! 0.5) Difficulties: Nevertheless, difficulties remaining when asses- How large is large for CNIs and sing collinearity include how large is large for VDPs? CNIs and VDPs, and how to proceed (e.g., How to proceed if collinearity sequentially?) once a problem is identified. problem? Issue 4, concerning influential observations, Issue 4: Influential Observations is typically addressed using measures that determine the extent to which estimated Does removal of subject from the regression coefficients are modified when data result in “significant” one or more data points (i.e., subjects) are change in b^j or OdR? dropped from one’s model. Measures that focus on such changes in specific regression Delta-beta (Dbj): measures chan- coefficients of interest are called Delta-betas, ges in specific bj of interest whereas measures that combine changes over all regression coefficients in one’s model are Cook’s distance-type (C): called Cook’s distance-type measures. combines Dbj over all predictors (Xj) Computer programs for logistic regression models provide graphs/figures that plot such Computer programs: measures for each subject. Those subjects Provide plots of for each subject that show extreme plots are typically identi- Extreme plots indicate fied as being “influential.” influential subjects The researcher must be careful when consid- Deleting influential observations: ering whether or not to delete an observation. Be careful! A conservative approach is to delete an obser- Conservative approach: delete vation only if it is obviously in error and can- only if data in error and cannot not be corrected. be corrected Issue 5: Multiple testing Issue 5 (multiple testing) concerns whether or not the researcher should adjust the signif- The problem: should you adjust a icance level used for significance tests to con- when performing sider the number of such tests that are several tests? performed.

Presentation: VII. Summary 285 SUMMARY (continued) Controversial issue: This is a controversial issue, in which various Use Bonferroni-type adjustment Bonferroni-type corrections have been vs. recommended, but there are also conceptual Do not do any adjustment arguments that recommend against any such adjustment. When determining best model: Nevertheless, when carrying out the process No well-established solution of finding a “best” model, there is no well- No. of tests not known in established method for such adjustment, advance since the number of tests actually performed cannot be known in advance. We suggest that you review the material cov- ered in this chapter by reading the detailed outline that follows. Then do the practice exer- cises and test. In the next two chapters, we address two other regression diagnostic procedures: Goodness of fit tests and ROC curves.

286 8. Additional Modeling Strategy Issues Detailed I. Overview (page 244) Outline Focus: Five issues not considered in Chaps. 6 and 7  Apply to any regression analysis but focus on binary logistic model  Goal: determine “best” model 1. Modeling strategy when there are two or more exposure variables 2. Screening variables when modeling 3. Collinearity diagnostics 4. Influential observations 5. Multiple testing II. Modeling Strategy for Several Exposure Variables (pages 244–262) A. Extend modeling strategy for (0,1) outcome, k exposures (Es), and p control variables (Cs) B. Example with two Es: Cross-sectional study, Grady Hospital, Atlanta, GA, 297 adult patients Diagnosis: Staphylococcus aureus Infection Question: ? MRSA, PREVHOSP, PAMU controlling for AGE, GENDER C. Modeling strategy summary: Several Es and Cs Model : Logit PðXÞ q p1 q p2 ¼ a þ ~ biEi þ ~ gjVj þ ~ ~ dikEiWk i¼1 j¼1 i¼1 k¼1 qq þ ~ ~ di*i0 EiEi0 i¼1 i0¼1 i¼6 i0 Step 1: Define initial model (above formula) Step 2: Assess interaction Option A: Overall chunk test þ Options B or C Option B: Test EWs, then EEs Option C: Test EWs, but assess Vs before EEs Step 3: Assess confounding and precision (Vs) Options A and B (continued): Vs after EWs and EEs Options C (continued): Vs after EWs, but prior to EEs Step 4: Test for nonsignif Es if not components of significant EEs


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook