Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore 2004_Kothari_Research Methodology

2004_Kothari_Research Methodology

Published by Mr.Phi's e-Library, 2021-12-13 05:23:37

Description: 2004_Kothari_Research Methodology

Search

Read the Text Version

278 Research Methodology ANOVA table for X, Y and XY can now be set up as shown below: Anova Table for X, Y and XY Source d.f. SS for X SS for Y Sum of product XY Between groups 2 1588.13 519.60 908 Within groups 12 EXX 271.60 EYY 274.40 EXY 198 Total 14 TXX 1859.73 TYY 794.00 TXY 1106 b gAdjusted total TXY 2 SS = TXX − TYY b g1106 2 = 1859.73 − 794 = (1859.73) – (1540.60) = 319.13 b gAdjusted SS within group = E XX −E XY 2 EYY b g198 2 = 271.60 − 274.40 = (271.60) – (142.87)) = 128.73 Adjusted SS between groups = (adjusted total SS) – (Adjusted SS within group) = (319.13 – 128.73) = 190.40 Anova Table for Adjusted X Source d.f. SS MS F-ratio Between groups 2 190.40 95.2 8.14 Within group 11 128.73 11.7 Total 13 319.13 At 5% level, the table value of F for v1 = 2 and v2 = 11 is 3.98 and at 1% level the table value of F is 7.21. Both these values are less than the calculated value (i.e., calculated value of 8.14 is greater than table values) and accordingly we infer that F-ratio is significant at both levels which means the difference in group means is significant. Adjusted means on X will be worked out as follows: Regression coefficient for X on Y i.e., b = Sum of product within group Sum of squares within groups for Y

Analysis of Variance and Co-variance 279 = 198 = 0.7216 274.40 Deviation of initial group means from Final means of groups in X (unadjusted) general mean (= 14) in case of Y 9.80 Group I –7.40 22.80 Group II 0.40 35.00 Group III 7.00 Adjusted means of groups in X = (Final mean) – b (deviation of initial mean from general mean in case of Y) Hence, Adjusted mean for Group I = (9.80) – 0.7216 (–7.4) = 15.14 Adjusted mean for Group II = (22.80) – 0.7216 (0.40) = 22.51 Adjusted mean for Group III = (35.00) – 0.7216 (7.00) = 29.95 Questions 1. (a) Explain the meaning of analysis of variance. Describe briefly the technique of analysis of variance for one-way and two-way classifications. (b)State the basic assumptions of the analysis of variance. 2. What do you mean by the additive property of the technique of the analysis of variance? Explain how this technique is superior in comparison to sampling. 3. Write short notes on the following: (i) Latin-square design. (ii) Coding in context of analysis of variance. (iii) F-ratio and its interpretation. (iv) Significance of the analysis of variance. 4. Below are given the yields per acre of wheat for six plots entering a crop competition, there of the plots being sown with wheat of variety A and three with B. Variety Yields in fields per acre 1 23 A B 30 32 22 20 18 16 Set up a table of analysis of variance and calculate F. State whether the difference between the yields of two varieties is significant taking 7.71 as the table value of F at 5% level for v1 = 1 and v2 = 4. (M.Com. II Semester EAFM Exam., Rajasthan University, 1976) 5. A certain manure was used on four plots of land A, B, C and D. Four beds were prepared in each plot and the manure used. The output of the crop in the beds of plots A, B, C and D is given below:

280 Research Methodology Output on Plots A BC D 893 3 12 4 8 7 172 8 315 2 Find out whether the difference in the means of the production of crops of the plots is significant or not. 6. Present your conclusions after doing analysis of variance to the following results of the Latin-square design experiment conducted in respect of five fertilizers which were used on plots of different fertility. ABCDE 16 10 11 09 09 ECABD 10 09 14 12 11 BDECA 15 08 08 10 18 DEBAC 12 06 13 13 12 CADEB 13 11 10 07 14 7. Test the hypothesis at the 0.05 level of significance that µ1 = µ2 = µ3 for the following data: Samples No. one No. two No. three (1) (2) (3) 62 6 74 8 65 9 –3 5 –4 – Total 19 18 28 8. Three varieties of wheat W1, W2 and W3 are treated with four different fertilizers viz., f1, f2, f3 and f4. The yields of wheat per acre were as under:

Analysis of Variance and Co-variance 281 Fertilizer treatment Varieties of wheat Total f1 W1 W2 W3 174 f2 55 72 47 183 f3 64 66 53 189 f 58 57 74 174 59 57 58 720 4 236 252 232 Total Set up a table for the analysis of variance and work out the F-ratios in respect of the above. Are the F-ratios significant? 9. The following table gives the monthly sales (in thousand rupees) of a certain firm in three states by its four salesmen: States Salesmen Total A BC D X 5 44 7 20 Y 7 85 4 24 Z 9 66 7 28 Total 21 18 15 18 72 Set up an analysis of variance table for the above information. Calculate F-coefficients and state whether the difference between sales affected by the four salesmen and difference between sales affected in three States are significant. 10. The following table illustrates the sample psychological health ratings of corporate executives in the field of Banking. Manufacturing and Fashion retailing: Banking 41 53 54 55 43 Manufacturing 45 51 48 43 39 Fashion retailing 34 44 46 45 51 Can we consider the psychological health of corporate executives in the given three fields to be equal at 5% level of significance? 11. The following table shows the lives in hours of randomly selected electric lamps from four batches: Batch Lives in hours 1 1600 1610 1650 1680 1700 1720 1800 2 1580 1640 1640 1700 1750 3 1450 1550 1600 1620 1640 1660 1740 1820 4 1510 1520 1530 1570 1600 1680 Perform an analysis of variance of these data and show that a significance test does not reject their homogeneity. (M.Phil. (EAFM) Exam., Raj. University, 1979) 12. Is the interaction variation significant in case of the following information concerning mileage based on different brands of gasoline and cars?

282 Research Methodology A Brands of gasoline Cars B W X YZ C 13 12 12 11 11 10 11 13 12 10 11 9 13 11 12 10 14 11 13 10 13 10 14 8 13. The following are paired observations for three experimental groups concerning an experimental involving three methods of teaching performed on a single class. Method A to Group I Method B to Group II Method C to Group III X YX YX Y 33 20 35 31 15 15 40 32 50 45 10 20 40 22 10 55 10 32 24 50 33 35 15 X represents initial measurement of achievement in a subject and Y the final measurement after subject has been taught. 12 pupils were assigned at random to 3 groups of 4 pupils each, one group from one method as shown in the table. Apply the technique of analysis of covariance for analyzing the experimental results and then state whether the teaching methods differ significantly at 5% level. Also calculate the adjusted means on Y. [Ans: F-ratio is not significant and hence there is no difference due to teaching methods. Adjusted means on Y will be as under: For Group I 20.70 For Group II 24.70 For Group III 22.60]

Testing of Hypotheses-II 283 12 Testing of Hypotheses-II (Nonparametric or Distribution-free Tests) It has already been stated in earlier chapters that a statistical test is a formal technique, based on some probability distribution, for arriving at a decision about the reasonableness of an assertion or hypothesis. The test technique makes use of one or more values obtained from sample data [often called test statistic(s)] to arrive at a probability statement about the hypothesis. But such a test technique also makes use of some more assertions about the population from which the sample is drawn. For instance, it may assume that population is normally distributed, sample drawn is a random sample and similar other assumptions. The normality of the population distribution forms the basis for making statistical inferences about the sample drawn from the population. But no such assumptions are made in case of non-parametric tests. In a statistical test, two kinds of assertions are involved viz., an assertion directly related to the purpose of investigation and other assertions to make a probability statement. The former is an assertion to be tested and is technically called a hypothesis, whereas the set of all other assertions is called the model. When we apply a test (to test the hypothesis) without a model, it is known as distribution-free test, or the nonparametric test. Non-parametric tests do not make an assumption about the parameters of the population and thus do not make use of the parameters of the distribution. In other words, under non-parametric or distribution-free tests we do not assume that a particular distribution is applicable, or that a certain value is attached to a parameter of the population. For instance, while testing the two training methods, say A and B, for determining the superiority of one over the other, if we do not assume that the scores of the trainees are normally distributed or that the mean score of all trainees taking method A would be a certain value, then the testing method is known as a distribution-free or nonparametric method. In fact, there is a growing use of such tests in situations when the normality assumption is open to doubt. As a result many distribution-free tests have been developed that do not depend on the shape of the distribution or deal with the parameters of the underlying population. The present chapter discusses few such tests.

284 Research Methodology IMPORTANT NONPARAMETRIC OR DISTRIBUTION-FREE TESTS Tests of hypotheses with ‘order statistics’ or ‘nonparametric statistics’ or ‘distribution-free’ statistics are known as nonparametric or distribution-free tests. The following distribution-free tests are important and generally used: (i) Test of a hypothesis concerning some single value for the given data (such as one-sample sign test). (ii) Test of a hypothesis concerning no difference among two or more sets of data (such as two-sample sign test, Fisher-Irwin test, Rank sum test, etc.). (iii) Test of a hypothesis of a relationship between variables (such as Rank correlation, Kendall’s coefficient of concordance and other tests for dependence. (iv) Test of a hypothesis concerning variation in the given data i.e., test analogous to ANOVA viz., Kruskal-Wallis test. (v) Tests of randomness of a sample based on the theory of runs viz., one sample runs test. (vi) Test of hypothesis to determine if categorical data shows dependency or if two classifications are independent viz., the chi-square test. (The chi-square test has already been dealt with in Chapter 10.) The chi-square test can as well be used to make comparison between theoretical populations and actual data when categories are used. Let us explain and illustrate some of the above stated tests which are often used in practice. 1. Sign Tests The sign test is one of the easiest parametric tests. Its name comes from the fact that it is based on the direction of the plus or minus signs of observations in a sample and not on their numerical magnitudes. The sign test may be one of the following two types: (a) One sample sign test; (b) Two sample sign test. (a) One sample sign test: The one sample sign test is a very simple non-parametric test applicable when we sample a continuous symmetrical population in which case the probability of getting a sample value less than mean is 1/2 and the probability of getting a sample value greater than mean is also 1/2. To test the null hypothesis µ = µ H0 against an appropriate alternative on the basis of a random sample of size ‘n’, we replace the value of each and every item of the sample with a plus (+) sign if it is greater than µ H0, and with a minus (–) sign if it is less than µ H0 . But if the value happens to be equal to µH0 , then we simply discard it. After doing this, we test the null hypothesis that these + and – signs are values of a random variable, having a binomial distribution with p = 1/2*. For performing one sample sign test when the sample is small, we can use tables of binomial probabilities, but when sample happens to be large, we use normal approximation to binomial distribution. Let us take an illustration to apply one sample sign test. *If it is not possible for one reason or another to assume a symmetrical population, even then we can use the one sample sign test, but we shall then be testing the null hypothesis µ~ = ~µ H0 , where ~µ is the population median.

Testing of Hypotheses-II 285 Illustration 1 Suppose playing four rounds of golf at the City Club 11 professionals totalled 280, 282, 290, 273, 283, 283, 275, 284, 282, 279, and 281. Use the sign test at 5% level of significance to test the null hypothesis that professional golfers average µ H0 = 284 for four rounds against the alternative hypothesis µ H0 < 284. Solution: To test the null hypothesis µ H0 = 284 against the alternative hypothesis µ H0 < 284 at 5% (or 0.05) level of significance, we first replace each value greater than 284 with a plus sign and each value less than 284 with a minus sign and discard the one value which actually equals 284. If we do this we get –,–,+,–,–,–,–,–,–,–. Now we can examine whether the one plus sign observed in 10 trials support the null hypothesis p = 1/2 or the alternative hypothesis p < 1/2. The probability of one or fewer successes with n = 10 and p = 1/2 can be worked out as under: GHF JKI FGH KJI FHG IJK HGF JKI10C1p1q9 + 10C0 p0q1011191 0 1 10 2 2 = 10 +1 22 = 0.010 + 0.001 (These values can also be seen from the table of binomial probabilities* when p = 1/2 and n = 10) = 0.011 Since this value is less than α = 0.05, the null hypothesis must be rejected. In other words, we conclude that professional golfers’ average is less than 284 for four rounds of golf. Alternatively, we can as well use normal approximation to the binomial distribution. If we do that, we find the observed proportion of success, on the basis of signs that we obtain, is 1/10 and that of failure is 9/10. The. standard error of proportion assuming null hypothesis p = 1/2 is as under: σ prop. = p⋅q = 1 × 1 = 0.1581 n 2 2 10 For testing the null hypothesis i.e., p = 1/2 against the alternative hypothesis p < 1/2, a one-tailed test is appropriate which can be indicated as shown in the Fig. 12.1. By using table of area under normal curve, we find the appropriate z value for 0.45 of the area under normal curve and it is 1.64. Using this, we now work out the limit (on the lower side as the alternative hypothesis is of < type) of the acceptance region as under: p − z ⋅ b gσ prop. or p – (1.64) (0.1581) or 1 − 0.2593 2 or 0.2407 * Table No. 8 given in appendix at the end of the book.

286 Research Methodology p – (1.64) (s )prop Limit 0.05 of area (0.45 of area) 0.2407 p = 1/2 (Shaded portion indicates rejection region) Fig. 12.1 As the observed proportion of success is only 1/10 or 0.1 which comes in the rejection region, we reject the null hypothesis at 5% level of significance and accept the alternative hypothesis. Thus, we conclude that professional golfers’ average is less than 284 for four rounds of golf. (b) Two sample sign test (or the sign test for paired data): The sign test has important applications in problems where we deal with paired data. In such problems, each pair of values can be replaced with a plus (+) sign if the first value of the first sample (say X) is greater than the first value of the second sample (say Y) and we take minus (–) sign if the first value of X is less than the first value of Y. In case the two values are equal, the concerning pair is discarded. (In case the two samples are not of equal size, then some of the values of the larger sample left over after the random pairing will have to be discarded.) The testing technique remains the same as started in case of one sample sign test. An example can be taken to explain and illustrate the two sample sign test. Illustration 2 The following are the numbers of artifacts dug up by two archaeologists at an ancient cliff dwelling on 30 days. By X 1 0 2 3 1 0 2 2 3 0 1 1 4 1 2 1 3 5 2 1 3 2 4 1 3 2 0 2 4 2 By Y 0 0 1 0 2 0 0 1 1 2 0 1 2 1 1 0 2 2 6 0 2 3 0 2 1 0 1 0 1 0 Use the sign test at 1% level of significance to test the null hypothesis that the two archaeologists, X and Y, are equally good at finding artifacts against the alternative hypothesis that X is better. Solution: First of all the given paired values are changed into signs (+ or –) as under:

Testing of Hypotheses-II 287 Table 12.1 By X 102310223011412135213241320242 001020011201211022602302101010 By Y + 0+ + – 0 + + + –+ 0+ 0 + ++ +– + + –+ –+ + –+ ++ Sign (X – Y) Total Number of + signs = 20 Total Number of – signs = 6 Hence, sample size = 26 (Since there are 4 zeros in the sign row and as such four pairs are discarded, we are left with 30 – 4 = 26.) Thus the observed proportion of pluses (or successes) in the sample is = 20/26 = 0.7692 and the observed proportion of minuses (or failures) in the sample is = 6/26 = 0.2308. As we are to test the null hypothesis that the two archaeologists X and Y are equally good and if that is so, the number of pluses and minuses should be equal and as such p = 1/2 and q = 1/2. Hence, the standard error of proportion of successes, given the null hypothesis and the size of the sample, we have: σ prop. = p⋅q = 1 × 1 = 0.0981 n 2 2 26 Since the alternative hypothesis is that the archaeologists X is better (or p > 1/2), we find one tailed test is appropriate. This can be indicated as under, applying normal approximation to binomial distribution in the given case: p + 2.32 (s )prop Limit 0.49 of area 0.01 of area p = 1/2 0.7276 (Shaded area represents rejection region) Fig. 12.2

288 Research Methodology By using the table of area under normal curve, we find the appropriate z value for 0.49 of the area under normal curve and it is 2.32. Using this, we now work out the limit (on the upper side as the alternative hypothesis is of > type) of the acceptance region as under: b gp + 2.32σprop. = 0.5 + 2.32 0.0981 = 0.5 + 0.2276 = 0.7276 and we now find the observed proportion of successes is 0.7692 and this comes in the rejection region and as such we reject the null hypothesis, at 1% level of significance, that two archaeologists X and Y are equally good. In other words, we accept the alternative hypothesis, and thus conclude that archaeologist X is better. Sign tests, as explained above, are quite simple and they can be applied in the context of both one-tailed and two-tailed tests. They are generally based on binomial distribution, but when the sample size happens to be large enough (such that n ⋅ p and n ⋅ q both happen to be greater than 5), we can as well make use of normal approximation to binomial distribution. 2. Fisher-Irwin Test Fisher-Irwin test is a distribution-free test used in testing a hypothesis concerning no difference among two sets of data. It is employed to determine whether one can reasonably assume, for example, that two supposedly different treatments are in fact different in terms of the results they produce. Suppose the management of a business unit has designed a new training programme which is now ready and as such it wishes to test its performance against that of the old training programme. For this purpose a test is performed as follows: Twelve newly selected workers are chosen for an experiment through a standard selection procedure so that we presume that they are of equal ability prior to the experiment. This group of twelve is then divided into two groups of six each, one group for each training programme. Workers are randomly assigned to the two groups. After the training is completed, all workers are given the same examination and the result is as under: Table 12.2 New Training (A) No. passed No. failed Total Old Training (B) 5 1 6 Total 3 3 6 8 4 12 A casual look of the above result shows that the new training programme is superior. But the question arises: Is it really so? It is just possible that the difference in the result of the two groups may be due to chance factor. Such a result may occur even though the two training programmes were equally good. Then how can a decision be made? We may test the hypothesis for the purpose. The hypothesis is that the two programmes are equally good. Prior to testing, the significance level (or the α value) must be specified and supposing the management fixes 5% level for the purpose, which must invariably be respected following the test to guard against bias entering into the result and to avoid the possibility of vacillation oil the part of the decision maker. The required probability that the particular result or a better one for A Group would occur if the two training programmes were, in

Testing of Hypotheses-II 289 fact, equally good, (alternatively the probability that the particular result or worse for B group would occur) be worked out. This should be done keeping in view the probability principles. For the given case, the probability that Group A has the particular result or a better one, given the null hypothesis that the two programmes are equally good, is as follows: Pr. of Group A doing as well or better = Pr. (5 passing and 1 failing) + Pr. (6 passing and 0 failing) = 8C5 × 4C1 + 8C6 × 4C0 12 C6 12 C6 = 224 + 28 = 0.24 + 0.03 = 0.27 924 924 Alternatively, we can work out as under: Pr. of Group B doing as well or worse = Pr. (3 passing and 3 failing) + Pr. (2 passing and 4 failing) = 8 C3 × 4C3 + 8 C2 × 4C4 12 C6 12 C6 = 224 + 28 = 0.24 + 0.03 = 0.27 924 924 Now we have to compare this calculated probability with the significance level of 5% or 0.05 already specified by the management. If we do so, we notice that the calculated value is greater than 0.05 and hence, we must accept the null hypothesis. This means that at a significance level of 5% the result obtained in the above table is not significant. Hence, we can infer that both training programmes are equally good. This test (Fisher-Irwin test), illustrated above, is applicable for those situations where the observed result for each item in the sample can be classified into one of the two mutually exclusive categories. For instance, in the given example the worker’s performance was classified as fail or pass and accordingly numbers failed and passed in each group were obtained. But supposing the score of each worker is also given and we only apply the Fisher-Irwin test as above, then certainly we are discarding the useful information concerning how well a worker scored. This in fact is the limitation of the Fisher-Irwin test which can be removed if we apply some other test, say, Wilcoxon test as stated in the pages that follow. 3. McNemer Test McNemer test is one of the important nonparametric tests often used when the data happen to be nominal and relate to two related samples. As such this test is specially useful with before-after measurement of the same subjects. The experiment is designed for the use of this test in such a way that the subjects initially are divided into equal groups as to their favourable and unfavourable views about, say, any system. After some treatment, the same number of subjects are asked to express their views about the given system whether they favour it or do not favour it. Through McNemer test we in fact try to judge the significance of any observed change in views of the same subjects before

290 Research Methodology and after the treatment by setting up a table in the following form in respect of the first and second set of responses: Table 12.3 Before treatment After treatment Favour Do not favour Favour Do not favour A B C D Since A + D indicates change in people’s responses (B + C shows no change in responses), the expectation under null hypothesis H is that (A + D)/2 cases change in one direction and the same 0 proportion in other direction. The test statistic under McNemer Test is worked out as under (as it uses the under-mentioned transformation of Chi-square test): c hχ2 = A− D −12 with d.f. = 1 b gA + D The minus 1 in the above equation is a correction for continuity as the Chi-square test happens to be a continuous distribution, whereas the observed data represent a discrete distribution. We illustrate this test by an example given below: Illustration 3 In a certain before-after experiment the responses obtained from 1000 respondents, when classified, gave the following information: Before treatment After treatment Unfavourable Favourable Response Response Favourable response 200 =A 300 =B Unfavourable response 400 = C 100 = D Test at 5% level of significance, whether there has been a significant change in people’s attitude before and after the concerning experiment. Solution: In the given question we have nominal data and the study involves before-after measurements of the two related samples, we can use appropriately the McNemer test. We take the null hypothesis (H ) that there has been no change in people’s attitude before and 0 after the experiment. This, in other words, means that the probability of favourable response before and unfavourable response after is equal to the probability of unfavourable response before and favourable response after i.e., H0: P(A) = P (D) We can test this hypothesis against the alternative hypothesis (Ha) viz., Ha: P (A) ≠ P (D)

Testing of Hypotheses-II 291 The test statistic, utilising the McNemer test, can be worked out as under: c h c hχ2 = A− D −12 200 − 100 − 1 2 = b g b gA + D 200 + 100 = 99 × 99 = 32.67 300 Degree of freedom = 1. From the Chi-square distribution table, the value of χ2 for 1 degree of freedom at 5% level of significance is 3.84. The calculated value of χ2 is 32.67 which is greater than the table value, indicating that we should reject the null hypothesis. As such we conclude that the change in people’s attitude before and after the experiment is significant. 4. Wilcoxon Matched-pairs Test (or Signed Rank Test) In various research situations in the context of two-related samples (i.e., case of matched paires such as a study where husband and wife are matched or when we compare the output of two similar machines or where some subjects are studied in context of before-after experiment) when we can determine both direction and magnitude of difference between matched values, we can use an important non-parametric test viz., Wilcoxon matched-paires test. While applying this test, we first find the differences (di) between each pair of values and assign rank to the differences from the smallest to the largest without regard to sign. The actual signs of each difference are then put to corresponding ranks and the test statistic T is calculated which happens to be the smaller of the two sums viz., the sum of the negative ranks and the sum of the positive ranks. While using this test, we may come across two types of tie situations. One situation arises when the two values of some matched pair(s) are equal i.e., the difference between values is zero in which case we drop out the pair(s) from our calculations. The other situation arises when two or more pairs have the same difference value in which case we assign ranks to such pairs by averaging their rank positions. For instance, if two pairs have rank score of 5, we assign the rank of 5.5 i.e., (5 + 6)/2 = 5.5 to each pair and rank the next largest difference as 7. When the given number of matched pairs after considering the number of dropped out pair(s), if any, as stated above is equal to or less than 25, we use the table of critical values of T (Table No. 7 given in appendix at the end of the book) for the purpose of accepting or rejecting the null hypothesis of no difference between the values of the given pairs of observations at a desired level of significance. For this test, the calculated value of T must be equal to or smaller than the table value in order to reject the null hypothesis. In case the number exceeds 25, the sampling distribution of T is taken as approximately normal with mean UT = n(n + 1)/4 and standard deviation b g b gσT = n n + 1 2n + 1 /24 , where n = [(number of given matched pairs) – (number of dropped out pairs, if any)] and in such situation the test statistic z is worked out as under:

292 Research Methodology z = T − UT σT We may now explain the use of this test by an example. Illustration 4 An experiment is conducted to judge the effect of brand name on quality perception. 16 subjects are recruited for the purpose and are asked to taste and compare two samples of product on a set of scale items judged to be ordinal. The following data are obtained: Pair Brand A Brand B 1 73 51 2 43 41 3 47 43 4 53 41 5 58 47 6 47 32 7 52 24 8 58 58 9 38 43 10 61 53 11 56 52 12 56 57 13 34 44 14 55 57 15 65 40 16 75 68 Test the hypothesis, using Wilcoxon matched-pairs test, that there is no difference between the perceived quality of the two samples. Use 5% level of significance. Solution: Let us first write the null and alternative hypotheses as under: H0: There is no difference between the perceived quality of two samples. Ha: There is difference between the perceived quality of the two samples. Using Wilcoxon matched-pairs test, we work out the value of the test statistic T as under: Table 12.4 Pair Brand A Brand B Difference Rank of Rank with signs di |di| +– 1 73 51 22 13 13 … 2 43 41 2 2.5 2.5 … ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○ Contd.

Testing of Hypotheses-II 293 Pair Brand A Brand B Difference Rank of Rank with signs d |d | + – ii 3 47 43 4 4.5 4.5 … 4 53 41 12 11 11 … 5 58 47 11 10 10 … 6 47 32 15 12 12 … 7 52 24 28 15 15 … 8 58 58 0 ––– 9 38 43 –5 6 … –6 10 61 53 8 8 8… 11 56 52 4 4.5 4.5 … 12 56 57 –1 1 … –1 13 34 44 –10 9 … –9 14 55 57 –2 2.5 … –2.5 15 65 40 25 14 14 … 16 75 68 7 7 7… TOTAL 101.5 –18.5 Hence, T = 18.5 We drop out pair 8 as ‘d’ value for this is zero and as such our n = (16 – 1) = 15 in the given problem. The table value of T at five percent level of significance when n = 15 is 25 (using a two-tailed test because our alternative hypothesis is that there is difference between the perceived quality of the two samples). The calculated value of T is 18.5 which is less than the table value of 25. As such we reject the null hypothesis and conclude that there is difference between the perceived quality of the two samples. 5. Rank Sum Tests Rank sum tests are a whole family of test, but we shall describe only two such tests commonly used viz., the U test and the H test. U test is popularly known as Wilcoxon-Mann-Whitney test, whereas H test is also known as Kruskal-Wallis test. A brief description of the said two tests is given below: (a) Wilcoxon-Mann-Whitney test (or U-test): This is a very popular test amongst the rank sum tests. This test is used to determine whether two independent samples have been drawn from the same population. It uses more information than the sign test or the Fisher-Irwin test. This test applies under very general conditions and requires only that the populations sampled are continuous. However, in practice even the violation of this assumption does not affect the results very much. To perform this test, we first of all rank the data jointly, taking them as belonging to a single sample in either an increasing or decreasing order of magnitude. We usually adopt low to high ranking process which means we assign rank 1 to an item with lowest value, rank 2 to the next higher item and so on. In case there are ties, then we would assign each of the tied observation the mean of the ranks which they jointly occupy. For example, if sixth, seventh and eighth values are identical, we would assign each the rank (6 + 7 + 8)/3 = 7. After this we find the sum of the ranks assigned to the

294 Research Methodology values of the first sample (and call it R1) and also the sum of the ranks assigned to the values of the second sample (and call it R2). Then we work out the test statistic i.e., U, which is a measurement of the difference between the ranked observations of the two samples as under: b gU = n1 ⋅ n2 + n1 n1 + 1 − R1 2 where n1, and n2 are the sample sizes and R1 is the sum of ranks assigned to the values of the first sample. (In practice, whichever rank sum can be conveniently obtained can be taken as R1, since it is immaterial which sample is called the first sample.) In applying U-test we take the null hypothesis that the two samples come from identical populations. If this hypothesis is true, it seems reasonable to suppose that the means of the ranks assigned to the values of the two samples should be more or less the same. Under the alternative hypothesis, the means of the two populations are not equal and if this is so, then most of the smaller ranks will go to the values of one sample while most of the higher ranks will go to those of the other sample. If the null hypothesis that the n1 + n2 observations came from identical populations is true, the said ‘U’ statistic has a sampling distribution with Mean = µU = n1 ⋅ n2 2 and Standard deviation (or the standard error) b g= σU = n1n2 n1 + n2 + 1 12 If n1 and n2 are sufficiently large (i.e., both greater than 8), the sampling distribution of U can be approximated closely with normal distribution and the limits of the acceptance region can be determined in the usual way at a given level of significance. But if either n1 or n2 is so small that the normal curve approximation to the sampling distribution of U cannot be used, then exact tests may be based on special tables such as one given in the, appendix,* showing selected values of Wilcoxon’s (unpaired) distribution. We now can take an example to explain the operation of U test. Illustration 5 The values in one sample are 53, 38, 69, 57, 46, 39, 73, 48, 73, 74, 60 and 78. In another sample they are 44, 40, 61, 52, 32, 44, 70, 41, 67, 72, 53 and 72. Test at the 10% level the hypothesis that they come from populations with the same mean. Apply U-test. Solution: First of all we assign ranks to all observations, adopting low to high ranking process on the presumption that all given items belong to a single sample. By doing so we get the following: * Table No. 6 given in appendix at the end of the book.

Testing of Hypotheses-II 295 Table 12.5 Size of sample item in Rank Name of related sample: ascending order [A for sample one and 1 32 2 B for sample two] 38 3 39 4 B 40 5 A 41 6.5 A 44 6.5 B 44 8 B 46 9 B 48 10 B 52 11.5 A 53 11.5 A 53 13 B 57 14 B 60 15 A 61 16 A 67 17 A 69 18 B 70 19.5 B 72 19.5 A 72 21.5 B 73 21.5 B 73 23 B 74 24 A 78 A A A From the above we find that the sum of the ranks assigned to sample one items or R1 = 2 + 3 + 8 + 9 + 11.5 + 13 + 14 + 17 + 21.5 + 21.5 + 23 + 24 = 167.5 and similarly we find that the sum of ranks assigned to sample two items or R2 = 1 + 4 + 5 + 6.5 + 6.5 + 10 + 11.5 + 15 + 16 + 18 + 19.5 + 19.5 = 132.5 and we have n1 = 12 and n2 = 12 b gHence, test statistic U = n1 ⋅ n2 + n1 n1 + 1 − R1 2 b g b g b g12 12 + 1 = 12 12 + 2 − 167.5 = 144 + 78 – 167.5 = 54.5 Since in the given problem n and n both are greater than 8, so the sampling distribution of U 12 approximates closely with normal curve. Keeping this in view, we work out the mean and standard deviation taking the null hypothesis that the two samples come from identical populations as under:

296 Research Methodology b g b gµU = n1 × n2 = 12 12 = 72 2 2 b g b g b g b gσU = n1n2 n1 + n2 + 1 = 12 12 12 + 12 + 1 12 12 = 17.32 As the alternative hypothesis is that the means of the two populations are not equal, a two-tailed test is appropriate. Accordingly the limits of acceptance region, keeping in view 10% level of significance as given, can be worked out as under: m u - 1.64s u m u + 1.64s u Limit Limit 0.05 of area 0.05 of area 0.45 of area 0.45 of area 43.6 m u = 72 100.4 (Shaded portion indicates rejection regions) Fig. 12.3 As the z value for 0.45 of the area under the normal curve is 1.64, we have the following limits of acceptance region: b gUpper limit = µU + 1.64 σU = 72 + 1.64 17.32 = 100.40 b gLower limit = µU − 1.64 σU = 72 − 1.64 17.32 = 43.60 As the observed value of U is 54.5 which is in the acceptance region, we accept the null hypothesis and conclude that the two samples come from identical populations (or that the two populations have the same mean) at 10% level. We can as well calculate the U statistic as under using R2 value: b gU = n1 ⋅ n2 + n2 n2 + 1 − R2 2 b g b g b g12 12 + 1 − 132.5 = 12 12 + 2 = 144 + 78 – 132.5 = 89.5 The value of U also lies in the acceptance region and as such our conclusion remains the same, even if we adopt this alternative way of finding U. We can take one more example concerning U test wherein n and n are both less than 8 and as 12 such we see the use of table given in the appendix concerning values of Wilcoxon’s distribution (unpaired distribution).

Testing of Hypotheses-II 297 Illustration 6 Two samples with values 90, 94, 36 and 44 in one case and the other with values 53, 39, 6, 24, and 33 are given. Test applying Wilcoxon test whether the two samples come from populations with the same mean at 10% level against the alternative hypothesis that these samples come from populations with different means. Solution: Let us first assign ranks as stated earlier and we get: Table 12.6 Size of sample item Rank Name of related sample in ascending order (Sample one as A 1 Sample two as B) 6 2 24 3 B 33 4 B 36 5 B 39 6 A 44 7 B 53 8 A 90 9 B 94 A A Sum of ranks assigned to items of sample one = 4 + 6 + 8 + 9 = 27 No. of items in this sample = 4 Sum of ranks assigned to items of sample two = 1 + 2 + 3 + 5 + 7 = 18 No. of items in this sample = 5 As the number of items in the two samples is less than 8, we cannot use the normal curve approximation technique as stated above and shall use the table giving values of Wilcoxon’s distribution. To use this table, we denote ‘Ws’ as the smaller of the two sums and ‘Wl’ the larger. Also, let ‘s’ be the number of items in the sample with smaller sum and let ‘l’ be the number of items in the sample with the larger sum. Taking these notations we have for our question the following values: W = 18; s = 5; W = 27; l = 4 sl The value of Ws is 18 for sample two which has five items and as such s = 5. We now find the difference between Ws and the minimum value it might have taken, given the value of s. The minimum value that Ws could have taken, given that s = 5, is the sum of ranks 1 through 5 and this comes as equal to 1 + 2 + 3 + 4 + 5 = 15. Thus, (Ws – Minimum Ws) = 18 – 15 = 3. To determine the probability that a result as extreme as this or more so would occur, we find the cell of the table which is in the column headed by the number 3 and in the row for s = 5 and l = 4 (the specified values of l are given in the second column of the table). The entry in this cell is 0.056 which is the required probability of getting a value as small as or smaller than 3 and now we should compare it with the significance level of 10%. Since the alternative hypothesis is that the two samples come from populations with different means, a two-tailed test is appropriate and accordingly 10% significance level will mean 5% in the left tail and 5% in the right tail. In other words, we should compare the calculated probability with the

298 Research Methodology probability of 0.05, given the null hypothesis and the significance level. If the calculated probability happens to be greater than 0.05 (which actually is so in the given case as 0.056 > 0.05), then we should accept the null hypothesis. Hence, in the given problem, we must conclude that the two samples come from populations with the same mean. (The same result we can get by using the value of Wl. The only difference is that the value maximum Wl – Wl is required. Since for this problem, the maximum value of Wl (given s = 5 and l = 4) is the sum of 6 through 9 i.e., 6 + 7 + 8 + 9 = 30, we have Max. Wl – Wl = 30 – 27 = 3 which is the same value that we worked out earlier as Ws, – Minimum Ws. All other things then remain the same as we have stated above). (b) The Kruskal-Wallis test (or H test): This test is conducted in a way similar to the U test described above. This test is used to test the null hypothesis that ‘k’ independent random samples come from identical universes against the alternative hypothesis that the means of these universes are not equal. This test is analogous to the one-way analysis of variance, but unlike the latter it does not require the assumption that the samples come from approximately normal populations or the universes having the same standard deviation. In this test, like the U test, the data are ranked jointly from low to high or high to low as if they constituted a single sample. The test statistic is H for this test which is worked out as under: k Ri2 − 3 n + 1 b g ∑ b gH = 12 n n + 1 i =1 ni where n = n1 + n2 + ... + nk and Ri being the sum of the ranks assigned to ni observations in the ith sample. If the null hypothesis is true that there is no difference between the sample means and each sample has at least five items*, then the sampling distribution of H can be approximated with a chi- square distribution with (k – 1) degrees of freedom. As such we can reject the null hypothesis at a given level of significance if H value calculated, as stated above, exceeds the concerned table value of chi-square. Let us take an example to explain the operation of this test: Illustration 7 Use the Kruskal-Wallis test at 5% level of significance to test the null hypothesis that a professional bowler performs equally well with the four bowling balls, given the following results: Bowling Results in Five Games With Ball No. A 271 282 257 248 262 With Ball No. B 252 275 302 268 276 With Ball No. C 260 255 239 246 266 With Ball No. D 279 242 297 270 258 * If any of the given samples has less than five items then chi-square distribution approximation can not be used and the exact tests may be based on table meant for it given in the book “Non-parametric statistics for the behavioural sciences” by S. Siegel.

Testing of Hypotheses-II 299 Solution: To apply the H test or the Kruskal-Wallis test to this problem, we begin by ranking all the given figures from the highest to the lowest, indicating besides each the name of the ball as under: Table 12.7 Bowling results Rank Name of the ball associated 302 1 297 2 B 282 3 D 279 4 A 276 5 D 275 6 B 271 7 B 270 8 A 268 9 D 266 10 B 262 11 C 260 12 A 258 13 C 257 14 D 255 15 A 252 16 C 248 17 B 246 18 A 242 19 C 239 20 D C For finding the values of Ri, we arrange the above table as under: Table 12.7 (a): Bowling Results with Different Balls and Corresponding Rank Ball A Rank Ball B Rank Ball C Rank Ball D Rank 271 7 252 16 260 12 279 4 282 3 275 6 255 15 242 19 257 14 302 1 239 20 297 2 248 17 268 9 246 18 270 8 262 11 276 5 266 10 158 13 n1 = 5 R1 = 52 n2 = 5 R2 = 37 n3 = 5 R3 = 75 n4 = 5 R4 = 46 Now we calculate H statistic as under:

300 Research Methodology k Ri2 − 3 n + 1 b g ∑ b gH = 12 n n + 1 i =1 ni SR| |VU b g= 12 522 + 372 + 752 + 462 − 3 20 + 1 b g |T W|20 20 + 1 5 5 5 5 = (0.02857) (2362.8) – 63 = 67.51 – 63 = 4.51 As the four samples have five items* each, the sampling distribution of H approximates closely with χ2 distribution. Now taking the null hypothesis that the bowler performs equally well with the four balls, we have the value of χ2 = 7.815 for (k – 1) or 4 – 1 = 3 degrees of freedom at 5% level of significance. Since the calculated value of H is only 4.51 and does not exceed the χ2 value of 7.815, so we accept the null hypothesis and conclude that bowler performs equally well with the four bowling balls. 6. One Sample Runs Test One sample runs test is a test used to judge the randomness of a sample on the basis of the order in which the observations are taken. There are many applications in which it is difficult to decide whether the sample used is a random one or not. This is particularly true when we have little or no control over the selection of the data. For instance, if we want to predict a retail store’s sales volume for a given month, we have no choice but to use past sales data and perhaps prevailing conditions in general. None of this information constitutes a random sample in the strict sense. To allow us to test samples for the randomness of their order, statisticians have developed the theory of runs. A run is a succession of identical letters (or other kinds of symbols) which is followed and preceded by different letters or no letters at all. To illustrate, we take the following arrangement of healthy, H, and diseased, D, mango trees that were planted many years ago along a certain road: HH DD HHHHH DDD HHHH DDDDD HHHHHHHHH 1st 2nd 3rd 4th 5th 6th 7th Using underlines to combine the letters which constitute the runs, we find that first there is a run of two H’s, then a run of two D’s, then a run of five H’s, then a run of three D’s, then a run of four H’s, then a run of five D’s and finally a run of nine H’s. In this way there are 7 runs in all or r = 7. If there are too few runs, we might suspect a definite grouping or a trend; if there are too many runs, we might suspect some sort of repeated alternating patterns. In the given case there seems some grouping i.e., the diseased trees seem to come in groups. Through one sample runs test which is based on the idea that too few or too many runs show that the items were not chosen randomly, we can say whether the apparently seen grouping is significant or whether it can be attributed to chance. We shall use the following symbols for a test of runs: n1 = number of occurrences of type 1 (say H in the given case) n2 = number of occurrences of type 2 (say D in the given case) * For the application of H test, it is not necessary that all samples should have equal number of items.

Testing of Hypotheses-II 301 r = number of runs. In the given case the values of n1, n2 and r would be as follows: n1 = 20; n2 = 10; r = 7 The sampling distribution of ‘r’ statistic, the number of runs, is to be used and this distribution has its mean µr = 2n1n2 +1 n1 + n2 2n1n2 − n1 − n2 n1 + n2 2 n1 + n2 − 1 b g b gand the standard deviation σr =2n1n2 In the given case, we work out the values of µr and σr as follows: b2g b20g b10g µr = 20 + 10 + 1 = 14.33 σr = b g b g b g b g2 20 10 2 × 20 × 10 − 20 − 10 = 2.38 b g b gand 20 + 10 2 20 + 10 − 1 For testing the null hypothesis concerning the randomness of the planted trees, we should have been given the level of significance. Suppose it is 1% or 0.01. Since too many or too few runs would indicate that the process by which the trees were planted was not random, a two-tailed test is appropriate which can be indicated as follows on the assumption* that the sampling distribution of r can be closely approximated by the normal distribution. (m r - 2.58s r ) (m r + 2.58s r ) Limit Limit 0.005 of area 0.495 of 0.495 of 0.005 of area area area 8.19 m r = 14.33 20.47 (Shaded area shows the rejection regions) Fig. 12.4 * This assumption can be applied when n1 and n2 are sufficiently large i.e., they should not be less than 10. But in case n1 or n2 is so small that the normal curve approximation assumption cannot be used, then exact tests may be based on special tables which can be seen in the book Non-parametric Statistics for the Behavioural Science by S. Siegel.

302 Research Methodology By using the table of area under normal curve, we find the appropriate z value for 0.495 of the area under the curve and it is 2.58. Using this we now calculate the limits of the acceptance region: Upper limit = µr + (2.58) (2.38) = 14.33 + 6.14 = 20.47 and Lower limit = µr – (2.58) (2.38) = 14.33 – 6.14 = 8.19 We now find that the observed number of runs (i.e., r = 7) lies outside the acceptance region i.e., in the rejection region. Therefore, we cannot accept the null hypothesis of randomness at the given level of significance viz., α = 0.01. As such we conclude that there is a strong indication that the diseased trees come in non-random grouping. One sample runs test, as explained above, is not limited only to test the randomness of series of attributes. Even a sample consisting of numerical values can be treated similarly by using the letters say ‘a’ and ‘b’ to denote respectively the values falling above and below the median of the sample. Numbers equal to the median are omitted. The resulting series of a’s and b’s (representing the data in their original order) can be tested for randomness on the basis of the total number of runs above and below the median, as per the procedure explained above. (The method of runs above and below the median is helpful in testing for trends or cyclical patterns concerning economic data. In case of an upward trend, there will be first mostly b’s and later mostly a’s, but in case of a downward trend, there will be first mostly a’s and later mostly b’s. In case of a cyclical pattern, there will be a systematic alternating of a’s and b’s and probably many runs.) 7. Spearman’s Rank Correlation When the data are not available to use in numerical form for doing correlation analysis but when the information is sufficient to rank the data as first, second, third, and so forth, we quite often use the rank correlation method and work out the coefficient of rank correlation. In fact, the rank correlation coefficient is a measure of correlation that exists between the two sets of ranks. In other words, it is a measure of association that is based on the ranks of the observations and not on the numerical values of the data. It was developed by famous statistician Charles Spearman in the early 1900s and as such it is also known as Spearman’s rank correlation coefficient. For calculating rank correlation coefficient, first of all the actual observations be replaced by their ranks, giving rank 1 to the highest value, rank 2 to the next highest value and following this very order ranks are assigned for all values. If two or more values happen to be equal, then the average of the ranks which should have been assigned to such values had they been all different, is taken and the same rank (equal to the said average) is given to concerning values. The second step is to record the difference between ranks (or ‘d’) for each pair of observations, then square these differences to obtain a total of such differences which can symbolically be stated as ∑di2. Finally, Spearman’s rank correlation coefficient, r*, is worked out as under: RS| V|USpearman’s ‘r’ = 1 – 6∑ di2 T| e jW|n n2 − 1 *Some authors use the symbol Rho ( ρ ) for this coefficient. Rho is to be used when the sample size does not exceed 30.

Testing of Hypotheses-II 303 where n = number of paired observations. The value of Spearman’s rank correlation coefficient will always vary between ±1 , +1, indicating a perfect positive correlation and –1 indicating perfect negative correlation between two variables. All other values of correlation coefficient will show different degrees of correlation. Suppose we get r = 0.756 which suggests a substantial positive relationship between the concerning two variables. But how we should test this value of 0.756? The testing device depends upon the value of n. For small values of n (i.e., n less than 30), the distribution of r is not normal and as such we use the table showing the values for Spearman’s Rank correlation (Table No. 5 given in Appendix at the end of the book) to determine the acceptance and rejection regions. Suppose we get r = 0.756 for a problem where n = 15 and want to test at 5% level of significance the null hypothesis that there is zero correlation in the concerning ranked data. In this case our problem is reduced to test the null hypothesis that there is no correlation i.e., u = 0 against the alternative hypothesis that there is a r correlation i.e., µr ≠ 0 at 5% level. In this case a two-tailed test is appropriate and we look in the said table in row for n = 15 and the column for a significance level of 0.05 and find that the critical values for r are ±0.5179 i.e., the upper limit of the acceptance region is 0.5179 and the lower limit of the acceptance region is –0.5179. And since our calculated r = 0.756 is outside the limits of the acceptance region, we reject the null hypothesis and accept the alternative hypothesis that there is a correlation in the ranked data. In case the sample consists of more than 30 items, then the sampling distribution of r is approximately normal with a mean of zero and a standard deviation of 1/ n − 1 and thus, the standard error of r is: σr = 1 n −1 We can use the table of area under normal curve to find the appropriate z values for testing hypotheses about the population rank correlation and draw inference as usual. We can illustrate it, by an example. Illustration 8 Personnel manager of a certain company wants to hire 30 additional programmers for his corporation. In the past, hiring decisions had been made on the basis of interview and also on the basis of an aptitude test. The agency doing aptitude test had charged Rs. 100 for each test, but now wants Rs. 200 for a test. Performance on the test has been a good predictor of a programmer’s ability and Rs. 100 for a test was a reasonable price. But now the personnel manager is not sure that the test results are worth Rs. 200. However, he has kept over the past few years records of the scores assigned to applicants for programming positions on the basis of interviews taken by him. If he becomes confident (using 0.01 level of significance) that the rank correlation between his interview scores and the applicants’ scores on aptitude test is positive, then he will feel justified in discontinuing the aptitude test in view of the increased cost of the test. What decision should he take on the basis of the following sample data concerning 35 applicants?

304 Research Methodology Sample Data Concerning 35 Applicants Serial Number Interview score Aptitude test score 1 81 113 2 88 88 3 55 76 4 83 129 5 78 99 6 93 142 7 65 93 8 87 136 9 95 82 10 76 91 11 60 83 12 85 96 13 93 126 14 66 108 15 90 95 16 69 65 17 87 96 18 68 101 19 81 111 20 84 121 21 82 83 22 90 79 23 63 71 24 78 109 25 73 68 26 79 121 27 72 109 28 95 121 29 81 140 30 87 132 31 93 135 32 85 143 33 91 118 34 94 147 35 94 138 Solution: To solve this problem we should first work out the value of Spearman’s r as under:

Testing of Hypotheses-II 305 Table 12.8: Calculation of Spearman’s S. No. Interview Aptitude Rank Rank Rank Differences squared score X test score X Y Difference ‘d ’ di2 Y i (Rank X) – (Rank Y) 1 81 113 21 15 6 36 2 88 –16 256 3 55 88 11 27 4 83 3 9 5 78 76 35 32 9 81 6 93 3.5 12.25 7 65 129 18 9 3 9 8 87 7 49 9 95 99 24.5 21 7 49 10 76 –28.5 812.25 11 60 142 6 3 0 0 12 85 5.5 30.25 13 93 93 32 25 –7 49 14 66 –4 16 15 90 136 13 6 12.5 156.25 16 69 –14.5 210.25 17 87 82 1.5 30 –6 36 18 68 –9.5 90.25 19 81 91 26 26 10 100 20 84 5 25 21 82 83 34 28.5 5 25 22 90 –9.5 90.25 23 63 96 15.5 22.5 –21.5 462.25 24 78 0 0 25 73 126 6 10 6 36 26 79 –7 49 27 72 108 31 18.5 11 121 28 95 11 121 29 81 95 9.5 24 –10.5 110.25 30 87 17 289 31 93 65 29 35 5 25 32 85 –1 1 33 91 96 13 22.5 13.5 182.25 34 94 –6 36 35 94 101 30 20 2.5 6.25 –1.5 225 n = 35 111 21 16 121 17 12 83 19 28.5 79 9.5 31 71 33 33 108 24.5 18.5 68 27 34 121 23 12 109 28 17 121 1.5 12 140 21 4 132 13 8 135 6 7 143 15.5 2 118 8 14 147 3.5 1 138 3.5 5 ∑di2 = 3583

306 Research Methodology Spearman’ s ‘ r’ = 1 – RT||S 6∑ di2 W|UV| = 1 − T|RS| 6 × 3583 V|W|U en n2 − 1j e j35 352 − 1 = 1 − 21498 = 0.498 42840 Since n = 35 the sampling distribution of r is approximately normal with a mean of zero and a standard deviation of 1/ n − 1 . Hence the standard error of r is σr = 1= 1 = 0.1715 n −1 35 − 1 As the personnel manager wishes to test his hypothesis at 0.01 level of significance, the problem can be stated: Null hypothesis that there is no correlation between interview score and aptitude test score i.e., µr = 0. Alternative hypothesis that there is positive correlation between interview score and aptitude test score i.e., µr > 0. As such one-tailed test is appropriate which can be indicated as under in the given case: m r + 2.32s ( r) Limit 0.01 of area 0.49 of area m r= 0 0.3978 (Shaded area shows rejection region) Fig. 12.5 By using the table of area under normal curve, we find the appropriate z value for 0.49 of the area under normal curve and it is 2.32. Using this we now work out the limit (on the upper side as alternative hypothesis is of > type) of the acceptance region as under: µr + (2.32) (0.1715) = 0 + 0.3978 = 0.3978

Testing of Hypotheses-II 307 We now find the observed r = 0.498 and as such it comes in the rejection region and, therefore, we reject the null hypothesis at 1% level and accept the alternative hypothesis. Hence we conclude that correlation between interview score and aptitude test score is positive. Accordingly personnel manager should decide that the aptitude test be discontinued. 8. Kendall’s Coefficient of Concordance Kendall’s coefficient of concordance, represented by the symbol W, is an important non-parametric measure of relationship. It is used for determining the degree of association among several (k) sets of ranking of N objects or individuals. When there are only two sets of rankings of N objects, we generally work out Spearman’s coefficient of correlation, but Kendall’s coefficient of concordance (W) is considered an appropriate measure of studying the degree of association among three or more sets of rankings. This descriptive measure of the agreement has special applications in providing a standard method of ordering objects according to consensus when we do not have an objective order of the objects. The basis of Kendall’s coefficient of concordance is to imagine how the given data would look if there were no agreement among the several sets of rankings, and then to imagine how it would look if there were perfect agreement among the several sets. For instance, in case of, say, four interviewers interviewing, say, six job applicants and assigning rank order on suitability for employment, if there is observed perfect agreement amongst the interviewers, then one applicant would be assigned rank 1 by all the four and sum of his ranks would be 1 + 1 + 1 + 1 = 4. Another applicant would be assigned a rank 2 by all four and the sum of his ranks will be 2 + 2 + 2 + 2 = 8. The sum of ranks for the six applicants would be 4, 8, 12, 16, 20 and 24 (not necessarily in this very order). In general, when perfect agreement exists among ranks assigned by k judges to N objects, the rank sums are k, 2k, 3k, … Nk. The, total sum of N ranks for k judges is kN(N + 1)/2 and the mean rank sum is k(N + 1)/ 2. The degree of agreement between judges reflects itself in the variation in the rank sums. When all judges agree, this sum is a maximum. Disagreement between judges reflects itself in a reduction in the variation of rank sums. For maximum disagreement, the rank sums will tend to be more or less equal. This provides the basis for the definition of a coefficient of concordance. When perfect agreement exists between judges, W equals to 1. When maximum disagreement exists, W equals to 0. It may be noted that W does not take negative values because of the fact that with more than two judges complete disagreement cannot take place. Thus, coefficient of concordance (W) is an index of divergence of the actual agreement shown in the data from the perfect agreement. The procedure for computing and interpreting Kendall’s coefficient of concordance (W) is as follows: (a) All the objects, N, should be ranked by all k judges in the usual fashion and this information may be put in the form of a k by N matrix; (b) For each object determine the sum of ranks (Rj) assigned by all the k judges; (c) Determine R j and then obtain the value of s as under: d i2 s = ∑ Rj − Rj (d) Work out the value of W using the following formula:

308 Research Methodology s N3 − N e jW =1 k2 12 d i2 where s = ∑ Rj − Rj ; k = no. of sets of rankings i.e., the number of judges; N = number of objects ranked; e j1 k 2 N 3 − N = maximum possible sum of the squared deviations i.e., the sum s which 12 would occur with perfect agreement among k rankings. Case of Tied Ranks Where tied ranks occur, the average method of assigning ranks be adopted i.e., assign to each member the average rank which the tied observations occupy. If the ties are not numerous, we may compute ‘W’ as stated above without making any adjustment in the formula; but if the ties are numerous, a correction factor is calculated for each set of ranks. This correction fact is e j∑ t3 − t T= 12 where t = number of observations in a group tied for a given rank. For instance, if the ranks on X are 1, 2, 3.5, 5, 6, 3.5, 8, 10, 8, 8, we have two groups of ties, one of two ranks and one of three ranks. The correction factor for this set of ranks for X would be e j e j23 − 2 + 33 − 3 T = = 2.5 12 A correction factor T is calculated for each of the k sets of ranks and these are added together over the k sets to obtain ∑T . We then use the formula for finding the value of ‘W’ as under: e jW = s 1 k 2 N 3 − N − k ∑T 12 The application of the correction in this formula tends to increase the size of W, but the correction factor has a very limited effect unless the ties are quite numerous. (e) The method for judging whether the calculated value of W is significantly different from zero depends on the size of N as stated below: (i) If N is 7 or smaller, Table No. 9 given in appendix at the end of the book gives critical values of s associated with W’s significance at 5% and 1% levels. If an observed s is equal to or greater than that shown in the table for a particular level of significance, then H0 T (i.e., k sets of rankings are independent) may be rejected at that level of significance.

Testing of Hypotheses-II 309 (ii) If N is larger than 7, we may use χ2 value to be worked out as: χ2 = k(N – 1). W with d.f. = (N – 1) for judging W’s significance at a given level in the usual way of using χ2 values. (f) Significant value of W may be interpreted and understood as if the judges are applying essentially the same standard in ranking the N objects under consideration, but this should never mean that the orderings observed are correct for the simple reason that all judges can agree in ordering objects because they all might employ ‘wrong’ criterion. Kendall, therefore, suggests that the best estimate of the ‘true’ rankings of N objects is provided, when W is significant, by the order of the various sums of ranks, Rj. If one accepts the criterion which the various judges have agreed upon, then the best estimate of the ‘true’ ranking is provided by the order of the sums of ranks. The best estimate is related to the lowest value observed amongst Rj. This can be illustrated with the help of an example. Illustration 9 Seven individuals have been assigned ranks by four judges at a certain music competition as shown in the following matrix: Individuals A B C D EFG Judge 1 1 3 2 5 746 Judge 2 2 4 1 3 756 Judge 3 3 4 1 2 765 Judge 4 1 2 5 4 637 Is there significant agreement in ranking assigned by different judges? Test at 5% level. Also point out the best estimate of the true rankings. Solution: As there are four sets of rankings, we can work out the coefficient of concordance (W) for judging significant agreement in ranking by different judges. For this purpose we first develop the given matrix as under: Table 12.9 K=4 Individuals ∴N=7 A B C DEF G Judge 1 1 3 2 574 6 Judge 2 2 4 1 375 6 Judge 3 3 4 1 276 5 Judge 4 1 2 5 463 7 Sum of ranks (R ) 7 13 9 14 27 18 24 ∑R j = 112 j d i2 Rj − Rj 81 9 49 4 121 4 64 ∴ s = 332

310 Research Methodology Q Rj = ∑R j = 112 = 16 N 7 Q s = 332 s 332 332 = 332 = 0.741 1 k2 N3 − N 4 2 73 − 7 16 336 448 e j b g e j b g∴W= = 1 = 12 12 12 To judge the significance of this W, we look into the Table No. 9 given in appendix for finding the value of s at 5% level for k = 4 and N = 7. This value is 217.0 and thus for accepting the null hypothesis (H0) that k sets of rankings are independent) our calculated value of s should be less than 217. But the worked out value of s is 332 which is higher than the table value which fact shows that W = 0.741 is significant. Hence, we reject the null hypothesis and infer that the judges are applying essentially the same standard in ranking the N objects i.e., there is significant agreement in ranking by different judges at 5% level in the given case. The lowest value observed amongst Rj is 7 and as such the best estimate of true rankings is in the case of individual A i.e., all judges on the whole place the individual A as first in the said music competition. Illustration 10 Given is the following information: k = 13 N = 20 W = 0.577 Determine the significance of W at 5% level. Solution: As N is larger than 7, we shall workout the value of χ2 for determining W’s significance as under: χ2 = k(N – 1)W with N – 1 degrees of freedom ∴ χ2 = 13(20 – 1) (0.577) or χ2 = (247) (0.577) = 142.52 Table value of χ2 at 5% level for N – 1 = 20 – 1 = 19 d.f. is 30.144 but the calculated value of χ2 is 142.52 and this is considerably higher than the table value. This does not support the null hypothesis of independence and as such we can infer that W is significant at 5% level. RELATIONSHIP BETWEEN SPEARMAN’S r ’s AND KENDALL’S W As stated above, W is an appropriate measure of studying the degree of association among three or more sets of ranks, but we can as well determine the degree of association among k sets of rankings by averaging the Spearman’s correlation coefficients (r’s) between all possible pairs (i.e., kC2 or k (k – 1)/2) of rankings keeping in view that W bears a linear relation to the average r’s taken over

Testing of Hypotheses-II 311 all possible pairs. The relationship between the average of Spearman’s r’s and Kendall’s W can be put in the following form: average of r’s = (kW – 1)/(k – 1) But the method of finding W using average of Spearman’s r’s between all possible pairs is quite tedious, particularly when k happens to be a big figure and as such this method is rarely used in practice for finding W. Illustration 11 Using data of illustration No. 9 above, find W using average of Spearman’s r’s. Solution: As k = 4 in the given question, the possible pairs are equal to k(k – 1)/2 = 4(4 – 1)/2 = 6 and we work out Spearman’s r for each of these pairs as shown in Table 12.10. Now we can find W using the following relationship formula between r’s average and W Average of r’s = (kW – 1)/(k – 1) or 0.655 = (4W – 1)/(4 – 1) or (0.655) (3) = 4W – 1 b g b gor +1 W= 0.655 3 = 2.965 = 0.741 44 [Note: This value of W is exactly the same as we had worked out using the formula: W = s/[(1/12) (k2) (N3 – N)] CHARACTERISTICS OF DISTRIBUTION-FREE OR NON-PARAMETRIC TESTS From what has been stated above in respect of important non-parametric tests, we can say that these tests share in main the following characteristics: 1. They do not suppose any particular distribution and the consequential assumptions. 2. They are rather quick and easy to use i.e., they do not require laborious computations since in many cases the observations are replaced by their rank order and in many others we simply use signs. 3. They are often not as efficient or ‘sharp’ as tests of significance or the parametric tests. An interval estimate with 95% confidence may be twice as large with the use of non- parametric tests as with regular standard methods. The reason being that these tests do not use all the available information but rather use groupings or rankings and the price we pay is a loss in efficiency. In fact, when we use non-parametric tests, we make a trade-off: we loose sharpness in estimating intervals, but we gain the ability to use less information and to calculate faster. 4. When our measurements are not as accurate as is necessary for standard tests of significance, then non-parametric methods come to our rescue which can be used fairly satisfactorily. 5. Parametric tests cannot apply to ordinal or nominal scale data but non-parametric tests do not suffer from any such limitation. 6. The parametric tests of difference like ‘t’ or ‘F’ make assumption about the homogeneity of the variances whereas this is not necessary for non-parametric tests of difference.

Table 12.10: Difference between Ranks |di| Values of such Differences Individuals Pair 1 – 2 Pair 1 – 3 Pair 1 – 4 A |d | d2 |d | d2 |d | d B C 1 1 24 0 D 1 1 11 1 E –1 1 11 3 F –2 4 39 1 G 0 0 00 1 1 1 24 1 0 0 11 1 ∑d 2 =8 ∑d 2 = 20 ∑di2 = 14 i i Spearman’s Coefficient of Correlation 6 ∑d 2 i e jr = 1 − N N2 − 1 r = 0.857 r = 0.643 r = 0.750 12 13 14 Average of Spearman’ s r’ s = 0.857 + 0.643 + 0. = 3.929 = 0.655 6

Assigned by k = 4 Judges and the Square 312 Research Methodology (di2) for all Possible Pairs of Judges Pair 2 – 3 Pair 2 – 4 Pair 3 – 4 d2 |d | d2 |d | d2 |d | d2 01 1 11 24 10 0 24 24 90 0 4 16 4 16 11 1 11 24 10 0 11 11 11 1 24 39 11 1 11 24 ∑di2 = 4 ∑d 2 = 28 ∑d 2 = 42 i i r = 0.929 r = 0.500 r = 0.250 23 24 34 .750 + 0.929 + 0.500 + 0.250 6

Testing of Hypotheses-II 313 CONCLUSION There are many situations in which the various assumptions required for standard tests of significance (such as that population is normal, samples are independent, standard deviation is known, etc.) cannot be met, then we can use non-parametric methods. Moreover, they are easier to explain and easier to understand. This is the reason why such tests have become popular. But one should not forget the fact that they are usually less efficient/powerful as they are based on no assumption (or virtually no assumption) and we all know that the less one assumes, the less one can infer from a set of data. But then the other side must also be kept in view that the more one assumes, the more one limits the applicability of one’s methods. Questions 1. Give your understanding of non-parametric or distribution free methods explaining their important characteristics. 2. Narrate the various advantages of using non-parametric tests. Also point out their limitations. 3. Briefly describe the different non-parametric tests explaining the significance of each such test. 4. On 15 occasions Mr. Kalicharan had to wait 4, 8, 2, 7, 7, 5, 8, 6, 1, 9, 6, 6, 5, 9 and 5 minutes for the bus he takes to reach his office. Use the sign test at 5% level of significance to test the bus company’s claim that on the average Mr. Kalicharan should not have to wait more than 5 minutes for a bus. 5. The following are the numbers of tickets issued by two policemen on 20 days: By first policeman: 7, 10, 14, 12, 6, 9, 11, 13, 7, 6, 10, 8, 14, 8, 12, 11, 9, 8, 10 and 15. By second policeman: 10, 13, 14, 11, 10, 7, 15, 11, 10, 9, 8, 12, 16, 10, 10, 14, 10, 12, 8 and 14. Use the sign test at 1% level of significance to test the null hypothesis that on the average the two policemen issue equal number of tickets against the alternative hypothesis that on the average the second policeman issues more tickets than the first one. 6. (a) Under what circumstances is the Fisher-Irwin test used? Explain. What is the main limitation of this test? (b) A housing contractor plans to build a large number of brick homes in the coming year. Two brick manufacturing concerns have given him nearly identical rates for supplying the bricks. But before placing his order, he wants to apply a test at 5% level of significance. The nature of the test is to subject each sampled brick to a force of 900 pounds. The test is performed on 8 bricks randomly chosen from a day’s production of concern A and on the same number of bricks randomly chosen from a day’s production of concern B. The results were as follows: Of the 8 bricks from concern A, two were broken and of the 8 bricks from concern B, five were broken. On the basis of these test results, determine whether the contractor should place order with concern A or with concern B if he prefers significantly stronger bricks. 7. Suppose that the breaking test described in problem 6(b) above is modified so that each brick is subjected to an increasing force until it breaks. The force applied at the time the brick breaks (calling it the breaking point) is recorded as under: Breaking-points Bricks of concern A 880, 950, 990, 975 895, 1030, 1025, 1010 Bricks of concern B 915, 790, 905, 900, 890, 825, 810 885.

314 Research Methodology On the basis of the above test results, determine whether the contractor should place order for bricks with concern A or with concern B (You should answer using U test or Wilcoxon-Mann-Whitney test). 8. The following are the kilometres per gallon which a test driver got for ten tankfuls each of three kinds of gasoline: Gasoline A 30, 41, 34, 43, 33, 34, 38, 26, 29, 36 Gasoline B 39, 28, 39, 29, 30, 31, 44, 43, 40, 33 Gasoline C 29, 41, 26, 36, 41, 43, 38, 38, 35, 40. Use the Kruskal-Wallis test at the level of significance α = 0.05 to test the null hypothesis that there is no difference in the average kilometre yield of the three types of gasoline. 9. (a) The following are the number of students absent from a college on 24 consecutive days: 29, 25, 31, 28, 30, 28, 33, 31, 35, 29, 31, 33, 35, 28, 36, 30, 33, 26, 30, 28, 32, 31, 38 and 27. Test for randomness at 1% level of significance. (b) The following arrangement indicates whether 25 consecutive persons interviewed by a social scientist are for (F) or against (A) an increase in the number of crimes in a certain locality: F, F, F, F, F, F, A, F, F, F, F, F, A, F, F, F, F, A, A, F, F, F, F, F, F. Test whether this arrangement of A’s and F’s may be regarded as random at 5% as well as at 10% level of significance. 10. Use a rank correlation at the 1% significance level and determine if there is significant positive correlation between the two samples on the basis of the following information: Blender A1 A2 A3 B C1 C2 D1 D2 E F1 F2 G1 G2 H model Sample 1 1 11 12 2 13 10 3 4 14 5 6 9 7 8 Sample 2 4 12 11 2 13 10 1 3 14 8 6 5 9 7 11. Three interviewers rank-order a group of 10 applicants as follows: Interviewers Applicants ab c d e f g h i j A 1 2 3 4 5 6 7 8 9 10 B 2 3 4 5 1 7 6 9 8 10 C 5 4 1 2 3 6 7 10 9 8 Compute the coefficient of concordance (W) and verify the same by using the relationship between average of Spearman’s r’s and the coefficient of concordance. Test the significance of W at 5% and 1% levels of significance and state what should be inferred from the same. Also point out the best estimate of true rankings. 12. Given are the values of Spearman’s r’s as under: rab = 0.607 rac = 0.429 rbc = 0.393 Calculate Kendall’s coefficient of concordance W from the above information and test its significance at 5% level.

Multivariate Analysis Techniques 315 13 Multivariate Analysis Techniques All statistical techniques which simultaneously analyse more than two variables on a sample of observations can be categorized as multivariate techniques. We may as well use the term ‘multivariate analysis’ which is a collection of methods for analyzing data in which a number of observations are available for each object. In the analysis of many problems, it is helpful to have a number of scores for each object. For instance, in the field of intelligence testing if we start with the theory that general intelligence is reflected in a variety of specific performance measures, then to study intelligence in the context of this theory one must administer many tests of mental skills, such as vocabulary, speed of recall, mental arithmetic, verbal analogies and so on. The score on each test is one variable, Xi, and there are several, k, of such scores for each object, represented as X1, X2 …Xk. Most of the research studies involve more than two variables in which situation analysis is desired of the association between one (at times many) criterion variable and several independent variables, or we may be required to study the association between variables having no dependency relationships. All such analyses are termed as multivariate analyses or multivariate techniques. In brief, techniques that take account of the various relationships among variables are termed multivariate analyses or multivariate techniques. GROWTH OF MULTIVARIATE TECHNIQUES Of late, multivariate techniques have emerged as a powerful tool to analyse data represented in terms of many variables. The main reason being that a series of univariate analysis carried out separately for each variable may, at times, lead to incorrect interpretation of the result. This is so because univariate analysis does not consider the correlation or inter-dependence among the variables. As a result, during the last fifty years, a number of statisticians have contributed to the development of several multivariate techniques. Today, these techniques are being applied in many fields such as economics, sociology, psychology, agriculture, anthropology, biology and medicine. These techniques are used in analyzing social, psychological, medical and economic data, specially when the variables concerning research studies of these fields are supposed to be correlated with each other and when rigorous probabilistic models cannot be appropriately used. Applications of multivariate techniques in practice have been accelerated in modern times because of the advent of high speed electronic computers.

316 Research Methodology CHARACTERISTICS AND APPLICATIONS Multivariate techniques are largely empirical and deal with the reality; they possess the ability to analyse complex data. Accordingly in most of the applied and behavioural researches, we generally resort to multivariate analysis techniques for realistic results. Besides being a tool for analyzing the data, multivariate techniques also help in various types of decision-making. For example, take the case of college entrance examination wherein a number of tests are administered to candidates, and the candidates scoring high total marks based on many subjects are admitted. This system, though apparently fair, may at times be biased in favour of some subjects with the larger standard deviations. Multivariate techniques may be appropriately used in such situations for developing norms as to who should be admitted in college. We may also cite an example from medical field. Many medical examinations such as blood pressure and cholesterol tests are administered to patients. Each of the results of such examinations has significance of its own, but it is also important to consider relationships between different test results or results of the same tests at different occasions in order to draw proper diagnostic conclusions and to determine an appropriate therapy. Multivariate techniques can assist us in such a situation. In view of all this, we can state that “if the researcher is interested in making probability statements on the basis of sampled multiple measurements, then the best strategy of data analysis is to use some suitable multivariate statistical technique.”1 The basic objective underlying multivariate techniques is to represent a collection of massive data in a simplified way. In other words, multivariate techniques transform a mass of observations into a smaller number of composite scores in such a way that they may reflect as much information as possible contained in the raw data obtained concerning a research study. Thus, the main contribution of these techniques is in arranging a large amount of complex information involved in the real data into a simplified visible form. Mathematically, multivariate techniques consist in “forming a linear composite vector in a vector subspace, which can be represented in terms of projection of a vector onto certain specified subspaces.”2 For better appreciation and understanding of multivariate techniques, one must be familiar with fundamental concepts of linear algebra, vector spaces, orthogonal and oblique projections and univariate analysis. Even then before applying multivariate techniques for meaningful results, one must consider the nature and structure of the data and the real aim of the analysis. We should also not forget that multivariate techniques do involve several complex mathematical computations and as such can be utilized largely with the availability of computer facility. CLASSIFICATION OF MULTIVARIATE TECHNIQUES Today, there exist a great variety of multivariate techniques which can be conveniently classified into two broad categories viz., dependence methods and interdependence methods. This sort of classification depends upon the question: Are some of the involved variables dependent upon others? If the answer is ‘yes’, we have dependence methods; but in case the answer is ‘no’, we have interdependence methods. Two more questions are relevant for understanding the nature of multivariate techniques. Firstly, in case some variables are dependent, the question is how many variables are dependent? The other question is, whether the data are metric or non-metric? This means whether 1K. Takeuchi, H. Yanai and B.N. Mukherji, The Foundations of Multivariate Analysis, p. 54. 2 Ibid., p. iii.

Multivariate Analysis Techniques 317 the data are quantitative, collected on interval or ratio scale, or whether the data are qualitative, collected on nominal or ordinal scale. The technique to be used for a given situation depends upon the answers to all these very questions. Jadish N. Sheth in his article on “The multivariate revolution in marketing research”3 has given the flow chart that clearly exhibits the nature of some important multivariate techniques as shown in Fig. 13.1. Thus, we have two types of multivariate techniques: one type for data containing both dependent and independent variables, and the other type for data containing several variables without dependency relationship. In the former category are included techniques like multiple regression analysis, multiple discriminant analysis, multivariate analysis of variance and canonical analysis, whereas in the latter category we put techniques like factor analysis, cluster analysis, multidimensional scaling or MDS (both metric and non-metric) and the latent structure analysis. All multivariate methods Are some variables dependent? Yes No Interdependence Dependence methods methods How many Are inputs metric? variables are dependent? One Several Yes No Is it metric? Are they Factor Clustre Metric metric? analysis analysis MDS Yes No Yes No Multiple Multiple Multivariate Non-metric Latent regression discriminant analysis of MDS structure variance analysis analysis Canonical analysis Fig. 13.1 3Journal of Marketing, American Marketing Association, Vol. 35, No. 1 (Jan. 1971), pp. 13–19.

318 Research Methodology VARIABLES IN MULTIVARIATE ANALYSIS Before we describe the various multivariate techniques, it seems appropriate to have a clear idea about the term, ‘variables’ used in the context of multivariate analysis. Many variables used in multivariate analysis can be classified into different categories from several points of view. Important ones are as under: (i) Explanatory variable and criterion variable: If X may be considered to be the cause of Y, then X is described as explanatory variable (also termed as causal or independent variable) and Y is described as criterion variable (also termed as resultant or dependent variable). In some cases both explanatory variable and criterion variable may consist of a set of many variables in which case set (X1, X2, X3, …., Xp) may be called a set of explanatory variables and the set (Y1, Y2, Y3, …., Yq) may be called a set of criterion variables if the variation of the former may be supposed to cause the variation of the latter as a whole. In economics, the explanatory variables are called external or exogenous variables and the criterion variables are called endogenous variables. Some people use the term external criterion for explanatory variable and the term internal criterion for criterion variable. (ii) Observable variables and latent variables: Explanatory variables described above are supposed to be observable directly in some situations, and if this is so, the same are termed as observable variables. However, there are some unobservable variables which may influence the criterion variables. We call such unobservable variables as latent variables. (iii) Discrete variable and continuous variable: Discrete variable is that variable which when measured may take only the integer value whereas continuous variable is one which, when measured, can assume any real value (even in decimal points). (iv) Dummy variable (or Pseudo variable): This term is being used in a technical sense and is useful in algebraic manipulations in context of multivariate analysis. We call Xi ( i = 1, …., m) a dummy variable, if only one of Xi is 1 and the others are all zero. IMPORTANT MULTIVARIATE TECHNIQUES A brief description of the various multivariate techniques named above (with special emphasis on factor analysis) is as under: (i) Multiple regression*: In multiple regression we form a linear composite of explanatory variables in such way that it has maximum correlation with a criterion variable. This technique is appropriate when the researcher has a single, metric criterion variable. Which is supposed to be a function of other explanatory variables. The main objective in using this technique is to predict the variability the dependent variable based on its covariance with all the independent variables. One can predict the level of the dependent phenomenon through multiple regression analysis model, given the levels of independent variables. Given a dependent variable, the linear-multiple regression problem is to estimate constants B , B , ... B and A such that the expression Y = B X + B X + ... + B X + A pare rovides 12 k 11 22 kk a good estimate of an individual’s Y score based on his X scores. In practice, Y and the several X variables are converted to standard scores; zy, zl, z2, ... zk; each z has a mean of 0 and standard deviation of 1. Then the problem is to estimate constants, βi , such that z′y = β1z1 + β2z2 + ...+ βk zk * See Chapter 7 also for other relevant information about multiple regression.

Multivariate Analysis Techniques 319 where z'y stands for the predicted value of the standardized Y score, zy. The expression on the right side of the above equation is the linear combination of explanatory variables. The constant A is eliminated in the process of converting X’s to z’s. The least-squares-method is used, to estimate the beta weights in such a way that the sum of the squared prediction errors is kept as small as possible d i2 i.e., the expression ∑ zy − zy′ is minimized. The predictive adequacy of a set of beta weights is indicated by the size of the correlation coefficient rzy ⋅ z′y between the predicted z′y scores and the actual zy scores. This special correlation coefficient from Karl Pearson is termed the multiple correlation coefficient (R). The squared multiple correlation, R2, represents the proportion of criterion (zy) variance accounted for by the explanatory variables, i.e., the proportion of total variance that is ‘Common Variance’. Sometimes the researcher may use step-wise regression techniques to have a better idea of the independent contribution of each explanatory variable. Under these techniques, the investigator adds the independent contribution of each explanatory variable into the prediction equation one by one, computing betas and R2 at each step. Formal computerized techniques are available for the purpose and the same can be used in the context of a particular problem being studied by the researcher. (ii) Multiple discriminant analysis: Through discriminant analysis technique, researcher may classify individuals or objects into one of two or more mutually exclusive and exhaustive groups on the basis of a set of independent variables. Discriminant analysis requires interval independent variables and a nominal dependent variable. For example, suppose that brand preference (say brand x or y) is the dependent variable of interest and its relationship to an individual’s income, age, education, etc. is being investigated, then we should use the technique of discriminant analysis. Regression analysis in such a situation is not suitable because the dependent variable is, not intervally scaled. Thus discriminant analysis is considered an appropriate technique when the single dependent variable happens to be non-metric and is to be classified into two or more groups, depending upon its relationship with several independent variables which all happen to be metric. The objective in discriminant analysis happens to be to predict an object’s likelihood of belonging to a particular group based on several independent variables. In case we classify the dependent variable in more than two groups, then we use the name multiple discriminant analysis; but in case only two groups are to be formed, we simply use the term discriminant analysis. We may briefly refer to the technical aspects* relating to discriminant analysis. (i) There happens to be a simple scoring system that assigns a score to each individual or object. This score is a weighted average of the individual’s numerical values of his independent variables. On the basis of this score, the individual is assigned to the ‘most likely’ category. For example, an individual is 20 years old, has an annual income of Rs 12,000,and has 10 years of formal education. Let b1, b2, and b3 be the weights attached to the independent variables of age, income and education respectively. The individual’s score (z), assuming linear score, would be: z = b (20) + b (12000) + b (10) 12 3 * Based on Robert Ferber, ed., Handbook of Marketing Research.

320 Research Methodology This numerical value of z can then be transformed into the probability that the individual is an early user, a late user or a non-user of the newly marketed consumer product (here we are making three categories viz. early user, late user or a non-user). (ii) The numerical values and signs of the b’s indicate the importance of the independent variables in their ability to discriminate among the different classes of individuals. Thus, through the discriminant analysis, the researcher can as well determine which independent variables are most useful in predicting whether the respondent is to be put into one group or the other. In other words, discriminant analysis reveals which specific variables in the profile account for the largest proportion of inter-group differences. (iii) In case only two groups of the individuals are to be formed on the basis of several independent variables, we can then have a model like this zi = b0 + b1X1i + b2X2i + ... + bnXni where Xji = the ith individual’s value of the jth independent variable; bj = the discriminant coefficient of the jth variable; zi = the ith individual’s discriminant score; zcrit. = the critical value for the discriminant score. The classification procedure in such a case would be If zi > zcrit., classify individual i as belonging to Group I If zi < zcrit, classify individual i as belonging to Group II. When n (the number of independent variables) is equal to 2, we have a straight line classification boundary. Every individual on one side of the line is classified as Group I and on the other side, every one is classified as belonging to Group II. When n = 3, the classification boundary is a two-dimensional plane in 3 space and in general the classification boundary is an n – 1 dimensional hyper-plane in n space. (iv) In n-group discriminant analysis, a discriminant function is formed for each pair of groups. If there are 6 groups to be formed, we would have 6(6 – 1)/2 = 15 pairs of groups, and hence 15 discriminant functions. The b values for each function tell which variables are important for discriminating between particular pairs of groups. The z score for each discriminant function tells in which of these two groups the individual is more likely to belong. Then use is made of the transitivity of the relation “more likely than”. For example, if group II is more likely than group I and group III is more likely than group II, then group III is also more likely than group I. This way all necessary comparisons are made and the individual is assigned to the most likely of all the groups. Thus, the multiple-group discriminant analysis is just like the two-group discriminant analysis for the multiple groups are simply examined two at a time. (v) For judging the statistical significance between two groups, we work out the Mahalanobis statistic, D2, which happens to be a generalized distance between two groups, where each group is characterized by the same set of n variables and where it is assumed that variance- covariance structure is identical for both groups. It is worked out thus: b g b gD2 = U1 − U2 v −1 U1 − U2 ′ where U = the mean vector for group I 1

Multivariate Analysis Techniques 321 U2 = the mean vector for group II v = the common variance matrix By transformation procedure, this D2 statistic becomes an F statistic which can be used to see if the two groups are statistically different from each other. From all this, we can conclude that the discriminant analysis provides a predictive equation, measures the relative importance of each variable and is also a measure of the ability of the equation to predict actual class-groups (two or more) concerning the dependent variable. (iii) Multivariate analysis of variance: Multivariate analysis of variance is an extension of bivariate analysis of variance in which the ratio of among-groups variance to within-groups variance is calculated on a set of variables instead of a single variable. This technique is considered appropriate when several metric dependent variables are involved in a research study along with many non-metric explanatory variables. (But if the study has only one metric dependent variable and several non- metric explanatory variables, then we use the ANOVA technique as explained earlier in the book.) In other words, multivariate analysis of variance is specially applied whenever the researcher wants to test hypotheses concerning multivariate differences in group responses to experimental manipulations. For instance, the market researcher may be interested in using one test market and one control market to examine the effect of an advertising campaign on sales as well as awareness, knowledge and attitudes. In that case he should use the technique of multivariate analysis of variance for meeting his objective. (iv) Canonical correlation analysis: This technique was first developed by Hotelling wherein an effort is made to simultaneously predict a set of criterion variables from their joint co-variance with a set of explanatory variables. Both metric and non-metric data can be used in the context of this multivariate technique. The procedure followed is to obtain a set of weights for the dependent and independent variables in such a way that linear composite of the criterion variables has a maximum correlation with the linear composite of the explanatory variables. For example, if we want to relate grade school adjustment to health and physical maturity of the child, we can then use canonical correlation analysis, provided we have for each child a number of adjustment scores (such as tests, teacher’s ratings, parent’s ratings and so on) and also we have for each child a number of health and physical maturity scores (such as heart rate, height, weight, index of intensity of illness and so on). The main objective of canonical correlation analysis is to discover factors separately in the two sets of variables such that the multiple correlation between sets of factors will be the maximum possible. Mathematically, in canonical correlation analysis, the weights of the two sets viz., a1, a2, … ak and yl, y2, y3, ... yj are so determined that the variables X = a1X1 + a2X2 +... + akXk + a and Y = y1Y1 + y2Y2 + … yjYj + y have a maximum common variance. The process of finding the weights requires factor analyses with two matrices.* The resulting canonical correlation solution then gives an over all description of the presence or absence of a relationship between the two sets of variables. (v) Factor analysis: Factor analysis is by far the most often used multivariate technique of research studies, specially pertaining to social and behavioural sciences. It is a technique applicable when there is a systematic interdependence among a set of observed or manifest variables and the researcher is interested in finding out something more fundamental or latent which creates this commonality. For instance, we might have data, say, about an individual’s income, education, occupation and dwelling * See, Eleanor W. Willemsen, Understanding Statistical Reasoning, p. 167–168.

322 Research Methodology area and want to infer from these some factor (such as social class) which summarises the commonality of all the said four variables. The technique used for such purpose is generally described as factor analysis. Factor analysis, thus, seeks to resolve a large set of measured variables in terms of relatively few categories, known as factors. This technique allows the researcher to group variables into factors (based on correlation between variables) and the factors so derived may be treated as new variables (often termed as latent variables) and their value derived by summing the values of the original variables which have been grouped into the factor. The meaning and name of such new variable is subjectively determined by the researcher. Since the factors happen to be linear combinations of data, the coordinates of each observation or variable is measured to obtain what are called factor loadings. Such factor loadings represent the correlation between the particular variable and the factor, and are usually place in a matrix of correlations between the variable and the factors. The mathematical basis of factor analysis concerns a data matrix* (also termed as score matrix), symbolized as S. The matrix contains the scores of N persons of k measures. Thus a1 is the score of person 1 on measure a, a2 is the score of person 2 on measure a, and kN is the score of person N on measure k. The score matrix then take the form as shown following: SCORE MATRIX (or Matrix S) Measures (variables) a bc k 1 a1 b1 c1 k1 k2 2 a2 b2 c2 k3 . 3 a3 b3 c3 . Persons (objects) . . .. . .. .. kN .. .. N aN bN cN It is assumed that scores on each measure are standardized [i.e., xi = ( X − Xi )2 /σi ] . This being so, the sum of scores in any column of the matrix, S, is zero and the variance of scores in any column is 1.0. Then factors (a factor is any linear combination of the variables in a data matrix and can be stated in a general way like: A = W a + W b + … + W k) are obtained (by any method ofk ab factoring). After this, we work out factor loadings (i.e., factor-variable correlations). Then communality, symbolized as h2, the eigen value and the total sum of squares are obtained and the results interpreted. For realistic results, we resort to the technique of rotation, because such rotations reveal different structures in the data. Finally, factor scores are obtained which help in explaining what the factors mean. They also facilitate comparison among groups of items as groups. With factor scores, one can also perform several other multivariate analyses such as multiple regression, cluster analysis, multiple discriminant analysis, etc. *Alternatively the technique can be applied through the matrix of correlations, R as stated later on.

Multivariate Analysis Techniques 323 IMPORTANT METHODS OF FACTOR ANALYSIS There are several methods of factor analysis, but they do not necessarily give same results. As such factor analysis is not a single unique method but a set of techniques. Important methods of factor analysis are: (i) the centroid method; (ii) the principal components method; (ii) the maximum likelihood method. Before we describe these different methods of factor analysis, it seems appropriate that some basic terms relating to factor analysis be well understood. (i) Factor: A factor is an underlying dimension that account for several observed variables. There can be one or more factors, depending upon the nature of the study and the number of variables involved in it. (ii) Factor-loadings: Factor-loadings are those values which explain how closely the variables are related to each one of the factors discovered. They are also known as factor-variable correlations. In fact, factor-loadings work as key to understanding what the factors mean. It is the absolute size (rather than the signs, plus or minus) of the loadings that is important in the interpretation of a factor. (iii) Communality (h2): Communality, symbolized as h2, shows how much of each variable is accounted for by the underlying factor taken together. A high value of communality means that not much of the variable is left over after whatever the factors represent is taken into consideration. It is worked out in respect of each variable as under: h2 of the ith variable = (ith factor loading of factor A)2 + (ith factor loading of factor B)2 + … (iv) Eigen value (or latent root): When we take the sum of squared values of factor loadings relating to a factor, then such sum is referred to as Eigen Value or latent root. Eigen value indicates the relative importance of each factor in accounting for the particular set of variables being analysed. (v) Total sum of squares: When eigen values of all factors are totalled, the resulting value is termed as the total sum of squares. This value, when divided by the number of variables (involved in a study), results in an index that shows how the particular solution accounts for what all the variables taken together represent. If the variables are all very different from each other, this index will be low. If they fall into one or more highly redundant groups, and if the extracted factors account for all the groups, the index will then approach unity. (vi) Rotation: Rotation, in the context of factor analysis, is something like staining a microscope slide. Just as different stains on it reveal different structures in the tissue, different rotations reveal different structures in the data. Though different rotations give results that appear to be entirely different, but from a statistical point of view, all results are taken as equal, none superior or inferior to others. However, from the standpoint of making sense of the results of factor analysis, one must select the right rotation. If the factors are independent orthogonal rotation is done and if the factors are correlated, an oblique rotation is made. Communality for each variables will remain undisturbed regardless of rotation but the eigen values will change as result of rotation.

324 Research Methodology (vii) Factor scores: Factor score represents the degree to which each respondent gets high scores on the group of items that load high on each factor. Factor scores can help explain what the factors mean. With such scores, several other multivariate analyses can be performed. We can now take up the important methods of factor analysis. (A) Centroid Method of Factor Analysis This method of factor analysis, developed by L.L. Thurstone, was quite frequently used until about 1950 before the advent of large capacity high speed computers.* The centroid method tends to maximize the sum of loadings, disregarding signs; it is the method which extracts the largest sum of absolute loadings for each factor in turn. It is defined by linear combinations in which all weights are either + 1.0 or – 1.0. The main merit of this method is that it is relatively simple, can be easily understood and involves simpler computations. If one understands this method, it becomes easy to understand the mechanics involved in other methods of factor analysis. Various steps** involved in this method are as follows: (i) This method starts with the computation of a matrix of correlations, R, wherein unities are place in the diagonal spaces. The product moment formula is used for working out the correlation coefficients. (ii) If the correlation matrix so obtained happens to be positive manifold (i.e., disregarding the diagonal elements each variable has a large sum of positive correlations than of negative correlations), the centroid method requires that the weights for all variables be +1.0. In other words, the variables are not weighted; they are simply summed. But in case the correlation matrix is not a positive manifold, then reflections must be made before the first centroid factor is obtained. (iii) The first centroid factor is determined as under: (a) The sum of the coefficients (including the diagonal unity) in each column of the correlation matrix is worked out. (b) Then the sum of these column sums (T) is obtained. (c) The sum of each column obtained as per (a) above is divided by the square root of T obtained in (b) above, resulting in what are called centroid loadings. This way each centroid loading (one loading for one variable) is computed. The full set of loadings so obtained constitute the first centroid factor (say A). (iv) To obtain second centroid factor (say B), one must first obtain a matrix of residual coefficients. For this purpose, the loadings for the two variables on the first centroid factor are multiplied. This is done for all possible pairs of variables (in each diagonal space is the square of the particular factor loading). The resulting matrix of factor cross products may be named as Q1. Then Q1 is subtracted clement by element from the original matrix of *But since 1950, Principal components method, to be discussed a little later, is being popularly used. **See, Jum C. Nunnally, Psychometric Theory, 2nd ed., p. 349–357, for details.

Multivariate Analysis Techniques 325 correlation, R, and the result is the first matrix of residual coefficients, R1.* After obtaining R1, one must reflect some of the variables in it, meaning thereby that some of the variables are given negative signs in the sum [This is usually done by inspection. The aim in doing this should be to obtain a reflected matrix, R'1, which will have the highest possible sum of coefficients (T)]. For any variable which is so reflected, the signs of all coefficients in that column and row of the residual matrix are changed. When this is done, the matrix is named as ‘reflected matrix’ form which the loadings are obtained in the usual way (already explained in the context of first centroid factor), but the loadings of the variables which were reflected must be given negative signs. The full set of loadings so obtained constitutes the second centroid factor (say B). Thus loadings on the second centroid factor are obtained from R'1. (v) For subsequent factors (C, D, etc.) the same process outlined above is repeated. After the second centroid factor is obtained, cross products are computed forming, matrix, Q2. This is then subtracted from R1 (and not from R'1) resulting in R2. To obtain a third factor (C), one should operate on R2 in the same way as on R1. First, some of the variables would have to be reflected to maximize the sum of loadings, which would produce R'2 . Loadings would be computed from R'2 as they were from R'1. Again, it would be necessary to give negative signs to the loadings of variables which were reflected which would result in third centroid factor (C). We may now illustrate this method by an example. Illustration 1 Given is the following correlation matrix, R, relating to eight variables with unities in the diagonal spaces: Variables 1 2 3 4 56 7 8 Variables 1 1.000 .709 .204 .081 .626 .113 .155 .774 2 .709 1.000 .051 .089 .581 .098 .083 .652 3 .204 .051 1.000 .671 .123 .689 .582 .072 4 .081 .089 .671 1.000 .022 .798 .613 .111 5 .626 .581 .123 .022 1.000 .047 .201 .724 6 .113 .098 .689 .798 .047 1.000 .801 .120 7 .155 .083 .582 .613 .201 .801 1.000 .152 8 .774 .652 .072 .111 .724 .120 .152 1.000 Using the centroid method of factor analysis, work out the first and second centroid factors from the above information. * One should understand the nature of the elements in R1 matrix. Each diagonal element is a partial variance i.e., the variance that remains after the influence of the first factor is partialed. Each off-diagonal element is a partial co-variance i.e., the covariance between two variables after the influence of the first factor is removed. This can be verified by looking at the partial correlation coefficient between any two variables say 1 and 2 when factor A is held constant r12⋅ A = r12 − r1A ⋅ r2 A 1 − r12A 1 − r22A (The numerator in the above formula is what is found in R1 corresponding to the entry for variables 1 and 2. In the denominator, the square of the term on the left is exactly what is found in the diagonal element for variable 1 in R1. Likewise the partial variance for 2 is found in the diagonal space for that variable in the residual matrix.) contd.

326 Research Methodology Solution: Given correlation matrix, R, is a positive manifold and as such the weights for all variables be +1.0. Accordingly, we calculate the first centroid factor (A) as under: Table 13.1(a) Variables 1 2 34 5 6 7 8 Variables 1 1.000 .709 .204 .081 .626 .113 .155 .774 2 .709 1.000 .051 .089 .581 .098 .083 .652 3 .204 .051 1.000 .671 .123 .689 .582 .072 4 .081 .089 .671 1.000 .022 .798 .613 .111 5 .626 .581 .123 .022 1.000 .047 .201 .724 6 .113 .098 .689 .798 .047 1.000 .801 .120 7 .155 .083 .582 .613 .201 .801 1.000 .152 8 .774 .652 .072 .111 .724 .120 .152 1.000 Column sums 3.662 3.263 3.392 3.385 3.324 3.666 3.587 3.605 Sum of the column sums (T) = 27.884 ∴ T = 5.281 First centroid factor A = 3.662 , 3.263 3.392 , 3.385 3.324 3.666 3.587 , 3.605 , , , , 5.281 5.281 5.281 5.281 5.281 5.281 5.281 5.281 = .693, .618, .642, .641, .629, .694, .679, .683 We can also state this information as under: Table 13.1 (b) Variables Factor loadings concerning first Centroid factor A 1 2 .693 3 .618 4 .642 5 .641 6 .629 7 .694 8 .679 .683 To obtain the second centroid factor B, we first of all develop (as shown on the next page) the first matrix of factor cross product, Q1: Since in R1 the diagonal terms are partial variances and the off-diagonal terms are partial covariances, it is easy to convert the entire table to a matrix of partial correlations. For this purpose one has to divide the elements in each row by the square- root of the diagonal element for that row and then dividing the elements in each column by the square-root of the diagonal element for that column.


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook