average change in the dependent variable corresponding to a unit change in the Business Forecasting Techniques: independent variable. Correia tion and Regression band If musthave the same sign, both + or both-. Also, r has the same sign NOTES as b, If. Self-Instructional Material 191 A positive value ofthe regression coefficient indicates that the relation between Xand Yls direct. A negative value shows an inverse relationship between Xand Y,i.e., high and low values are paired together. s b= r_s!_ = b Since }X X r rWe have bli = or b b = y.<xy That r, b, If have the same sign gives us an alternative definition of r. It is the square root ofthe product of band If and has the same sign as bor If. r = Jbii or ~b.r.<bxy The product bli can never exceed 1 because r cannot numerically exceed 1. This result can be used to distinguish between two regression lines for the same data. Forexample, we have b = -0.48, If= -1.66 r= ~(-0.48)(-1.66) = -0.89. Ifthe regression line of Yon .X i.e., Y= a + b.X exists it does not imply that the regression ofX on Ynecessarily exists. If Xis time and Ysales, the regression ofsales on time is expressed by Y= a + bX But there is no question ofregression (or dependence) oftime on sales. Can the Two Regression Lines Coincide? The two regression lines are identical ifand only ifall the points in the scatter diagram lie on one straight line, i.e., ifthe correlation is perfect, r= 1. What is the point ofintersection ofthe two regression lines? (.X}) is the only point common to both and hence the_po_int of intersection. Ifwe solve the two regression equations simultaneously we get .X Y. Example 8.4: For the following data showing index numbers ofprices and production r.for 5 years find the two regression lines and show that bli = lear Index Numbers of 1961 Production Prices 1962 1963 100 107 1964 101 123 1965 106 133 99 109 97 128 Solution: Estimate the index number ofprices when it is known that the index number of production is 110. Predict the index number of production when that of prices is known to be 120. Use Xforproduction, tlorprices. Subtract 100 from and value ofX and each value of Y
Business Statistics-II NOtr:: r does not change by change of scale and origin, i.e., by subtraction and NOTES division bchanges by division (or multiplication) ofobservations by any number. But b does not change by subtraction as done in this exercise. In simple regression problems, such subtraction may be avoided. Thus, u=X-100, v= Y-120. u v zi v uv 0 -13 0 169 0 13 19 3 6 13 -1 -11 36 169 78 -3 9 1 121 11 9 81 -27 l:u=3 l:v= 1 l:ti = 47 l:v = 549 l:uv= 65 l:uv-nuv v-nii nvr=(1) -r====~~====== -Jl:zt- -Jl: 65-5 x 0.6x 0.2 -J147 -5 X 0.6 X 0.6-J549-5 X 0.2 X 0.2 64.4 -J45.2-J548.8 64.4 = =0.41. 6.3 X 23.4 (i1) For the regression of Yon .X no additional work is necessary. b does not change by change oforigin only. It changes by change ofscale. b = l:uv- n~v = 64.4 = 1.42 l:rf- nu2 45.2 Now, X= l:X = 503 =100.6 and Y= l: y = 601 =120.2 n5 n5 a= Y-bX=120.2-l.42 x 100.6=-14.25 The regression of Yon Xis given by: Y=-14.25 + 1.42 X (iil) To fmd the regression ofX on Y l:vfj l:uv- nuv 64.4 = -nv = 548.8 =O.t2 a'= X-li Y=100.6-0.l2 x 120.2=68.58 The regression of X on Yis given by X= 68.58+0.12 Y (il-} .Jbii = -Jt.42 x 0.12 = 0.41 = r (\"0 To predict Thorn Xsubstitute X= 110 in the regression of Yon X Predicted Y=-14.25 + 1.42 x 110= 141.95 To predict Xfrom Ysubstitute Y= 120 in the regression ofXon Y Predicted X= 68.58 + 0.12 x 120 = 82.98 192 Self-InstructionalMaterial
Example 8.5: A firm doubles the number of its employees and profit increases Business Forecasting Techniques: significantly. Does it imply profit depends on the number ofemployees? Correlation and Regression Solution: It is likely that the increase in the employee number has come along with increase in capital, efficiency or other development. The increase in employee number NOTES need not be the basic cause ofprofit increase. IfXand lhave nothing to do with each other logically but the observations on Xand lhappen to move according to a pattern, the resulting regression equation, even ifit is well fitted and has significant coefficients, is spurious and meaningless. Example 8.6: In a linear regression analysis of 60 observations, the two lines of regression are, 1000 Y=768X-3608and5X=6Y+24 What is the coefficient of correlation in the data? 5 Show that the ratio ofthe coefficient of variation of Xto that of Yis 24 . Solution: From the data we have, b = 768 fj =~ 1000' 5 Coefficientofcorrelation r = Jbij = .Jo.922 = 0.96 Ifwe solve the two equations, we get X= 6, Y = 1 . Since Coefficient of variation of Xis ~ x 100 . X s Coefficient of variation of Yis .y! x 100 . Their ratio is s) X = -s-x =Y- = 1.25 x 1 = 5 - - ---- sY/ Y sY X 6 24 Example 8.7: What if bli > 1? rSolution: Since :j> 1, :. If bli > 1, interchange dependent and independent variables in the two regression lines. Example 8.8: The equation oftwo regression lines obtained in a correlation analysis of60 observations are 5x= 6y+ 24 and lOOOy= 768x- 3608. What is the correlation coefficient and what is its probable error? Show that the ratio of the coefficient ofvariance of xto that of yis .2_. What is 24 the ratio of variance of x and j! Solution: The equations of the regression lines are given as, 5x= 6y+ 24 and lOOOy= 768x- 3608 Self-Instructional Miterial 193
Business Sfiltistics-II b =Ia.x-=-6 (1) NOTES xy crY 5 and (2) Multiplying equations (1) and (2), we get rb X b = = -6 768 I=± 0.96 X - - ::::> xy }X 5 1000 Since both bxyand b;xare positive, the correlation coefficient Iis also positive and hence, I= + 0.96. Probable error of I, Each regression line passes through (x, y). So from the given equations ofthese lines we have, 5x = 6y+24 and 1000y = 768x-3608 Solving these we get, x=6andy=1 (3) . (1), we have I .0-'x 6 I= 0.96 =-,where From equation crY 5 or -ax = -6 x -1- = -5 (4) cry 5 0.96 4 And the ratio of the coefficients of variance of x to that of y, x(::) m(::;~- (i) ~ X(~) (from equations (3) and (4)) 5 24 Example 8.9: The two lines of regression are x+ 2y- 5 = 0, 2x+ 3y- 8 = 0 and variance of xis 12. Calculate the values of x, y, a~ and I. Solution: Since each regression line passes through (x, y) ,so from the given equations, we have 2y= -x+5 and 2x= -3y+8 (1) Solving these, we get x = 1, y = 2 Assuming the lines of regression of yon xand xon yas, 2y=- x+ 5 and 2x=- 3y+ 8 194 Self-InstructionalMlterial
From equation ( 1) we have, (2) Business Forecasting Techniques: crY 1 Correlation and Regression bxy= r.- =- -2 NOTES crx _iand bxy = r. crx = crY 2 Multiplying equations (1) and (2), we get, (-iX X(- -bxy b% = 2) 21) = r_:y r J3= -3 => r= ± - = ± 0.866 42 Since bxy and b% are negative, the correlation coefficient r is negative. Thus, r= -0.866 Now, cr2 = 12 (given) X l 1)x) \"2From equation (2), we have (rccrYrl 2 = ( - 2 r cr2 or X 4 or 4(- 0.866)2 x 12 = cr~ cr~ = 35.998 Note: If we assume the lines of regression of yon x and x on y as, = =(-%) (-2)=:rx=-2y+ 5 and 3y=-2x+ 8, then We shall get bxy X bxy > 1, which is inadmissible. 8.6.2 Formulae used in Regression Once a reasonable degree ofcorrelation is established between two variables, we may evince interest in estimating or predicting the value of one variable given the value of another. It is here that regression analysis comes into picture. Regression analysis reveals the average relationship between two variables and this makes possible estimation or prediction via a mathematical equation connecting the two variables. 1. Regression equation of X on Y. (X- X- )= -r.a-x ( Y- -Y) cry rcrx = :r.~;' (If deviations are taken from actual means of X and l5 crY :r.y i.e., where x= (X- X) and y= (Y- Y) Self-Instructional 1141terial 195
Business Statistics-I/ (I:dx}(I:dy) NOTES rcrx = I:dxdy N 2 (I:dy )2 1: dy - _,___N~ (If deviations are taken from assumed means of X and lf i.e., if dx= (X-~ and dy= (y- A) 2. Regression equation of Yon X: (Y- _ = -r.cr Y( Y\\ Y) crx X--'') r.cr y I:xy ~ = 1:~ I(If the deviations are taken from actual means of X and lf i.e., if x= (X- X)andy= (Y- Y} r.cr y = I:dxdy (I:dx N)( I:dy) 1:d2- (1:dx)2 XN (If deviations are taken from assumed means of X and lf i.e., if dx= (X-~) and dy= (Y- A) 3. Regression coefficients : r. crx or bxy is the regression coefficient of X on Y cry r. crY or byx is the regression coefficient of Yon X crx 4. or b - r.crx- I:xy - J.lll [where lltt =Covariance (x; ;?] xy - ----;;y---- - N .cr2y - cr2y JVote: In case we deal with actual values of X and Yvariables and not the deviations, Y)lthen b = [ N(LXY)-(LX){L xy MY2 -(LY)2 or b = I:xy J.lii [where lln = Covariance (x; ;?] yx Ncr2 = cr2 XX 196 Self-Instructional!tbterial JVote: In case we deal with actual values of x and yvariables and not the deviations,
then Business Forecasting Techniques: Correlation andRegression 6. Angle between the lines of regression: If8 be the acute angle between the two lines of regression (Xon Yand Yon A), NOTES then, 'Notes: (1) Both the lines ofregression (of Yon Xand X on lJpass through the point (X, Y) , the mean of Xand Yseries. (Ji) If the two lines of regression coincide, then the correlation between Xand Yis perfect and by equating the respective slopes, we get, r. cry = crY crx rcrx ror = 1 or r = ±1 rHence, tan 8 =0 => = 1 => r = ± 1 (ii1) Ifthe coefficient of correlation, viz., r between X and Yis zero, i.e., the variables X and Yare independent, it can be easily seen that the lines ofregression of Yand Xand of Xon Yare respectively given by Y= Y and X= X and these two regression lines intersect at right angles. Therefore if r = 0, tan 8 = oo ::::> 8 = rr/2 or 90° 7. Standard error of estimate : The standard error of regression of Yvalues from ~ = Syx S = ~L.( Y- Ycf Unexplained variation yx N N Also Syx= cry.F?} and S = ~L.Y2 -aL.Y-bL.XY yx N Similarly, if Sxystands for the standard error of regression of Xvalues from--\\. then S= V/:r.( X - x ,l N xy Also, vsxy= /L.X2 - aL.NX- bL.XY Note: The standard error of estimate measures the accuracy ofthe estimated figures. The smaller the value of standard error of estimate, the closer will be the dots to the regression line and the better the estimates based on the equation for this line. If the Self-Instructional Material 197
Business Statistics-II standard error of estimate is zero, then there is no variation about the line and the correlation will be perfect. Thus, with the help ofstandard error or estimate, it is possible NOTES for us to ascertain how good and representative the regression line is as a description of the average relationship between two series. Properties of Regression Coefficients (1) Coefficient ofcorrelation r between the variables xand yis the geometric mean between two regression coefficients byx and bxf (i.e., r= ~byx x bxy ) (ii) Though rxy = ryx (always), bxy ::;:. byx in general. (They become equal only when (iii) If one of the regression coefficients is greater than unity numerically, then the other is less than unity numerically. (iV) The arithmetic mean ofregression coefficients is greater than the coefficient of correlation r(by and large). (V) The covariance, the coefficient ofcorrelation rand the two regression coefficients have the same sign. ( w) Though correlationcoefficient is independent ofboth scale and origin, the regression coefficients are independent ofchange oforigin but not ofscale. 8.7 CONCEPT OF ERROR We have found a line through the scatter points which best fits the data. But how good is this fit? How reliable is the estimated value of ~ How close are the values of ~ to the observed values of Y! The closer these values are to each other, the better the fit. This means that if the points in the scatter diagram are closely spaced around the regression line, then the estimated value ~will be close to the observed value of Yand hence, this estimate can be considered as highly reliable. Accordingly, a measure of variability of scatter around the regression line would determine the reliability ofthis estimate ~ The smaller this estimate, the more dependable the prediction will be. (This measure is similar in nature to standard deviation which is also a measure ofscattered data around the mean.) This measure is known as standarderror ofthe estimateand is used to determine the dispersion of observed values of Y about the regression line. This measure is designated by sy.x and is given by: S = ~'f.(Y-)Jz y.x n- 2 where, Y = Observed value ofthe dependent variable ~=Corresponding computed value ofthe dependent variable n = Sample size and (n- 2) = Degrees of freedom 198 Self-InstructionalMlterial
Based upon this relationship, a simpler formula for calculating Sy.x would be: Business Forecasting 1Cchniques: Correia tion and Regression s = /\"L( l'f- 4J('L YJ- f:t ('LX}') y.x V NOTES n-2 Example 8.10: Consider Example 2, which shows the regarding the relationship of aeights between sons and their fathers. Let us calculate the standard error of the ~stimate sy.x' Solution: Now, s =J'L(YJ2- 4J('L YJ- f:t ('LXYJ y.x n-2 27355- 26.25(405)- 0.625(26740) 4 J¥ = .J2.8125 = 1.678 8.8 COEFFICIENT OF DETERMINATION The coefficient ofdetermination (r), the square ofthe coefficient ofcorrelation (r), is a more precise measure of the strength of the relationship between two variables and lends itselfto more precise interpretation because it can be presented as a proportion or as a percentage. The coefficient ofdetermination (r) can be defined as the proportion ofthe variation in the dependent variable Y,that is explained by the variation in independent variable X in the regression model. In other words: r = Explained variation Total variation 2 4J'LY+ f:t'LXY-¥ 'L(YJ2- ('L YJ 2 n Example 8.11: Let us calculate the coefficient of correlation rand the coefficient of determination (r) from our example ofheights ofsons and fathers. Father (X} Son ( l1 Check Your Progress 63 66 5. What do you understand by 65 68 coefficient of correlation? 66 65 67 67 6. Write the assumptions used 67 69 to find the coefficient of 68 70 correlation by Karl Pearson's method. Solution: Now, 7. Define the coefficient of determination, I2. 9. Can the two regression lines coincide? 11. Differentiate between correlation analysis and regression analysis. Self-Instructional 11-flterial 199
Business Statistic<J-U Since all these values have been calculated earlier, we simply substitute these values in the formula to determine the value of(r). NOTES Hence, 200 Self-lnstrUctionalM:lterial 26.25(405) + 0.625(26740)- (405)2 r- = -------....,...-----=6=--- 27355- (405)2 6 = 10631.25+16712.5-27337.5 = 6.25 =0.357 27355-27377.5 17.5 and I = .f? = -Jo.357 = 0.597 rWhile the value of I= 0.597 is more of an abstract figure, the value of = 0.357 tells us that 35.7 per cent ofthe variation in Yis explained by the variation in X This rindicates a weak relationship since the value of = 0, means no relationship at all and the value of I= 1 or 100 per cent means perfect relationship. In general, for a high degree ofcorrelation which leads to better estimates and prediction, the coefficient of rdetermination must have a high value. 8.9 APPLICATIONS OF CORRELATION AND REGRESSION As we have seen, correlation and regression can be applied in many situations. They include the following: • Testing hypotheses on cost-effect relationships • Checking whether two variables without necessarily inferring a cause and effect relationship • Estimating the value ofone variable corresponding to a particular value ofthe other variable Correlation and regression analysis enable businesses to investigate the determinants of key variables, such as their sales. Variations in a company's sales are likely to be related to variation in product prices, consumers, incomes, tastes and preferences. Multiple regression analysis can be used to investigate the nature ofthis relationship and correlation analysis can be used to test the goodness offit. One example can be given here. In 1964, tobacco companies argued that the US Surgeon General's report did not prove that cigarette smoking caused lung cancer. It only pointed out a statistically significant predictive value in terms ofAmericans who smoked andAmericans who developed lung cancer. Eventuallyother studies corroborated the Surgeon General's findings for several medical problems. 8.10 SUMMARY Inthis unit you have learned that regression analysis is a mathematical process ofusing observations to fmd the line of best fit through data in order to make estimates and predictions about the behaviour of the variables. This line of best fit may be linear (straight) or curvilinear to some mathematical formulae. Correlation analysis is the process
offinding how well (or badly) the line fits the observations, such that ifall the observations Business Forecasting Techniques: lie exactly on the line ofbest fit, the correlation is considered to be 1or unity. The least Correlation and Regression squares method is the most widely used procedure for developing estimates ofthe model parameters. You have also learned that correlation and regression analysis are related NOTES since both deal with relationships among variables. Regression and correlation analyses thus determine the nature and strength ofthe relationship between two variables. 8.11 ANSWERS TO 'CHECK YOUR PROGRESS' 1. There are several types of correlations. These are: (a) Positive or negative correlation (b) Linear or non-linear correlation (c) Simple, partial or multiple correlation 2. Correlation analysis is the statistical tool that is generally used to describe the degree to which one variable is related to another. The relationship, if any, is usually assumed to be a linear one. This analysis is used quite frequently in conjunction with regression analysis to measure how well the regression line explains the variations of the dependent variable. In fact, the word correlation refers to the relationship or the interdependence between two variables. There are various phenomena which are related to each other. For instance, when demand of a certain commodity increases, then its price goes up and when its demand decreases the price comes down. 3. Scatter diagram is the method to calculate the constants in regression models that make use ofscatter diagram or dot diagram. A scatter diagram is a diagram that represents two series with the known variables, i.e., independent variable plotted on the X:axis and the variable to be estimated, i.e., dependent variable to be plotted on the Y-axis. 4. The least squares method is a method to calculate the constants in regression models for fitting a line through the scatter diagram that minimizes the sum ofthe squared vertical deviations from the fitted line. In other words, the line to be fitted will pass through the points ofthe scatter diagram in such a fashion that the sum of the squares of the vertical deviations of these points the from line will be a mmnnum. 5. The coefficient of correlation, which is symbolically denoted by r, is another important measure to describe how well one variable explains another. It measures the degree ofrelationship between the two causally related variables. The value ofthis coefficient can never be more than +1 or less than -1. Thus, +1 and -1 are the limits ofthis coefficient. 6. Karl Pearson's method is the most widelyused method ofmeasuring the relationship between two variables. This coefficient is based on the following assumptions: (1) There is a linear relationship between the two variables which means that a straight line would be obtained ifthe observed data is plotted on a graph. (.iJ) The two variables are causally related which means that one ofthe variables is independent and the other one is dependent. Self-Instructional Material 201
Business Statistics-II (ii1) A large number ofindependent causes operates in both the variables so as to produce a normal distribution. NOTES 7. The coefficient ofdetermination (r), the square ofthe coefficient ofcorrelation 202 Self-InstructionalMltcrial (r), is a more precise measure ofthe strength ofthe relationship between the two variables and lends itselfto more precise interpretation because it canbe presented as a proportion or as a percentage. 9. Two regression lines are identical and coincide if and only if all the points in the scatter diagram lie on one straight line, i.e., if the correlation is perfect, r =1. 11. Correlation analysis helps us in determining the degree to which two or more variables are related to each other. When there are only two variables we can determine the degree to which one variable is linearly related to the other. Regression analysis helps in determining the pattern of relationship between one or more independent variables and a dependent variable. This is done by an equation estimated with the help ofdata. 8.12 QUESTIONS AND EXERCISES Short-Answer Questions 1. Explain the meaning and significance ofregression. 2. Write any three predictions that can be assumed while using the regression technique. 3. Explain how the least squares method is useful in statistical calculations. 4. What is a 'Scatter Diagram'? 5. How does the scatter diagram help in studying the correlation between two variables? 6. Define correlation analysis. 7. What are the different types ofcorrelation? 8. How will you calculate the coefficient ofcorrelation? 9. Write the method for calculating the coefficient ofcorrelation by Karl Pearson's Method. Long-Answer Questions 1. Obtain the estimating equation by the method ofleast squares from the following information: X y (Independent variable) (frpendent variable) 2 18 4 12 5 10 6 8 8 7 11 5 2. Find out the coefficient of correlation between the two kinds of assessment of M.A. students' performance using. (a) Karl Pearson's Method (b) the method ofleast squares
S.Nof Internal assessrrx:nt External assessrrx:nt Business Forecasting Iechniques: students (Marks obtained (Marks obtained Correiation and Regression out of 100) 2 out of 100) NOTES 3 49 4 51 72 ·; 63 74 5 73 44 Self-Instructional Materiar 203 46 58 6 66 7 50 8 50 9 ro 30 47 35 36 ro Also, work out r2 and interpret the same. 3. Calculate correlation coefficient from the following results: n = 10; LX= 140; IY= 150 L(X-10)2 = 180;I(Y-15)2 =215 (X-10)( Y-15) = (i) 4. Given is the following information: Observation Jest score Sales ('000 Rs) X y 2 73 450 3 78 490 4 92 570 5 61 380 6 87 7 81 540 8 77 9 70 500 10 65 82 480 Total 430 766 410 490 4740 You are required to: (1) Graph the scatter diagram for this data. (i1) Find the regression equation and draw the line corresponding to the equation on the scatter diagram. (ii1) Make an estimate about sales ifthe test score happens to be 75. 5. Calculate correlation coefficient and the two regression lines for the following information: Ages of 10-20 10-20 Ages of Wves (in years) Total Husbands 20-30 20 20-30 30-40 40-50 46 (m 8 59 years) 30-40 26 25 10 40-50 14 37 4 18 3 46 Total 28 44 59 9 140
Business Sflltistics-II 6. To know what relationship exists between unemployment and suicide attempts, a sociologist surveyed twelve cities and obtained the following data: NOTES S.N ofthe Unemploym:mt rate Number ofsuicide attempts 204 Self-InstructionalMlterial city per cent per 1000 residents 7.3 22 2 6.4 17 3 6.2 9 4 5.5 8 5 6.4 12 6 4.7 5 7 5.8 7 8 7.9 19 9 6.7 13 10 9.6 29 11 10.3 33 12 7.2 18 (I) Develop the estimating equation that best describes the given relationship. (ii) Find a prediction interval (with 95% confidence level) for the attempted suicide rate when unemployment rate happens to be 6%. 7. (I) Give an example ofa pair ofvariables which would have: · (a) An increasing relationship (b) No relationship (c) Adecreasing relationship (ii) Suppose that the general relationship between height in inches (AJ and weight in kg. ( lJ is Y = 10 + 2.2 (AJ. Suppose that weights ofpersons ofa given height are normally distributed with a dispersion measurable by cre= 10 kg. (a) What would be the expected weight for a person whose height is 65 inches? (b) Ifa person whose height is 65 inches should weigh 161 kg, what value of cre does this represent? (c) What reasons might account for the value of e for the person ofpart (b)? (d) What would be the probability that someone whose height is 70 inches would weigh between 124 and 184 kg? 8. Examine the following statements and state whether each one ofthe statements is true or false, giving reasons for your answer. (I) Ifthe value ofthe coefficient ofcorrelation is 0.9 then this indicates that 90 per cent ofthe variation in dependent variable has been explained by variation in the independent variable. (ii) Ifa high significant relationship is found between the two variables Xand };' then this constitutes definite proofthat there is a causal relationship between
these two variables. Business Forecasting Techniques: Correlation and Regression (iii) Negative value ofthe' b' coefficient in a regression relationship indicates a NOTES weaker relationship between the variables involved than would a positive value for the 'b' coefficient in a regression relationship. (i0 Ifthe value for the' b' coefficient in an estimating equation is less than 0.5, then the relationship will not be a significant one. (0 r2+Js. is always equal to one. From this, it can also be inferred that r+ kis equal to one. l r =coefficient of correlation; ? = coefficient of determination [ k = coefficient of alienation; k = coefficient of non-determination. ~. Show that iffive students get respectively 1, 2, 3, 4, 5 marks (A} out oft 0 in subject A and 3, 3, 5, 7, 6 marks ( l) out of 10 in subject B, the regression ofsubject Bon subject Acan be written Y= 1.8 +X 10. Show that for 10 students p = 0. 79 for marks given by Judge! 23 20 19 17 16 28 24 25 27 22 Judge II 30 28 27 41 36 45 46 44 43 39 11. For the following results showing average increase in hours of sleep gained by using two drugs on 10 persons, fmd rand p. DrugA 1.1 1.0 0.9 0.8 0.7 1.9 1,6 1.6 1.2 1.1 Drug B 1.2 0.9 0.9 0.6 0.7 1.6 1.7 1.4 1.3 1.2 12. In a regression analysis problem the following data is given: sx = 3 and the regression lines 8X-10 Y+ 66 = 0, 40X-18 Y= 214. s,.Find X, Y, r, 13. In each of the following cases, fmd (1) I; (i1) the two regression equations and predict the values as asked. Draw the scatter diagrams and plot the lines. (a) lear Index ofindustrial Mtional income production X }'{Rs million) 500 1941 100 520 580 1942 105 600 610 1943 107 600 650 1944 98 1945 109 1946 110 1947 114 Predict YifX= 120. PredictXif Y=700. Self-Instructional Mlterial 205
Business Statistics-II (b) X y NOTES 10 0 10 -1 11 -2 11 4 12 -2 15 0 16 -2 11 2 Predict YifX= %and predict Xif Y= 18. 14. Find the rank correlation for the following rankings of6 students in two tests: Test I 4 3 5 6 2 1 Test II 5 4 3 12 6 15. For the following results showing marks obtained by 15 students show that the rank correlation is 0.89. Marks in Maths 50 50 40 39 38 37 36 35 34 33 32 31 30 29 28 Marks in Stats. 50 49 51 52 43 47 42 40 44 40 30 41 32 33 31 16. Show that 0.86 is the rank correlation as well as the simple correlation for the following data showing ranks given to 12 qualities ofroses by two judges. Quality 1 2 3 4 5 6 7 8 9 10 11 12 Judge! 4 10 6 5 9 3 12 11 7 8 2 1 Judge II 5 9 7 8 11 2 12 10 6 4 1 3 8.13 FURTHER READING Chandan, J. S. 1998. Statistics for Business andEconomics. New Delhi: Vikas Publishing House Pvt. Ltd. Monga, G. S. 2000. flJathematics and Statistics for Economics. New Delhi: Vikas Publishing House Pvt. Ltd. Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vtkas Publishing House Pvt. Ltd. Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: M~cmillan India Ltd. Gupta, S.C. 2006. Fundanrntals ofStatistics. New Delhi: Himalaya Publishing House. Gupta, S. P. 2005. Statistical Mthods. New Delhi: S. Chand and Sons: 206 Self-InstructionalMaterial
UNIT 9 BUSINESS FORECASTING Busi11ess Forecasti11g Techniques: TECHNIQUES: TIME SERIES 1ime Series An.a/ysis ANALYSIS NOTES Structure 9.0 Introduction 9.1 Unit Objectives 9.2 Components of Time Series Analysis 9.3 Fitting ofTrends 9.3.1 Trend Analysis 9.3.2 Smoothing Techniques 9.3.3 Measuring Cyclical Effect 9.4 Applications of Business Problems 9.4.1 SimpleAverages 9.4.2 Moving Averages 9.4.3 Measuring Irregular Variation and Seasonal Adjustments 9.5 Solved Problems 9.6 Summary 9.7 Answers to 'Check Your Progress' 9.8 Questions and Exercises 9.9 Further Reading 9.0 INTRODUCTION Inthis unit, you will learn how time series analysis differs from regression analysis. We oftensee a number ofcharts on companydrawing boards or in newspapers, where we see lines going up and down from leftto right ona graph. Thevertical axis represents avariable suchas productivityor crime data in the city, and the horizontal axis represents the different periods of increasing time such as days, weeks, months or years. The analysis ofthe movements ofsuch variables over periods oftime is referred to as time series analysis. Time series canthenbe defmed as a set ofnumeric observations ofthe dependent variable, measured at specific points in time in a chronological order, usually at equal intervals, in orderto determine the relationship oftime to such variables. In this unit, you will also learn that one of the major elements ofplanning, and specificallystrategic planning ofanyorganizationis accurately forecasting the future events that would have an impact on the operations ofthe organization. Previous performances must be studied so as to forecast future activity. Even in our daily lives, we plan our future events on the basis ofa reasonable estimate ofthe future environment that could affect ourplans, :whetherit is forecasting rain on ourpicnic on Saturday, or forecasting economic conditions for ten years. Textbook publishers must predict future sales ofbooks to print enou~copies for students; :financialadvisors mustpredictthevalues ofavarietyofeconomic factors in orderto advise clients regarding stocks, bonds and otherbusiness opportunities. Similarly, hotel builders ina citymust project the fu~~e influx oftourists, and so on. The qualityofsuchforecasts is stronglyrelated to the relevant informationthat canbe extracted and used from past data. In that respect, time series can be used to determine patterns in the data ofthe past over a period oftime and extrapolate the data into the future. Se/f-hJStructio11al Material 207
Business Sf1ltistics-JI 9.1 UNIT OBJECTIVES NOTES After going through this unit, you will be able to: • Classify the time series • Analyse the components oftime series • Describe the influence oftime series analysis • Explain the different methods ofmeasuring trend • Describe the different methods ofmeasuring seasonal variations • Explain smoothing techniques • Calculate simple averages and moving averages • Analyse exponential smoothing • Measure irregular variations and seasonal adjustments 9.2 COMPONENTS OF TIME SERIES ANALYSIS The time series analysis method is quite accurate where future is expected to be similar to past. The underlying assumption in time series is that the same factors will continue to influence the future patterns of economic activity in a similar manner as in the past. These techniques are fairly sophisticated and require experts to use these methods. The classical approach is to analyse a time series in terms offour distinct types of variations or separate components that influence a time series. 1. Secular Trend or Simply Trend ( 1J Trend is a general long-term movement in the time series value of the variable ( lJ over a fairly long period of time. The variable ( lJ is the factor that we are interested in evaluating for the future. It could be sales, population, crime rate, and so on. Trend is a common word, popularly used in day-to-day conversation, such as population trends, inflation trends and birth rate. These variables are observed over a long period of time and any changes related to time are noted and calculated and a trend of these changes is established. There are many types of trends; the series may be increasing at a slow rate or at a fast rate or these may be decreasing at various rates. Some remain relatively constant and some reverse their trend from growth to decline or from decline to growth over a period of time. These changes occur as a result of the general tendency of the data to increase or decrease as a result of some identifiable influences. If a trend can be determined and the rate of change can be ascertained, then tentative estimates on the same series values into the future can be made. However, such forecasts are based upon the assumption that the conditions affecting the steady growth or decline are reasonably expected to remain unchanged in the future. A change in these conditions would affect the forecasts. As an example, a time-series involving increase in population over time can be shown as, 208 Self-InstructionalMlterial
Business Forecasting Techniques: 1inr Series Analysis NOTES Time 2. Cyclical Fluctuations (q Cyclical fluctuations refer to regular swings or patterns that repeat over a long period oftime. The movements are considered cyclical only if they occur after time intervals of more than one year. These are the changes that take place as a result of economic booms or depressions. These may be up or down, and are recurrent in nature and have a duration of several years- usually lasting for two to ten years. These movements also differ in intensity or amplitude and each phase of movement changes gradually into the phase that follows it. Some economists believe that the business cycle completes four phases every twelve to fifteen years. These four phases are: prosperity, recession, depression and recovery. However, there is no agreement on the nature or causes of these cycles. Even though, measurement and prediction of cyclical variation is very important for strategic planning, the reliability of such measurements is highly questionable due to the following reasons: , 1) These cycles do not occur at regular intervals. In the twenty-five years from 1956 to 1981 in America, it is estimated that the peaks in the cyclical activity of the overall economy occurred in August 1957, Aprill960, December 1969, November 1973 and January 1980.1 This shows that they differ widely in timing, intensity and pattern, thus making reliable evaluation of trends very difficult. (iJ) The cyclic variations are affected by many erratic, irregular and random forces which cannot be isolated and identified separately, nor can their impact be measured accurately. The cyclic variation for revenues in an industry against time is shown graphically as follows: Q) :c::1: ~ a: Time 3. Seasonal Variation (S) Seasonal variation involves patterns of change that repeat over a period of one year or less. Then they repeat from year to year and they are brought about by fixed 1 Mark L. Berenson and David M. Levine, Basic Business Statistics (New Jersey: Prentice-Hall, Self-Instructional Material 209 1983), 618.
Business Statistics-// events. For example, sales of consumer items increase prior to Christmas due to gift giving tradition. The sale of automobiles in America are much higher during the last NOTES three to four months ofthe year due to the introduction ofnew models. This data may be measured monthly or quarterly. Since these variations repeat during a period of twelve months, they can be predicted fairly and accurately. Some factors that cause seasonal variations are as follows: (1) Season and climate. Changes in the climate and weather conditions have a profound effect on sales. For example, the sale ofumbrellas in India is always more during monsoons. Similarly, during winter, there is a greater demand for woollen clothes and hot drinks, while during summer months there is an increase in the sales of fans and air conditioners. (n) Customs and festivals. Customs and traditions affect the pattern of seasonal spending. For example, Mother's Day or Valentine's Day in America see increase in gift sales preceding these days. In India, festivals such as Baisakhi and Diwali mean a big demand for sweets and candy. It is customary all over the world to give presents to children when they graduate from high school or college. Accordingly, the month of June, when most students graduate, is a time for the increase of sale for presents befitting the young. An accurate assessment of seasonal behaviour is an aid in business planning and scheduling such as !It the area ofproduction, inventory control, personnel, advertising, and so on. The seAsonal fluctuations over four repeating quarters in a given year for sale of a given item is illustrated as: Ul Q) irini i 23 4 -Time (Quarters) 4. Irregular or Random Variation (1) These variations are accidental, random or simply due to chance factors. Thus, they are wholly unpredictable. These fluctuations may be caused by such isolated incidents as floods, famines, strikes or wars. Sudden changes in demand or a breakthrough in technological development may be included in this category. Accordingly, it is almost impossible to isolate and measure the value and the impact ofthese erratic movements on forecasting models or techniques. This phenomenon may be graphically shown as follows: 210 Self-InstructionalMlterial
Business Forecasting Techniques: 7inr Series Analysis NOTES -Time It is traditionally acknowledged that the value of the time series (})is a function of the impact of variable trend ( 1}, seasonal variation (5), cyclical variation (C) and irregular fluctuation (I). These relationships may vary depending upon assumptions and purposes. The effects of these four components might be additive, multiplicative, or combination thereof in a number of ways. However, the traditional time series analysis model is characterized by multiplicative relationship, so that: Y=Tr'SOCCI This model is appropriate for those situations where percentage changes best represent the movement in the series and the components are not viewed as absolute values but as relative values. Another approach to defme the relationship may be additive, so that: Y= T+ S + C +I This model is useful when the variations in the time series are in absolute values and can be separated and traced to each of these four parts and each part can be measured independently. 9.3 FITTING OF TRENDS 9.3.1 Trend Analysis While chance variations are difficult to identify, separate, control or predict, a more precise measurement of trend, cyclical effects and seasonal effects can be made in order to make the forecasts more reliable. In this section, we discuss techniques that would allow us to describe trend. When a time series shows an upward or downward long-term linear trend, then regression analysis can be used to estimate this trend and project the trends into forecasting the future values of the variables involved. The equation for the straight line used to describe the linear relationship between the independent variable X and the dependent variable Yis: Y= b0 + b1X where, b0 = Intercept on the Y-axis and b1 = Slope of the straight line Check Your Progress In time series analysis, the independent variable is time, so we will use the I. What do you understand by the term 'trend'? symbol tin place of X and we will use the symbol ~in place of 1;; which we have 2. Explain the meaning of used previously. cyclical fluctuation. Hence, the equation for linear trend is given as: 3. What factors cause seasonal variations? J;= b0 + b1t Self-Instructional J1.1aterial 211
Business St11tistics-II where, lj= Forecast value of the time series in period t NOTES b0 = Intercept of the trend line on Y-axis b1 = Slope of the trend line t = Time period ho qAs discussed earlier, we can calculate the values of and by the following formulae: q = JU:((y)- (:Et)(:Ey) and bo =-- qI n(:Er)- (:Et)2 , Y where, y = Actual value of the time series in period time t n = Number of periods y = Average value of time series = :Ey n I :Et =Average value of t=n- Knowing these values, we can calculate the value of y. Example 9.1: A car fleet owner has 5 cars which have been in the fleet for several different years. The manager wants to establish ifthere is a linear relationship between the age ofthe car and the repairs in hundreds of dollars for a given year. This way, he can predict the repair expenses for each year as the cars become older. The information for the repair costs he had collected for the last year on these cars is as follows: Car# Age (f) Repairs ( :ij 1 4 1 3 6 2 3 7 3 5 7 4 6 9 5 The manager wants to predict the repair expenses for the next year for the two cars that are 3 years old. Solution: The trend in repair costs suggests a linear relationship with the age of the car, so that the linear regression equation is given as: where q = JU:(ty)- (:Et)(:Ey) and n(:Ef)- (:Et)2 b0 =y-qi 212 Self-Instructional Mlterial
To calculate the various values, let us form a new table as follows: Business Forecasting Techniques: Tinr Series Analysis Age of Car (t) Repair Cost ( l1 tY f2 1 NOTES 1 44 3 6 18 9 3 7 21 9 5 7 35 25 6 9 54 36 Total 18 33 132 80 Knowing that n = 5, let us substitute these values to calculate the regression coefficients b0 and b1. Then, q = 5(132)- (18)(33) 5(80)- (18)2 660-594 =--- 400-324 = 66 =0.87 76 and 4 = y-qt where _y_--L-Y;;_-s33--_ 6.6 and -t =-t =-1=8 3.6 n5 Then, 4 = 6.6 -0.87(3.6) = 6.6- 3.13 = 3.47 Hence, l;' = 3.47 + 0.87 t The cars that are 3 years old now will be 4 years old next year, so that t= 4. Hence, ~) = 3.47 + 0.87(4) =3.47+3.48 = 6.95 Accordingly, the repair costs on each car that is 3 years old are expected to be $695.00 9.3.2 Smoothing Techniques Smoothing techniques improve the forecasts of future trends provided that the time series is fairly stable with no significant trend, cyclical or seasonal effect and the objective is to smooth out the irregular component of the time series through the Self-Instructional Material 213
Business Statistics-II averaging process. There are two techniques that are generally employed for such smoothing. NOTES 1. Moving averages 2. Exponential smoothing 1. Moving averages. The concept ofthe moving averages is based on the idea that any large irregular component of time series at any point in time will have a less significant impact on the trend, ifthe observation at that point in time is averaged with such values immediately before and after the observation under consideration. For example, ifwe are interested in computing the three-period moving average for any time period, then we will take the average ofthe value in such time period, the value in the period immediately preceding it and the value in the time period immediately following it. Let us illustrate this concept with the help of an example. Example 9.2: Let the following table represent the number of cars sold in the first 6 weeks of the first two months of the year by a given dealer. Our objective is to calculate the three-week moving average. U-eek Sales 1 20 2 24 3 22 4 26 5 21 6 22 Solution: The moving average for the frrst three-week period is given as: . average= 20 + 24 + 22 66 = 22 Movtng =- 33 This moving average can then be used to forecast the sale of cars for week 4. Since the actual number of cars sold in week 4 is 26, we note that the error in the forecast is (26 - 22) = 4. The calculation for the moving average for the next three periods is done by adding the value for week 4 and dropping the value for week 1, and taking the average for weeks 2, 3 and 4. Hence, Movt.ng average= 24 + 22 + 26 -72 =24 33 Then, this is considered to be the forecast of sales for week 5. Since the actual value of the sales for week 5 is 21, we have an error in our forecast of (21 - 24) =- (3). The next moving average for weeks 3 to 5, as a forecast for week 6 is given as: . 22 + 26 + 21 69 Movtng average= = - = 23 33 The error between the actual and the forecast value for week 6 is (22- 23) =- ( 1). (Since the actual value of the sales for week 7 is not given, there is no need to forecast such values). 214 Self-InstructionalMlterial
Our objective is to predict the trend and forecast the value of a given variable Business Forecasting Techniques: n the future as accurately as possible so that the forecast is reasonably free from Tinr Series Analysis andom variations. To do that, we must have the sum of individual errors, as discussed :arlier, as little as possible. However, since errors are irregular and random, it is NOTES :xpected that some errors would be positive in value and others negative, so that the urn of these errors would be highly distorted and would be closer to zero. This lifficulty can be avoided by squaring each of the individual forecast errors and then aking the average. Naturally, the minimum values of these errors would also result n the minimum value of the 'average of the sum of squared errors'. This is shown ts follows: ~ek 1irre Series \"Wtlue Moving Average Error Error Squared 1 20 22 4 16 2 24 24 - 3 9 3 22 23 - 1 1 4 26 5 21 6 22 Then the average of the sum of squared errors, also known as rrean squared ~rror (MSE) is given as: MSE= 16+9+1 = 26 =8.67 33 The value ofMSE is an often-used measure of the accuracy of the forecasting nethod, and the method which results in the least value of MSE is considered more 1ccurate than others. The value ofMSE can be manipulated by varying the number )f data values to be included in the moving average. For example, if we had calculated :he value of MSE by taking 4 periods into consideration for calculating the moving 1verage, rather than 3, then the value ofMSE would be less. Accordingly, by using :rial and error method, the number of data values selected for use in forecasting .vould be such that the resulting MSE value would be minimum. t Exponential smoothing. In the moving average method, each observation in the noving average calculation receives the same weight. In other words, each value ::ontributes equally towards the calculation ofthe moving average, irrespective of the 1umber of time periods taken into consideration. In most actual situations, this is not 1 realistic assumption. Because of the dynamics of the environment over a period of time, it is more likely that the forecast for the next period would be closer to the most recent previous period than the more distant previous period, so that the more recent value should get more weight than the previous value, and so on. The exponential ~moothing technique uses the moving average with appropriate weights assigned to the values taken into consideration in order to arrive at a more accurate or smoothed forecast. It takes into consideration the decreasing impact of the past time periods as we move further into the past time periods. This decreasing impact as we move down into the time period is exponentially distributed and hence, the name exponential smoothing. In this method, the smoothed value for period ~ which is the weighted average of that period's actual value and the smoothed average from the previous period Self-lnstructiona/IW1terial 215
Business Statistics-II (t - 1), becomes the forecast for the next period ( t + 1). Then the exponential NOTES smoothing model for time period (t+ 1) can be expressed as follows: 216 Self-InstructionalMlterial .11t+l> =a 1;'\"+ (1- a)F; where ~t+ 1) = The forecast of the' time series for period ( t + 1) lj= Actual value of the time series in period t a = Smoothing factor (0::;; a::;; 1) F; = Forecast of the time series for period t The value of a is selected by the decision-maker on the basis of the degree of smoothing required. A small value of a means a greater degree of smoothing. A large value of a means very little smoothing. When a = 1, then there is no smoothing at all so that the forecast for the next time period is exactly the same as the actual value of times series in the current period. This can be seen by: .11t+ll =a 1;'\"+ (1- a)F; when a.= 1 .11t+ I) = 1;'\"+ 0F; = 1;\" The exponential smoothing approach is simple to use and once the value of a is selected, it requires only two pieces of information, namely ljand F; to calculate .11t+l). To begin with the exponential smoothing process, we let F; equal the actual 11·value of the time series in period ~ which is Hence, the forecast for period 2 is written as: F;=al(+(1-a)J1 But since we have put 11= l(, hence, F; =a l(+ (1- a) l( =11 Let us now apply exponential smoothing method to the problem of forecasting car sales as discussed in the case of moving averages. The data once again is given as follows: J.teek 1irre Series Vdlue ( lj) 1 20 2 24 3 22 4 26 5 21 6 22 Let a.= 0.4 Fz 11Since is calculated earlier as equal to = 20, we can calculate the value of Fj as follows: 11 = 0 .4}2'\" + (1- 0.4) F; Since Fz = 11· we get Fj = 0.4(24) + 0.6(20) = 9.6+ 12 = 21.6
Similar values can be calculated for subsequent periods, so that, Business Forecasting 'Ji:chniques: 1inr: Series Analysis F4 = 0.4¥ + 0.61) NOTES =0.4(22) + 0.6(21.6) =8.8 + 12.96 Self-Instructional 1\\.flterial 217 =21.76 F5 = 0.41;; + 0.6F4 =0.4(26) + 0.6(21.76) = 10.4 + 13.056 =23.456 F6 =o.4 Ys + o.6F5 =0.4(21) + 0.6(23.456) =8.4 + 14.07 =22.47 and, R, = 0.41-6 + 0.6F6 =0.4(22) + 0.6(22.47) =8.8 + 13.48 =22.28 Now we can compare the exponential smoothing forecast value with the actual values for the six time periods and calculate the forecast error. Meek 1ime Series lfllue Exponential Smoothing Error {lj) Forecast lfllue (F'r) 0';- F,) 1 20 -- 2 24 20.000 4.0 3 22 21.600 0.4 4 26 21.760 4.24 5 21 23.456 -2.456 6 22 22.470 -0.47 (The value of R, is not considered because the value of lj is not given). Let us now calculate the value of MSE for this method with selected value of a =0.4. From the previous table: Forecast errors Squared Forecast Error ( }('- F;) ( }('- F;) 4 16 0.4 0.16 4.24 17.98 -2.456 6.03 -0.47 0.22 Total = 40.39 Then, MSE = 40.39/5 = 8.08 The previous value of MSE was 8.67. Hence, the current approach is a better one.
Business Statistias-ll The choice of the value for a is very significant. Let us look at the exponential NOTES smoothing model again. F{t+J) =al('\"+(l-a)F; = F; +a(}('\"- F;) where ( lj- ~ is the forecast error during the time period t The accuracy of the forecast can be improved by carefully selecting the value of a. If the time series contains substantial random variability then a small value of a (known as smoothing factor or smoothing constant) is preferable. On the other hand, a larger value of a would be desirable for time series with relatively little random variability ( lj- ~- 9.3.3 Measuring Cyclical Effect Cyclic variation, as we have discussed earlier, is a pattern that repeats over time periods longer than one year. These variations are generally unpredictable in relation to the time of occurrence, duration as well as amplitude. However, these variations have to be separated and identified. The measure we use to identify cyclical variation is the percentage of trend and the procedure used is known as the residual trend As we have discussed earlier, there are four components of time series. These are secular trend ( 7), seasonal variation (S), cyclical variation (C) and irregular (or chance) variation(/). Since the time period considered for seasonal variation is less than one year, it can be excluded from the study, because when we look at time series consisting of annual data spread over many years, then only the secular trend, cyclical variation and irregular variation are considered. Since secular trend component can be described by the trend line (usually calculated by line of regression), we can isolate cyclical and irregular components from the trend. Furthermore, since irregular variation occurs by chance and cannot be predicted or identified accurately, it can be reasonably assumed that most of the variation in time series left unexplained by the trend component can be explained by the cyclical component. In that respect, cyclical variation can be considered as the residual, once other causes of variation have been identified. The measure of cyclic variation as percentage of trend is calculated as follows: (1) Determine the trend line (usually by regression analysis). (2) Compute the trend value ljfor each time period (f) under consideration. (3) Calculate the ratio lfljfor each time period. (4) Multiply this ratio by 100 to get the percentage of trend, so that: lJ;JlOO.( y\\ Percentage of trend = Example 9.3: The following is the data for energy consumption (measured in quadrillions of BTU) in the United States from 1981 to 1986 as reported in the Statistical Abstracts of the United States. 218 Self-InstructionalMJterial
lear 1ime Period (t) Annual Energy Business Forecasting Techniques: Consumption (JJ 1inr Series Analysis 1981 1 74.0 NOTES 1982 2 70.8 1983 3 70.5 1984 4 74.1 1985 5 74.0 1986 6 73.9 Assuming a linear trend, calculate the percentage of trend for each year (cyclical variation). Solution: First we fmd the secular trend by the regression line method, which is given by: 1;'\"=4+qt where q = 11L(ty)- (Lt)(Ly) ~Lf)-(Lt) 2 and 4 =y-qt Let us make a table for these values. t y tY r- 1 74.0 74.0 1 4 2 70.8 141.6 9 16 3 70.5 211.5 25 36 4 74.1 296.4 :Lr =91 5 74.0 370.0 6 73.9 443.4 Lt= 21 :LY= 437.3 :LtY= 1536.9 Substituting these values we get, q = 6(1536.9)-(21)(437.3) 6(91)-(21)2 =9221.4-9183.3 546-441 = 38·1 =0.363 105 and 4 = y-qt Self-Instructional Mlterial 219
Business Smtistics-D where y= :I:y = 437.3 = 72.88. n6 NOTES 1 = 21 =3.5 6 Hence, 4 =72.88- 0.363(3.5) = 72.88 - 1.27 = 71.61 Then, ~= 71.61 + 0.363t Calculating the value of Yrfor each time period, we get the following table for the percentage of trend ( ¥~100. Time Period Energy Consumption 1Iend Percentage of Trend (t) ro 71.97 OfY;lOO 72.34 102.82 1 74.0 72.70 97.87 2 73.06 96.97 3 70.8 73.43 101.42 4 73.79 100.77 5 70.5 100.15 6 74.1 74.0 73.9 The following graph shows the actual energy consumption ( }f, trend line ( ~ and the cyclical fluctuations above and below the trend line over the time period (i) for 6 years. 75 74 Cyclical fluctuations below the trend line a 73 a. E 72 ~ 8 71 f 70 69 68~---+---+--~--~--~ 2 3456 -+t (years) Frequently, we draw a graph of cyclic variation as the percentage oftrend. This process eliminates the trend line and isolates the cyclical component ofthe time series. It must be understood that cyclical fluctuations are not accurately predictable, and hence, we cannot predict the future cyclic v&riations based upon such past cyclic variations. 220 Self-InstructionalMaterial
103 Graph of Business Forecasting Techniques: percentage of trend 1lnr Series Analysis 102 NOTES 'c0 Trend line Q) t!= 101 0 Q) Cl 100 .lc!! Q) ~ 99 aQ..) i 98 97 96 1981 1982 1983 1984 1985 1986 -t(years) · The percentage of trend figures show that in 1981, the actual consumption of energy was 102.82 per cent of expected consumption that year and in 1983, the actual consumption was 96.97 per cent of the expected consumption. 9.4 APPLICATIONS OF BUSINESS PROBLEMS Seasonal variation has been defmed as predictable and repetitive movement around the trend line in a period of one year or less. For the measurement of seasonal variation, the time interval involved may be in terms of days, weeks, months or quarters. Because of the predictability of seasonal trends, we can plan in advance to meet these variations. For example, studying the seasonal variations in the production data makes it possible to plan for hiring of additional personnel for peak periods of production or to accumulate an inventory ofraw materials or to allocate vacation time to personnel, and so on. In order to isolate and identify seasonal variations, we first eliminate, as far as possible the effects oftrend, cyclical variations and irregular fluctuations on the time series. Some of the methods used for the measurement of seasonal variations are described as follows. 9.4.1 Simple Averages This is the simplest method of isolating seasonal fluctuations in time series. It is based on the assumption that the series contain only the seasonal and irregular fluctuations. Assume that the time series involve monthly data over a time period of, say, 5 years. Assume further that we want to fmd the seasonal index for the month of March. (The seasonal variation will be the same for March in every year. Seasonal index describes the degree of seasonal variation). Then, the seasonal index for the month of March will be calculated as follows: l )Seasona1Index fior March =( Monthly average for March l x 10 Average of monthly averages The following steps can be used in the calculation of seasonal index (variation) for the month ofMarch (or any month), over the five years period, regarding the sale of cars by one distributor. Self-Instructional 1\\-hterial 221
Business Statistics-I/ 1. Calculate the average sale of cars for the month of March over the last 5 years. NOTES 2. Calculate the average sale of cars for each month over the 5 years and then calculate the average of these monthly averages. 3. Use the formula to calculate seasonal index for March. Let us say that the average sale of cars for the month of March over the period of 5 years is 360, and the average of all monthly average is 316. Then the seasonal index for March= (360/316) x 100 = 113.92. · 9.4.2 Moving Averages This is the most widely used method ofmeasuring seasonal variations. The seasonal index is based upon a mean of 100 with the degree of seasonal variation (seasonal index) measured by variations away from this base value. For example, if we look at the seasonality of rental of row boats at the lake during the three summer months (a quarter) and we find that the seasonal index is 135 and we also know that the total boat rentals for the entire last year was 1680, then we can estimate the number of summer rentals for the row boats. The average number of quarterly boats rented= 1680/4 = 420. The seasonal index, 135 for the summer quarter means that the summer rentals are 135 percent of the average quarterly rentals. Hence, summer rentals= 420 x (135/100) = 567. The steps required to compute the seasonal index can be emunerated by illustrating an example. Example 9.4: Assume that a record of rental of row boats for the previous 3 years on a quarterly basis is given as follows: lear Rentals per quarter Total I II III N 1991 350 300 450 400 1500 1992 330 360 500 410 1600 1993 370 350 520 440 1680 Solution: Step 1. The first step is to calculate the four-quarter moving total for time series. This total is associated with the middle data point in the set of values for the four quarters, shown as follows. lear Quarters Rentals MJving Total 1991 I 350 II 300 III 1500 N 450 400 222 Self-InstructionalMaterial
The moving total for the given values of four quarters is 1500 which is simply Business Forecasting 1echniques: the addition of the four quarter values. This value of 1500 is placed in the middle of Tim:: Series Analysis values 300 and 450 and recorded in the next column. For the next moving total of the four quarters, we will drop the value of the first quarter, which is 350, from the NOTES total and add the value of the fifth quarter (in other words, frrst quarter of the next year), and this total will be placed in the middle of the next two values, which are Self-Instructional Material 223 450 and 400, and so on. These values of the moving totals are shown in column 4 of the following table. Step 2. The next step is to calculate the quarter moving average. This can be done by dividing the four quarter moving total, as calculated in Step 1 earlier, by 4, since there are 4 quarters. The quarter moving average is recorded in column 5 in the table. The entire table of calculations is shown as follows: lear Quarters Rentals Quarter Quarter Quarter Percentage of Mwing Mwing Centered Actual to (l) (2) (3) Average .M:Jving Centered Total Average (5) .M:Jving Average (4) (6) (7) I 350 1500 375.0 120.80 II 300 372.50 105.96 III 450 84.35 N 400 1480 370.0 90.28 1992 I 330 377.50 123.45 II 360 100.30 III 500 1540 385.0 90.24 N 410 391.25 84.08 1993 I 370 350 1590 397.5 n 520 398.75 440 III 1600 400.0 N 405.00 1640 410.0 408,75 1630 407.5 410.00 1650 412.5 416.25 1680 420.0 Step 3. After the moving averages for each of the consecutive four quarters have been taken, we centre these moving averages. As we see from the table, the quarterly moving average falls between the quarters. This is because the number of quarters is even which is 4. If we had odd number of time periods, such as 7 days of the week, then the moving average would already be centered and the third step
Business Statistics-II here would not be necessary. Accordingly, we centre our averages in order to associate each average with the corresponding quarter, rather than between the quarters. This NOTES is shown in column 6, where the centered moving averag~ is calculated as the average of the two consecutive moving averages. The moving average (or the centered moving average) aims to eliminate seasonal and irregular fluctuations (S and I) from the original time series, so that this average represents the cyclical and trend components of the series. As the following graph shows the centered moving average has smoothed the peaks and troughs of the original time series. Original Time 550 Series 500 450 400 350 300 II Ill IV I II Ill IV I II Ill IV I 1991 1992 1993 Step 4. Column 7 in the table contains calculated entries which are percentages ofthe actual values to the corresponding centered moving average values. For example, the first four quarters centered moving average of 372.50 in the table has the corresponding actual value of450, so that the percentage ofactual value to centered moving average would be: --------A-c-t-ua-l-V-a-l-u-e-----xlOO Centered Moving Average Value = 450 xlOO 372.5 = 120.80 Step S. The purpose of this step is to eliminate the remaining cyclical and irregular fluctuations still present in the values in Column 7 of the table. This can be done by calculating the 'modified mean' for each quarter. The modified mean for each quarter of the three years time period under consideration, is calculated as follows. (a) Make a table of values in column 7 of the previous table (percentage of actual to moving average values) for each quarter of the three years as shown in the following table. 224 Self-InstructionalMlterial
lear Quarter I Quarter II Quarter (Ill) Quarter (llj Business Forecasting Jechniques: 120.80 11nr Series Analysis 1991 - - 123.45 105.96 1992 100.30 NOTES 84.35 90.28 - 84.08 - Self-Instructional !vhterial 225 1993 90.24 (b) We take the average of these values for each quarter. It should be noted that if there are many years and quarters taken into consideration instead of three years as we have taken, then the highest and lowest values from each quarterly data would be discarded and the average of the remaining data would be considered. By discarding the highest and lowest values from each quarter data, we tend to reduce the extreme cyclical and irregular fluctuations, which are further smoothed when we average the remaining values. Thus, the modified mean can be considered as an index of seasonal component. This modified mean for each quarter data is shown as follows: Quarter I= 84·35 + 90·24 = 87.295 2 Quarter II= 90·28 + 84·08 = 87.180 .2 Quarter III= 120·80 + 123.45 122.125 2 Quarter IV= 105.96 + 100.30 = 103.13 2 Total= 399.73 The modified means as calculated here are preliminary seasonal indices. These should average 100 percent or a total of 400 for the 4 quarters. However, our total is 399.73. This can be corrected by the following step. Step 6. First, we calculate an adjustment factor. This is done by dividing the desired or the expected total of 400 by the actual total obtained of 399.73, so that: Adjustment = 400 = 1.0007 399.73 By multiplying the modified mean for each quarter by the adjustment factor, we get the seasonal index for each quarter, so that: Quarter I= 87.295 x 1.0007 = 87.356 Quarter II= 87.180 x 1.0007 = 87.241 Quarter III= 122.125 x 1.0007 = 122.201 Quarter IV= 103.13 x 1.0007 = 103.202 Total = 400.000 Average seasonal m.dex =40-0 =100 4
Business Statistics-II (This average seasonal index is approximated to 100 because of rounding-off errors). NOTES The logical meaning behind this method is based on the fact that the centered 226 Self-InstructionalMlterial moving average part of this process eliminates the influence of secular trend and cyclical fluctuations ( T 0 C). This may be represented by the following expression: TOSO COl =SOl TOC where (T 0 S 0 C 0 I) is the influence oftrend, seasonal variations, cyclic fluctuations and irregular or chance variations. Thus, the ratio to moving average represents the influence of seasonal and irregular components. However, ifthese ratios for each quarter over a period ofyears are averaged, then the most random or irregular fluctuations would be eliminated so that, SOI=S I and this would give us the value of seasonal influences. 9.4.3 Measuring Irregular Variation and Seasonal Adjustments Typically, irregular variation is random in nature, unpredictable and occurs over comparatively short periods of time. Because of its unpredictability, it is generally not measured or explained mathematically. Usually, subjective and logical reasoning explains such variation that occurs; for example, long period of cold weather in Brazil and Columbia results in increase in the price of coffee beans, because cold weather destroys coffee plants. Similarly, the Persian GulfWar, an irregular factor resulted in increase in airline and ship travel for a number of months because of the movement of personnel and supplies. However, the irregular component can be isolated by eliminating other components from the time series data. For example, time series data contains ( T 0 S 0 C 0 I) components and if we can eliminate ( T 0 S 0 C) elements from the data, then we are left with (I) component. We can follow the previous example to determine the (I) component as follows. The data presented has already been earlier provided or calculated. lear Quarters Rentals Centered l'vfoving To S o C o I I(T 0 C) 1ime Series lillues Average (T 0 C) =S 0/ (TOSOCO/) 1991 I 350 - - 300 n 450 - - 400 m 330 372.50 1208 360 377.50 1.060 N 500 391.25 0.843 410 398.75 0.903 1992 I 405.00 1235 408.75 1.003 n m N (Contd ..)
1993 370 410.00 0.902 Business Forecasting 'Ji:chniques: II 350 416.25 0.841 Jinr Series Analysis ill 520 N 440 NOTES The seasonal indices for each quarter have already been calculated as: Check Your Progress Quarter I = 87.356 4. Explain the term 'Irregular Random Variation'. Quarter II= 87.241 Quarter III= 122.201 5. What are the two methods Quarter IV= 103.202 adopted in smoothing Then the seasonal influence is given by: techniques? Quarter I= 87.356/100 = .874 Quarter II= 87.241/100 = .872 6. Enumerate the methods Quarter III= 122.201/100 = 1.222 used to measure seasonal Quarter IV= 103.202/100 = 1.032 variation. Making another table of (S 0 I) values and (5) values and dividing (S 0 I) by (5) we get the values of(/) as follows: Self-Instructional Mlterial 227 lear Quarters (So I) (S) (I) 1991 I - - - - II - - 1.222 III 1.208 1.032 0.988 0.874 1.027 IV 1.060 0.872 0.965 1.222 1.036 1992 I 0.843 1.032 1.011 0.874 0.972 II 0.903 0.872 1.032 0.964 III 1.235 - - - IV 1.003 1993 I 0.902 - II 0.841 III - IV - Seasonal adjustments Many times we read about time series values as seasonally adjusted This is accomplished by dividing the original time series values by their corresponding seasonal indices. These deseasonalized values allow more direct and equitable comparisons of values from different time periods. For example, in comparing the demands for rental row boats (example that we have been following), it would not be equitable to compare the demand of second quarter (spring) with the demand of third quarter (summer), when the demand is traditionally higher. However, these demand values can be compared when we remove the seasonal influence from these time series values. The seasonally adjusted values for the demand of row boats in each quarter are based on the values previously calculated and is shown as follows.
Business Statistics-11 lear Quarter Rentals (S) Seasonally Adjusted Rounded-off NOTES (TOS OCOJ) Ullues Ullues 1991 I 350 - - - II 300 - - - III 450 1.222 368.25 368 IV 400 1.032 387.60 388 1992 I 330 0.874 377.57 378 II 360 0.872 412.80 413 III 500 1.222 409.16 409 IV 410 1.032 397.29 397 1993 I 370 0.874 423.34 423 II 350 0.872 401.38 401 III 520 - - - - IV 440 - - The seasonally adjusted value for each quarter is calculated as: Original Value Seasonal Index These calculations complete the process of separating and identifying the four components of the time series, namely secular trend ( 1J, seasonal variation (5), cyclical variation (C) and irregular variation (1). 9.5 SOLVED PROBLEMS Problem 1. The following table shows the number of public sector industries' failures in India during the period 1987 to 1993. Using a four-year moving average method, calculate the nran square error (MSE) for this data. lear Number of Failures 1987 32 1988 26 1989 30 1990 28 1991 24 1992 22 1993 26 Solution: The four-year moving averages are calculated as follows: (1) 1987 to 1990: Moving average = 32+26+30+28 29 4 26+30+ 28+ 24 (2) 1988 to 1991: Moving average = 4 =27 228 Self-InstructionalMJterial
(3) 1989 to 1990: Moving average = 30+28+24+22 =26 Business Forecasting 'li:chniques: 4 7im: Series Analysis (4) 1990 to 1993: Moving average = 28+ 24+ 22 + 26 =25 NOTES 4 Self-Instructional Mlterial 229 To calculate the value of MSE, the following table is constructed. lear 1irrr Series Vilue (l:j) /ttfoving Average Error Error Squared 1987 32 29 - 1 1 1988 26 27 - 3 9 1989 30 26 -4 16 1990 28 25 - 1 1 1991 24 1992 22 1993 26 Then, MSE =1 + 9 + 16 + 1=27 =6.75 44 Problem 2. For the given time series values data in Problem 1, calculate the value of MSE by using the exponential smoothing method. The value of the smoothing factor a. (alpha) is given as 0.6. The time series values for various years are repeated as follows: lear (Y.: 1987 32 1988 26 1989 30 1990 28 1991 24 1992 22 1993 26 Solution: The exponential smoothing model for time period (t + 1) is expressed as: !ft+l) =a.J; + (1-a.)F, where, F( t+1) = the forecast value of time series for period (tt-l) :J; = actual value of the time series in period (f) a. = smoothing factor (0 ~a.~ 1) ~ = forecast value of the time series for period (f) To start with, we let .fi equal the actual value of the time series in period (t= 1) which is l:), so that: J1=l( and, F; =a.l(+ (1- a.).11 = a.l(+ (1- a.) l(, l:) and so on. Now, let us calculate the values of .fi, where i = 1, 2, ... 8. Then,
Business Smtistics-1/ ~= ~ =32 NOTES Fi = 0.6(32) + 0.4(32) = 32 F3 = 0.6(26) + 0.4(32) = 15.6 + 12.8 = 28.4 F4 = 0.6(30) + 0.4(28.4) = 18.0 + 11.36 = 29.36 F5 = 0.6(28) + 0.4(29.36) = 16.8 + 11.74 = 28.54 F6 = 0.6(24) + 0.4(28.54) = 14.4 + 11.42 = 25.82 F7 = 0.6(22) + 0.4(25.82) = 13.2 + 10.33 = 23.53 Fg = 0.6(26) + 0.4(23.53) = 15.6 + 9.41 = 25.01 (The value of Fg is not taken into consideration because the value of Y8 is not given). All these values are tabulated as follows: lear 1inr Serie~ Exponential Smoothing Error Error Squared Utlue (I;} Forecast liiJUe (FtY (lj- Ftl (lj- F/ 1987 32 - -- 1988 26 32.00 -6.00 36.00 1989 30 28.40 1.60 2.56 1990 28 29.36 -1.36 1.85 1991 24 28.54 -4.54 20.61 1992 22 25.82 -3.82 14.59 1993 26 23.53 2.47 6.10 Total 81.71 Then, MSE = 81.71 = 13.618 6 Problem 3. The Dean of the School of Business at Atlantic University, which operates on a trimester system, has compiled the following quarterly new enrolment of MBA students for the previous 3 years from 1992 to 1994 and the results are shown as follows: lear Fall Wnter Spring Sllli111rr 1992 200 180 185 95 1993 220 188 173 83 1994 220 176 161 87 By using the ratio to moving average method, calculate the seasonal index for each trimester. Solution: In order to calculate the seasonal indices for fall, winter, spring and summer academic sessions, we need to fmd quarter moving averages, quarter centered moving averages and percentages of actual to centered moving averages as explained previously. We construct the following table: 230 Self-InstructionalMlterial
lear Quarters Mllues Quarter Quarter Quarter Percentage of Business Forecasting 7i:chniques: (1) (2) (3) Mwing Mwing Centered Actual to 1inr Series Analysis Average JIJoving Centered Total Average NOTES (5) JIJoving Average (4) (6) (7) Self-Instructional Milerial 231 1992 I 200 II 180 660 165 Ill 185 167.5 110.45 680 170 N 95 171.0 55.55 688 172 1993 I 220 170.5 129.03 676 169 II 188 167.5 112.24 664 166 Ill 173 166.0 104.22 664 166 N 83 154.5 50.46 652 163 1994 I 220 161.5 136.22 640 160 II 176 160.5 109.66 644 161 III 161 N 87 Now, we calculate the modified mean for each quarter. This can be done by the following steps. The first step is to make a table of values already calculated and placed in colunm (7) of the previous table. These are the percentage of actual to moving average values for the various quarters of the three years. These are shown in the following table: lear Fall Wnter Spring Summer 1992 110.45 55.55 1993 129.03 112.24 104.22 50.46 1994 136.22 109.66 The second step is to take the average of these values for each quarter. The modified mean for each quarter data is shown as follows: Fall= 129.03 + 136.22 = 265.25 = 132_625 22 Winter= 112.24+109.66 = 221.90 = 110.950 22 Spring= 110.45 + 104.22 = 214.67 = 107_335 22
Business Smtistics-H Summer= 55.55 + 50.46 = 106.01 = 53.005 22 NOTES Total= 403.915 232 Self-Instructional Mlterial These modified means are preliminary seasonal indices. These should average 100 or a total of 400 for these 4 quarters. However, our total is 403.915. Accordingly, we calculate the adjustment factor as follows: Adjustment Factor = 400 = 0.9903 403.915 We get the seasonal index for each quarter by multiplying the modified mean for each quarter by the adjustment factor. Then, the seasonal index for each quarter is shown as follows: Fall: 132.625 X 0.9903 131.34 Wmter: 110.950 x 0.9903 109.87 Spring: 107.335 X 0.9903 106.29 Summer: 53.005 x 0.9903 52.50 Total 400.00 Problem 4. In Problem 3, which gives us the data about new admissions into the MBA programme of an university for each trimester, separate the seasonal and irregular influences on the time series and calculate the irregular (I) component as well as the seasonally adjusted values for each quarter. Solution: We have already calculated the various values that are needed. We know that: Time Series Values = T 0 S D C 0 I Centered Moving Average= T 0 C Hence, s 01 =_T_o_s_o_c_o_._I TOC Let us restate the needed values in· the following table. lear Quarter TfJSOCOI TOC SOl 1992 I 200 - - II 180 - III 185 167.5 - IV 95 171.0 220 170.5 1.105 1993 I 188 167.5 0.556 II 173 166.0 1.290 III 83 164.5 1.122 IV 220 161.5 1.042 176 160.5 0.505 1994 I 161 - 1.362 II 87 - 1.097 III IV - -
The seasonal indices for each quarter have already been calculated as: Business Forecasting 1ixhniques: 11m: Series Analysis Fall: 131.34 NOTES Winter: 109.87 Self-Instructional \"Material 233 Spring: 106.29 Summer: 52.50 Then the seasonal influence (S) is given by: Fall: 131.34/100 = 1.3134 Winter: 109.87/100 = 1.10987 Spring: 106.29/100 = 1.0629 Summer: 52.50/100= 0.5250 Now, we make another table with (S x /) values as calculated in the previous table and (5) values for each quarter of fall, winter, spring and summer and this way we can get the values of(/) by dividing (S 0 /) values by the (5) values. These are shown in the following table. lear Quarter sni (S) (I) 1992 I - -- II - -- III 1.105 1.0629 1.040 IV 0.556 0.5250 1.059 1993 I 1.290 1.3134 0.982 II 1.122 1.0987 1.021 III 1.042 1.0629 0.980 IV 0.505 0.5250 0.962 1994 I 1.362 1.3134 1.037 II 1.097 1.0987 0.998 III - -- IV - -- Now we can find the seasonally adjusted values by dividing the original time series values by their corresponding seasonal indices. This is shown as follows: lear Quarter Time Series \"Wllues (S) Seasonally 1992 TDSOCOI Adjusted \"Wllues I 200 - 1993 180 - II 185 - - 1994 III 95 220 1.0629 174.05 IV 188 0.5250 180.95 I 173 1.3134 167.50 83 1.0987 171.11 II 220. 1.0629 162.76 III 176 0.5250 158.09 161 1.3134 167.50 IV 87 1.0987 160.19 I - - II - III - IV
Business Statistics-If 9.6SUMMARY NOTES In this unit, you have learned that accurate forecasting is an essential element ofplanning ofany organization or policy. This requires studying previous performances in order to 234 Self-Instructional Material forecast future activities. When a projection ofthe pattern offuture economic activity is known and the level of future business activity is understood, the desirability of an alternative course ofaction and the selection ofan optimum alternative can be examined and forecast. The quality ofsuch forecasts is strongly related to the relevant information that can be extracted from past data. Hence, the time series analysis method helps in making accurate predictions and also in situations where the future is expected to be similar to or at least predictive from the past. 9.7 ANSWERS TO 'CHECK YOUR PROGRESS' 1. The term trend means the general long-term movement in the time series value . , ,. . . . ,,.. ,. , l5 over a fairly long period of time. Here, ' Y stands for such factors like sales, population and crime rate that we are interested in evaluating for the future. 2. Regular swings or patterns that repeat over a long period of time are known as cyclical fluctuations. These are usually unpredictable in relation to the time of occurrence, the duration as well as the amplitude. 3. Factors like changes in climate and weather, and customs and traditions cause seasonal variations. 4. Those variations which are accidental, random or occur due to chance factors, are known as irregular random variations. 5. The two methods adopted in smoothing techniques are: a. Moving Averages b. Exponential Smoothing 6. The methods used in measuring seasonal variation are: a. Simple Average Method b. Moving Average Method 9.8 QUESTIONS AND EXERCISES Short-Answer Questions 1. Differentiate between secular trend and cyclic fluctuation. 2. How is irregular variation caused? 3. Defme seasonal variation. 4. What do you mean by trend analysis? 5. How will you measure cyclical effect? 6. Describe the simple average method of isolating seasonal fluctuations in time senes. 7. What are the ways to measure irregular variation? 8. How are seasonal adjustments made?
Long-Answer Questions Business Forecasting Techniques: Tinr Series Analysis 1. The following data shows the number of Lincoln Continental cars sold by a NOTES dealer in Queens during the 12 months of 1994. Month Number Sold Jan 52 Feb 48 Mar 57 Apr 60 May 55 June 62 July 54 Aug 65 Sept 70 Oct 80 Nov 90 Dec 75 (a) Calculate the three-month moving average for this data. (b) Calculate the five-month moving average for this data. (c) Which one of these two moving averages is a better smoothing technique and why? 2. The owner of six gasoline stations in New Jersey would like to have some reasonable indication of future sales. He would like to use the moving average method to forecast future sales. He has recorded the quarterly gasoline sales (in thousands of gallons) for all his gas stations for the past three years. These are shown in the following table. 38 Self-Instructional Afilterial 235 2 58 3 80 4 30 2 1 40 2 60 3 50 4 55 3 50 2 45 3 80 4 .____ ]SL___._ (a) Calculate the three-quarter moving average. (b) Calculate the five-quarter moving average. (c) Plot the quarterly sales and also both the moving averages on the same graph. Which of these two moving average seems to be a better smoothing technique? 3. The following data represents the sales (in millions of doHars) in a large department store for the twelve months of the last three yesxs.
. Business Statistics-H Month 1993 1994 1995 NOTES January 7.2 8.2 6.4 236 Self-InstructionalMlterial February 8.5 9.1 7.3 March 9.6 10.5 8.5 April 10.2 11.4 10.2 May 11.7 12.0 10.8 June 13.0 14.5 15.0 July 14.2 14.0 13.0 August 15.7 16.2 15.0 September 11.4 12.0 11.8 October 9.3 9.0 8.6 November 12.8 9.8 11.0 December 9.3 7.3 6.3 (a) Plot the data and check for any trends in the data. (b) Compute a twelve-month moving average. Which components of the time series do these values reflect? 4. The General Manager of NCO (National Computer Outlet) which is a chain store in the North East wants to determine the sales trend for one ofthe stores located in Flushing, Queens. He took the sales data for the last 20 years in millions of dollars. The yearly data is given as follows: lear Sales ($ million) 1975 3.5 1976 4.0 1977 5.7 1978 6.2 1979 5.0 1980 4.2 1981 3.4 1982 4.6 1983 5.6 1984 6.4 1985 7.2 1986 8.0 1987 ~9 1488 5.0 1989 4.2 1990 5.0 1991 6.2 1992 7.2 1993 8.2 1994 9.2 The General Manager asked the company statistician to calculate the five-year moving average. (a) Ifyou were the company statistician, how would you present the results to the General Manager?
(b) Calculate the seven-year moving average and explain the difference between Business Forecasting Techniques: the five-year moving average and the seven-year moving average to the Jinr Series Analysis General Manager as to which of the two smoothes the data better. NOTES (c) Plot the actual sales and the seven-year moving average sales on the same graph and interpret it. Self-Instructional Material 237 5. An economist has calculated the variable rate ofreturn on money market funds for the last twelve months as follows: Month Rate of Return (%) January February 6.2 March 5.8 April 6.5 May 6.4 June 5.9 July 5.9 August 6.0 September 6.8 October 6.5 November 6.1 December 6.0 6.0 (a) Using a three-month moving average, forecast the rate of return for next January. (b) Using exponential smoothing method and setting, a = 0.8, forecast the rate of return for next January. 6. The Indian Motorcycle Company is concerned about declining sales in the western region. The following data shows monthly sales (in millions of dollars) of the motorcycles for the past twelve months. Month Sales January 6.5 February 6.0 March 6.3 April 5.1 May 5.6 June 4.8 July 4.0 August 3.6 September 3.5 October 3.1 November 3.0 December 3.0 (a) Plot the trend line and describe the relationship between sales and time. (b) What is the average monthly change in sales? (c) If the monthly sales fall below $2.4 million, then the West Coast office must be closed. Is it likely that the office will be closed during the next six months?
Business StatistiCY-U 7. The chief economist for New York State Department of Commerce reported NOTES the following seasonally adjusted values for the consumption of durable goods 238 Self-InstructionalMlterial for the last twelve-month period. Month Index January 119 February 114 March 115 April 116 May 120 June 113 July 115 Augus~ 112 September 117 October 116 November 121 December 124 (a) Develop the linear trend model. r(b) Calculate the value. r,(c) Based on the value of does it appear to be a good fit? 8. The following data represent the sales (in Rs Lakh) for two outdoor furniture outlets for the last ten years. Sales lear Oudet A Oudet B 1 118 95 2 114 100 3 130 118 4 125 124 5 140 130 6 143 145 7 147 160 8 158 181 9 149 190 10 161 205 (a) Calculate the regression coefficients for data for both outlets. (b) How does the average yearly change in sales differ from one outlet to another? (c) Plot the sales against time for both outlets. (d) Which one of the two regression lines seems to be better fit with the given data? 9. An institution dealing with pension funds is interested in buying a large block of stock ofAzumi Business Enterprises (ABE). The president ofthe institution has noted down the dividends paid out on common stock shares for the last ten years. This data is presented as follows:
lear Dividend $ Business Forecasting Techniques: 1inr Series Analysis 1985 3.20 1986 3.00 NOTES 1987 2.80 1988 3.00 1989 2.50 1990 2.10 1991 1.60 1992 2.00 1993 1.10 1994 1.00 (a) Plot the data. (b) Determine the value of regression coefficients. (c) Estimate the dividend expected in 1995. (d) Calculate the points on the trend line for the years 1987 and 1991 and plot the trend line. 10. Rinkoo Camera Corporation has ten camera stores scattered in five areas of New York city. The president of the company wants to fmd out if there is any connection between the sales price and the sales volume ofNikon F-1 camera in the various retail stores. He assigns different prices of the same camera for the different stores and collects data for a thirty-day period. The data is presented as follows. The sales volume is in number of units and the price is in dollars. Store Price Volume 1 550 420 2 600 400 3 625 300 4 575 400 5 600 340 6 500 440 7 450 500 8 480 460 9 550 400 10 650 310 (a) Plot the data. (b) Estimate the linear regression of sales on price. (c) What effect would you expect on sales if the price of the camera in store number 7 is increased to $530? (d) Calculate the points on the trend line for stores 4 and 7 and plot the trend line. 11. The sales of videocassette recorders (VCRs) have been increasing every year since 1984. The following data represents the number of households in the United States (in millions) who have at least one VCR in the house over the years 1984 through 1994. Self-Instructional Material 239
Business Sta.tistics-11 lear Number 1984 9.0 NOTES 1985 17.2 1986 30.9 240 Self-Instructional.Mltcrial 1987 40.0 1988 49.9 1989 58.4 1990 68.2 1991 80.0 1992 82.2 1993 82.7 1994 84.0 (a) Determine as to which variable is X and which variable is Y and plot the data. (b) Estimate the number ofhouseholds with VCRs in 1995. (c) By what per cent, on an average, did the number ofhouseholds with VCRs increase over the years? 12. Nishi Hardware store located in Staten island has made a record of the net profits for the first eight weeks of 1995. The results are shown in the following table: »eek Profits ($'000) 1 12 2 20 34 4 15 52 6 25 75 8 10 (a) At the given value of a = 0.4, use exponential smoothing technique to forecast profits for future weeks. (b) Forecast the future profits when a value is changed to 0.9. (c) Which of the two a values produces a more reliable estimate? Explain. 13. The following data shows the sales revenues for sales of used cars sold by Atlantic Company for the months of January to April in 1995. Month Sales ($'00,000) January 95 February 105 March 100 April 110 Find the error between the actual value and the forecast value for the months of February, March and April of 1995, using exponential smoothing method with a= 0.6. 14. The following data presents the rate of unemployment in South India for 12 years from 1982 to 1993.
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356