Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore MCA645 CU-MCA-SEM-II-Statistical & Numerical Methods

MCA645 CU-MCA-SEM-II-Statistical & Numerical Methods

Published by kuljeet.singh, 2021-01-04 06:57:50

Description: MCA645 CU-MCA-SEM-II-Statistical & Numerical Methods

Search

Read the Text Version

MASTER OF COMPUTER APPLICATIONS SEMESTER-II STATISTICAL & NUMERICAL METHODS MCA645

CHANDIGARH UNIVERSITY Institute of Distance and Online Learning Course Development Committee Chairman Prof. (Dr.) Parag Diwan Vice Chancellor, Chandigarh University, Gharuan, Punjab Advisors Prof. (Dr.) Bharat Bhushan, Director – IGNOU Prof. (Dr.) Majulika Srivastava, Director – CIQA, IGNOU Programme Coordinators & Editing Team Master of Business Administration (MBA) Bachelor of Business Administration (BBA) Coordinator – Dr. Rupali Arora Coordinator – Dr. Simran Jewandah Master of Computer Applications (MCA) Bachelor of Computer Applications (BCA) Coordinator – Dr. Raju Kumar Coordinator – Dr. Manisha Malhotra Master of Commerce (M.Com.) Bachelor of Commerce (B.Com.) Coordinator – Dr. Aman Jindal Coordinator – Dr. Minakshi Garg Master of Arts (Psychology) Bachelor of Science (Travel &Tourism Management) Coordinator – Dr. Samerjeet Kaur Coordinator – Dr. Shikha Sharma Master of Arts (English) Bachelor of Arts (General) Coordinator – Dr. Ashita Chadha Coordinator – Ms. Neeraj Gohlan Academic and Administrative Management Prof. (Dr.) R. M. Bhagat Prof. (Dr.) S.S. Sehgal Executive Director – Sciences Registrar Prof. (Dr.) Abhishek Prof. (Dr.) Inderpreet Kaur Executive Director – Management Director – IDOL Prof. (Dr.) Manaswini Acharya Executive Director – Liberal Arts © No part of this publication should be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording and/or otherwise without the prior written permission of the authors and the publisher. SLM SPECIALLY PREPARED FOR CU IDOL STUDENTS Printed and Published by: SCHOOLGURU EDUSERVE PVT LTD B-903, Western Edge II, Western Express Highway, Borivali (E), Mumbai - 400066 Call Us: +91 22 4896 8005 Mail Us: [email protected] For: CHANDIGARH UNIVERSITY 2 Institute of Distance and Online Learning CU IDOL SELF LEARNING MATERIAL (SLM)

First Published in 2020 All rights reserved. No Part of this book may be reproduced or transmitted, in any form or by any means, without permission in writing from Chandigarh University. Any person who does any unauthorized act in relation to this book may be liable to criminal prosecution and civil claims for damages. This book is meant for educational and learning purpose. The authors of the book has/have taken all reasonable care to ensure that the contents of the book do not violate any existing copyright or other intellectual property rights of any person in any manner whatsoever. In the even the Authors has/ have been unable to track any source and if any copyright has been inadvertently infringed, please notify the publisher in writing for corrective action. 3 CU IDOL SELF LEARNING MATERIAL (SLM)

CONTENT Unit 1- Computer Arithmetic ..........................................................................................................5 Unit 2- Iterative Methods ..............................................................................................................33 Unit 3- Linear Equations and Differential Equations 1: ..............................................................68 Unit 4- Linear Equations and Differential Equations 2 ..............................................................83 Unit 5- Numerical Differentiation and Integration: ...................................................................108 Unit 6- Interpolation and Approximation 1: ..............................................................................134 Unit 7- Interpolation and Approximation 2: ..............................................................................151 Unit 8- Statistical methods..........................................................................................................163 Unit 9- Analysis of Variance: ......................................................................................................181 Unit 10- Time Series Analysis: ...................................................................................................209 4 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT 1- COMPUTER ARITHMETIC Structure 1.0 Learning Objective 1.1 Introduction 1.2 Floating point representation of numbers 1.2.1 Fixed-point 1.2.2 Normalized floating-point 1.2.3 Representable numbers 1.2.4 IEEE standard 1.3 Arithmetic operations with normalized floating point numbers and their consequences. 1.3.1addition 1.3.2 Subtraction 1.3.3 Multiplication 1.3.4 Division 1.3.5 Issues in Floating Point 1.3.6 Consequences: 1.4 Error in number representation - pitfalls in computing. 1.4.1 Representation errors 1.4.2 Numerical Error 1.4.3 pitfalls in computing. 1.5 Summary 1.6 Keywords 1.7 Learning Activity 1.8 Unit End Questions 1.9 References 1.0 LEARNING OBJECTIVES After studying this unit, you will be able to:  Explain the Floating point representation of numbers,  Enumerate Arithmetic operations with normalized floating point numbers and their consequences.  Comprehend Error in number representation 1.1 INTRODUCTION 5 CU IDOL SELF LEARNING MATERIAL (SLM)

Computer arithmetic is a branch of computer engineering that deals with methods of representing integers and real values (e.g., fixed- and floating-point numbers) in digital systems and efficient algorithms for manipulating such numbers by means of hardware circuits or software routines. On the hardware side, various types of adders, subtractors, multipliers, dividers, square-rooters, and circuit techniques for function evaluation are considered. Both abstract structures and technology-specific designs are dealt with. Software aspects of computer arithmetic include complexity, error characteristics, stability, and certifiability of computational algorithms. 1.2 FLOATING POINT REPRESENTATION OF NUMBERS In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision. For this reason, floating-point computation is often found in systems which include very small and very large real numbers, which require fast processing times. A number is, in general, represented approximately to a fixed number of significant digits (the significand) and scaled using an exponent in some fixed base; the base for the scaling is normally two, ten, or sixteen. The term floating point refers to the fact that a number's radix point (decimal point, or, more commonly in computers, binary point) can \"float\"; that is, it can be placed anywhere relative to the significant digits of the number. This position is indicated as the exponent component, and thus the floating-point representation can be thought of as a kind of scientific notation. A floating-point system can be used to represent, with a fixed number of digits, numbers of different orders of magnitude: e.g. the distance between galaxies or the diameter of an atomic nucleus can be expressed with the same unit of length. The result of this dynamic range is that the numbers that can be represented are not uniformly spaced; the difference between two consecutive representable numbers varies with the chosen scale. Over the years, a variety of floating-point representations have been used in computers. In 1985, the IEEE 754 Standard for Floating-Point Arithmetic was established, and since the 1990s, the most commonly encountered representations are those defined by the IEEE. 1.2.1 Fixed-point One possibility for handling numbers with fractional parts is to add bits after the decimal point: The first bit after the decimal point is the halves place, the next bit the quarters place, the next bit the eighths place, and so on. 6 CU IDOL SELF LEARNING MATERIAL (SLM)

Suppose that we want to represent 1.625(10). We would want 1 in the ones place, leaving us with 0.625. Then we want 1 in the halves place, leaving us with 0.625 − 0.5 = 0.125. No quarters will fit, so put a 0 there. We want a 1 in the eighths place, and we subtract 0.125 from 0.125 to get 0. So the binary representation of 1.625 would be 1.101(2). The idea of fixed-point representation is to split the bits of the representation between the places to the left of the decimal point and places to the right of the decimal point. For example, a 32-bit fixed- point representation might allocate 24 bits for the integer part and 8 bits for the fractional part. Fig 1.1: 32 bits fixed-point representation To represent 1.625, we would use the first 24 bits to indicate 1, and we'd use the remaining 8 bits to represent 0.625. Thus, our 32-bit representation would be: 00000000 00000000 00000001 10100000. Fixed-point representation works reasonably well as long as you work with numbers within the supported range. The 32-bit fixed-point representation described above can represent any multiple of 1/256 from 0 up to 224 ≈ 16.7 million. But programs frequently need to work with numbers from a much broader range. For this reason, fixed-point representation isn't used very often in today's computing world. A notable exception is financial software. Here, all computations must be represented exactly to the penny, and in fact further precision is rarely desired, since results are always rounded to the penny. Moreover, most applications have no requirement for large amounts (like trillions of dollars), so the limited range of fixed-point numbers isn't an issue. Thus, programs typically use a variant of fixed- point representation that represents each amount as an integer multiple of 1/100, just as the fixed- point representation described above represents each number as a multiple of 1/256. 1.2.2 Normalized floating-point Floating-point representation is an alternative technique based on scientific notation. Floating-point basics 7 CU IDOL SELF LEARNING MATERIAL (SLM)

Though we'd like to use scientific notation, we'll base our scientific notation on powers of 2, not powers of 10, because we're working with computers that prefer binary. For example, 5.5(10) is 101.1(2) in binary, and it converts a to binary scientific notation of 1.011(2) × 22. In converting to binary scientific notation here, we moved the decimal point to the left two places, just as we would in converting 101.1(10) to scientific notation: It would be 1.011(10) × 102.) Once we have a number in binary scientific notation, we still must have a technique for mapping that into a set of bits. First, let us define the two parts of scientific representation: In 1.011(2) × 102, we call 1.011(2) the mantissa (or the significand), and we call 2 the exponent. In this section we'll use 8 bits to store such a number. Fig 1.2: 8 bit memory We use the first bit to represent the sign (1 for negative, 0 for positive), the next four bits for the sum of 7 and the actual exponent (we add 7 to allow for negative exponents), and the last three bits for the mantissa's fractional part. Note that we omit the integer part of the mantissa: Since the mantissa must have exactly one nonzero bit to the left of its decimal point, and the only nonzero bit is 1, we know that the bit to the left of the decimal point must be a 1. There's no point in wasting space in inserting this 1 into our bit pattern, so we include only the bits of the mantissa to the right of the decimal point. For our example of 5.5(10) = 1.011(2) × 22, we add 7 to 2 to arrive at 9(10) = 1001(2) for the exponent bits. Into the mantissa bits we place the bits following the decimal point of the scientific notation, 011. This gives us 0 1001 011 as the 8-bit representation of 5.5(10). We call this floating-point representation because the values of the mantissa bits “float” along with the decimal point, based on the exponent's given value. This is in contrast to fixed- point representation, where the decimal point is always in the same place among the bits given. Suppose we want to represent −96(10). a. First, we convert our desired number to binary: −1100000(2). b. Then we convert this to binary scientific notation: −1.100000(2) × 26. c. Then we fit this into the bits. 1. We choose 1 for the sign bit since the number is negative. 2. We add 7 to the exponent and place the result into the four exponent bits. For this example, we arrive at 6 + 7 = 13(10) = 1101(2). 3. The three mantissa bits are the first three bits following the leading 1: 100. If it happened that there were 1 bit beyond the 1/8's place, we would need to round the mantissa to the nearest eighth.) Thus we end up with 1 1101 100. Conversely, suppose we want to decode the number 0 0101 100. 8 CU IDOL SELF LEARNING MATERIAL (SLM)

1. We observe that the number is positive, and the exponent bits represent 0101(2) = 5(10). This is 7 more than the actual exponent, and so the actual exponent must be −2. Thus, in binary scientific notation, we have 1.100(2) × 2−2. 2. We convert this to binary: 1.100(2) × 2−2 = 0.011(2). 3. We convert the binary into decimal: 0.011(2) = 1/4 + 1/8 = 3/8 = 0.375(10). Let's illustrate with an example, suppose that the 32-bit pattern is 1 1000 0001 011 0000 0000 0000 0000 0000, with:  S=1  E = 1000 0001  F = 011 0000 0000 0000 0000 0000 In the normalized form, the actual fraction is normalized with an implicit leading 1 in the form of 1.F. In this example, the actual fraction is 1.011 0000 0000 0000 0000 0000 = 1 + 1×2^-2 + 1×2^-3 = 1.375D. The sign bit represents the sign of the number, with S=0 for positive and S=1 for negative number. In this example with S=1, this is a negative number, i.e., -1.375D. In normalized form, the actual exponent is E-127 (so-called excess-127 or bias-127). This is because we need to represent both positive and negative exponent. With an 8-bit E, ranging from 0 to 255, the excess-127 scheme could provide actual exponent of -127 to 128. In this example, E-127=129- 127=2D. Hence, the number represented is -1.375×2^2=-5.5D. De-Normalized Form Normalized form has a serious problem, with an implicit leading 1 for the fraction, it cannot represent the number zero! Convince yourself on this! De-normalized form was devised to represent zero and other numbers. For E=0, the numbers are in the de-normalized form. An implicit leading 0 (instead of 1) is used for the fraction; and the actual exponent is always -126. Hence, the number zero can be represented with E=0 and F=0 (because 0.0×2^-126=0). We can also represent very small positive and negative numbers in de-normalized form with E=0. For example, if S=1, E=0, and F=011 0000 0000 0000 0000 0000. The actual fraction is 0.011=1×2^- 2+1×2^-3=0.375D. Since S=1, it is a negative number. With E=0, the actual exponent is -126. Hence the number is -0.375×2^-126 = -4.4×10^-39, which is an extremely small negative number (close to zero). Example 1: Suppose that IEEE-754 32-bit floating-point representation pattern is 0 10000000 110 0000 0000 0000 0000 0000. 9 CU IDOL SELF LEARNING MATERIAL (SLM)

Sign bit S = 0 ⇒ positive number E = 1000 0000B = 128D (in normalized form) Fraction is 1.11B (with an implicit leading 1) = 1 + 1×2^-1 + 1×2^-2 = 1.75D The number is +1.75 × 2^ (128-127) = +3.5D Example 2: Suppose that IEEE-754 32-bit floating-point representation pattern is 1 01111110 100 0000 0000 0000 0000 0000. Sign bit S = 1 ⇒ negative number E = 0111 1110B = 126D (in normalized form) Fraction is 1.1B (with an implicit leading 1) = 1 + 2^-1 = 1.5D The number is -1.5 × 2^ (126-127) = -0.75D Example 3: Suppose that IEEE-754 32-bit floating-point representation pattern is 1 01111110 000 0000 0000 0000 0000 0001. Sign bit S = 1 ⇒ negative number E = 0111 1110B = 126D (in normalized form) Fraction is 1.000 0000 0000 0000 0000 0001B (with an implicit leading 1) = 1 + 2^-23 The number is -(1 + 2^-23) × 2^ (126-127) = -0.500000059604644775390625 (may not be exact in decimal!) Example 4 (De-Normalized Form): Suppose that IEEE-754 32-bit floating-point representation pattern is 1 00000000 000 0000 0000 0000 0000 0001. Sign bit S = 1 ⇒ negative number 10 E = 0 (in de-normalized form) CU IDOL SELF LEARNING MATERIAL (SLM)

Fraction is 0.000 0000 0000 0000 0000 0001B (with an implicit leading 0) = 1×2^-23 The number is -2^-23 × 2^ (-126) = -2× (-149) ≈ -1.4×10^-45 1.2.3 Representable numbers This 8-bit floating-point format can represent a wide range of both small numbers and large numbers. To find the smallest possible positive number we can represent, we would want the sign bit to be 0, we would place 0 in all the exponent bits to get the smallest exponent possible, and we would put 0 in all the mantissa bits. This gives us 0 0000 000, which represents 1.000(2) × 20 − 7 = 2−7 ≈ 0.0078(10). To determine the largest positive number, we would want the sign bit still to be 0, we would place 1 in all the exponent bits to get the largest exponent possible, and we would put 1 in all the mantissa bits. This gives us 0 1111 111, which represents 1.111(2) × 215 − 7 = 1.111(2) × 28 = 111100000(2) =480(10). Thus, our 8-bit floating-point format can represent positive numbers from about 0.0078(10) to 480(10). In contrast, the 8-bit two's-complement representation can only represent positive numbers between 1 and 127. But notice that the floating-point representation can't represent all of the numbers in its range — this would be impossible, since eight bits can represent only 28 = 256 distinct values, and there are infinitely many real numbers in the range to represent. What's going on? Let's consider how to represent 51(10) in this scheme. In binary, this is 110011(2) = 1.10011(2) × 25. When we try to fit the mantissa into the 3-bit portion of our scheme, we find that the last two bits won't fit: We would be forced to round to 1.101(2) × 25, and the resulting bit pattern would be 0 1100 101. That rounding means that we're not representing the number precisely. In fact, 1 1100 101 translates to 1.101(2) × 212 − 7 = 1.101(2) × 25 = 110100(2) = 52(10). Thus, in our 8-bit floating-point representation, 51 equals 52! That's pretty irritating, but it's a price we have to pay if we want to be able to handle a large range of numbers with such a small number of bits. (By the way, in rounding numbers that are exactly between two possibilities, the typical policy is to round so that the final mantissa bit is 0. For example, taking the number 19, we end up with 1.0011(2) × 24, and we would round this up to 1.010(2) × 24 = 20(10). On the other hand, rounding 11 CU IDOL SELF LEARNING MATERIAL (SLM)

up the number 21 = 1.0101(2) × 24 would lead to 1.011(2) × 24, leaving a 1 in the final bit of the mantissa, which we want to avoid; so instead we round down to 1.010(2) × 24 = 20(10). Doubtless you were taught in grade school to “round up” all the time; computers don't do this because if we consistently round up, all those rounding’s will bias the total of the numbers upward. Rounding so the final bit is 0 ensure that exactly half of the possible numbers round up and exactly half round down.) While a floating-point representation can't represent all numbers precisely, it does give us a guaranteed number of significant digits. For this 8-bit representation, we get a single digit of precision, which is pretty limited. To get more precision, we need more mantissa bits. Suppose we defined a similar 16-bit representation with 1 bit for the sign bit, 6 bits for the exponent plus 31, and 9 bits for the mantissa. Fig 1.3: 16 bit memory This representation, with its 9 mantissa bits, happens to provide three significant digits. Given a limited length for a floating-point representation, we have to compromise between more mantissa bits (to get more precision) and more exponent bits (to get a wider range of numbers to represent). For 16-bit floating-point numbers, the 6-and-9 split is a reasonable trade-off of range versus precision. 1.2.4 IEEE standard Nearly all computers today follow the IEEE 754 standard for representing floating-point numbers. This standard was largely developed by 1980 and it was formally adopted in 1985, though several manufacturers continued to use their own formats throughout the 1980's. This standard is similar to the 8-bit and 16-bit formats we've explored already, but the standard deals with longer bit lengths to gain more precision and range; and it incorporates two special cases to deal with very small and very large numbers. IEEE formats There are two major varieties of the standard, for 32 bits and 64 bits, although it does define other lengths, like 128 bits. format sign exponent mantissa exponent significant Our 8-bit bit bits bits excess digits Our 16-bit 14 3 7 1 IEEE 32-bit 16 9 31 3 IEEE 64-bit 18 23 127 6 1 11 52 1,023 15 12 CU IDOL SELF LEARNING MATERIAL (SLM)

format sign exponent mantissa exponent significant IEEE 128-bit bit bits bits excess digits 1 15 112 16,383 34 All formats use an offset for the exponent that is halfway up the range of integers that can fit into the exponent bits; this is called the excess. For the 8-bit format, we had 4 exponent bits; the largest number that can fit into 4 bits is 24 − 1 = 15, and so the excess is 7 ≈ 15/2. The IEEE 32-bit format has 8 exponent bits, and so the largest number that fits is 255, and the excess is 127 ≈ 255/2. In C programming, the float type corresponds to 32 bits, and double corresponds to 64 bits. (Actually, C doesn't specify the length or representation for these types. But this is most common. Java, incidentally, requires that these types correspond to the respective IEEE formats.) The representation scheme for 64-bit double-precision is similar to the 32-bit single-precision:  The most significant bit is the sign bit (S), with 0 for positive numbers and 1 for negative numbers.  The following 11 bits represent exponent (E).  The remaining 52 bits represents fraction (F). The value (N) is calculated as follows:  Normalized form: For 1 ≤ E ≤ 2046, N = (-1) ^S × 1.F × 2^(E-1023).  Denormalized form: For E = 0, N = (-1) ^S × 0.F × 2^ (-1022). These are in the denormalized form.  For E = 2047, N represents special values, such as ±INF (infinity), NaN (not a number). There are three parts in the floating-point representation:  The sign bit (S) is self-explanatory (0 for positive numbers and 1 for negative numbers).  For the exponent (E), a so-called bias (or excess) is applied so as to represent both positive and negative exponent. The bias is set at half of the range. For single precision with an 8-bit exponent, the bias is 127 (or excess-127). For double precision with a 11-bit exponent, the bias is 1023 (or excess-1023).  The fraction (F) (also called the mantissa or significand) is composed of an implicit leading bit (before the radix point) and the fractional bits (after the radix point). The leading bit for normalized numbers is 1; while the leading bit for denormalized numbers is 0. 13 CU IDOL SELF LEARNING MATERIAL (SLM)

Normalized Floating-Point Numbers In normalized form, the radix point is placed after the first non-zero digit, e.g., 9.8765D×10^- 23D, 1.001011B×2^11B. For binary number, the leading bit is always 1, and need not be represented explicitly - this saves 1 bit of storage. In IEEE 754's normalized form:  For single-precision, 1 ≤ E ≤ 254 with excess of 127. Hence, the actual exponent is from -126 to +127. Negative exponents are used to represent small numbers (< 1.0); while positive exponents are used to represent large numbers (> 1.0). N = (-1)^S × 1.F × 2^(E-127)  For double-precision, 1 ≤ E ≤ 2046 with excess of 1023. The actual exponent is from - 1022 to +1023, and N = (-1)^S × 1.F × 2^(E-1023) Take note that n-bit pattern has a finite number of combinations (=2^n), which could represent finite distinct numbers. It is not possible to represent the infinite numbers in the real axis (even a small range says 0.0 to 1.0 has infinite numbers). That is, not all floating-point numbers can be accurately represented. Instead, the closest approximation is used, which leads to loss of accuracy. The minimum and maximum normalized floating-point numbers are: Table 1: The minimum and maximum normalized floating-point numbers 14 CU IDOL SELF LEARNING MATERIAL (SLM)

Fig 1.4: Representation of Normalized and Denormalized Floating- point numbers Denormalized Floating-Point Numbers If E = 0, but the fraction is non-zero, then the value is in denormalized form, and a leading bit of 0 is assumed, as follows:  For single-precision, E = 0, N = (-1) ^S × 0.F × 2^ (-126)  For double-precision, E = 0, N = (-1)^S × 0.F × 2^(-1022) Denormalized form can represent very small numbers closed to zero, and zero, which cannot be represented in normalized form, as shown in the above figure. The minimum and maximum of denormalized floating-point numbers are: Table 2: Minimum and maximum of denormalized floating-point numbers Special Values Zero: Zero cannot be represented in the normalized form, and must be represented in denormalized form with E=0 and F=0. There are two representations for zero: +0 with S=0 and -0 with S=1. Infinity: The value of +infinity (e.g., 1/0) and -infinity (e.g., -1/0) are represented with an exponent of all 1's (E = 255 for single-precision and E = 2047 for double-precision), F=0, and S=0 (for +INF) and S=1 (for -INF). Not a Number (NaN): NaN denotes a value that cannot be represented as real number (e.g. 0/0). NaN is represented with Exponent of all 1's (E = 255 for single-precision and E = 2047 for double-precision) and any non-zero fraction. 1.3 ARITHMETIC OPERATIONS WITH NORMALIZED FLOATING POINT NUMBERS AND THEIR CONSEQUENCES. Arithmetic operations on floating point numbers consist of addition, subtraction, multiplication and division. The operations are done with algorithms similar to those used on sign magnitude integers 15 CU IDOL SELF LEARNING MATERIAL (SLM)

(because of the similarity of representation) -- example, only add numbers of the same sign. If the numbers are of opposite sign, must do subtraction. 1.3.1addition Example1: on decimal value given in scientific notation: 3.25 x 10 ** 3 + 2.63 x 10 ** -1 ----------------- First step: align decimal points Second step: add 3.25 x 10 ** 3 + 0.000263 x 10 ** 3 -------------------- 3.250263 x 10 ** 3 (presumes use of infinite precision, without regard for accuracy) Third step: normalize the result (already normalized!) Example 2: on fl pt. value given in binary: .25 = 0 01111101 00000000000000000000000 100 = 0 10000101 10010000000000000000000 To add these fl. pt. representations, Step 1: align radix points Shifting the mantissa Left By 1 Bit Decreases the Exponent By 1 Shifting the Mantissa Right By 1 Bit Increases the Exponent by 1 we want to shift the mantissa right, because the bits that fall off the end should come from the least significant end of the mantissa -> choose to shift the .25, since we want to increase its exponent. 16 -> shift by 10000101 -01111101 CU IDOL SELF LEARNING MATERIAL (SLM)

--------- 00001000 (8) places. 0 01111101 00000000000000000000000 (original value) 0 01111110 10000000000000000000000 (shifted 1 place) (note that hidden bit is shifted into msb of mantissa) 0 01111111 01000000000000000000000 (shifted 2 places) 0 10000000 00100000000000000000000 (shifted 3 places) 0 10000001 00010000000000000000000 (shifted 4 places) 0 10000010 00001000000000000000000 (shifted 5 places) 0 10000011 00000100000000000000000 (shifted 6 places) 0 10000100 00000010000000000000000 (shifted 7 places) 0 10000101 00000001000000000000000 (shifted 8 places) step 2: add (don't forget the hidden bit for the 100) 0 10000101 1.10010000000000000000000 (100) + 0 10000101 0.00000001000000000000000 (.25) --------------------------------------- 0 10000101 1.10010001000000000000000 step 3: normalize the result (get the \"hidden bit\" to be a 1) it already is for this example. result is 0 10000101 10010001000000000000000 Example 3: Floating point addition: Add the following two decimal numbers in scientific notation: 8.70 × 10-1 with 9.95 × 101 1. Rewrite the smaller number such that its exponent matches with the exponent of the larger number. 8.70 × 10-1 = 0.087 × 101 2. Add the mantissas 9.95 + 0.087 = 10.037 and write the sum 10.037 × 101 3. Put the result in Normalised Form 10.037 × 101 = 1.0037 × 102 (shift mantissa, adjust exponent) check for overflow/underflow of the exponent after normalisation 17 CU IDOL SELF LEARNING MATERIAL (SLM)

4. Round the result If the mantissa does not fit in the space reserved for it, it has to be rounded off. For Example: If only 4 digits are allowed for mantissa 1.0037 × 102 ===> 1.004 × 102 Example 4: addition in binary Perform 0.5 + (-0.4375) 0.5 = 0.1 × 20 = 1.000 × 2-1 (normalised) -0.4375 = -0.0111 × 20 = -1.110 × 2-2 (normalised) 1. Rewrite the smaller number such that its exponent matches with the exponent of the larger number. -1.110 × 2-2 = -0.1110 × 2-1 2. Add the mantissas: 1.000 × 2-1 + -0.1110 × 2-1 = 0.001 × 2-1 3. Normalise the sum, checking for overflow/underflow: 0.001 × 2-1 = 1.000 × 2-4 -126 <= -4 <= 127 ===> No overflow or underflow 4. Round the sum: The sum fits in 4 bits so rounding is not required Check: 1.000 × 2-4 = 0.0625 which is equal to 0.5 - 0.4375 (which is correct) 1.3.2 Subtraction Like addition as far as alignment of radix points then the algorithm for subtraction of sign mag. numbers takes over. Before subtracting, compare magnitudes (don't forget the hidden bit!) change sign bit if order of operands is changed. Don't forget to normalize number afterward. Example 1: Let the two numbers be x = 9.75 y = – 0.5625 Converting them into 32-bit floating point representation 18 9.75’s representation in 32-bit format = 0 10000010 00111000000000000000000 – 0.5625’s representation in 32-bit format = 1 01111110 00100000000000000000000 Now, we find the difference of exponents to know how much shifting is required. (10000010 – 01111110)2 = (4)10 Now, we shift the mantissa of lesser number right side by 4 units. Mantissa of – 0.5625 = 1.00100000000000000000000 (note that 1 before decimal point is understood in 32-bit representation) CU IDOL SELF LEARNING MATERIAL (SLM)

Shifting right by 4 units, 0.00010010000000000000000 Mantissa of 9.75= 1. 00111000000000000000000 Subtracting mantissa of both 0. 00010010000000000000000 – 1. 00111000000000000000000 ———————————————— 1. 00100110000000000000000 Sign bit of bigger number = 0 So, finally the answer = x – y = 0 10000010 00100110000000000000000 1.3.3 Multiplication Example 1 on decimal values given in scientific notation: 3.0 x 10 ** 1 + 0.5 x 10 ** 2 ----------------- algorithm: multiply mantissas add exponents 3.0 x 10 ** 1 + 0.5 x 10 ** 2 ----------------- 1.50 x 10 ** 3 Example 2 in binary: use a mantissa that is only 4 bits so that I don't spend all day just doing the multiplication part. 0 10000100 0100 x 1 00111100 1100 ----------------- mantissa multiplication: 1.0100 (don't forget hidden bit) x 1.1100 ------ 00000 19 CU IDOL SELF LEARNING MATERIAL (SLM)

00000 10100 10100 10100 --------- 1000110000 becomes 10.00110000 add exponents: always add true exponents (otherwise, the bias gets added in twice) biased: 10000100 + 00111100 ---------- 10000100 01111111 (switch the order of the subtraction, - 01111111 - 00111100 so that we can get a negative value) ---------- ---------- 00000101 01000011 true exp true exp is 5. is -67 add true exponents 5 + (-67) is -62. re-bias exponent: -62 + 127 is 65. unsigned representation for 65 is 01000001.put the result back together (and add sign bit). 1 01000001 10.00110000 normalize the result: (moving the radix point one place to the left increases the exponent by 1.) 1 01000001 10.00110000 becomes 1 01000010 1.000110000 This is the value stored (not the hidden bit!): 1 01000010 000110000 Example 3: Floating Point Multiplication 20 Multiply the following two numbers in scientific notation by hand: 1.110 × 1010 × 9.200 × 10-5 CU IDOL SELF LEARNING MATERIAL (SLM)

1. Add the exponents to find New Exponent = 10 + (-5) = 5 If we add biased exponents, bias will be added twice. Therefore we need to subtract it once to compensate: (10 + 127) + (-5 + 127) = 259 259 - 127 = 132 which is (5 + 127) = biased new exponent 2. Multiply the mantissas 1.110 × 9.200 = 10.212000 Can only keep three digits to the right of the decimal point, so the result is 10.212 × 105 3. Normalise the result 1.0212 × 106 4. Round it 1.021 × 106 Example 4: multiplication in binary: 1.000 × 2-1 × -1.110 × 2-2 1. Add the biased exponents (-1 + 127) + (-2 + 127) - 127 = 124 ===> (-3 + 127) 2. Multiply the mantissas 1.000 × 1.110 ----------- 0000 1000 1000 + 1000 ----------- 1110000 ===> 1.110000 The product is 1.110000 × 2-3 Need to keep it to 4 bits 1.110 × 2-3 3. Normalise (already normalised) At this step check for overflow/underflow by making sure that -126 <= Exponent <= 127 1 <= Biased Exponent <= 254 4. Round the result (no change) 5. Adjust the sign. 21 CU IDOL SELF LEARNING MATERIAL (SLM)

Since the original signs are different, the result will be negative -1.110 × 2-3 1.3.4 Division Similar to multiplication. true division: do unsigned division on the mantissas (don't forget the hidden bit) subtract TRUE exponents The IEEE standard is very specific about how all this is done. Unfortunately, the hardware to do all this is pretty slow. Some comparisons of approximate times: 2's complement integer add 1 time unit fl. pt add 4-time units fl. pt. multiply 6-time units fl. pt. divide 13-time units There is a faster way to do division. It’s called division by reciprocal approximation. It takes about the same time as a fl. pt. multiply. Unfortunately, the results are not always the same as with true division. Division by reciprocal approximation: instead of doing a / b they do a x 1/b. figure out a reciprocal for b, and then use the fl. pt. multiplication hardware. example of a result that isn't the same as with true division. true division: 3/3 = 1 (exactly) reciprocal approx.: 1/3 = .33333333 3 x .33333333 = .99999999, not 1 It is not always possible to get a perfectly accurate reciprocal. 1.3.5 Issues in Floating Point Note: This discussion only touches the surface of some issues that people deal with. Entire courses could probably be taught on each of the issues. Rounding -------- Arithmetic operations on fl. pt. values compute results that cannot be represented in the given amount of precision. So, we must round results. 22 CU IDOL SELF LEARNING MATERIAL (SLM)

There are MANY ways of rounding. They each have \"correct\" uses, and exist for different reasons. The goal in a computation is to have the computer round such that the end result is as \"correct\" as possible. There are even arguments as to what is really correct. 3 methods of rounding: R toward 0 -- also called truncation. figure out how many bits (digits) are available. Take that many bits (digits) for the result and throw away the rest. This has the effect of making the value represented closer to 0. Example: .7783 if 3 decimal places available, .778 if 2 decimal places available, .77 R toward + infinity -- Regardless of the value, round towards +infinity. Example: 1.23 if 2 decimal places, 1.3 -2.86 if 2 decimal places, -2.8 Round toward - infinity -- Regardless of the value, round towards -infinity. Example: 1.23 if 2 decimal places, 1.2 -2.86 if 2 decimal places, -2.9 In binary -- rounding to 2 digits after radix point ---------------------------------------------------- round toward + infinity -- 1.1101 | 1.11 | 10.00 ------ 1.001 | 1.00 | 1.01 ----- 23 CU IDOL SELF LEARNING MATERIAL (SLM)

Round toward - infinity -- 1.1101 | 1.11 | 10.00 ------ 1.001 | 1.00 | 1.01 ----- Round toward zero (TRUNCATE) -- 1.1101 | 1.11 | 10.00 ------ 1.001 | 1.00 | 1.01 ----- -1.1101 | -10.00 | -1.11 ------ -1.001 | -1.01 | -1.00 ----- Round toward nearest -- ODD CASE: if there is anything other than 1000... to the right of the number of digits to be kept, then rounded in IEEE standard such that the least significant bit (to be kept) is a zero. 24 CU IDOL SELF LEARNING MATERIAL (SLM)

1.1111 | 1.11 | 10.00 ------ 1.1101 | 1.11 | 10.00 ------ 1.001 (ODD CASE) | 1.00 | 1.01 ----- -1.1101 (1/4 of the way between) | -10.00 | -1.11 ------ -1.001 (ODD CASE) | -1.01 | -1.00 ----- NOTE: this is a bit different than the \"round to nearest\" algorithm (for the \"tie\" case, .5) learned in elementary school for decimal numbers. Use of standards ---------------- --> allows all machines following the standard to exchange data and to calculate the exact same results. --> IEEE fl. pt. standard sets parameters of data representation (# bits for mantissa vs. exponent) --> Pentium architecture follows the standard Overflow and Underflow ---------------------- 25 CU IDOL SELF LEARNING MATERIAL (SLM)

Just as with integer arithmetic, floating point arithmetic operations can cause overflow. Detection of overflow in fl. pt. comes by checking exponents before/during normalization. Once overflow has occurred, an infinity value can be represented and propagated through a calculation. Underflow occurs in fl. pt. representations when a number is too small (close to 0) to be represented. (show number line!) if a fl. pt. value cannot be normalized (getting a 1 just to the left of the radix point would cause the exponent field to be all 0's) then underflow occurs. 1.3.6 Consequences: You might wish that the standard laws of arithmetic would apply to numeric representations. For example, it would be convenient if the commutative law (x + y = y + x) applied to addition. With integer arithmetic using two's-complement representation, all the standard laws apply except those that involve division. Identities involving division fail, of course, because division often results in non-integers, which can only be approximated in a two's-complement representation. With IEEE floating-point numbers, however, many laws fall apart. The commutative law still holds for both addition and multiplication. But the associative law of addition (x + (y + z) = (x + y) + z) does not: Suppose x and y were the largest possible numeric value, and z were the most negative numeric value. Then the value of x + (y + z) would be x, since y + z is 0 and x + 0 is x. But the value of (x + y) + z would be positive infinity, since x + y results in overflow, which is equivalent to infinity in the IEEE format, and any finite number added to infinity results in infinity again. The associative law fails for addition even without resorting to such special cases. Going back to our 8-bit representation, consider x = 1, y = 52, z = −52. If you evaluate x + (y + z), you first notice that y + z = 0 and so x + (y + z) is 1. But (x + y) + z is 0, since the value 1 + 52 is still 52 due to rounding. Another example of something that should be true but isn't in IEEE floating-point arithmetic is the following equation. 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 1. The reason this fails is that 1/6 has no precise representation with a fixed number of mantissa bits, and so it must be approximated with the IEEE format. As it happens, it underestimates slightly. When we add this underestimation to itself 6 times, the total ends up being less than 1. 26 CU IDOL SELF LEARNING MATERIAL (SLM)

Such oddities are problems, and good programmers are very aware of them. Nobody has really come up with an alternative that is convincingly better than the IEEE approach to representing numbers: All have some odd anomalies, and the IEE approach avoids these better than most. Consequently, all modern computers use this representation for non-integral numbers. 1.4 ERROR IN NUMBER REPRESENTATION - PITFALLS IN COMPUTING. 1.4.1 Representation errors Floating-point numbers have a few obvious drawbacks: 1. Most real numbers cannot be represented exactly 2. Small representation errors can accumulate systematically during arithmetic operations Let’s write a program that (in a quite naive way) attempts to range over the values -3, -2.9, -2.8, ..., 2.8, 2.9 using a for loop that increments a float variable. #include <iostream> using std::cout; using std::endl; #include <iomanip> using std::setw; int main() { cout.precision(10); int n = -30; for (float x = -3.0; x < 3.0; x += 0.1, n += 1) cout << setw(16) << x << setw(6) << 0.1*n << endl; return 0; } As the output of this program makes clear, repeatedly adding 0.1 to -3.0 does not give the desired result. Each value is slightly off, and the comparison (x < 3.0) fails to cut off the sequence at 2.9. -3 -3 -2.900000095 -2.9 -2.800000191 -2.8 -2.700000286 -2.7 -2.600000381 -2.6 -0.400000602 -0.4 -0.300000608 -0.3 -0.2000006139 -0.2 27 CU IDOL SELF LEARNING MATERIAL (SLM)

-0.1000006124 -0.1 -6.124377023e-07 0 0.09999939054 0.1 0.199999392 0.2 0.3 0.2999993861 0.4 0.3999993801 2.699999094 2.7 2.799998999 2.8 2.899998903 2.9 2.999998808 3 Why does this happen? Although -3.0 can be exactly represented, 0.1 cannot. The closest we can get is 0.1000000015, which is represented as follows. s/eeeeeeee/fffffffffffffffffffffff 0 01111011 10011001100110011001101 The exact number is a non- terminating fraction: \\(0.1 = (1.100110011001100\\ldots)_2\\). 1.4.2 Numerical Error Floating point numbers are a peculiar finite subset of the rationales, designed to span many orders of magnitude and to have a consistent number of values in each factor-two interval. Since the exponent field is finite, there are minimum and maximum values that can be represented. largest number smallest number epsilon float 3.40E+38 1.5E-45 1.19E-7 double 1.80E+308 5.0E-324 2.22E-16 sdfs type largest number smallest number epsilon float 3.40E+38 1.5E-45 1.19E-7 double 1.80E+308 5.0E-324 2.22E-16 This means that any number represented as a float that is greater than \\(3.4\\times 10^{38}\\) is an inf; any that is smaller than \\(1.5\\times 10^{-45}\\) is identically 0. Within the representable range, an arbitrary real number \\(x = \\hat{x} + \\delta_x\\) will be closest to some floating point value \\(\\hat{x}\\) but a distance \\(\\delta_x\\) away. It’s easy to show that the relative error \\ (\\epsilon_x = \\delta_x/x\\) is bounded by the machine epsilon \\(\\epsilon_\\text{m}\\), defined to be the smallest value such that \\ (1 + \\epsilon_\\text{m} > 1\\). (That is, the bounds on the absolute error \\(\\delta_x\\) scale with \\(|x|\\).) In the case of a float, \\ (1 = +2^ {127-127} \\times (1+0.0\\ldots 0) \\). The next highest number has an identical bit pattern except that the least significant bit is on. 28 CU IDOL SELF LEARNING MATERIAL (SLM)

\\[\\begin{split}0/01111111/00000000000000000000000 &= 1\\\\ 0/01111111/00000000000000000000001 &= 1 + \\epsilon_\\text{m}\\end{split}\\] The difference corresponds to \\(\\epsilon_\\text{m} = 2^ {-23} = 1.19\\times 10^ {-7} \\). Hence, the boolean test 1.0F + 1E-9F == 1.0F evaluates to true (whereas 1.0 + 1E-9 == 1.0 evaluates to false, because \\(\\epsilon_\\text{m} = 2^ {-52} = 2.22\\times 10^ {-16} \\) for doubles). Such representation errors are only a small part of what you need to worry about. There are many other sources of error. 1. Rounding and truncation errors occur during conversions between types 2. Errors are introduced when series expansions are cut off: e.g., cos(0.5*M_π) != 0.0, even though (cos π/2) is zero; such behaviour is common to all transcendental functions in the cmath library 3. Addition of floating point values having very different orders of magnitude may lead to (mild) loss of significance 4. Subtraction of nearly equal values may lead to (severe) loss of significance 5. Multiplication can lead to underflow or overflow if the exponents of the two operands are both large or both small 6. Division is never entirely safe, since verifying that the divisor is not zero does not guarantee that a division will not overflow and yield infinity 1.4.3 pitfalls in computing. HW vs. SW computing  Floating point operations can be done by hardware (circuitry) or by software (program code).  A programmer won't know which is occurring, without prior knowledge of the HW.  SW is much slower than HW. by approx. 1000 times.  A difficult (but good) exercise for students would be to design a SW algorithm for doing fl. pt. addition using only integer operations.  SW to do fl. pt. operations is tedious. It takes lots of shifting and masking to get the data in the right form to use integer arithmetic operations to get a result -- and then more shifting and masking to put the number back into fl. pt. format.  A common thing that manufacturers used to do is to offer 2 versions of the Same architecture, one with HW, and the other with SW fl. pt. ops. 1.5 SUMMARY Computer arithmetic is a field of computer science that investigates how computers should represent numbers and perform operations on them. It includes integer arithmetic, fixed-point arithmetic, and the arithmetic this book focuses on: floating-point (FP) arithmetic. he common way computers approximate real numbers and 29 CU IDOL SELF LEARNING MATERIAL (SLM)

that it is described in the IEEE-754 standard [IEE 08]. As in scientific notations, numbers are represented using an exponent and a significand, except that this significand has to fit on a certain amount of bits. As this number of bits (called precision) is limited, each operation may be inexact due to the rounding. This makes computer arithmetic sometimes inaccurate: the result of a long computation may be far from the mathematical result that would have been obtained if all the computations were correct. This also makes computer arithmetic unintuitive: for instance, the FP addition is not always associative. 1.6 KEYWORDS  Order of magnitude: An order of magnitude is an exponential change of plus-or-minus 1 in the value of a quantity or unit. The term is generally used in conjunction with power-of-10.  A non-terminating: non-repeating decimal is a decimal number that continues endlessly, with no group of digits repeating endlessly.  Rounding Error: Because floating-point numbers have a limited number of digits, they cannot represent all real numbers accurately: when there are more digits than the format allows, the leftover ones are omitted - the number is rounded  Truncation Error: A number that is Truncated or Chopped at the digit means that all digits after the digit are removed. For example, consider the decimal number  Numeric Value: is anything of, relating to, or containing numbers. 1.7 LEARNING ACTIVITY 1. Explain IEEE 754 Floating-Point Standard -------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------- 2. How is a binary number can be represented in 14-bit floating-point form? -------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------- 1.8 UNIT END QUESTIONS A. Descriptive Type Questions 1. What is Fixed Point? 30 2. Explain Normalized Floating point? 3. Discuss Multiplication of Floating point? 4. What are the consequences discuss operations of in the arithmetic Floating pointed 5. Write a note on error in number representation. CU IDOL SELF LEARNING MATERIAL (SLM)

B. Multiple Choice Questions: 1. The numbers written to the power of 10 in the representation of decimal numbers are called as _____ a) Height factors b) Size factors c) Scale factors d) None of the mentioned 2. If the decimal point is placed to the right of the first significant digit, then the number is called ________ a) Orthogonal b) Normalized c) Determinate d) None of the mentioned 3. _______ constitute the representation of the floating number. a) Sign b) Significant digits c) Scale factor d) All of the mentioned 4. The sign followed by the string of digits is called as ______ a) Significant b) Determinant c) Mantissa d) Exponent 5. In IEEE 32-bit representations, the mantissa of the fraction is said to occupy ______ bits. a) 24 b) 23 c) 20 d) 16 Answer: 1 – c; 2 - b; 3 - d; 4 -c; 5 - b; 31 CU IDOL SELF LEARNING MATERIAL (SLM)

1.9 REFERENCES  Rajaraman V., Computer Oriented Numerical Method. New Delhi: Prentice Hall.  Salaria R.S.A, Textbook of Statistical and Numerical Methods in Engineering, Delhi: Khanna Book Publishing Company.  Gupta S.P. and Kapoor, V.K. (2014). Fundamentals of Mathematical Statistics. Delhi: Sultan Chand and Sons.  Anderson (1990). Statistical Modelling .New York: McGraw Publishing House.  Gupta S.P. and Kapoor, V.K. (2015). Fundamentals of Applied statistics. Delhi: Sultan Chand & Sons.  Graybill (1990). Introduction to Statistics. New York: McGraw Publishing House.  J. H. Wilkinson, The Algebraic Eigenvalue Problem (Numerical Mathematics and Scientific Computation), Clarendon Press  Numerical Methods & Analysis – Engineering App – Google Play store  https://en.wikibooks.org/wiki/Statistics/Numerical_Methods/Numerical_Comparison_of_Statistic al_Software 32 CU IDOL SELF LEARNING MATERIAL (SLM)

UNIT 2- ITERATIVE METHODS Structure 2.0 Learning Objective 2.1 Introduction 2.2 Bisection 2.2.1 Bisection Method Algorithm 2.2.2 Bisection Method Example 2.3 False position, 2.4 Newton-Raphson methods, 2.5 Polynomial evaluation, 2.6 Solving polynomial equations (Bairstow's Method). 2.7 Summary 2.8 Keywords 2.9 Learning Activity 2.10 Unit End Questions 2.11 References 2.0 LEARNING OBJECTIVE: After studying this unit, you will be able to:  Explain the Different iterative methods like Bisection, False position, Newton-Raphson methods  Comprehend the Polynomial evaluation and solving polynomial equations using (Bairstow's Method). 2.1 INTRODUCTION The Iterative Method is a mathematical way of solving a problem which generates a sequence of approximations. This method is applicable for both linear and nonlinear problems with large number of variables. The word Iterative or Iteration refers to the technique that solve any linear system problems with successive approximation at each step. 2.2 BISECTION METHOD The bisection method is used to find the roots of a polynomial equation. It separates the interval and subdivides the interval in which the root of the equation lies. The principle behind this method is the intermediate theorem for continuous functions. It works by narrowing the gap between the positive and negative intervals until it closes in on the correct answer. This method narrows the gap by taking 33 CU IDOL SELF LEARNING MATERIAL (SLM)

the average of the positive and negative intervals. It is a simple method, and it is relatively slow. The bisection method is also known as interval halving method, root-finding method, binary search method or dichotomy method. Let, consider a continuous function “f” which is defined on the closed interval [a, b], is given with f(a) and f(b) of different signs. Then by intermediate theorem, there exists a point x belong to (a, b) for which f(x) =0. 2.2.1 Bisection Method Algorithm Follow the below procedure to get the solution for the continuous function: For any continuous function f(x),  Find two points, say a and b such that a < b and f(a)* f(b) < 0  Find the midpoint of a and b, say “t”  t is the root of the given function if f(t) = 0; else follow the next step  Divide the interval [a, b]  If f(t)*f(b) <0, let a = t  Else if f(t) *f(a), let b = t  Repeat above three steps until f(t) = 0. The bisection method is an approximation method to find the roots of the given equation by repeatedly dividing the interval. This method will divide the interval until the resulting interval is found, which is extremely small. 2.2.2 Bisection Method Example Example-1: Find a root of an equation f(x)=x3-x-1 using Bisection method Solution: Here x3-x-1 =0 Let f(x)= x3-x-1 Here x 0 12 f(x) -1 -1 5 1st iteration : Here f(1)= -1 < 0 and f(2) = 5 > 0 ∴ Now, Root lies between 1 and 2 34 CU IDOL SELF LEARNING MATERIAL (SLM)

x0 = = 1.5 35 f(x0) = f(1.5) = 0.875 > 0 2nd iteration : Here f(1)= - 1 < 0 and f(1.5) = 0.875 > 0 ∴ Now, Root lies between 1 and 1.5 x1 = =1.25 f(x1) = f(1.25) = – 0.29688 < 0 3rd iteration : Here f(1.25) = – 0.29688 < 0 and f(1.5) = 0.875 > 0 ∴ Now, Root lies between 1.25 and 1.5 x2 = = 1.375 f(x2) = f(1.375) = 0.22461 > 0 4th iteration : Here f(1.25) = – 0.29688 < 0 and f(1.375) = 0.22461 > 0 ∴ Now, Root lies between 1.25 and 1.375 x3 = =1.3125 f(x3) = f(1.3125) = – 0.05151 < 0 CU IDOL SELF LEARNING MATERIAL (SLM)

5th iteration : 36 Here f(1.3125) = – 0.05151 < 0 and f(1.375) = 0.22461 > 0 ∴ Now, Root lies between 1.3125 and 1.375 x4 = 1.3125 + 1.3752 = 1.34375 f(x4) = f(1.34375) = 0.08261 > 0 6th iteration : Here f(1.3125)= – 0.05151 < 0 and f(1.34375) = 0.08261 > 0 ∴ Now, Root lies between 1.3125 and 1.34375 x5= = 1.32812 f(x5) = f(1.32812) = 0.01458>0 7th iteration : Here f(1.3125)=-0.05151<0 and f(1.32812)=0.01458>0 ∴ Now, Root lies between 1.3125 and 1.32812 x6 = =1.32031 f(x6) = f(1.32031) = – 0.01871 < 0 8th iteration : Here f(1.32031)= – 0.01871 < 0 and f(1.32812) = 0.01458 > 0 ∴ Now, Root lies between 1.32031 and 1.32812 CU IDOL SELF LEARNING MATERIAL (SLM)

x7= =1.32422 37 f(x7)=f(1.32422)=-0.00213<0 9th iteration : Here f(1.32422) = – 0.00213 < 0 and f(1.32812) = 0.01458 > 0 ∴ Now, Root lies between 1.32422 and 1.32812 x8= =1.32617 f(x8)=f(1.32617)=0.00621>0 10th iteration : Here f(1.32422) = – 0.00213<0 and f(1.32617) = 0.00621>0 ∴ Now, Root lies between 1.32422 and 1.32617 x9= =1.3252 f(x9) = f (1.3252) = 0.00204 > 0 11th iteration : Here f(1.32422)= – 0.00213 < 0 and f(1.3252) = 0.00204 > 0 ∴ Now, Root lies between 1.32422 and 1.3252 x10= = 1.32471 CU IDOL SELF LEARNING MATERIAL (SLM)

f(x10) = f(1.32471) = – 0.00005 < 0 Approximate root of the equation x3-x-1=0 using Bisection method is 1.32471 Example 2: Determine the root of the given equation x2-3 = 0 for x ∈ [1, 2] 38 Solution: Given: x2-3 = 0 Let f(x) = x2-3 Now, find the value of f(x) at a= 1 and b=2. f(x=1) = 12-3 = 1 – 3 = -2 < 0 f(x=2) = 22-3 = 4 – 3 = 1 > 0 The given function is continuous, and the root lies in the interval [1, 2]. Let “t” be the midpoint of the interval. I.e., t = (1+2)/2 t =3 / 2 t = 1.5 Therefore, the value of the function at “t” is f(t) =f (1.5) = (1.5)2-3 = 2.25 – 3 = -0.75 < 0 f(t) is negative, so b is replaced with t= 1.5 for the next iterations. The iterations for the given functions are: CU IDOL SELF LEARNING MATERIAL (SLM)

So, at the seventh iteration, we get the final interval [1.7266, 1.7344] Hence, 1.7344 is the approximated solution. Example 3: Find the root of f(x) = x2 - 3. Let εstep = 0.01, εabs = 0.01 and start with the interval [1, 2]. Solution: Table 1. Bisection method applied to f(x) = x2 - 3. a b f(a) f(b) c = (a + b)/2 f(c) Update new b − a 1.0 2.0 -2.0 1.0 1.5 2.0 -0.75 1.0 1.5 -0.75 a=c 0.5 1.5 1.75 -0.75 0.0625 1.625 1.75 -0.3594 0.0625 1.75 0.062 b = c 0.25 1.6875 1.75 -0.1523 0.0625 1.7188 1.75 -0.0457 0.0625 1.625 -0.359 a=c 0.125 1.6875 -0.1523 a=c 0.0625 1.7188 -0.0457 a=c 0.0313 1.7344 0.0081 b = c 0.0156 39 CU IDOL SELF LEARNING MATERIAL (SLM)

1.71988/td> 1.7344 -0.0457 0.0081 1.7266 -0.0189 a = c 0.0078 Thus, with the seventh iteration, we note that the final interval, [1.7266, 1.7344], has a width less than 0.01 and |f (1.7344) | < 0.01, and therefore we chose b = 1.7344 to be our approximation of the root. Example 4: Find the root of f(x) = e-x (3.2 sin(x) - 0.5 cos(x)) on the interval [3, 4], this time with εstep = 0.001, εabs = 0.001. Solution: Table 1. Bisection method applied to f(x) = e-x (3.2 sin(x) - 0.5 cos(x)). ab f(a) f(b) c = (a + b)/2 f(c) Update new b − a 3.0 4.0 0.047127 -0.038372 3.5 -0.019757 b = c 0.5 3.0 3.5 0.047127 -0.019757 3.25 0.0058479 a = c 0.25 3.25 3.5 0.0058479 -0.019757 3.375 -0.0086808 b = c 0.125 3.25 3.375 0.0058479 -0.0086808 3.3125 -0.0018773 b = c 0.0625 3.25 3.3125 0.0058479 -0.0018773 3.2812 0.0018739 a = c 0.0313 3.2812 3.3125 0.0018739 -0.0018773 3.2968 -0.000024791 b = c 0.0156 3.2812 3.2968 0.0018739 -0.000024791 3.289 0.00091736 a = c 0.0078 3.289 3.2968 0.00091736 -0.000024791 3.2929 0.00044352 a = c 0.0039 3.2929 3.2968 0.00044352 -0.000024791 3.2948 0.00021466 a = c 0.002 3.2948 3.2968 0.00021466 -0.000024791 3.2958 0.000094077 a = c 0.001 3.2958 3.2968 0.000094077 -0.000024791 3.2963 0.000034799 a = c 0.0005 Thus, after the 11th iteration, we note that the final interval, [3.2958, 3.2968] has a width less than 0.001 and |f (3.2968) | < 0.001 and therefore we chose b = 3.2968 to be our approximation of the root. 2.3 METHOD OF FALSE POSITION Let the root lie in the interval (a, b). Then, P (a, f(a)), Q (b, f(b)) are points as the curve. Join the points P and Q. The point of intersection of this, with the X-axis, c, line is taken as the next approximation to the root. We determine by the intermediate value theorem, whether the root now 40 CU IDOL SELF LEARNING MATERIAL (SLM)

lies in (a, c) or (c, b) we repeat the procedure. If x0, x1, x2…… are the sequence of approximations, then we stop the iteration when | xk+1 − xk| < given error tolerance. Fig 2.1: Graphical representation of Method of False Position The equation of line chord joining (a, f(a)), (b, f(b)) is Setting y = 0, we set the point of intersection with X-axis as givens If we denote x0 = a, x1 = b, then the iteration formula can be written as The rate of convergence is still linear but faster than that of the bisection method. 41 Both these methods will fail if f has a double root. CU IDOL SELF LEARNING MATERIAL (SLM)

Example 1: Obtain the positive root of the equation by Regua Falsi method Solution: Let f(x)= . Since f (0) = , f (2) = 3, Let us take that the root lies in (0, 2). We have x0 = 0, x1 = 2. Then, using (2), we get We obtain the next approximations as x5 = 0.9756, x6 = 0.9918, x7 = 0.9973, x8 = 0.9990. Since, |x8 – x7| = 0.0017 < 0.005, the approximation x8 = 0.9990 is correct to decimal places. Note that in this problem, the lower end of the interval tends to the root, and the minimum error tends to zero, but the upper limit and maximum error remain fixed. In other problems, the opposite may happen. This is the property to the regula falsi method. Example-2: Find a root of an equation f(x) = x3 – x – 1 using False Position method Solution: Here x3 – x – 1 = 0 Let f(x)=x3 – x – 1 Here 0 12 x -1 -1 5 f(x) 1st iteration : 42 CU IDOL SELF LEARNING MATERIAL (SLM)

Here f(1) = – 1< 0 and f(2) = 5 > 0 43 ∴ Now, Root lies between x0=1 and x1=2 x2 = x0 – f(x0)⋅ x2 = 1 – (–1)⋅ x2 = 1.16667 f(x2) = f (1.16667) = – 0.5787 < 0 2nd iteration : Here f (1.16667) = –0.5787 < 0 and f (2) = 5 > 0 ∴ Now, Root lies between x0=1.16667 and x1=2 x3 = x0 – f(x0)⋅ x3 = 1.16667 – (–0.5787)⋅ x3 = 1.25311 f(x3) = f(1.25311) = –0.28536<0 3rd iteration: Here f(1.25311) = –0.28536 < 0 and f (2) = 5 > 0 ∴ Now, Root lies between x0=1.25311 and x1=2 CU IDOL SELF LEARNING MATERIAL (SLM)

x4= x0 – f(x0)⋅ 44 x4 = 1.25311– (–0.28536)⋅ x4 = 1.29344 f(x4) = f(1.29344) = –0.12954 < 0 4th iteration : Here f(1.29344)= –0.12954 < 0 and f(2) = 5 > 0 ∴ Now, Root lies between x0=1.29344 and x1=2 x5= x0 – f(x0)⋅ x5 = 1.29344– (–0.12954)⋅ x5=1.31128 f(x5)=f (1.31128)= –0.05659 < 0 5th iteration : Here f(1.31128) = –0.05659 < 0and f(2) = 5 > 0 ∴ Now, Root lies between x0=1.31128 and x1=2 x6= x0 – f(x0)⋅ CU IDOL SELF LEARNING MATERIAL (SLM)

x6 = 1.31128– (–0.05659) ⋅ f(x6) = f(1.31899) = –0.0243 < 0 6th iteration : Here f(1.31899) = –0.0243 < 0 and f(2) = 5>0 ∴ Now, Root lies between x0=1.31899 and x1=2 x7 = x0 – f(x0)⋅ x7 = 1.31899– (–0.0243) ⋅ x7 = 1.32228 f(x7) = f(1.32228) = –0.01036 < 0 7th iteration : Here f(1.32228)= –0.01036 < 0and f(2) = 5 > 0 ∴ Now, Root lies between x0 =1.32228 and x1 =2 x8= x0 – f(x0)⋅ x7 = 1.32228– (–0.01036) ⋅ x8 =1.32368 f(x8) = f (1.32368) = – 0.0044 <0 8th iteration: 45 CU IDOL SELF LEARNING MATERIAL (SLM)

Here f (1.32368) = – 0.0044 < 0 and f(2) = 5 > 0 46 ∴ Now, Root lies between x0 =1.32368 and x1 =2 x9= x0 – f(x0)⋅ x9 = 1.32368– (–0.0044) ⋅ x9 =1.32428 f(x9) = f (1.32428) = –0.00187 < 0 9th iteration: Here f (1.32428) = –0.00187 < 0 and f (2) = 5 > 0 ∴ Now, Root lies between x0 =1.32428 and x1=2 x10 = x0 – f(x0) ⋅ x10 = 1.32428– (–0.00187) ⋅ x10 =1.32453 f(x10) = f (1.32453) = –0.00079<0 10th iteration: Here f (1.32453) =-0.00079<0 and f (2) =5>0 ∴ Now, Root lies between x0 =1.32453 and x1 =2 CU IDOL SELF LEARNING MATERIAL (SLM)

x11 = x0 – f(x0) ⋅ x11 =1.32453– (– 0.00079). x11 = 1.32464 f(x11) = f (1.32464) = –0.00034 < 0 Approximate root of the equation x3-x-1=0 using False Position method is 1.32464 2.5 NEWTON-RAPHSON METHOD This method is also called Newton-Raphson method. We assume that f is a differentiable function in some interval [a, b] containing the root. The graph of y = f(x) is plotted in Figure 2.2 (a). The point of intersection x = r, is the required root. 47 CU IDOL SELF LEARNING MATERIAL (SLM)

Fig 2.2 (a): Graph of y = f(x) Let x0 be an initial approximation of r. Then, (x0, f(x0)) is a point as the curve Fig 2.2 (b): (x0, f(x0)) point on the curve Draw the tangent line to the curve at the point (x0 , f(x0 )) . This line intersects the x-axis at a new point, say x1. Now, x1 is a better approximation to r, than x0 . We now repeat this process, yielding new points x2 , x3 , ... until we are “close enough” to r. Figure2.2( b) shows one more iteration of this process, determining x2 Fig 2.2 (c): One more iteration determining x2 Now, we derive this method algebraically. The equation of the tangent at (x0, f(x0)) is given by y – f(x0) =f’(x0)(x – x0) 48 CU IDOL SELF LEARNING MATERIAL (SLM)

Where f’(x0) is the slope of the curve at (x0, f(x0)). Setting y = 0, we get the point of intersection of the tangent with x-axis as But, this is our next approximation, that is Iterating this process, we get the Newton-Raphson as Example 1: Find the smallest positive root of x7 + 9x5 − 13x − 17 = 0. Solution: Let f(x) = x7 + 9x5 − 13x − 17, we have f (0) < 0, f (1) < 0 and f (1)f(2) < 0. Hence, the smallest positive root lies in (1, 2). We can take any value in (1, 2) or one of the end points as the initial approximation. Let x0 = 1, we have, f’(x) =7x6 +45x4 − 13. The Newton-Raphson method becomes Starting with x0 = 1, we obtain the values given in Table After 6 iterations of Newton’s method, we have 49 CU IDOL SELF LEARNING MATERIAL (SLM)


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook