3.2 The Probability Distribution for a Discrete Random Variable 87 or an integer between 1 and the sample size. Thus, this random variable can take on only a finite number of values with nonzero probability. A random variable of this type is said to be discrete.DEFINITION 3.1 A random variable Y is said to be discrete if it can assume only a finite or countably infinite1 number of distinct values. A less formidable characterization of discrete random variables can be obtained by considering some practical examples. The number of bacteria per unit area in the study of drug control on bacterial growth is a discrete random variable, as is the number of defective television sets in a shipment of 100 sets. Indeed, discrete random variables often represent counts associated with real phenomena. Let us now consider the relation of the material in Chapter 2 to this chapter. Why study the theory of probability? The answer is that the probability of an observed event is needed to make inferences about a population. The events of interest are often numerical events that correspond to values of discrete random variables. Hence, it is imperative that we know the probabilities of these numerical events. Because certain types of random variables occur so frequently in practice, it is useful to have at hand the probability for each value of a random variable. This collection of probabilities is called the probability distribution of the discrete random variable. We will find that many experiments exhibit similar characteristics and generate random variables with the same type of probability distribution. Consequently, knowledge of the probability distributions for random variables associated with common types of experiments will eliminate the need for solving the same probability problems over and over again.3.2 The Probability Distribution for a Discrete Random Variable Notationally, we will use an uppercase letter, such as Y , to denote a random variable and a lowercase letter, such as y, to denote a particular value that a random variable may assume. For example, let Y denote any one of the six possible values that could be observed on the upper face when a die is tossed. After the die is tossed, the number actually observed will be denoted by the symbol y. Note that Y is a random variable, but the specific observed value, y, is not random. The expression (Y = y) can be read, the set of all points in S assigned the value y by the random variable Y . It is now meaningful to talk about the probability that Y takes on the value y, denoted by P(Y = y). As in Section 2.11, this probability is defined as the sum of the probabilities of appropriate sample points in S. 1. Recall that a set of elements is countably infinite if the elements in the set can be put into one-to-one correspondence with the positive integers.
88 Chapter 3 Discrete Random Variables and Their Probability DistributionsDEFINITION 3.2 The probability that Y takes on the value y, P(Y = y), is defined as the sum of the probabilities of all sample points in S that are assigned the value y. We will sometimes denote P(Y = y) by p(y). Because p(y) is a function that assigns probabilities to each value y of the random variable Y , it is sometimes called the probability function for Y .DEFINITION 3.3 The probability distribution for a discrete variable Y can be represented by a formula, a table, or a graph that provides p(y) = P(Y = y) for all y. Notice that p(y) ≥ 0 for all y, but the probability distribution for a discrete random variable assigns nonzero probabilities to only a countable number of distinct y values. Any value y not explicitly assigned a positive probability is understood to be such that p(y) = 0. We illustrate these ideas with an example.E XA M P LE 3.1 A supervisor in a manufacturing plant has three men and three women working for him. He wants to choose two workers for a special job. Not wishing to show any biases in his selection, he decides to select the two workers at random. Let Y denote the number of women in his selection. Find the probability distribution for Y .Solution The supervisor can select two workers from six in 6 = 15 ways. Hence, S contains 2 15 sample points, which we assume to be equally likely because random sampling was employed. Thus, P(Ei ) = 1/15, for i = 1, 2, . . . , 15. The values for Y that have nonzero probability are 0, 1, and 2. The number of ways of selecting Y = 0 women is 3 3 because the supervisor must select zero workers from the three women and 0 2 two from the three men. Thus, there are 3 3 = 1 · 3 = 3 sample points in the event 0 2 Y = 0, and p(0) = P(Y = 0) = 33 = 3 = 1 . 02 15 5 15 Similarly, p(1) = P(Y = 1) = 3 3 = 9 = 3, 1 1 15 15 5 p(2) = P(Y = 2) = 3 3 = 3 = 1. 2 0 15 15 5 Notice that (Y = 1) is by far the most likely outcome. This should seem reasonable since the number of women equals the number of men in the original group. The table for the probability distribution of the random variable Y considered in Example 3.1 is summarized in Table 3.1. The same distribution is given in graphical form in Figure 3.1. If we regard the width at each bar in Figure 3.1 as one unit, then
3.2 The Probability Distribution for a Discrete Random Variable 89 Table 3.1 Probability distribution for Example 3.1 y p(y) 0 1/5 1 3/5 2 1/5 F I G U R E 3.1 p ( y)Probability histogram 3/5 for Table 3.1 1/5 0 y 012 the area in a bar is equal to the probability that Y takes on the value over which the bar is centered. This concept of areas representing probabilities was introduced in Section 1.2. The most concise method of representing discrete probability distributions is by means of a formula. For Example 3.1 we see that the formula for p(y) can be written as 33 p(y) = y 2−y , y = 0, 1, 2. 6 2 Notice that the probabilities associated with all distinct values of a discrete random variable must sum to 1. In summary, the following properties must hold for any discrete probability distribution:THEOREM 3.1 For any discrete probability distribution, the following must be true: 1. 0 ≤ p(y) ≤ 1 for all y. 2. y p(y) = 1, where the summation is over all values of y with nonzero probability. As mentioned in Section 1.5, the probability distributions we derive are models, not exact representations, for the frequency distributions of populations of real data that occur (or would be generated) in nature. Thus, they are models for real distributions of data similar to the distributions discussed in Chapter 1. For example, if we were to randomly select two workers from among the six described in Example 3.1, we would observe a single y value. In this instance the observed y value would be 0, 1, or 2. If the experiment were repeated many times, many y values would be generated. A relative frequency histogram for the resulting data, constructed in the manner de- scribed in Chapter 1, would be very similar to the probability histogram of Figure 3.1.
90 Chapter 3 Discrete Random Variables and Their Probability Distributions Such simulation studies are very useful. By repeating some experiments over and over again, we can generate measurements of discrete random variables that possess frequency distributions very similar to the probability distributions derived in this chapter, reinforcing the conviction that our models are quite accurate. Exercises 3.1 When the health department tested private wells in a county for two impurities commonly found in drinking water, it found that 20% of the wells had neither impurity, 40% had impurity A, and 50% had impurity B. (Obviously, some had both impurities.) If a well is randomly chosen from those in the county, find the probability distribution for Y , the number of impurities found in the well. 3.2 You and a friend play a game where you each toss a balanced coin. If the upper faces on the coins are both tails, you win $1; if the faces are both heads, you win $2; if the coins do not match (one shows a head, the other a tail), you lose $1 (win (−$1)). Give the probability distribution for your winnings, Y , on a single play of this game. 3.3 A group of four components is known to contain two defectives. An inspector tests the compo- nents one at a time until the two defectives are located. Once she locates the two defectives, she stops testing, but the second defective is tested to ensure accuracy. Let Y denote the number of the test on which the second defective is found. Find the probability distribution for Y . 3.4 Consider a system of water flowing through valves from A to B. (See the accompanying diagram.) Valves 1, 2, and 3 operate independently, and each correctly opens on signal with probability .8. Find the probability distribution for Y , the number of open paths from A to B after the signal is given. (Note that Y can take on the values 0, 1, and 2.) 1 AB 23 3.5 A problem in a test given to small children asks them to match each of three pictures of animals to the word identifying that animal. If a child assigns the three words at random to the three pictures, find the probability distribution for Y , the number of correct matches. 3.6 Five balls, numbered 1, 2, 3, 4, and 5, are placed in an urn. Two balls are randomly selected from the five, and their numbers noted. Find the probability distribution for the following: a The largest of the two sampled numbers b The sum of the two sampled numbers 3.7 Each of three balls are randomly placed into one of three bowls. Find the probability distribution for Y = the number of empty bowls. 3.8 A single cell can either die, with probability .1, or split into two cells, with probability .9, producing a new generation of cells. Each cell in the new generation dies or splits into two cells independently with the same probabilities as the initial cell. Find the probability distribution for the number of cells in the next generation.
3.3 The Expected Value of a Random Variable or a Function of a Random Variable 91 3.9 In order to verify the accuracy of their financial accounts, companies use auditors on a regular basis to verify accounting entries. The company’s employees make erroneous entries 5% of3.10 the time. Suppose that an auditor randomly checks three entries.3.11 a Find the probability distribution for Y , the number of errors detected by the auditor. b Construct a probability histogram for p(y). c Find the probability that the auditor will detect more than one error. A rental agency, which leases heavy equipment by the day, has found that one expensive piece of equipment is leased, on the average, only one day in five. If rental on one day is independent of rental on any other day, find the probability distribution of Y , the number of days between a pair of rentals. Persons entering a blood bank are such that 1 in 3 have type O+ blood and 1 in 15 have type O− blood. Consider three randomly selected donors for the blood bank. Let X denote the number of donors with type O+ blood and Y denote the number with type O− blood. Find the probability distributions for X and Y . Also find the probability distribution for X + Y , the number of donors who have type O blood.3.3 The Expected Value of a Random Variable or a Function of a Random Variable We have observed that the probability distribution for a random variable is a theoret- ical model for the empirical distribution of data associated with a real population. If the model is an accurate representation of nature, the theoretical and empirical dis- tributions are equivalent. Consequently, as in Chapter 1, we attempt to find the mean and the variance for a random variable and thereby to acquire numerical descriptive measures, parameters, for the probability distribution p(y) that are consistent with those discussed in Chapter 1.DEFINITION 3.4 Let Y be a discrete random variable with the probability function p(y). Then the expected value of Y , E(Y ), is defined to be2 E(Y ) = yp(y). y If p(y) is an accurate characterization of the population frequency distribution, then E(Y ) = μ, the population mean. Definition 3.4 is completely consistent with the definition of the mean of a set of measurements that was given in Definition 1.1. For example, consider a discrete 2. To be precise, the expected value of a discrete random variable is said to exist if the sum, as given earlier, is absolutely convergent—that is, if |y| p(y) < ∞. y This absolute convergence will hold for all examples in this text and will not be mentioned each time an expected value is defined.
Search
Read the Text Version
- 1 - 5
Pages: