Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore Psych of Learning and Behavior

Psych of Learning and Behavior

Published by Vidit Jain, 2021-06-21 15:57:12

Description: Psych of Learning and Behavior

Search

Read the Text Version

QUICK QUIZ F228 CHAPTER 6 Operant Conditioning: Introduction 1. The word positive, when combined with the words reinforcement or punishment, means only that the behavior is followed by the ______________ of something. The word negative, when combined with the words reinforcement or punishment, means only that the behavior is followed by the ______________ of something. 2. The word positive, when combined with the words reinforcement or punishment, (does/ does not) ______________ mean that the consequence is good or pleasant. Similarly, the term negative, when combined with the words reinforcement or punishment, (does/ does not) ______________ mean that the consequence is bad or unpleasant. 3. Within the context of reinforcement and punishment, positive refers to the (addition/ subtraction) ________________ of something, and negative refers to the (addition/ subtraction) ______________ of something. 4. Reinforcement is related to a(n) (increase/decrease) _____________ in behavior, whereas punishment is related to a(n) (increase/decrease) ______________ in behavior. Positive Reinforcement Positive reinforcement consists of the presentation of a stimulus (one that is usually considered pleasant or rewarding) following a response, which then leads to an increase in the future strength of that response. Loosely speak- ing, the behavior results in the delivery of something the recipient likes, so the person or animal is more likely to behave that way in the future. Some of the earlier illustrations have been examples of positive reinforcement. The standard rat procedure in which lever pressing produces food is an example of positive reinforcement because the consequence of food leads to an increase in lever pressing. It is reinforcement because the behavior increases in fre- quency, and it is positive reinforcement because the consequence involves the presentation of something—namely, food (which we would call a positive reinforcer). Here are some additional examples of positive reinforcement: Turn on TV ã See the show R SR Smile at person ã The person smiles at you R SR Order coffee ã Receive coffee R SR Study diligently for quiz ã Obtain an excellent mark R SR Compliment partner ã Receive a kiss R SR Negative Reinforcement Negative reinforcement is the removal of a stimulus (one that is usually con- sidered unpleasant or aversive) following a response, which then leads to an increase in the future strength of that response. Loosely speaking, the behavior

Four Types of Contingencies 229QUICK QUIZ G results in the prevention or removal of something the person or animal hates, so the subject is more likely to behave that way in the future. For example, if by pressing a lever a rat terminates an electric shock that it is receiving, it will become more likely to press the lever the next time it receives an electric shock. This is an example of reinforcement because the behavior increases in strength; it is negative reinforcement because the consequence consists of taking something away. Here are some additional examples: Open umbrella ã Escape rain R SR Claim illness ã Avoid writing an exam R SR Take aspirin ã Eliminate headache R SR Turn on the heater ã Escape the cold R SR The last example is interesting because it illustrates how it is sometimes a matter of interpretation as to whether something is an example of negative reinforcement or positive reinforcement. Does the person turn on the heater to escape the cold (negative reinforcement) or to obtain warmth (positive reinforcement)? Either interpretation would be correct. Negative reinforcement involves two types of behavior: escape and avoid- ance. Escape behavior results in the termination (stopping) of an aversive stimulus. In the example of the person getting rained on, by opening the umbrella the person stops this from happening. Likewise, taking aspirin removes a headache, and turning on the heater allows one to escape the cold. Avoidance is similar to escape except that avoidance behavior occurs before the aversive stimulus is presented and therefore prevents its delivery. For example, if the umbrella were opened before stepping out into the rain, the person would avoid getting rained on. And by pretending to be ill, a student avoids having to write an exam. Escape and avoidance are discussed in more detail in Chapter 9. 1. When you reached toward the dog, he nipped at your hand. You quickly pulled your hand back. As a result, he now nips at your hand whenever you reach toward him. The consequence for the dog’s behavior of nipping consisted of the (presentation/ removal) ________________ of a stimulus (namely, your hand), and his behavior of nipping subsequently (increased/decreased) _________________ in frequency; therefore, this is an example of __________________ reinforcement. 2. When the dog sat at your feet and whined during breakfast one morning, you fed him. As a result, he sat at your feet and whined during breakfast the next morning. The consequence for the dog’s whining consisted of the (presentation/ removal) _______________ of a stimulus, and his behavior of whining subsequently (increased/decreased) ________________ in frequency; therefore, this is an example of ______________ reinforcement.

230 CHAPTER 6 Operant Conditioning: Introduction 3. Karen cries while saying to her boyfriend, “John, I don’t feel as though you love me.” John gives Karen a big hug saying, “That’s not true, dear, I love you very much.” If John’s hug is a reinforcer, Karen is (more/less) ___________________ likely to cry the next time she feels insecure about her relationship. More spe- cifically, this is an example of __________________ reinforcement of Karen’s crying behavior. 4. With respect to escape and avoidance, an ______________ response is one that terminates an aversive stimulus, while an ______________ response is one that prevents an aversive stimulus from occurring. Escape and avoidance responses are two classes of behavior that are maintained by (positive/negative) ______________ reinforcement. 5. Turning down the heat because you are too hot is an example of an (escape/ avoidance) ________________ response; turning it down before you become too hot is an example of an (escape/avoidance) ________________ response. Positive Punishment Positive punishment consists of the presentation of a stimulus (one that is usu- ally considered unpleasant or aversive) following a response, which then leads to a decrease in the future strength of that response. Loosely speaking, the behavior results in the delivery of something the person or animal hates, so the subject is less likely to behave that way in the future. For example, if a rat received a shock when it pressed a lever, it would stop pressing the lever. This is an example of punishment because the behavior decreases in strength, and it is positive punishment because the consequence involves the presenta- tion of something (i.e., shock). Consider some further examples of positive punishment: Talk back to the boss ã Get reprimanded R SP Swat at the wasp ã Get stung R SP Meow constantly ã Get sprayed with water R SP In each case, the behavior is followed by the presentation of an aversive stimu- lus, with the result that there is a decrease in the future probability of the behavior. People frequently confuse positive punishment with negative reinforce- ment. One reason for this is the fact that many behaviorists, including Skinner, use the term negative reinforcer to refer to an aversive (unpleasant) stimulus and the term positive reinforcer to refer to an appetitive (pleasant) stimulus. Unfortunately, people with less knowledge of the field have then assumed that the presentation of a negative reinforcer is an instance of negative reinforce- ment, which it is not. Within the framework presented here, it is instead an instance of positive punishment.

Four Types of Contingencies 231 Negative Punishment Negative punishment consists of the removal of a stimulus (one that is usually con- sidered pleasant or rewarding) following a response, which then leads to a decrease in the future strength of that response. Loosely speaking, the behavior results in the removal of something the person or animal likes, so the subject is less likely to behave that way in the future. Here are some examples of negative punishment: Stay out past curfew ã Lose car privileges R SP Argue with boss ã Lose job R SP Play with food ã Lose dessert R SP Tease sister ã Sent to room (loss of social contact) R SP In each case, it is punishment because the behavior decreases in strength, and it is negative punishment because the consequence consists of the removal of something. The last example is known as “time-out” and is employed by many parents as a replacement for spanking. Removal of social contact is usually one consequence of such a procedure; more generally, however, the child loses the opportunity to receive any type of positive reinforcer during the time-out inter- val. Children find such situations to be quite unpleasant, with the result that even very brief time-outs can be quite effective. Consider another example of negative punishment: Jonathan’s girlfriend, who is quite jealous, completely ignored him (withdrew her attention from him) when she observed him having a conversation with another woman at a party. As a result, he stopped talking to the other women at the party. Jonathan talks to other women ã His girlfriend ignores him R SP Jonathan’s behavior of talking to other women at parties has been negatively pun- ished. It is punishment in that the frequency with which he talked to other women at the party declined, and it is negative punishment because the consequence that produced that decline was the withdrawal of his girlfriend’s attention. Question: In this scenario, Jonathan’s behavior has been negatively punished. But what contingencies are operating on the girlfriend’s behavior? When she ignored him, he stopped talking to other women at the party. Given that this occurred, she might ignore him at future parties if she again sees him talking to other women. If so, her behavior has been negatively reinforced by the fact that it was effective in getting him to stop doing something that she disliked. If we dia- gram this interaction from the perspective of each person, we get the following: For Jonathan: I talk to other women ã My girlfriend ignores me R SP

232 CHAPTER 6 Operant Conditioning: Introduction For his girlfriend: I ignore Jonathan ã He stops talking to other women R SR As you can see, a reduction in one person’s behavior as a result of punish- ment can negatively reinforce the behavior of the person who implemented the punishment. This is the reason we are so often enticed to use punish- ment: Punishment is often successful in immediately getting a person to stop behaving in ways that we dislike. That success then reinforces our tendency to use punishment in the future, which of course can create major problems in the long run. We discuss the uses and abuses of punishment more fully in Chapter 9. Some students mistakenly equate behaviorism with the use of punish- ment. It is important to recognize that behaviorists actually emphasize the use of positive reinforcement. Indeed, Skinner (1953) believed that many societal problems can be traced to the overuse of punishment as well as nega- tive reinforcement. For example, teachers too often control their students by attempting to punish maladaptive behavior rather than by reinforcing adap- tive behavior. Moreover, the educational system in general is designed in such a way that students too often study to avoid failure (a negative reinforcer) rather than to obtain knowledge (a positive reinforcer). As a result, schooling is often more onerous and less effective than it could be. Similarly, in interpersonal relationships, people too often attempt to change each other’s behavior through the use of aversive consequences, such as complaining, when positive reinforcement for appropriate behavior might work just as well or better. Marsha, for example, says that Roger forgets to call whenever he is going to be late, even though she often complains about it. Perhaps a more effective approach would be for her to express her apprecia- tion when he does call. Furthermore, although many people believe that the key to a great rela- tionship is open communication, research has shown that a much more important element is the ratio of positive (pleasant) interactions to negative (aversive) interactions. In fact, one of the best predictors of a successful mar- riage is when the positives outweigh the negatives by a ratio of about five to one (Gottman, 1994). Even volatile relationships, in which there seems to be an enormous amount of bickering, can thrive if the number of posi- tive exchanges, such as teasing, hugging, and praising, greatly outweigh the number of negative exchanges. To help strengthen your understanding of the four types of contingencies— positive reinforcement, negative reinforcement, positive punishment, and nega- tive punishment—and deal with examples that are potentially confusing, see also “Four Types of Contingencies: Tricky Examples” in the And Furthermore box.5 5Note that the labels for the two types of punishment are not standardized. For example, positive and negative punishment are sometimes called Type 1 and Type 2 punishment (e.g., Chance, 1994) or punishment by contingent application and punishment by contingent withdrawal (e.g., L. Miller, 1997).

Positive Reinforcement: Further Distinctions 233QUICK QUIZ H And Furthermore Four Types of Contingencies: Tricky Examples After learning about the four types of contingencies, students are sometimes dismayed when they encounter examples that suddenly confuse them. When this happens, the con- tingency typically has been worded in an unusual way. For example, suppose that a mother tells her son that if he does not clean his room, then he will not get dessert. What type of contingency is the mother specifying? To begin with, it sounds like a negative contingency because the consequence seems to involve the threatened loss of something—namely, the dessert. It also sounds like reinforcement because the goal is to increase the probability of a certain behavior—cleaning the room. We might therefore conclude that this is an example of negative reinforcement. But does this make sense? In everyday terms, negative reinforce- ment involves strengthening a behavior by removing something that the person dislikes, while here we are talking about removing something that the person likes. So what type of contingency is this? To clarify situations like this, it helps to reword the example in terms of the occurrence of a behavior rather than its nonoccurrence because in reality it is only the occurrence of a behavior that is reinforced or punished. By doing so, and depending on the behavior we focus on, this example can be interpreted as fitting either of two types of contingencies. On one hand, if we focus on the behavior of cleaning the room, it can be viewed as an example of positive reinforcement: if the son cleans his room, he can have dessert. On the other hand, if we focus on the behavior of “doing something other than cleaning the room” (or something other than following his mother’s instructions), it can be viewed as an example of negative punishment: if the son does something other than clean his room, he will not get dessert. Thus, all behaviors other than room cleaning, such as watching television, will result in the loss of dessert. In fact, to the extent that the mother made her request in a threatening manner, she probably intended something like the latter. But note how she could just as easily have worded her request in the form of positive reinforcement—“If you clean your room, you can have some dessert”—and how much more pleasant that sounds. Unfortunately, many parents too often choose the unpleasant version, especially when they are frustrated or angry, which in turn helps to create a decidedly unpleasant atmosphere in the household. 1. When Sasha was teasing the dog, it bit her. As a result, she no longer teases the dog. The consequence for Sasha’s behavior of teasing the dog was the (presentation/ removal) _________________ of a stimulus, and the teasing behavior subsequently (increased/decreased) ________________ in frequency; therefore, this is an example of ________________ ________________. 2. Whenever Sasha pulled the dog’s tail, the dog left and went into another room. As a result, Sasha now pulls the dog’s tail less often when it is around. The consequence for pulling the dog’s tail was the (presentation/removal) _____________ of a stimulus, and the behavior of pulling the dog’s tail subsequently (increased/decreased) ____________ in frequency; therefore, this is an example of _______________ _____________.

234 CHAPTER 6 Operant Conditioning: Introduction 3. When Alex burped in public during his date with Stephanie, she got angry with him. Alex now burps quite often when he is out on a date with Stephanie. The conse- quence for burping was the ______________ of a stimulus, and the behavior of belching subsequently ______________ in frequency; therefore, this is an example of ______________ ______________. 4. When Alex held the car door open for Stephanie, she made a big fuss over what a gentleman he was becoming. Alex no longer holds the car door open for her. The consequence for holding open the door was the ______________ of a stimulus, and the behavior of holding open the door subsequently ______________ in fre- quency; therefore, this is an example of ______________ ______________. 5. When Tenzing shared his toys with his brother, his mother stopped criticizing him. Tenzing now shares his toys with his brother quite often. The consequence for shar- ing the toys was the ______________ of a stimulus, and the behavior of sharing the toys subsequently ______________ in frequency; therefore, this is an example of _______________ _____________. Positive Reinforcement: Further Distinctions Because behaviorists so strongly emphasize positive reinforcement, let us have a closer look at this type of contingency. More specifically, we will examine various categories of positive reinforcement. Immediate Versus Delayed Reinforcement A reinforcer can be presented either immediately after a behavior occurs or following some delay. In general, the more immediate the reinforcer, the stronger its effect on the behavior. Suppose, for example, that you wish to reinforce a child’s quiet playing by giving him a treat. The treat should ideally be given while the quiet period is still in progress. If, instead, you deliver the treat several minutes later, while he is engaged in some other behavior (e.g., banging a stick on his toy box), you might inadvertently reinforce that behavior rather than the one you wish to reinforce. The weak effect of delayed reinforcers on behavior accounts for some major difficulties in life. Do you find it tough to stick to a diet or an exercise regime? This is because the benefits of exercise and proper eating are delayed and there- fore weak, whereas the enjoyable effects of alternate activities, such as watching television and drinking a soda, are immediate and therefore powerful. Similarly, have you ever promised yourself that you would study all weekend, only to find that you completely wasted your time reading novels, watching television, and going out with friends? The immediate reinforcement associated with these recreational activities effectively outweighed the delayed reinforcement associ- ated with studying. Of course, what we are talking about here is the issue of self-control, a topic that is more fully discussed in Chapter 10.

Positive Reinforcement: Further Distinctions 235QUICK QUIZ I The importance of immediate reinforcement is so profound that some behaviorists (e.g., Malott, 1989; Malott & Suarez, 2004) argue that a delayed reinforcer does not, on its own, actually function as a “reinforcer.” They point to experimental evidence indicating that delaying a reinforcer by even a few seconds can often severely influence its effectiveness (e.g., Grice, 1948; Keesey, 1964; see also J. Williams, 1973). This finding suggests that delayed reinforcers, to the extent that they are effective, may function by a differ- ent mechanism from immediate reinforcement, especially in humans. Thus, receiving a good mark on that essay you wrote last week does not reinforce the behavior of essay writing in the same way that immediately receiving a food pellet reinforces a rat’s tendency to press a lever (or immediately seeing your mother’s smile reinforces your tendency to give her another compliment). Rather, in the case of humans, behaviors that appear to be strengthened by long-delayed reinforcers are often under the control of rules or instructions that we have received from others or generated for ourselves. These rules or instructions describe to us the delayed consequences that can result from a behavior (e.g., “Gee, if I work on that essay tonight, I am likely to get a good mark on it next week”), thereby bridging the gap between the behavior and the consequence. In this text, for simplicity, we will ignore some of the complexities associ- ated with the issue of rules and delayed reinforcement, though we will briefly discuss rule-governed behavior in Chapter 12. For the present purposes, it is sufficient to note that delayed reinforcement is usually much less potent, and perhaps even qualitatively different, than immediate reinforcement. This also makes clear the crucial importance of immediate reinforcement when dealing with young children (and animals) who have little or no language capacity, since the use of rules is essentially dependent on language. 1. In general, the more _________________ the reinforcer, the stronger its effect on behavior. 2. It is sometimes difficult for students to study in that the reinforcers for studying are ______________ and therefore w______________, whereas the reinforcers for alternative activities are ______________ and therefore s______________. 3. It has been suggested that delayed reinforcers (do/do not) __________________ function in the same manner as immediate reinforcers. Rather, the effectiveness of delayed reinforcers in humans is largely dependent on the use of i______________ or r________________ to bridge the gap between the behavior and the delay. Primary and Secondary Reinforcers A primary reinforcer (also called an unconditioned reinforcer) is an event that is innately reinforcing. Loosely speaking, primary reinforcers are those things we are born to like rather than learn to like and that therefore naturally rein- force our behavior. Examples of primary reinforcers are food, water, proper temperature (neither too hot nor too cold), and sexual contact.

236 CHAPTER 6 Operant Conditioning: Introduction Many primary reinforcers are associated with basic physiological needs, and their effectiveness is closely tied to a state of deprivation. For example, food is a highly effective reinforcer when we are food deprived and hungry but not when we are satiated. Some primary reinforcers, however, do not seem to be associated with a physiological state of deprivation. An animal (or person) cooped up in a boring environment will likely find access to a more stimulat- ing environment highly reinforcing and will perform a response such as lever pressing (or driving to the mall) to gain such access. In cases such as this, the deprivation seems more psychological than physiological. A secondary reinforcer (also called a conditioned reinforcer) is an event that is reinforcing because it has been associated with some other reinforcer. Loosely speaking, secondary reinforcers are those events that we have learned to like because they have become associated with other things that we like. Much of our behavior is directed toward obtaining secondary reinforcers, such as good marks, fine clothes, and a nice car. Because of our experiences with these events, they can function as effective reinforcers for our current behavior. Thus, if good marks in school are consistently associated with praise, then the good marks themselves can serve as reinforcers for behaviors such as studying. And just seeing a professor who once provided you with lots of praise and encouraged you to make the most of your life may be an effective reinforcer for the behavior of visiting her after you graduate. Conditioned stimuli that have been associated with appetitive unconditioned stimuli (USs) can also function as secondary reinforcers. For example, the sound of a metronome that has been paired with food to produce a classically conditioned response of salivation: Metronome: Food ã Salivation NS US UR Metronome ã Salivation CS CR Can then be used as a secondary reinforcer for an operant response such as lever pressing: Lever press ã Metronome R SR Because the metronome has been closely associated with food, it can now serve as a reinforcer for the operant response of lever pressing. The animal essen- tially seeks out the metronome because of its pleasant associations. Similarly, we may seek out music that has been closely associated with a romantic epi- sode in our life because of its pleasant associations. Discriminative stimuli associated with positive reinforcers can likewise function as secondary reinforcers. A tone that has served as an SD signaling the availability of food for lever pressing: Tone: Lever press ã Food SD R SR

Positive Reinforcement: Further Distinctions 237 can then function as a secondary reinforcer for some other behavior, such as running in a wheel: Run in wheel ã Tone R SR An important type of secondary reinforcer is known as a generalized rein- forcer. A generalized reinforcer (also known as a generalized secondary rein- forcer) is a type of secondary reinforcer that has been associated with several other reinforcers. For example, money is a powerful generalized reinforcer for humans because it is associated with an almost unlimited array of other reinforcers including food, clothing, furnishings, entertainment, and even dates (insofar as money will likely increase our attractiveness to others). In fact, money can become such a powerful reinforcer that some people would rather just have the money than the things it can buy. Social attention, too, is a highly effective generalized reinforcer, especially for young children (though some aspects of it, such as touching, are probably also primary reinforcers). Attention from caretakers is usually associated with a host of good things such as food and play and comfort, with the result that attention by itself can become a powerful reinforcer. It is so powerful that some children will even misbehave to get someone to pay attention to them. Generalized reinforcers are often used in behavior modification programs. In a “token economy,” tokens are used in institutional settings—such as mental institutions, prisons, or classrooms for problem children—to increase the fre- quency of certain desirable behaviors, such as completing an assigned task, dressing appropriately, or behaving sociably. Attendants deliver the tokens immediately following the occurrence of the behavior. These tokens can later be exchanged for “backup reinforcers” such as treats, trips into the commu- nity, or television viewing time. In essence, just as the opportunity to earn money—and what it can buy—motivates many of us to behave appropriately, so too does the opportunity to earn tokens—and what they can be exchanged for—motivates the residents of that setting to behave appropriately. (See Miltenberger, 1997, for an in-depth discussion of token economies.) Note that an event can function as both a primary reinforcer and a second- ary reinforcer. A Thanksgiving dinner, for example, can be both a primary reinforcer, in the sense of providing food, and a secondary reinforcer due to its association with a beloved grandmother who prepared many similar din- ners in your childhood. Finally, just as stimuli that are associated with reinforcement can become secondary reinforcers, so also can the behaviors that are associated with rein- forcement. For example, children who are consistently praised for helping others might eventually find the behavior of helping others to be reinforc- ing in and of itself. They will then help others, not to receive praise but because they “like to help.” We might then describe such children as having an altruistic nature. By a similar mechanism, even hard work can sometimes become a secondary reinforcer (see “Learned Industriousness” in the And Furthermore box).

QUICK QUIZ J238 CHAPTER 6 Operant Conditioning: Introduction And Furthermore Learned Industriousness Some people seem to enjoy hard work while others do not. Why is this? According to learned industriousness theory, if working hard (displaying high effort) on a task has been consistently associated with reinforcement, then working hard might itself become a secondary reinforcer (Eisenberger, 1992). This can result in a generalized tendency to work hard. Experiments with both humans and animals have confirmed this possibility. For example, rats that have received reinforcers for emitting forceful lever presses will then run faster down an alleyway to obtain food (Eisenberger, Carlson, Guile, & Shapiro, 1979). Similarly, students who have received reinforcers for solving complex math problems will later write essays of higher quality (Eisenberger, Masterson, & McDermitt, 1982). Experiments have also confirmed the opposite: Rats and humans that have received reinforcers for displaying low effort on a task will show a generalized tendency to be lazy (see Eisenberger, 1992). (Something to think about if you have a strong tendency to take the easy way out.) 1. Events that are innately reinforcing are called ______________ reinforcers. They are sometimes also called un______________ reinforcers. 2. Events that become reinforcers through their association with other reinforcers are called _______________ reinforcers. They are sometimes also called _____________ reinforcers. 3. Honey is for most people an example of a ______________ reinforcer, while a coupon that is used to purchase the honey is an example of a ______________ reinforcer. 4. A (CS/US) ______ that has been associated with an appetitive (CS/US) ______ can serve as a secondary reinforcer for an operant response. As well, a stimulus that serves as a(n) ______ for an operant response can also serve as a secondary rein- forcer for some other response. 5. A generalized reinforcer (or generalized secondary reinforcer) is a secondary reinforcer that has been associated with _________________________________. 6. Two generalized secondary reinforcers that have strong effects on human behavior are __________________________________________________. 7. Behavior modification programs in institutional settings often utilize generalized reinforcers in the form of t_______________. This type of arrangement is known as a t______________ e______________. Intrinsic and Extrinsic Reinforcement In the preceding discussion, we noted that operant behavior itself can some- times be reinforcing. Such a behavior is said to be intrinsically reinforcing or motivating. Thus, intrinsic reinforcement is reinforcement provided by the

Positive Reinforcement: Further Distinctions 239 mere act of performing the behavior. We rollerblade because it is invigorat- ing, we party with friends because we like their company, and we work hard at something partly because hard work has, through experience, become enjoy- able (though you are probably still not convinced about that one). Animals, too, sometimes engage in activities for their own sake. In some of the ear- liest research on intrinsic motivation, it was found that with no additional incentive, monkeys repeatedly solved mechanical puzzles (Harlow, Harlow, & Meyer, 1950). Unfortunately, many activities are not intrinsically reinforcing and instead require additional incentives to ensure their performance. Extrinsic rein- forcement is the reinforcement provided by some consequence that is external to the behavior (i.e., an “extrinsic reinforcer”). For example, perhaps you are reading this text solely because of an upcoming exam. Passing the exam is the extrinsic consequence that is motivating your behavior. Other examples of extrinsically motivated behaviors are driving to get somewhere, working for money, and dating an attractive individual merely to enhance your prestige. Unfortunately, the distinction between intrinsic and extrinsic reinforc- ers is not always clear. For example, is candy an intrinsic or extrinsic rein- forcer? In one sense, candy seems like an intrinsic reinforcer because eating it is an enjoyable activity; yet the candy exists external to the behavior that is being reinforced. In such cases, it often helps to focus on the behavior that is being strengthened. Imagine, for example, that we offer candy to a child to strengthen the behavior of being quiet in the supermarket. The candy is clearly an extrinsic reinforcer for the behavior of being quiet, but with respect to the behavior of eating candy, the candy is the critical component in an intrin- sically reinforcing activity. In any case, do not fret too much if you encounter an example that seems confusing. The most important thing is to be able to distinguish situations in which the motivation is clearly intrinsic (taking a bath for the pleasure of it) from those in which the motivation is clearly extrinsic (taking a bath because you have been paid to do so). Question: What happens if you are given an extrinsic reinforcer for an activity that is already intrinsically reinforcing? What if, for example, you love rollerblading and are fortunate enough to be hired one weekend to blade around an amusement park while displaying a new line of sportswear? Will the experience of receiving payment for rollerblading increase, decrease, or have no effect on your subsequent enjoyment of the activity? Although you might think that it would increase your enjoyment of roll- erblading (since the activity is not only enjoyable but also associated with money), many researchers claim that experiences like this can decrease intrinsic interest. For example, Lepper, Green, and Nisbett (1973) found that children who enjoyed drawing with Magic Markers became less interested following a session in which they had been promised, and then received, a “good player” award for drawing with the markers. In contrast, children who did not receive an award or who received the award unexpectedly after playing with the mark- ers did not show a loss of interest. Similar results have been reported by other investigators (e.g., Deci & Ryan, 1985). However, some researchers have

240 CHAPTER 6 Operant Conditioning: Introduction And Furthermore Positive Reinforcement of Artistic Appreciation B. F. Skinner (1983) once described how two students used positive reinforcement to instill in their new roommate an appreciation of modern art. These students had several items of modern art in their apartment, but the roommate had shown little interest in them and was instead proceeding to “change the character” of the space. As a counterploy, the students first decided to pay attention to the roommate only when they saw him looking at one of the works of art. Next, they threw a party and arranged for an attractive young woman to engage him in a discussion about the art. They also arranged for him to receive announcements from local art galleries about upcoming art shows. After about a month, the roommate himself suggested attending a local art museum. Interestingly, while there, he just “happened” to find a five-dollar bill lying at his feet while he was looking at a painting. According to Skinner, “It was not long before [the two students] came again in great excitement—to show me his first painting” (p. 48). found that extrinsic rewards have no effect on intrinsic interest (e.g., Amabile, Hennessey, & Grossman, 1986) or actually produce an increase in intrinsic interest (e.g., Harackiewicz, Manderlink, & Sansone, 1984). Unfortunately, despite these mixed findings, it is the damaging effects of extrinsic rewards on intrinsic motivation that are often presented to the public (e.g., Kohn, 1993). But is this a fair assessment of the evidence? Are the harmful effects of reinforcement the rule or the exception? Cameron and Pierce (1994) attempted to answer this question by con- ducting a meta-analysis of 96 well-controlled experiments that examined the effects of extrinsic rewards on intrinsic motivation. (A meta-analysis is a sta- tistical procedure that combines the results of several separate studies, thereby producing a more reliable overall assessment of the variable being studied.) The meta-analysis by Cameron and Pierce indicates that extrinsic rewards usually have little or no effect on intrinsic motivation. External rewards can occasionally undermine intrinsic motivation, but only when the reward is expected (i.e., the person has been instructed beforehand that she will receive a reward), the reward is tangible (e.g., it consists of money rather than praise), and the reward is given for simply performing the activity (and not for how well it is performed). It also turns out that verbal rewards, such as praise, often pro- duce an increase in intrinsic motivation, as do tangible rewards given for high- quality performance (see Deci & Ryan, 1985). Cameron and Pierce (1994) conclude that extrinsic rewards can be safely applied in most circumstances and that the limited circumstances in which they decrease intrinsic motiva- tion are easily avoided. Bandura (1997) likewise has argued that the dangers of extrinsic rewards on intrinsic motivation have been greatly overstated. (See Cameron, 2001; Cameron, Banko, & Pierce, 2001; Cameron & Pierce, 2002;

Positive Reinforcement: Further Distinctions 241QUICK QUIZ K and Deci, Koestner, & Ryan, 2001a, 2001b, for further contributions to this debate.)6 (See also “Positive Reinforcement of Artistic Appreciation” in the And Furthermore box.) 1. An __________________ motivated activity is one in which the activity is itself reinforcing; an ________________ motivated activity is one in which the reinforcer for the activity consists of some type of additional consequence that is not inherent to the activity. 2. Running to lose weight is an example of an ______________ motivated activity; running because it “feels good” is an example of an ______________ motivated activity. 3. In their meta-analysis of relevant research, Cameron and Pierce (1994) found that extrinsic rewards decrease intrinsic motivation only when they are (expected/ unexpected) _____________________, (tangible/verbal) ___________________, and given for (performing well/merely engaging in the behavior) __________________ _____________________________________. 4. They also found that extrinsic rewards generally increased intrinsic motivation when the rewards were (tangible/verbal) ______________, and that tangible rewards increased intrinsic motivation when they were delivered contingent upon (high/low) ______________ quality performance. Natural and Contrived Reinforcers The distinction between intrinsic and extrinsic reinforcers is closely related to the distinction between natural and contrived reinforcers. Natural rein- forcers are reinforcers that are naturally provided for a certain behavior; that is, they are a typical consequence of the behavior within that setting. Money is a natural consequence of selling merchandise; gold medals are a natural consequence of hard training and a great performance. Contrived reinforcers are reinforcers that have been deliberately arranged to modify a behavior; they are not a typical consequence of the behavior in that setting. For example, although television is the natural reinforcer for the behavior of turning on the set, it is a contrived reinforcer for the behavior 6It is also the case that some consequences that appear to function as positive reinforcers might in reality be more aversive. For example, many years ago, a player for the Pittsburgh Pirates told me (Russ Powell) that he hated baseball because there were so many young players trying to replace him. It seemed like the consequence that motivated his playing was no longer the love of baseball, nor even the desire to obtain a good salary; rather, it was the threatened loss of a good salary if he didn’t play well. According to Skinner (1987), human behavior is too often controlled by these types of negative consequences—working to avoid the loss of a paycheck and studying to avoid failure (especially prevalent in students who procrastinate until they are in serious danger of failing). It is therefore not surprising that these activities often seem less than intrinsically interesting.

242 CHAPTER 6 Operant Conditioning: Introduction of, say, accomplishing a certain amount of studying. In the latter case, we have created a contrived contingency in an attempt to modify the person’s study behavior. Note that intrinsic reinforcers are always natural reinforcers, while extrinsic reinforcers can be either natural or contrived. For example, an actor’s “feeling of satisfaction” is an intrinsic, natural reinforcer for a good performance. By contrast, compliments by customers are extrinsic, natural reinforcers for a chef ’s behavior of creating a wonderful meal; and candy is an extrinsic, con- trived reinforcer for a child’s behavior of sitting quietly in a doctor’s office. In other words, some extrinsic reinforcers are part of the typical contingencies in our environment, while others have been artificially imposed to modify a particular behavior. Needless to say, concerns about the effects of extrinsic reinforcement on intrinsic motivation have focused mostly on situations in which extrinsic reinforcers have been artificially manipulated, or contrived. Although contrived reinforcers are often seen as a hallmark of behaviorism, behaviorists strive to utilize natural reinforcers whenever possible (Sulzer- Azaroff & Mayer, 1991). When contrived reinforcers are used, the ultimate intention is to let the “natural contingencies” eventually take over if at all possible. For example, although we might initially use tokens to motivate a patient with schizophrenia to socialize with others, our hope is that the behav- ior will soon become “trapped” by the natural consequences of socializing (e.g., smiles and pleasant comments from others) such that the tokens can eventually be withdrawn. Similarly, although we might initially use praise to increase the frequency with which a child reads, the natural (and intrinsic) reinforcers associated with reading will hopefully take over so that the child will begin reading even in the absence of praise. Note, too, that natural contingencies tend to produce more efficient behavior patterns than do contrived contingencies (Skinner, 1987). Although a coach might use praise to reinforce correct throwing actions by a young quarterback, the most important factor in producing correct throws will be the natural consequence of where the ball goes. To distinguish between intrinsic versus extrinsic reinforcers and natu- ral versus contrived reinforcers, just remember that the former is con- cerned with the extent to which the behavior itself is reinforcing while the latter is concerned with the extent to which a reinforcer has been artificially imposed so as to manipulate a behavior. Note, too, that the extent to which a reinforcer has been artificially imposed is not always clear; hence, it is always possible to find examples in which it is ambiguous as to whether the reinforcer is contrived or natural. Are grades in school a natural reinforcer or a contrived reinforcer? It depends on whether one’s grades are a typical aspect of the learning environment, at least within the school system, or a contrived aspect. In any event, as with intrinsic versus extrinsic motivation, the important thing is to be able to distinguish those situations in which the reinforcers are clearly contrived — as often occurs in a behavior modification program — from those in which the reinforcers are more natural.

Shaping 243QUICK QUIZ L 1. A(n) ______________ reinforcer is a reinforcer that typically occurs for that behavior in that setting; a(n) ______________ reinforcer is one that typically does not occur for that behavior in that setting. 2. You flip the switch and the light comes on. The light coming on is an example of a(n) (contrived/natural) ______________ reinforcer; in general, it is also an example of an (intrinsic/extrinsic) ______________ reinforcer. 3. You thank your roommate for helping out with the housework in an attempt to motivate her to help out more often. To the extent that this works, the thank-you is an example of a(n) (contrived/natural) ______________ reinforcer; it is also an example of an (intrinsic/extrinsic) ______________ reinforcer. 4. In applied behavior analysis, although one might initially use (contrived/natural) ______________ consequences to first develop a behavior, the hope is that, if possible, the behavior will become tr______________ by the n______________ c______________ associated with that behavior. 5. In most cases, the most important consequence in developing highly effective forms of behavior will be the (contrived/natural) ______________ consequences of that behavior. 6. (Intrinsic/Extrinsic) ______________ reinforcers are always natural reinforcers, while ______________ reinforcers can be either natural or contrived. Shaping Positive reinforcement is clearly a great way to strengthen a behavior, but what if the behavior that we wish to reinforce never occurs? For example, what if you want to reinforce a rat’s behavior of pressing a lever but are unable to do so because the rat never presses the lever? What can you do? The solution is to use a procedure called shaping. Shaping is the gradual creation of new operant behavior through rein- forcement of successive approximations to that behavior. With our rat, we could begin by delivering food whenever it stands near the lever. As a result, it begins standing near the lever more often. We then deliver food only when it is facing the lever, at which point it starts engaging in that behavior more often. In a similar manner, step-by-step, we reinforce touching the lever, then placing a paw on the lever, and then pressing down on the lever. When the rat finally presses down on the lever with enough force, it closes the microswitch that activates the food magazine. The rat has now earned a reinforcer on its own. After a few more experiences like this, the rat begins to reliably press the lever whenever it is hungry. By reinforcing successive approximations to the target behavior, we have managed to teach the rat an entirely new behavior. Another example of shaping: How do you teach a dog to catch a Frisbee? Many people simply throw the Frisbee at the dog, at which point the dog probably wonders what on earth has gotten into its owner as the Frisbee sails

244 CHAPTER 6 Operant Conditioning: Introduction over its head. Or possibly the dog runs after the Frisbee, picks it up after it falls on the ground, and then makes the owner chase after him to get the Frisbee back. Karen Pryor (1999), a professional animal trainer, recommends the fol- lowing procedure. First, reinforce the dog’s behavior of taking the Frisbee from your hand and immediately returning it. Next, raise the criterion by holding the Frisbee in the air to make the dog jump for it. When this is well established, toss the Frisbee slightly so the dog jumps and catches it in midair. Then toss it a couple of feet so he has to run after it to catch it. Now gradually throw it further and further so the dog has to run farther and farther to get it. Remember to provide lots of praise each time the dog catches the Frisbee and returns it. Shaping is obviously a fundamental procedure for teaching animals to per- form tricks. During such training, the trainers often use a sound, such as a click from a handheld clicker, to reinforce the behavior. The sound has been repeat- edly paired with food so that it has become a secondary reinforcer. The benefit of using a sound as a reinforcer is that it can be presented immediately upon the occurrence of the behavior, even if the animal is some distance away. Also, if food were presented each time the correct behavior occurred, the animal would quickly satiate, at which point the food would become ineffective as a reinforcer. By using a secondary reinforcer such as a click, with food delivered only intermittently, satiation will take longer to occur, thereby allowing for longer training sessions. Most of our behaviors have, to some extent, been learned or modified through shaping. For example, when children first learn to eat with a knife and fork, parents might praise even very poor attempts. Over time, though, they expect better and better performance before offering praise. In a similar manner, with children we gradually shape the behavior of dressing appro- priately, speaking politely, and writing legibly. And shaping is not confined merely to childhood. All of us are in the position of receiving constant feed- back about our performance—be it ironing clothes, cooking a meal, or slam- dunking a basketball—thus allowing us to continually modify our behaviors and improve our skills. In such circumstances, it is usually the natural consequences of the behavior—the extent to which we are successful or unsuccessful—that provide the necessary reinforcement for gradual modifi- cations of the behavior. For further information on shaping as applied to both animals and humans, you might wish to obtain a copy of Karen Pryor’s (1999) highly readable book, Don’t Shoot the Dog. Pryor also has a Web site on “clicker training” (shaping through the use of clicks as secondary reinforcers) that can be accessed via the Internet (just search for “clicker training”). Clicker training has become increasingly popular with dog owners and is being used to shape behavior in everything from birds to horses and even llamas and elephants. Interestingly, Pryor observes that many animals greatly enjoy the “game” of clicker training. (See also “Training Ishmael” in the And Furthermore box.)

Shaping 245 An excellent demonstration of the power of shaping. © Joseph Sohm/Visions of America/Corbis

246 CHAPTER 6 Operant Conditioning: Introduction ADVICE FOR THE LOVELORN Dear Dr. Dee, My boyfriend has a terrible tendency to boss me around. I have tried to counter this ten- dency by being especially nice to him, but the problem seems to be getting worse. He also refuses to discuss it or see a counselor. He says I am too sensitive and that I am making a mountain out of a molehill. What should I do? Just About Hadenough Dear Just, You should first recognize that some people have a long history of reinforcement for being dominant or aggressive, and that it can sometimes be difficult to alter such tendencies. In fact, you might eventually have to bail out of this relationship, particularly because he refuses to discuss what seems to be an obvious problem. Nevertheless, you might also wish to consider the possibility that you are inadvertently rein- forcing his aggressiveness. Remember how, in the opening vignette to this chapter, the young woman reacted to her partner’s angry demands by a show of affection. While this might reduce his anger in the short run, it might also reinforce his tendency to be aggressive. After all, not only was his anger effective in getting her to hurry up, it also resulted in a hug. The next time he wants her to hurry up or desires affection, what better way than to get angry? As a first step, you might wish to take careful note of the situations in which your boy- friend becomes bossy. If it appears that you might be reinforcing his bossiness by being par- ticularly nice to him when he acts that way, you could try offering him little or no attention when he behaves like that and lots of attention when he behaves more appropriately. Can this work? In her book Don’t Shoot the Dog, Karen Pryor (1999) relates the following story about a woman who implemented just such a program: A young woman married a man who turned out to be very bossy and demanding. Worse yet, his father, who lived with them, was equally given to ordering his daughter-in-law about. It was the girl’s mother who told me this story. On her first visit she was horrified at what her daughter was going through. “Don’t worry, Mother,” the daughter said. “Wait and see.” The daughter formed the practice of responding minimally to commands and harsh remarks, while reinforcing with approval and affection any tendency by either man to be pleasant and thoughtful. In a year, she had turned them into decent human beings. Now they greet her with smiles when she comes home and leap up—both of them—to help with the groceries. (p. 30) By reinforcing successive approximations toward decent behavior and not reinforcing bossy behavior (yet still responding minimally to their requests), this woman was apparently able to shape more appropriate behavior in her husband and father-in-law. Remember, though, such problems are often difficult to manage and may require professional help. Behaviorally yours,

Shaping 247 And Furthermore Training Ishmael Although the principles of reinforcement and shaping are easy enough to learn, apply- ing those principles can be another matter. In this case, there is no substitute for the experience of shaping behavior in a live animal. Dogs are ideal subjects for this, with cats and birds also being quite suitable. However, many people live in apartments where such pets are not allowed. Fortunately, apartment dwellers are often allowed to keep fish, and some fish are in fact surprisingly trainable. Goldfish, for example, have been trained to swim through hoops, push ping pong balls around, and (according to one report from an acquaintance who swore she saw it on television) pull a string to ring a tiny bell for food. To illustrate the process of using reinforcement to train a fish, let us consider some training that I (Russ Powell) conducted with Ishmael, a 2-inch long, dark blue, male Betta splendens (Siamese fighting fish). Ishmael lives by himself in a 1-gallon acrylic tank with gravel and a few plants. It might seem to you that a 1-gallon tank is awfully small, but bettas have evolved to survive in small pools of water in their native Thailand. Isolation from other fish is also a natural state of affairs for a betta because the sight of another male, or something similar to it, often elicits the fixed action pattern of aggression that we discussed in Chapter 3. As it turns out, this natural proclivity for small living quarters and relative isolation is an advantage when it comes to training bettas, because this setup mimics some of the features of an operant conditioning chamber. The interesting thing about training a male betta is that two types of reinforcers are available. One, of course, is food, and this generally works as well with bettas as it does with other animals (though bettas are sometimes fussy eaters). Unfortunately, being such small fish, they can be given only a few bites of food per day, which means that each training session must be kept quite short to prevent overfeeding. The other type of reinforcer is the presentation of a mirror that allows them to see a mirror image of themselves. This mirror image is often perceived as another male, which then elicits the fixed action pattern of aggression. Interestingly, the opportunity to aggress like this can serve as a positive reinforcer that can be used to strengthen some other behavior (Melvin, 1985; T. Thompson, 1963). Note that this is an excellent example of how positive reinforcers are not necessarily the kinds of events that one would classify as pleasant. If bettas had human-like feelings, one could only assume from their behavior that they hate the sight of another male; nevertheless, they will learn to perform a response in order to see that male. As an informal demonstration of the effectiveness of mirror presentation as a reinforcer with Ishmael, mirror presentations were made contingent upon the behavior of turning a half-circle, first in a clockwise and then in a counterclockwise direction. A clockwise turn was defined as a clockwise movement from, at minimum, a left-facing position (from the (continued )

248 CHAPTER 6 Operant Conditioning: Introduction observer’s perspective) to a right-facing position in the tank. A counterclockwise turn was defined as a counterclockwise movement from, at minimum, a right-facing position to a left-facing position. Each training session lasted 10 minutes, which was measured with a kitchen timer. During an initial baseline period, Ishmael’s clockwise and counterclockwise circling habits were recorded throughout the session with 10 mirror presentations presented noncontin- gently (independent of any behavior) at random points in time. (Question: Why include mirror presentations in the baseline period?) During this period, Ishmael showed a slight preference for turning in a counterclockwise direction (see Figure 6.4). A clockwise turn was selected for initial training beginning in session 5. Rather than simply waiting for a clockwise turn and then presenting the mirror, past experience with another betta suggested that the turn- ing behavior could be established more rapidly by using a shaping procedure. Thus, mirror presentations were initially made contingent upon successive approximations to the required behavior (i.e., slight turns in the correct direction were initially reinforced, and progressively complete turns were subsequently reinforced). Shaping proceeded rapidly, with Ishmael quickly establishing a pattern of clockwise turns. For the remainder of session 5 and for the following three sessions, he exhibited a clear preference for such turns. Beginning in session 9, the contingency was reversed with counterclockwise turns reinforced and clockwise turns extinguished. Possibly due to the short length of each training session, counterclockwise turns did not become well established until ses- sion 13 (even with shaping), which was maintained for the following three sessions. A reversal was then attempted in which clockwise turns were again reinforced and counterclockwise turns were extinguished (hence, overall, this was an ABCB design). FIGURE 6.4 Number of clockwise and counterclockwise turns made by Ishmael across different phases of the demonstration. 30 Baseline Clockwise Counterclockwise Clockwise reinforced reinforced reinforced 25 Number of turns 20 Clockwise Counterclockwise 15 10 5 0 0 2 4 6 8 10 12 14 16 18 20 Session

Summary 249QUICK QUIZ M This time, three sessions were required before Ishmael developed a preference for turning in the newly reinforced direction. In session 20, however, the number of turns in either direction—as well as, it seemed, his general activity level—dropped sharply, especially toward the end of the session. During the first 5 minutes of the next session, he mostly sat at the bottom of the tank reacting minimally to the mirror. No circling behavior occurred. It appeared as though long-term habituation had set in such that the mirror was no longer sufficiently reinforcing to motivate the target behavior, and the session was therefore terminated. Ishmael showed a similar lack of interest in the mirror over the following 2 days as well. Despite this less-than-ideal conclusion, the results generally confirmed that mirror presentation was an effective reinforcer for Ishmael’s circling behavior. Most impressive was the initial learning of a clockwise turn, which occurred very rapidly. With food as a reinforcer, Ishmael also learned to bump, but not push, a ping pong ball—it was easy to get him to hover near the ball, but difficult to get him to contact it—to swim through a wire hoop for food—which was relatively easy to accomplish—and to nip at the bent end of a plastic-coated paper clip, which was extremely easy to accomplish. (As for Ishmael, he easily trained his owner [me] to give him extra food by staring longingly at me and acting very excited when I came home each evening.) Further information on betta training can be found in the Chapter 6: Additional Information section of the book companion Web site (the URL is at the end of the chapter). Now for the answer to the question about including mirror presentations in the baseline period: Intermittent presentation of the mirror by itself generates a lot of excitement and movement. Hence, noncontingent presentation of the mirror during the baseline period controls for the increase in circling that will likely occur simply due to the increased movement caused by mirror presentation alone. 1. Shaping is the creation of ___________________ operant behavior through the reinforcement of s______________ a______________ to that behavior. 2. In clicker training with dogs, the click is a s______________ reinforcer that has been established by first pairing it with f______________. 3. The advantages of using the click as a reinforcer is that it can be delivered i__________. It can also prevent the animal from becoming s______________. S U M M A RY In contrast to elicited behaviors that are automatically evoked by the stimuli that precede them, operant behaviors are controlled by their con- sequences. Thus, in operant (or instrumental) conditioning, the future probability of a response is affected by its consequence. Reinforcers are

250 CHAPTER 6 Operant Conditioning: Introduction consequences that increase the probability of (or strengthen) a response, whereas punishers decrease the probability of (or weaken) a response. In positive reinforcement and positive punishment, the consequence involves the presentation of a stimulus, whereas in negative reinforce- ment and negative punishment, the consequence involves the removal of a stimulus. When a behavior has been consistently reinforced or punished in the pres- ence of certain stimuli, those stimuli will begin to influence the occurrence of the behavior. A discriminative stimulus is a stimulus in the presence of which a response has been reinforced and in the absence of which it has not been reinforced. Immediate reinforcers have a much stronger effect on behavior than do delayed reinforcers. Primary reinforcers are events that are innately rein- forcing; secondary reinforcers are events that become reinforcing because they have been associated with other reinforcers. A generalized secondary reinforcer is a secondary reinforcer that has been associated with many other reinforcers. Intrinsic reinforcement occurs when performing a behavior is inherently reinforcing; extrinsic reinforcement occurs when the effective reinforcer is some consequence that is external to the behavior. Extrinsic reinforcement can undermine intrinsic interest in a task when the reinforcer is expected, tangible, or is made contingent on mere performance of the task. Extrinsic reinforcement can strengthen intrinsic interest when the reinforcer consists of verbal praise or is made contingent on high-quality performance. Shaping is the creation of novel behavior through the reinforcement of gradual approximations to that behavior. Effective shaping is often carried out with the use of a secondary reinforcer, such as the sound of a whistle or a click that can be delivered immediately following the occurrence of the appropriate behavior. SUGGESTED READINGS Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. Psychological Review Monograph Supplement, 2, 1–109. A classic work in the field. Kohn, A. (1993). Punished by rewards. Boston: Houghton Mifflin. One of the harshest criticisms of the use of rewards to motivate people. Pryor, K. (1975). Lads before the wind: Adventures in porpoise training. New York: Harper & Row. An engaging account of Pryor’s experiences in becoming a dolphin trainer. Pryor, K. (1999). Don’t shoot the dog: The new art of teaching and training (Rev. ed.). New York: Bantam Books. Pryor’s most popular book on the art of shaping behavior as applied to everything from dogs to horses to humans.

Concept Review 251 Cameron, J., & Pierce, W. D. (2002). Rewards and intrinsic motivation: Resolving the controversy. New York: Greenwood Publishing. An ardent defense of the use of rewards to motivate people. STUDY QUESTIONS 1. State Thorndike’s law of effect. What is operant conditioning (as defined by Skinner), and how does this definition differ from Thorndike’s law of effect? 2. Explain why operant behaviors are said to be emitted and why they are defined as a “class” of responses. 3. Define the terms reinforcer and punisher. 4. What is the difference between the terms reinforcement and reinforcer? 5. What is a discriminative stimulus? Define the three-term contingency and diagram an example. 6. Define positive reinforcement and diagram an example. Define negative reinforcement and diagram an example. (For each example, include the appropriate symbols.) 7. Define positive punishment and diagram an example. Define negative punishment and diagram an example. (For each example, include the appropriate symbols.) 8. What are the similarities and differences between negative reinforcement and positive punishment? 9. How does immediacy affect the strength of a reinforcer? How does this often lead to difficulties for students in their academic studies? 10. Distinguish between primary and secondary reinforcers, and give an example of each. 11. What is a generalized reinforcer? What are two examples of such reinforcers? 12. Define intrinsic and extrinsic reinforcement, and provide an example of each. 13. Under what three conditions does extrinsic reinforcement undermine intrinsic interest? Under what two conditions does extrinsic reinforce- ment enhance intrinsic interest? 14. Define natural and contrived reinforcers, and provide an example of each. 15. Define shaping. What are two advantages of using a secondary reinforcer, such as a sound, as an aid to shaping? CONCEPT REVIEW avoidance behavior. Behavior that occurs before the aversive stimulus is presented and therefore prevents its delivery. contrived reinforcers. Reinforcers that have been deliberately arranged to modify a behavior; they are not a typical consequence of the behavior in that setting.

252 CHAPTER 6 Operant Conditioning: Introduction discriminative stimulus (SD). A stimulus in the presence of which responses are reinforced and in the absence of which they are not reinforced. discriminative stimulus for extinction (SΔ). A stimulus that signals the absence of reinforcement. discriminative stimulus for punishment. A stimulus that signals that a response will be punished. escape behavior. A behavior that results in the termination of an aversive stimulus. extrinsic reinforcement. The reinforcement provided by a consequence that is external to the behavior, that is, an extrinsic reinforcer. generalized (or generalized secondary) reinforcer. A type of secondary reinforcer that has been associated with several other reinforcers. intrinsic reinforcement. Reinforcement provided by the mere act of per- forming the behavior; the performance of the behavior is inherently reinforcing. law of effect. As stated by Thorndike, the proposition that behaviors that lead to a satisfying state of affairs are strengthened or “stamped in,” while behaviors that lead to an unsatisfying or annoying state of affairs are weak- ened or “stamped out.” natural reinforcers. Reinforcers that are naturally provided for a certain behavior; that is, they are a typical consequence of the behavior within that setting. negative punishment. The removal of a stimulus (one that is usually con- sidered pleasant or rewarding) following a response, which then leads to a decrease in the future strength of that response. negative reinforcement. The removal of a stimulus (one that is usually con- sidered unpleasant or aversive) following a response, which then leads to an increase in the future strength of that response. operant behavior. A class of emitted responses that result in certain con- sequences; these consequences, in turn, affect the future probability or strength of those responses. operant conditioning. A type of learning in which the future probability of a behavior is affected by its consequences. positive punishment. The presentation of a stimulus (one that is usually considered unpleasant or aversive) following a response, which then leads to a decrease in the future strength of that response. positive reinforcement. The presentation of a stimulus (one that is usually considered pleasant or rewarding) following a response, which then leads to an increase in the future strength of that response. primary reinforcer (or unconditioned reinforcer). An event that is innately reinforcing. punisher. An event that (1) follows a behavior and (2) decreases the future probability of that behavior. reinforcer. An event that (1) follows a behavior and (2) increases the future probability of that behavior.

Chapter Test 253 secondary reinforcer (or conditioned reinforcer). An event that is rein- forcing because it has been associated with some other reinforcer. shaping. The gradual creation of new operant behavior through reinforce- ment of successive approximations to that behavior. three-term contingency. The relationship between a discriminative stimu- lus, an operant behavior, and a reinforcer or punisher. CHAPTER TEST 31. Shaping is (A) the reinforcement of a new operant behavior, (B) the grad- ual reinforcement of a new operant behavior, (C) the reinforcement of successive approximations to a new operant behavior, (D) the creation of new operant behavior through successive approximations to reinforce- ment, (E) none of the preceding. ____ 20. A positive reinforcer is a stimulus, (A) the presentation of which increases the strength of a response, (B) the presentation of which follows a response and increases the strength of that response, (C) the presentation of which decreases the strength of a response, (D) the presentation of which follows a response and decreases the strength of that response. ________________ 2. Elicited behaviors are controlled by the events that (precede / follow) ____________________ their occurrence, while operant behav- iors are controlled by the events that (precede /follow) ______________ their occurrence. 14. An easy way to remember the three-term contingency is that you ______ something, _______________ something, and ____________ something. 25. Behaviors that are performed for their own sake are said to be _______ motivated; behaviors that are performed to achieve some additional incentive are said to be ________________ motivated. 11. Reinforcers and punishers are defined entirely by their ________________ on behavior. 8. An event is a punisher if it ________________ a behavior and the future probability of that behavior ________________. 23. Money and praise are common examples of ______________ reinforcers. 12. If the rat does not press the lever, then it does not receive a shock. As a result, the rat is more likely not to press the lever. This is an example of (A) negative reinforcement, (B) negative punishment, (C) positive reinforcement, (D) positive punishment. ______________ (Think carefully about this.) 28. At the zoo one day, you notice that a zookeeper is leading a rhinoceros into a pen by repeatedly whistling at it as the animal moves. It is probably the case that the whistle has been paired with ________________ and is now functioning as a ________________. 1. Compared to most elicited behaviors, operant behaviors seem (more/ less) ________________ automatic and reflexive.

254 CHAPTER 6 Operant Conditioning: Introduction 15. The three-term contingency can be thought of as an ABC sequence in which A stands for ________________, B stands for ________________, and C stands for ________________. 27. The gradual development of new operant behavior through reinforcement of ________________ to that behavior is called ________________. 6. Operant responses are sometimes simply called ________________. 21. Each time a student studies at home, she is praised by her parents. As a result, she no longer studies at home. This is an example of what type of contingency? ____________________________ 17. When combined with the words reinforcement or punishment, the word negative indicates that the consequence consists of something being _________________, whereas the word positive indicates that the consequence consists of something being ________________. 10. The terms reinforcer or punisher refer to the specific ________________ that follows a behavior, whereas the terms reinforcement or punishment refer to the ________________ or________________ whereby the probability of a behavior is altered by its consequences. 24. Harpreet very much enjoys hard work and often volunteers for projects that are quite demanding. According to ___________________ theory, it is likely the case that, for Harpreet, the act of expending a lot of effort has often been ________________. 3. According to Thorndike’s ______________________, behaviors that lead to a ________________ state of affairs are strengthened, whereas behaviors that lead to an _______________ state of affairs are weakened. 30. A generalized secondary reinforcer is one that has become a reinforcer because it has been associated with (A) a primary reinforcer, (B) a secondary reinforcer, (C) several secondary reinforcers, (D) several primary reinforc- ers, or (E) several reinforcers (either primary or secondary). ____________ 19. When Beth tried to pull the tail of her dog, he bared his teeth and growled threateningly. Beth quickly pulled her hand back. The dog growled even more threateningly the next time Beth reached for his tail, and she again pulled her hand away. Eventually Beth gave up, and no longer tries to pull the dog’s tail. The dog’s behavior of baring his teeth and growling served to (positively/negatively) ________________ (punish /reinforce) _________________ Beth’s behavior of trying to pull his tail. Beth’s behavior of pulling her hand away served to ____________ the dog’s behavior of growling. 32. Achieving a record number of strikeouts in a game would be a(n) (natural/contrived) _______________ reinforcer for pitching well; receiving a bonus for throwing that many strikeouts would be a(n) ________________ reinforcer. 5. Operant behaviors are usually defined as a ________________ of responses, all of which are capable of producing a certain ________________. 16. A stimulus that signals that a response will be punished is called a ___________________ for punishment.

Answers to Chapter Test 255 22. Events that are innately reinforcing are called ________________ reinforcers; events that become reinforcers through experience are called _____________ reinforcers. 9. A reinforcer is usually given the symbol ________________, while a pun- isher is usually given the symbol ________________. The operant response is given the symbol ________________, while a discriminative stimulus is given the symbol ________________. 26. Steven has fond memories of his mother reading fairy tales to him when he was a child, and as a result he now enjoys reading fairy tales as an adult. For Steven, the act of reading fairy tales is functioning as what type of reinforcer? (A) primary, (B) secondary, (C) intrinsic, (D) extrinsic, (E) both (B) and (C). _____ 4. Classically conditioned behaviors are said to be __________________ by stimuli; operant behaviors are said to be ___________________ by the organism. 18. Referring to this chapter’s opening vignette, among the four types of contingencies described in this chapter, Sally’s actions toward Joe prob- ably best illustrate the process of ________________. In other words, Joe’s abusive behavior will likely (increase/decrease) ________________ in the future as a result of Sally’s actions. 7. An event is a reinforcer if it ________________ a behavior and the future probability of that behavior ________________. 29. Major advantages of using the sound of a click for shaping are that the click can be delivered ____________ and the animal is unlikely to ____________ upon it. 13. A discriminative stimulus is a stimulus that signals that a ________________ is available. It is said to “___________________” for the behavior. Visit the book companion Web site at <http://www.academic.cengage.com/ psychology/powell> for additional practice questions, answers to the Quick Quizzes, practice review exams, and additional exercises and information. ANSWERS TO CHAPTER TEST 1. less 10. consequence (event); process; 2. precede; follow procedure 3. law of effect; satisfying; unsatisfying 11. effect (or annoying) 12. D (because “lever press → shock” is 4. elicited; emitted 5. class; consequence the effective contingency) 6. operants 13. reinforcer; set the occasion 7. follows; increases 14. notice; do; get 8. follows; decreases 15. antecedent; behavior; 9. SR; SP; R; SD consequence 16. discriminative stimulus

256 CHAPTER 6 Operant Conditioning: Introduction 17. removed (or subtracted); presented 24. learned industriousness; (or added) positively reinforced 18. positive reinforcement; increase 25. intrinsically; extrinsically 19. positively; punish; negatively 26. E 27. successive (or gradual) reinforce 20. B approximations; shaping 21. positive punishment 28. food; secondary reinforcer 22. primary; secondary 29. immediately; satiate 30. E (or conditional) 31. C 23. generalized (or generalized 32. natural; contrived secondary)

CHAPTER 7 Schedules and Theories of Reinforcement CHAPTER OUTLINE Theories of Reinforcement Drive Reduction Theory Schedules of Reinforcement The Premack Principle Continuous Versus Intermittent Response Deprivation Hypothesis Schedules Behavioral Bliss Point Approach Four Basic Intermittent Schedules Other Simple Schedules of Reinforcement Complex Schedules of Reinforcement 257

258 CHAPTER 7 Schedules and Theories of Reinforcement “I don’t understand why Alvin is so distant,” Mandy commented. “He was great when we first started going out. Now it’s like pulling teeth to get him to pay attention to me.” “So why do you put up with it?” her sister asked. “I guess I’m in love with him. Why else would I be so persistent?” Schedules of Reinforcement In this section, we discuss schedules of reinforcement. A schedule of rein- forcement is the response requirement that must be met to obtain reinforce- ment. In other words, a schedule indicates what exactly has to be done for the reinforcer to be delivered. For example, does each lever press by the rat result in a food pellet, or are several lever presses required? Did your mom give you a cookie each time you asked for one, or only some of the time? And just how persistent does Mandy have to be before Alvin will pay attention to her? As you will discover in this section, different response requirements can have dramatically different effects on behavior. Many of these effects (known as schedule effects) were first observed in experiments with pigeons (Ferster & Skinner, 1957), but they also help to explain some puzzling aspects of human behavior that are often attributed to internal traits or desires. Continuous Versus Intermittent Schedules A continuous reinforcement schedule is one in which each specified response is reinforced. For example, each time a rat presses the lever, it obtains a food pellet; each time the dog rolls over on command, it gets a treat; and each time Karen turns the ignition in her car, the motor starts. Continuous reinforce- ment (abbreviated CRF) is very useful when a behavior is first being shaped or strengthened. For example, when using a shaping procedure to train a rat to press a lever, reinforcement should be delivered for each approximation to the target behavior. Similarly, if we wish to encourage a child to always brush her teeth before bed, we would do well to initially praise her each time she does so. An intermittent (or partial) reinforcement schedule is one in which only some responses are reinforced. For example, perhaps only some of the rat’s lever presses result in a food pellet, and perhaps only occasionally did your mother give you a cookie when you asked for one. Intermittent reinforcement obviously characterizes much of everyday life. Not all concerts we attend are enjoyable, not every person we invite out on a date accepts, and not every date that we go out on leads to an enjoyable evening. And although we might initially praise a child each time she properly completes her homework, we might soon praise her only occasionally in the belief that such behavior should persist in the absence of praise.

Schedules of Reinforcement 259QUICK QUIZ A There are four basic (or simple) types of intermittent schedules: fixed ratio, variable ratio, fixed interval, and variable interval. We will describe each one along with the characteristic response pattern produced by each. Note that this characteristic response pattern is the stable pattern that emerges once the organism has had considerable exposure to the schedule. Such stable patterns are known as steady-state behaviors, in contrast to the more variable patterns of behavior that are evident when an organism is first exposed to a schedule. 1. A s_____________ of reinforcement is the r_____________ requirement that must be met to obtain reinforcement. 2. On a c_____________ reinforcement schedule (abbreviated ______), each response is reinforced, whereas on an i_____________ reinforcement schedule, only some responses are reinforced. The latter is also called a p_____________ reinforcement schedule. 3. Each time you flick the light switch, the light comes on. The behavior of flicking the light switch is on a(n) _____________ schedule of reinforcement. 4. When the weather is very cold, you are sometimes unable to start your car. The behavior of starting your car in very cold weather is on a(n) _____________ schedule of reinforcement. 5. S_____________ e_____________ are the different effects on behavior produced by different response requirements. These are the stable patterns of behavior that emerge once the organism has had sufficient exposure to the schedule. Such stable patterns are known as st_____________-st_____________ behaviors. Four Basic Intermittent Schedules Fixed Ratio Schedules On a fixed ratio (FR) schedule, reinforcement is con- tingent upon a fixed, predictable number of responses. For example, on a fixed ratio 5 schedule (abbreviated FR 5), a rat has to press the lever 5 times to obtain a food pellet. On an FR 50 schedule, it has to press the lever 50 times to obtain a food pellet. Similarly, earning a dollar for every 10 carburetors assembled on an assembly line is an example of an FR 10 schedule, while earning a dollar for each carburetor assembled is an example of an FR 1 schedule. Note that an FR 1 schedule is the same as a CRF (continuous reinforcement) schedule in which each response is reinforced (thus, such a schedule can be correctly labeled as either an FR 1 or a CRF). FR schedules generally produce a high rate of response along with a short pause follow- ing the attainment of each reinforcer (see Figure 7.1). This short pause is known as a postreinforcement pause. For example, a rat on an FR 25 schedule will rapidly emit 25 lever presses, munch down the food pellet it receives, and then snoop around the chamber for a few seconds before rapidly emitting another 25 lever presses. In other words, it will take a short break following each reinforcer, just as you might take a short break after reading each chapter in a textbook or completing a particular assignment. Note, too, that each pause is followed by a quick return

260 CHAPTER 7 Schedules and Theories of Reinforcement FIGURE 7.1 Response patterns for FR, variable ratio (VR), fixed interval (FI), and vari- able interval (VI) schedules. This figure shows the characteristic pattern of responding on the four basic schedules. Notice the high response rate on the fixed and variable ratio schedules, moderate response rate on the variable interval schedule, and scalloped response pattern on the fixed interval schedule. Also, both the fixed ratio and fixed interval schedules are accompanied by postreinforcement pauses. (Source: Modified from Nairne, 2000.) Fixed ratio schedule Variable ratio schedule Cumulative responses Rapid High, steady rate responding No pauses Post- reinforcement pause Fixed interval schedule Variable interval schedule Long pause after Moderate, reinforcement steady rate yields “scalloping” effect No pauses Time Time to a high rate of response. Thus, the typical FR pattern is described as a “break- and-run” pattern—a short break followed by a steady run of responses. Similarly, students sometimes find that when they finally sit down to start work on the next chapter or assignment, they quickly become involved in it. Perhaps this is why just starting a task is often the most important step in overcoming procrastination; once you start, the work often flows naturally. For this reason, it is sometimes helpful to use certain tricks to get started, such as beginning with a short, easy task before progressing to a more dif- ficult task. Alternatively, you might promise yourself that you will work for only 5 or 10 minutes and then quit for the evening if you really do not feel like carrying on. What often happens is that once the promised time period has passed, it is actually quite easy to carry on.

Schedules of Reinforcement 261QUICK QUIZ B In general, higher ratio requirements produce longer postreinforcement pauses. This means that you will probably take a longer break after complet- ing a long assignment than after completing a short one. Similarly, a rat will show longer pauses on an FR 100 schedule than on an FR 30 schedule. With very low ratios, such as FR 1 (CRF) or FR 2, there may be little or no pausing other than the time it takes for the rat to munch down the food pellet. In such cases, the next reinforcer is so close— only a few lever presses away—that the rat is tempted to immediately go back to work. (If only the reinforcers for studying were so immediate!) Schedules in which the reinforcer is easily obtained are said to be very dense or rich, while schedules in which the reinforcer is difficult to obtain are said to be very lean. Thus, an FR 5 schedule is considered a very dense schedule of reinforcement compared to an FR 100. During a 1-hour session, a rat can earn many more food pellets on an FR 5 schedule than it can on an FR 100. Similarly, an assembly line worker who earns a dollar for each carburetor assembled (a CRF schedule) can earn considerably more during an 8-hour shift than can a worker who earns a dollar for every 10 carburetors assembled (an FR 10 schedule). In general, “stretching the ratio”— moving from a low ratio requirement (a dense schedule) to a high ratio requirement (a lean schedule)—should be done gradually. For example, once lever pressing is well established on a CRF schedule, the requirement can be gradually increased to FR 2, FR 5, FR 10, and so on. If the requirement is increased too quickly—for example, CRF to FR 2 and then a sudden jump to FR 20—the rat’s behavior may become erratic and even die out altogether. Likewise, if you try to raise the require- ment too high—say, to FR 2000—there may be a similar breakdown in the rat’s behavior. Such breakdowns in behavior are technically known as ratio strain, a disruption in responding due to an overly demanding response requirement. Ratio strain is what most people would refer to as burnout, and it can be a big problem for students faced with a heavy workload. Some students, especially those who have a history of getting by with minimal work, may find it increas- ingly difficult to study under such circumstances and may even choose to drop out of college. If they had instead experienced a gradual increase in workload over a period of several months or years, they might have been able to put forth the needed effort to succeed. 1. On a(n) _____________ _____________ schedule, reinforcement is contingent upon a fixed number of responses. 2. A schedule in which 15 responses are required for each reinforcer is abbreviated _____________. 3. A mother finds that she always has to make the same request three times before her child complies. The mother’s behavior of making requests is on an ________ _____ schedule of reinforcement. 4. An FR 1 schedule of reinforcement can also be called a ____________ schedule.

262 CHAPTER 7 Schedules and Theories of Reinforcement 5. A fixed ratio schedule tends to produce a (high/low) _________ rate of response, along with a p_________________ p________. 6. An FR 200 schedule of reinforcement will result in a (longer/shorter) ____________ pause than an FR 50 schedule. 7. The typical FR pattern is sometimes called a b_________-and-r________ pattern, with a ____________ pause that is followed immediately by a (high/low) ________ rate of response. 8. An FR 12 schedule of reinforcement is (denser/leaner) _____________ than an FR 100 schedule. 9. A very dense schedule of reinforcement can also be referred to as a very r_________ schedule. 10. Over a period of a few months, Aaron changed from complying with each of his mother’s requests to complying with every other request, then with every third request, and so on. The mother’s behavior of making requests has been subjected to a procedure known as “s_____________ the r_____________.” 11. Graduate students often have to complete an enormous amount of work in the initial year of their program. For some students, the workload involved is far beyond anything they have previously encountered. As a result, their study behavior may become increasingly (erratic/stereotyped) _____________ throughout the year, a process known as r_____________ s_____________. Variable Ratio Schedules On a variable ratio ( VR) schedule, reinforcement is contingent upon a varying, unpredictable number of responses. For example, on a variable ratio 5 (VR 5) schedule, a rat has to emit an average of 5 lever presses for each food pellet, with the number of lever responses on any par- ticular trial varying between, say, 1 and 10. Thus, the number of required lever presses might be 3 for the first pellet, 6 for the second pellet, 1 for the third pellet, 7 for the fourth pellet, and so on, with the overall average being 5 lever presses for each reinforcer. Similarly, on a VR 50 schedule, the number of required lever presses may vary between 1 and 100, with the average being 50. VR schedules generally produce a high and steady rate of response with little or no postreinforcement pause (see Figure 7.1). The lack of a postreinforcement pause is understandable if you consider that each response on a VR schedule has the potential of resulting in a reinforcer. For example, on a VR 50 schedule in which the response requirement for each reinforcer varies between 1 and 100, it is possible that the very next lever press will produce another food pellet, even if the rat has just obtained a food pellet. The real world is filled with examples of VR schedules. Some predatory behaviors, such as that shown by cheetahs, are on VR schedules in that only some attempts at chasing down prey are successful. In humans, only some acts of politeness receive an acknowledgment, only some residents who are

Schedules of Reinforcement 263 called upon by canvassers will make a contribution, and only some CDs that we buy are enjoyable. Many sports activities, such as shooting baskets in basketball and shots on goal in hockey, are also reinforced largely on a VR schedule. A colleague just stopped by and joked that his golf drive is probably on a VR 200 schedule. In other words, he figures that an average of about one in every 200 drives is a good one. I (Russ Powell) replied that my own drives are probably on a much leaner schedule with the result that ratio strain has set in, which is fancy behaviorist talk for “I so rarely hit the ball straight that I have just about given up playing.” Variable ratio schedules help to account for the persistence with which some people display certain maladaptive behaviors. Gambling is a prime example in this regard: The unpredictable nature of these activities results in a very high rate of behavior. In fact, the behavior of a gambler playing a slot machine is the classic example of human behavior controlled by a VR schedule. Certain forms of aberrant social behavior may also be accounted for by VR schedules. For example, why do some men persist in using cute, flippant remarks to introduce themselves to women when the vast majority of women view such remarks negatively? One reason is that a small minor- ity of women actually respond favorably, thereby intermittently reinforc- ing the use of such remarks. For example, Kleinke, Meeker, and Staneske (1986) found that although 84% of women surveyed rated the opening line “I’m easy. Are you?” as poor to terrible, 14% rated it as either very good or excellent! Variable ratio schedules of reinforcement may also facilitate the develop- ment of an abusive relationship. At the start of a relationship, the individuals involved typically provide each other with an enormous amount of positive reinforcement (a very dense schedule). This strengthens the relationship and increases each partner’s attraction to the other. As the relationship progresses, such reinforcement naturally becomes somewhat more intermittent. In some situations, however, this process becomes malignant, with one person (let us call this person the victimizer) providing reinforcement on an extremely intermittent basis, and the other person (the victim) working incredibly hard to obtain that reinforcement. Because the process evolves gradually (a pro- cess of slowly “stretching the ratio”), the victim may have little awareness of what is happening until the abusive pattern is well established. What would motivate such an unbalanced process? One source of motivation is that the less often the victimizer reinforces the victim, the more attention (reinforce- ment) he or she receives from the victim. In other words, the victim works so hard to get the partner’s attention that he or she actually reinforces the very process of being largely ignored by that partner. Of course, it does not neces- sarily have to be a one-way process, and there may be relationships in which the partners alternate the role of victim and victimizer. The result may be a volatile relationship that both partners find exciting but that is constantly on the verge of collapse due to frequent periods in which each partner experi- ences “ratio strain.”

QUICK QUIZ C264 CHAPTER 7 Schedules and Theories of Reinforcement 1. On a variable ratio schedule, reinforcement is contingent upon a _____________ un_____________ _____________ of responses. 2. A variable ratio schedule typically produces a (high/low) ____________ rate of behavior (with/without) _____________ a postreinforcement pause. 3. An average of 1 in 10 people approached by a panhandler actually gives him money. His behavior of panhandling is on a _______ schedule of reinforcement. 4. As with an FR schedule, an extremely lean VR schedule can result in r___________ s___________. Fixed Interval Schedules On a fixed interval (FI) schedule, reinforcement is contingent upon the first response after a fixed, predictable period of time. For a rat on a fixed interval 30-second (FI 30-sec) schedule, the first lever press after a 30-second interval has elapsed results in a food pellet. Following that, another 30 seconds must elapse before a lever press will again produce a food pellet. Any lever pressing that occurs during the interval, before the 30-second period has elapsed, is ineffective. Similarly, trying to phone a friend who is due to arrive home in exactly 30 minutes will be effective only after the 30 minutes have elapsed, with any phone calls before that being ineffective. FI schedules often produce a “scalloped” (upwardly curved) pattern of responding, consisting of a postreinforcement pause followed by a gradually increasing rate of response as the interval draws to a close (see Figure 7.1). For example, a rat on an FI 30-sec schedule will likely emit no lever presses at the start of the 30-second interval. This will be followed by a few tentative lever presses perhaps midway through the interval, with a gradually increasing rate of response thereafter. By the time the interval draws to a close and the reinforcer is imminent, the rat will be emitting a high rate of response, with the result that the reinforcer will be attained as soon as it becomes available. Would the behavior of trying to phone someone who is due to arrive home in 30 minutes also follow a scalloped pattern (assuming they do not have a cell phone)? If we have a watch available, it probably would not. We would simply look at our watch to determine when the 30 minutes have elapsed and then make our phone call. The indicated time would be a discriminative stimulus (SD) for when the reinforcer is available (i.e., the person is home), and we would wait until the appropriate time before phoning. But what about the behavior of looking at your watch during the 30 minutes (the reinforcer for which would be noticing that the interval has elapsed)? You are unlikely to spend much time looking at your watch at the start of the interval. As time progresses, however, you will begin looking at it more and more frequently. In other words, your behavior will follow the typical scalloped pattern of responding. The distribution of study sessions throughout the term can also show characteristics of an FI scallop. At the start of a course, many students engage in little or no studying. This is followed by a gradual increase in studying as the first exam approaches. The completion of the exam is again followed by little or no studying until the next exam approaches. Unfortunately, these postreinforcement pauses are often too long, with the result that many

Schedules of Reinforcement 265QUICK QUIZ D students obtain much poorer marks than they would have if they had studied at a steadier pace throughout. (Note, however, that studying for exams is not a pure example of an FI schedule because a certain amount of work must be accomplished during the interval to obtain the reinforcer of a good mark. On a pure FI schedule, any responding that happens during the interval is essentially irrelevant.) 1. On a fixed interval schedule, reinforcement is contingent upon the _____________ response following a _____________, pr____________ period of _____________. 2. If I have just missed the bus when I get to the bus stop, I know that I have to wait 15 minutes for the next one to come along. Given that it is absolutely freezing out, I snuggle into my parka as best I can and grimly wait out the interval. Every once in a while, though, I emerge from my cocoon to take a quick glance down the street to see if the bus is coming. My behavior of looking for the bus is on a(n) __________ (use the abbreviation) schedule of reinforcement. 3. In the example in question 2, I will probably engage in (few/frequent) _____________ glances at the start of the interval, followed by a gradually (increasing/decreasing) _____________ rate of glancing as time passes. 4. Responding on an FI schedule is often characterized by a sc_____________ pattern of responding consisting of a p__________________ p________ followed by a gradually (increasing/decreasing) ____________ rate of behavior as the interval draws to a close. 5. On a pure FI schedule, any response that occurs (during/following) _____________ the interval is irrelevant. Variable Interval Schedules On a variable interval ( VI) schedule, rein- forcement is contingent upon the first response after a varying, unpredictable period of time. For a rat on a variable interval 30-second (VI 30-sec) schedule, the first lever press after an average interval of 30 seconds will result in a food pellet, with the actual interval on any particular trial varying between, say, 1 and 60 seconds. Thus, the number of seconds that must pass before a lever press will produce a food pellet could be 8 seconds for the first food pellet, 55 seconds for the second pellet, 24 seconds for the third, and so on, the average of which is 30 seconds. Similarly, if each day you are waiting for a bus and have no idea when it will arrive, then looking down the street for the bus will be reinforced after a varying, unpredictable period of time—for example, 2 minutes the first day, 12 minutes the next day, 9 minutes the third day, and so on, with an average interval of, say, 10 minutes (VI 10-min). VI schedules usually produce a moderate, steady rate of response with little or no postreinforcement pause (see Figure 7.1). By responding at a relatively steady rate throughout the interval, the rat on a VI 30-sec schedule will attain the reinforcer almost as soon as it becomes available. Similarly, if you need to con- tact a friend about some emergency and know that she always arrives home sometime between 6:00 p.m. and 6:30 p.m., a good strategy would be to phone

266 CHAPTER 7 Schedules and Theories of Reinforcement every few minutes throughout that time period. By doing so, you will almost certainly contact her within a few minutes of her arrival. Because VI schedules produce predictable response rates, as well as predict- able rates of reinforcement, they are often used to investigate other aspects of operant conditioning, such as those involving matters of choice between alternative sources of reinforcement. You will encounter examples of this when we discuss choice behavior in Chapter 10. QUICK QUIZ E 1. On a variable interval schedule, reinforcement is contingent upon the _____________ response following a _____________, un__________ period of _____________. 2. You find that by frequently switching stations on your radio, you are able to hear your favorite song an average of once every 20 minutes. Your behavior of switching stations is thus being reinforced on a _____________ schedule. 3. In general, variable interval schedules produce a (low/moderate/high) _____________, (steady/fluctuating) ___________________________ rate of response with little or no ___________________________________________________. Comparing the Four Basic Schedules The four basic schedules produce quite different patterns of behavior, which vary in both the rate of response and in the presence or absence of a postreinforcement pause. These characteristics are summarized in Table 7.1. As can be seen, ratio schedules (FR and VR) produce higher rates of response than do interval schedules (FI and VI). This makes sense because the reinforcer in such schedules is entirely “response contingent”; that is, it depends entirely on the number of responses emitted. For this reason, a rat on a VR 100 schedule can double the number of food pellets earned in a 1-hour session by doubling its rate of lever pressing. Similarly, a door-to-door salesman can double the number of sales he makes during a day by doubling the number of customers he calls on (assuming that he continues to give an adequate sales pitch to each customer). Compare this to an interval schedule TABLE 7.1 Characteristic response rates and postreinforcement pauses for each of the four basic intermittent schedules. These are only gen- eral characteristics; they are not found under all circumstances. For example, an FR schedule with a very low response require- ment, such as FR 2, is unlikely to produce a postreinforcement pause. By contrast, an FR schedule with a very high response requirement, such as FR 2000, may result in a ratio strain and a complete cessation of responding. FR VR FI VI Response rate High High Increasing Moderate Postreinforcement pause Yes No Yes No

Schedules of Reinforcement 267QUICK QUIZ F in which reinforcement is mostly time contingent. For example, on an FI 1-minute schedule, no more than 50 reinforcers can be earned in a 50-minute session. Under such circumstances, responding at a high rate throughout each interval does not pay off and is essentially a waste of energy. Instead, it makes more sense to respond in a way that will maximize the possibility of attaining the reinforcer soon after it becomes available. On an FI schedule, this means responding at a gradually increasing rate as the interval draws to a close; on a VI schedule, this means responding at a moderate, steady pace throughout the interval. It can also be seen that fixed schedules (FR and FI) tend to produce postrein- forcement pauses, whereas variable schedules (VR and VI) do not. On a variable schedule, there is always the possibility of a relatively immediate reinforcer, even if one has just attained a reinforcer, which tempts one to immediately resume responding. By comparison, on a fixed schedule, attaining one rein- forcer means that the next reinforcer is necessarily some distance away. On an FR schedule, this results in a short postreinforcement pause before grinding out another set of responses; on an FI schedule, the postreinforcement pause is followed by a gradually increasing rate of response as the interval draws to a close and the reinforcer becomes imminent. 1. In general, (ratio/interval) __________________ schedules tend to produce a high rate of response. This is because the reinforcer in such schedules is entirely r_____________ contingent, meaning that the rapidity with which responses are emitted (does/does not) _____________ greatly affect how soon the reinforcer is obtained. 2. On ________________ schedules, the reinforcer is largely time contingent, mean- ing that the rapidity with which responses are emitted has (little/considerable) _______________ effect on how quickly the reinforcer is obtained. 3. In general, (variable/fixed) ________________ schedules produce little or no postreinforcement pausing because such schedules provide the possibility of rela- tively i_____________ reinforcement, even if one has just obtained a reinforcer. 4. In general, ______________ schedules produce postreinforcement pauses because obtaining one reinforcer means that the next reinforcer is necessarily quite (dis- tant/close) ________________. Other Simple Schedules of Reinforcement Duration Schedules On a duration schedule, reinforcement is contingent on performing a behavior continuously throughout a period of time. On a fixed duration (FD) schedule, the behavior must be performed continuously for a fixed, predictable period of time. For example, the rat must run in the wheel for 60 seconds to earn one pellet of food (an FD 60-sec schedule). Likewise, Julie may decide that her son can watch television each evening only after he completes 2 hours of studying (an FD 2-hr schedule).

268 CHAPTER 7 Schedules and Theories of Reinforcement On a variable duration ( VD) schedule, the behavior must be performed continuously for a varying, unpredictable period of time. For example, the rat must run in the wheel for an average of 60 seconds to earn one pellet of food, with the required time varying between 1 second and 120 seconds on any particular trial (a VD 60-sec schedule). And Julie may decide to rein- force her son’s studying with cookies and other treats at varying points in time that happen to average out to about one treat every 30 minutes (a VD 30-min schedule). (Question: How do FD and VD schedules differ from FI and VI schedules?) Although duration schedules are sometimes useful in modifying certain human behaviors, such as studying, they are in some ways rather imprecise compared to the four basic schedules discussed earlier. With FR schedules, for example, one knows precisely what was done to achieve the reinforcer, namely, a certain number of responses. On an FD schedule, however, what constitutes “continuous performance of behavior” during the interval could vary widely. With respect to wheel running, for example, a “lazy” rat could dawdle along at barely a walk, while an “energetic” rat might rotate the wheel at a tremendous pace. Both would receive the reinforcer. Similarly, Julie’s son might read only a few pages during his 2-hour study session or charge through several chapters; in either case, he would receive the reinforcer of being allowed to watch television. Remember too, from Chapter 6, how rein- forcing the mere performance of an activity with no regard to level of perfor- mance can undermine a person’s intrinsic interest in that activity. This danger obviously applies to duration schedules. One therefore needs to be cautious in their use. Response-Rate Schedules As we have seen, different types of intermittent schedules produce different rates of response (i.e., they have different sched- ule effects). These different rates are essentially by-products of the schedule. However, in a response-rate schedule, reinforcement is directly contingent upon the organism’s rate of response. Let’s examine three types of response- rate schedules. Which of these workers is on a ratio schedule of reinforcement? © Scott Adams/Dist. by United Feature Syndicate, Inc.

Schedules of Reinforcement 269 In differential reinforcement of high rates (DRH), reinforcement is con- tingent upon emitting at least a certain number of responses in a certain period of time— or, more generally, reinforcement is provided for responding at a fast rate. The term differential reinforcement means simply that one type of response is reinforced while another is not. In a DRH schedule, reinforcement is provided for a high rate of response and not for a low rate. For example, a rat might receive a food pellet only if it emits at least 30 lever presses within a period of a minute. Similarly, a worker on an assembly line may be told that she can keep her job only if she assembles a minimum of 20 carburetors per hour. By requiring so many responses in a short period of time, DRH schedules ensure a high rate of responding. Athletic events such as running and swimming are prime examples of DRH schedules in that winning is directly contingent on a rapid series of responses. In differential reinforcement of low rates (DRL), a minimum amount of time must pass between each response before the reinforcer will be delivered— or, more generally, reinforcement is provided for responding at a slow rate. For example, a rat might receive a food pellet only if it waits at least 10 seconds between lever presses. So how is this different from an FI 10-sec schedule? Remember that on an FI schedule, responses that occur during the interval have no effect; on a DRL schedule, however, responses that occur during the interval do have an effect — an adverse effect in that they prevent reinforcement from occurring. In other words, responding during the interval must not occur in order for a response following the interval to produce a reinforcer. Human examples of DRL schedules consist of situations in which a person is required to perform some action slowly. For example, a parent might praise a child for brushing her teeth slowly or completing her homework slowly, given that going too fast generally results in sloppy performance. Once the quality of performance improves, reinforcement can then be made contingent on responding at a normal speed. In differential reinforcement of paced responding (DRP), reinforcement is contingent upon emitting a series of responses at a set rate — or, more generally, reinforcement is provided for responding neither too fast nor too slow. For example, a rat might receive a food pellet if it emits 10 consecutive responses, with each response separated by an interval of no less than 1.5 and no more than 2.5 seconds. Similarly, musical activities, such as playing in a band or dancing to music, require that the relevant actions be per- formed at a specific pace. People who are very good at this are said to have a good sense of timing or rhythm. Further examples of DRP schedules can be found in noncompetitive swimming or running. People often perform these activities at a pace that is fast enough to ensure benefits to health and a feeling of well-being, yet not so fast as to result in exhaustion and possible injury. In fact, even competitive swimmers and runners, especially those who compete over long distances, will often set a specific pace throughout much of the race. Doing so ensures that they have sufficient energy at the

QUICK QUIZ G270 CHAPTER 7 Schedules and Theories of Reinforcement end for a last-minute sprint (DRH) to the finish line, thereby maximizing their chances of clocking a good time. 1. On a (VD/VI) ___________ schedule, reinforcement is contingent upon respond- ing continuously for a varying period of time; on an (FI/FD) ____________ schedule, reinforcement is contingent upon the first response after a fixed period of time. 2. As Tessa sits quietly, her mother occasionally gives her a hug as a reward. This is an example of a ______________ _____________ schedule. 3. In practicing the slow-motion form of exercise known as tai chi, Yang noticed that the more slowly he moved, the more thoroughly his muscles relaxed. This is an example of d______________ reinforcement of _____________ _____________ behavior (abbreviated ________). 4. On a video game, the faster you destroy all the targets, the more bonus points you obtain. This is an example of _____________ reinforcement of _______________ ___________ behavior (abbreviated ________). 5. Frank discovers that his golf shots are much more accurate when he swings the club with a nice, even rhythm that is neither too fast nor too slow. This is an example of _____________ reinforcement of_____________ behavior (abbreviated ________). Noncontingent Schedules On a noncontingent schedule of reinforcement, the reinforcer is delivered independently of any response. In other words, a response is not required for the reinforcer to be obtained. Such schedules are also called response-independent schedules. There are two types of noncontingent schedules: fixed time and variable time. On a fixed time (FT) schedule, the reinforcer is delivered following a fixed, predictable period of time, regardless of the organism’s behavior. For example, on a fixed time 30-second (FT 30-sec) schedule, a pigeon receives access to food every 30 seconds regardless of its behavior. Likewise, many people receive Christmas gifts each year, independently of whether they have been naughty or nice—an FT 1-year schedule. FT schedules therefore involve the delivery of a “free” reinforcer following a predictable period of time. On a variable time (V T) schedule, the reinforcer is delivered follow- ing a varying, unpredictable period of time, regardless of the organism’s behavior. For example, on a variable time 30-second (VT 30-sec) schedule, a pigeon receives access to food after an average interval of 30 seconds, with the actual interval on any particular trial ranging from, say, 1 second to 60 seconds. Similarly, you may coincidentally run into an old high school chum about every 3 months on average (a VT 3-month schedule). VT sched- ules therefore involve the delivery of a free reinforcer following an unpre- dictable period of time. (Question: How do FT and VT schedules differ from FI and VI schedules?)

Schedules of Reinforcement 271QUICK QUIZ H 1. On a non_____________ schedule of reinforcement, a response is not required to obtain a reinforcer. Such a schedule is also called a response i____________ schedule of reinforcement. 2. Every morning at 7:00 A.M. a robin perches outside Marilyn’s bedroom window and begins singing. Given that Marilyn very much enjoys the robin’s song, this is an example of a ______________ ___________ 24-hour schedule of reinforcement (abbreviated __________). 3. For farmers, rainfall is an example of a noncontingent reinforcer that is typically delivered on a ______________________ __________________________ schedule (abbreviated _________________). Noncontingent reinforcement may account for some forms of superstitious behavior. In the first investigation of this possibility, Skinner (1948b) presented pigeons with food every 15 seconds (FT 15-sec) regardless of their behavior. Although you might think that such free reinforcers would have little effect on the pigeons’ behavior (other than encouraging them to stay close to the feeder), quite the opposite occurred. Six of the eight pigeons began to display ritualistic patterns of behavior. For example, one bird began turning counterclockwise circles, while another repeatedly thrust its head into an upper corner of the chamber. Two other pigeons displayed a swaying pendulum motion of the head and body. Skinner believed these behaviors evolved because they had been acci- dentally reinforced by the coincidental presentation of food. For example, if a pigeon just happened to turn a counterclockwise circle before food delivery, that behavior would be accidentally reinforced and increase in frequency. This would increase the likelihood of the same behavior occurring the next time food was delivered, which would further strengthen it. The eventual result would be a well-established pattern of turning circles, as though turning circles somehow caused the food to appear. Some researchers have argued that Skinner’s evidence for superstitious behavior in the pigeon may not be as clear-cut as he believed. They claim that at least some of the ritualistic behaviors he observed may have consisted of innate tendencies, almost like fidgeting behaviors, that are often elicited during a period of waiting (Staddon & Simmelhag, 1971). These tenden- cies, which are discussed in Chapter 11, are known as adjunctive behaviors. Nevertheless, other experiments have replicated the effect of noncontingent reinforcement on the development of superstitious behavior. Ono (1987), for example, placed students in a booth that contained three levers and a counter. The students were told that “if you do something, you may get points on the counter” (p. 263). They were also told to get as many points as possible. In reality, the points were delivered on either an FT or VT schedule, so the students’ behavior actually had no effect on point delivery. Nevertheless, most students developed at least temporary patterns of superstitious lever pulling; that is, they pulled the lever as though it were effective in producing points. Interestingly, one student started with lever pulling but then coin- cidentally received a point after simply touching the counter. This led to a

272 CHAPTER 7 Schedules and Theories of Reinforcement superstitious pattern of climbing on the counter and touching different parts of the apparatus, apparently in the belief that this action produced the points. She then jumped off the apparatus at just the time that she received another point, which led to a superstitious pattern of repeatedly jumping in the air and touching the ceiling! After several minutes of this, she finally quit, appar- ently as a result of fatigue. Professional athletes and gamblers are particularly prone to the devel- opment of superstitions, some of which may evolve in the manner that Skinner suggests. Under constant threat of losing their position to an eager newcomer, professional athletes are constantly on the lookout for any- thing that might enhance their performance. As a result, unusual events that precede a fine performance, such as humming a certain tune or wear- ing an unusual article of clothing, may be quickly identified and then deliberately reproduced in the hopes of reproducing that performance. Gamblers display even stronger tendencies toward the development of superstitions, probably because the activity in which they are engaged is even more uncertain in its outcome. Bingo players, for example, commonly carry lucky pendants, stuffed animals, or pieces of jewelry to each game, and they are often adamant (almost pathologically so) about obtaining cards that contain certain patterns or are drawn from the top or bottom of the stack. Many of these rituals probably evolved because they were at one time associated with a big win. Herrnstein (1966) noted that superstitious behaviors can sometimes develop as by-products of contingent reinforcement for some other behavior. For exam- ple, a businessman might believe it is important to impress customers with a firm handshake—when in fact it is merely the handshake, and not the firmness of the handshake, that is the critical factor. (Unfortunately, such a superstition could have serious consequences if the businessman then attempts to branch out into the Asian market, where a firm handshake is often regarded as a sign of disrespect.) Similarly, some managers might come to believe that “pushing the panic button” is an effective way to deal with crises, simply because it is usually followed by a successful outcome. What they fail to realize is that a low-key approach might have been equally if not more effective—and certainly a lot less stressful. Question: Although Skinner’s (1948b) original demonstration of super- stitious behavior involved the use of a fixed time schedule, you might wish to consider whether superstitious behavior in humans is more likely to develop under a fixed or variable time schedule. To answer this, think about the types of situations in which you are particularly likely to find superstitious behavior in humans. Is it in situations that involve predictable events or unpredictable events? Obviously, it is unpredictable events, such as games of chance, performance in sports, fishing (“Jana’s lucky lure”), and so forth. In this sense, at least from a human perspective, superstitious behavior can be seen as an attempt to make an unpredictable situation more predictable.

Schedules of Reinforcement 273QUICK QUIZ I 1. When noncontingent reinforcement happens to follow a particular behavior, that behavior may (increase/decrease) _____________ in strength. Such behavior is referred to as s_____________ behavior. 2. Herrnstein (1966) noted that superstitious behaviors can sometimes develop as a by-product of c_______________ reinforcement for some other behavior. 3. As shown by the kinds of situations in which superstitious behaviors develop in humans, such behaviors seem most likely to develop on a(n) (VT/FT) _____ schedule of reinforcement. What happens if a noncontingent schedule of reinforcement is superim- posed on a regular, contingent schedule of reinforcement? What if, for exam- ple, a pigeon responding on a VI schedule of food reinforcement also receives extra reinforcers for free? Will the pigeon’s rate of response on the VI sched- ule increase or decrease? In fact, the pigeon’s rate of response on the response- dependent schedule will decrease (Rachlin & Baum, 1972). Just as people on welfare sometimes become less inclined to look for work, the pigeon that receives free reinforcers will work less vigorously for the contingent reinforc- ers. Suggestive evidence of this effect can also be found among professional athletes. One study, conducted several years ago, found that major league pitchers who had signed long-term contracts showed a significant decline in number of innings pitched relative to pitchers who only signed a 1-year contract (O’Brien, Figlerski, Howard, & Caggiano, 1981) (see Figure 7.2). Insofar as a long-term contract virtually guarantees a hefty salary regardless of performance, these results suggest that athletic performance may suffer when the money earned is no longer contingent on performance. (Question: Can you think of alternative explanations for this finding?) At this point, you might be thinking that noncontingent reinforcement is all bad, given that it leads to superstitious behavior in some situations and to poor performance in others. In fact, noncontingent reinforcement is some- times quite beneficial. More specifically, it can be an effective means of reduc- ing the frequency of maladaptive behaviors. For example, children who act out often do so to obtain attention. If, however, they are given a sufficient amount of attention on a noncontingent basis, they will no longer have to act out to obtain it. Noncontingent reinforcement has even been shown to reduce the frequency of self-injurious behavior. Such behavior, which can consist of head- banging or biting chunks of flesh out of one’s arm, is sometimes displayed by people who suffer from retardation or autism; it can be notoriously difficult to treat. In many cases, the behavior appears to be maintained by the attention it elicits from caretakers. Research has shown, however, that if the caretakers provide the individual with plenty of attention on a noncontingent basis, then the frequency of their self-injurious behavior may be greatly reduced (e.g., Hagopian, Fisher, & Legacy, 1994). In a sense, such individuals no longer have to injure themselves to receive attention because they are now receiving lots of attention for free.

Mean innings pitched274 CHAPTER 7 Schedules and Theories of Reinforcement FIGURE 7.2 Average number of innings pitched by major league pitchers in the years before and after signing long-term contracts. (Source: Coon, 1998. Data from O’Brien et al., 1981.) 220 210 Signed guaranteed 200 contract 190 180 170 160 150 140 130 120 12345 6 Years Interestingly, the beneficial effects of noncontingent reinforcement can be seen as providing empirical support for the value of what Carl Rogers (1959), the famous humanistic psychologist, called “unconditional posi- tive regard.” Unconditional positive regard refers to the love, respect, and acceptance that one receives from significant others, regardless of one’s behavior. Rogers assumed that such regard is a necessary precondition for the development of a healthy personality. From a behavioral perspective, unconditional positive regard can be viewed as a form of noncontingent social reinforcement, which can indeed have beneficial effects. In fact, it seems likely that proper child rearing requires healthy doses of both non- contingent reinforcement, which gives the child a secure base from which to explore the world and take risks, and contingent reinforcement, which helps to shape the child’s behavior in appropriate ways, maximize skill develop- ment, and prevent the development of passivity. Thus, Abraham Maslow (1971), another famous humanistic psychologist, argued that child rearing should be neither too restrictive nor too lenient, which in behavioral terms

Schedules of Reinforcement 275QUICK QUIZ J can be taken to imply that the social reinforcement children receive should be neither excessively contingent nor excessively noncontingent. 1. During the time that a rat is responding for food on a VR 100 schedule, we begin deliv- ering additional food on a VT 60-second schedule. As a result, the rate of response on the VR schedule is likely to (increase/decrease/remain unchanged) _____________. 2. In many mixed martial arts matches, each fighter typically receives a guaranteed purse, regardless of the outcome. In the Ultimate Fighter series, the winner of the final match is awarded a major contract in the UFC while the loser receives noth- ing. As a result, Karo is not surprised when he notices fighters in the latter event (more/less) _____________ often fighting to the point of complete exhaustion, since the monetary reinforcer tied to the match is (contingent/not contingent) _____________ upon winning the match. 3. A child who is often hugged during the course of the day, regardless of what he is doing, is in humanistic terms receiving unconditional positive regard. In behavioral terms, he is receiving a form of non______________ social reinforcement. As a result, this child may be (more/less) ___________ likely to act out in order to receive attention. Complex Schedules of Reinforcement All of the schedules previously described are relatively simple in that there is only one basic requirement. On the other hand, a complex schedule consists of a combination of two or more simple schedules. There are a wide variety of such schedules, three of which are described here. Two other types of com- plex schedules—multiple schedules and concurrent schedules—are discussed in later chapters. Conjunctive Schedules A conjunctive schedule is a type of complex schedule in which the requirements of two or more simple schedules must be met before a reinforcer is delivered. For example, on a conjunctive FI 2-minute FR 100 schedule, reinforcement is contingent upon completing 100 lever presses and completing at least one lever press following a 2-minute interval. Many of the contingencies that we encounter in everyday life are examples of conjunctive schedules. The wages you earn on a job are contingent upon working a certain number of hours each week and doing a sufficient amount of work so that you will not be fired. Likewise, Jon’s fiancée might have chosen to marry him because he is kind and humorous and interesting and drives a Porsche. With any one of these components missing, he would not have received the reinforcer of being engaged to her. Adjusting Schedules In an adjusting schedule, the response requirement changes as a function of the organism’s performance while responding for the previous reinforcer. For example, on an FR 100 schedule, if the rat completes all 100 responses within a 5-minute interval, we may then increase the requirement

276 CHAPTER 7 Schedules and Theories of Reinforcement to 110 responses (FR 110). In other words, because it has performed so well, we expect even better performance in the future. In a similar fashion, when Seema displayed excellent ability in master- ing her violin lessons, she and her parents decided to increase the amount she had to learn each week. And when Lily’s high school students per- formed poorly on their exams, she gradually decreased the amount of material they had to learn each week. (It is, of course, in this manner that standards in school become gradually lowered, often to the detriment of the students.) Note that the process of shaping also involves an adjusting schedule insofar as the criterion for reinforcement is raised depending on the animal’s perfor- mance. As soon as the rat has learned to stand near the lever to get food, one raises the criterion to touching the lever, placing a paw on the lever, and so forth. The requirement for reinforcement changes as soon as the rat has suc- cessfully met the previous requirement. QUICK QUIZ K 1. A complex schedule is one that consists of _______________________________. 2. In a(n) _____________ schedule, the response requirement changes as a function of the organism’s performance while responding for the previous reinforcer, while in a(n) _____________ schedule, the requirements of two or more simple schedules must be met before the reinforcer is delivered. 3. To the extent that a gymnast is trying to improve his performance, he is likely on a(n) _____________ schedule of reinforcement; to the extent that his perfor- mance is judged according to both the form and quickness of his moves, he is on a(n) _____________ schedule. Chained Schedules A chained schedule consists of a sequence of two or more simple schedules, each of which has its own SD and the last of which results in a terminal reinforcer. In other words, the person or animal must work through a series of component schedules to obtain the sought-after rein- forcer. A chained schedule differs from a conjunctive schedule in that the two component schedules must be completed in a particular order, which is not required in a conjunctive schedule. As an example of a chained schedule, a pigeon in a standard operant con- ditioning chamber is presented with a VR 20 schedule on a green key, fol- lowed by an FI 10-sec schedule on a red key, which then leads to the terminal reinforcer of food. Thus, an average of 20 responses on the green key will result in a change in key color to red, following which the first response on the red key after a 10-second interval will be reinforced by food. The food is the terminal reinforcer that supports the entire chain. This chain can be diagrammed as follows: VR 20 FI 10-sec Green key: Peck → Red key: Peck → Food SD R SR /SD R SR

Schedules of Reinforcement 277 Note that the presentation of the red key is both a secondary reinforcer for completing the preceding VR 20 schedule and an SD for responding on the subsequent FI 10-sec schedule. Note, too, that this is an example of a two-link chain, with the VR 20 schedule constituting the first, or initial, link and the FI 10-sec schedule constituting the second, or terminal, link. By adding yet another schedule to the start of the chain, we can create a three-link chain, for example: VI 30-sec VR 20 FI 10-sec White key: Peck → Green key: Peck → Red key: Peck → Food SD R SR /SD R SR /SD R SR In this case, both the green and red keys function as secondary reinforcers that help maintain behavior throughout the chain. 1. A chained schedule consists of a sequence of two or more simple schedules, each of QUICK QUIZ L which has its own _____________ and the last of which results in a t____________ r___________________. 2. Within a chain, completion of each of the early links ends in a(n) s______________ reinforcer, which also functions as the _________________ for the next link of the chain. Once pigeons learn which schedule is associated with which key, they generally show the appropriate response patterns for those schedules. In the preceding example, this would be a moderate, steady rate of response on the white key, a high rate of response on the green key, and a scalloped pattern of responding on the red key. Nevertheless, responding tends to be some- what weaker in the earlier links of a chain than in the later links. This can be seen most clearly when each link consists of the same schedule. For example, Kelleher and Fry (1962) presented pigeons with a three-link chained schedule with each link consisting of an FI 60-sec schedule: FI 60-sec FI 60-sec FI 60-sec White key: Peck → Green key: Peck → Red key: Peck → Food SD R SR /SD R SR /SD R SR The pigeons displayed very long pauses and a slow rate of response on the white key compared to the other two keys. The greatest amount of respond- ing occurred on the red key. Why would the earlier links of the chain be associated with weaker respond- ing? One way of looking at it is that in the later links, the terminal reinforcer is more immediate and hence more influential; while in the early links, the termi- nal reinforcer is more distant and hence less influential (remember that delayed reinforcement is less effective than immediate reinforcement). Another way of looking at it is that the secondary reinforcers supporting behavior in the early links are less directly associated with food and are therefore relatively weak (e.g., the green key is associated with food only indirectly through its association


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook