Saturday, February 24, 2007

Learned imitative reinforcers

This is a difficult concept for some students to get, so I'll see if I can supplement Malott's discussion a little bit.

If you do something and I do the same thing, I can tell that I'm doing the same thing you're doing because I can see that we're doing the same thing. If you say something and I say the same thing, then I can hear that I said the same thing. In other words, I know when my actions match yours because of the perceptual feedback I get from you and from myself. But it goes beyond seeing and hearing. If you raise your arm in the air and I do the same, even if my eyes are closed I'm getting perceptual (proprioceptive) feedback informing me that my arm is raised. All of these types of perceptual feedback are stimuli.

If a child imitates someone else's behavior and it's reinforced, then those reinforcers are paired with those stimuli that inform the child that their behaviors are matching the model's. When this has happened a sufficient number of times and in a variety of imitative situations, then stimuli showing us that our behavior matches a model's become learned reinforcers. From that point on, whenever we perceive that our behavior matches someone else's, that matching (imitative) behavior will be automatically reinforced by those learned imitative reinforcers.

That's how generalized imitation happens.

Wednesday, February 21, 2007

What is a response class?

Revised on 3/29/14

In Ch. 7 (p. 128) Malott discusses the three ways to define a response class. Skinner pointed out that no one ever performs a response/behavior the same way twice. Opening the refrigerator with your right hand is basically the same behavior as opening it with your left hand, even though they're obviously different too. But because they're more similar in important ways than they are different, they're considered members of the same response class. So a response class is a collection of similar behaviors.

What Malott does for us in Ch. 7 is to explain the three ways in which the members of a response class can be similar to each other. If two or more responses are similar in one or more of these ways, then they're members of the same response class.

(1) First, behaviors can be similar on one or more response dimensions. A response dimension is a physical property of a response. So this means that responses may be members of the same response class because they're physically similar.

(2) Behaviors can also be similar because they share the effects of reinforcement or punishment. That means that if one member of a response class is reinforced or punished, and its frequency subsequently changes, the frequency of the other members of the response class will also change in the same direction, even though they haven't been directly reinforced or punished. An implication of this is that if reinforcing or punishing a behavior changes its frequency, and the frequency of another behavior also changes in the same way, that's an indication that the two behaviors are members of the same response class.

(3) Behaviors can also be similar because they serve the same function or produce the same outcome. That means that if a behavior is followed by a particular reinforcer or aversive stimulus (punisher), then other members of the same response class will also be followed by that reinforcer or punisher. So if two behaviors produce the same reinforcing or punishing consequence, that's an indication that they're members of the same response class. This doesn't prove that they're members of the same response class, but it may suggest that you should investigate further to determine if, in fact, they are.

Sunday, February 18, 2007

Dipper training CyberRat: What's going on

What's really going on when you conduct the procedure that Malott calls "dipper training" and that is often called "magazine training?"

As you'll learn in Ch. 20, a stimulus can have more than one function. In the case of the sound that Malott calls the "dipper click," this stimulus takes on more than one function as a result of dipper training. We start the process by pairing the sound with a stimulus that's already an effective reinforcer for a thirsty rat – a drop of water. As we learn in Ch. 11, this pairing of a neutral stimulus with a reinforcer causes the neutral stimulus to become a reinforcer too – a learned reinforcer. It's important for this sound to become a learned reinforcer so that we can then use it to provide immediate reinforcement for other behaviors in the Skinner box. Imagine that you wanted to train the rat to stand on its hind legs in the left front corner of the Skinner box. If a drop of water were the only reinforcer available, then could you present it immediately following that behavior? No way.

But if the sound of the dipper click has become a reinforcer, then you can present that sound immediately following whatever behavior you choose. That's why dipper training is so important, and why it has to happen before any other training can take place in the Skinner box.

Now, as we learn in Ch. 12, a discriminative stimulus (SD) is a stimulus that functions as if it's a signal that if the target behavior is performed, it will be reinforced. When we conduct dipper training properly, in addition to the dipper click becoming a learned reinforcer, it also becomes an SD for the behavior of going to the dipper. When the dipper click sounds, if the rat goes to the dipper that behavior is reinforced by the drop of water that's there. But remember that in order for a stimulus to be an SD, there also has to be an SΔ. In this case, the SΔ is the lack of a dipper click. If the rat goes to the dipper in the SΔ condition, that behavior is not reinforced.

In the dipper training exercise with CyberRat, the indicator that we've been successful is that the rat will go to the dipper from anywhere in the Skinner box upon hearing the click. In other words, the sound functions as an SD. When we see this, we can also be confident that the dipper click has become a learned reinforcer, which is what we are trying to accomplish in this lab.

Remember, though, that when we do the exercise properly, the dipper click is not being used to reinforce any behavior. In a reinforcement contingency, the reinforcer is presented after the target behavior, and during dipper training the dipper click is presented before the behavior we want to see, which is going to the dipper. This is why the instructions warn you not to present the dipper click consistently when the rat is in a particular place or doing a particular thing, because that would result in reinforcing that behavior or being in that spot. Instead it's important to allow the rat to go different places in the box and to do different things before presenting the dipper click.

When we get to Ch. 20 on behavioral chains, we'll revisit this idea of a particular stimulus functioning as both a reinforcer and an SD.

Your discrimination CA: What to watch for

When students don't get the discrimination CA right, very often it's because of what they write in the SΔ box. In a discrimination situation, the behaver must be able to perceive (thru seeing, hearing, or any of the other senses) the difference between the SD and the SΔ BEFORE the target behavior is performed. So make sure you don't describe an SΔ that the behaver can't perceive as different from the SD until after the behavior has already been performed. An example of this mistake might be saying that the SD is an ATM machine that has money in it, the SΔ is an ATM that's empty, and the target behavior is punching in your code and requesting money. In this situation, the target behavior has to be performed before the behaver knows that the ATM is empty. So an empty ATM can't be an SΔ in this case.

Think of it this way: In the case of reinforcement-based discrimination, the SD functions like a signal that if the target behavior is performed, it will be reinforced, and the SΔ functions like a signal that it won't be. In the case of punishment-based discrimination, the SD signals that the target behavior will be punished and the SΔ signals that it won't. And in order for these stimuli to function like signals, the behaver has to be able to perceive the difference between them BEFORE performing the target behavior.

Thursday, February 8, 2007

Your shaping CA: What to watch for

Revised on 9/30/14

In order to get credit for your shaping CA, there are a few details that you have to be careful about. Shaping is a natural extension of differential reinforcement (or punishment) because, in effect, it's a connected series of differential reinforcement (or punishment) contingencies. That means that the behaviors in each phase have to be members of the same response class. And that means they're pretty similar to each other, except for the differences along the chosen dimension that get reinforced (or punished).

In the case of shaping using reinforcement, as the frequency of the behavior during a particular phase becomes more frequent, there are naturally some variations in how it's performed. Once the frequency stabilizes at a new, higher level, the performance manager watches for variations in occurrences of the behavior that are a little bit closer to the desired terminal behavior. The next phase begins when these variations become the only occurrences that are reinforced. After a while, these once-rare variations become the most common form of the behavior, and occasionally new variations appear that are even closer to the terminal behavior.

So the occurrences of the behavior during subsequent phases must overlap with each other. Another way to think about it is that the behavior in the earlier phase naturally "flows into" the behavior in the next phase. Still another way to think about it is that the behavior that's going to be reinforced in the next phase must occur occasionally during the present phase. Think about this as you put together and revise your CAs. The behavior you're thinking of describing in the next phase ... Is it a behavior that you see occasionally in the present phase? If not, then change the behavior you're thinking of describing in the next phase.

One more detail to be careful of: Your description of the behavior that gets extinguished in the 2nd phase should be identical to your description of the behavior that was reinforced in the 1st phase, and your description of the behavior that gets extinguished in the 3rd phase should be identical to your description of the behavior that was reinforced in the 2nd phase. You won't see that in all of the examples in the textbook, but this is the way you should do it.

Note that it's often appropriate to indicate "N/A" as the behavior being extinguished in the 1st phase. That's because during the 1st phase, there was no previous phase during which there was a behavior that was being reinforced. Therefore, you can't describe a behavior, so you say "N/A" instead.

Sunday, February 4, 2007

Your differential reinforcement CA: What to watch for

Revised on 3/29/14

Some of you have begun turning in your differential reinforcement CAs for Ch. 7. That reminds me that it's time to say something about the most common stumbling block for students doing this CA. And if you choose one of the other differential contingencies for extra credit, what I'm going to say here will apply to that one too.

In the differential contingencies, the two behaviors start out (that is, before the contingency is imposed) being members of the same response class. In simple terms, this means that they're similar to each other. Look at the diagram on p. 124. Hitting a tennis ball with some skill and hitting it without any skill are members of the response class that we could call "hitting a tennis ball." We could say that they're two subclasses of the larger response class. They're similar, though there's a key difference between them.

When a differential contingency is applied, members of one subclass are reinforced or punished (depending on the particular contingency) while members of the other subclass are not. As a result, the frequencies of the two subclasses diverge. The frequency of one subclass ends up being higher than the other. At this point we can say that the two subclasses have each become new response classes in their own right. In sum, the result of a differential contingency is to divide a response class into two different response classes.

But back to the point I started with. In an example of a differential contingency, the two behaviors have to be similar to each other in at least one of the ways that Malott says members of a response class are similar to each other (p. 128). So when you're preparing your differential reinforcement CA, be sure the two behaviors are similar.

Doing that will also help you avoid another common mistake made by students on the differential contingencies – describing a non-behavior in one of the behavior boxes. If you're reinforcing the behavior described in the upper behavior box, and not reinforcing (extinguishing) the "not doing" of that behavior as described in the lower behavior box, then that's no different from simple reinforcement, in which you reinforce the behavior described in the upper behavior box. Make sure what's described in both behavior boxes is something a dead man can't do.

Thursday, February 1, 2007

Extinction distinctions

Important stuff starting in the middle of the 1st column on p. 106. It's easy to get extinction mixed up with the two kinds of penalty contingencies because all involve the lack of a reinforcer for the target behavior. Read that short section carefully so you won't get them mixed up. Also note that in response cost, the reinforcer that gets removed is NOT the one that was reinforcing the target behavior, whereas in extinction, the reinforcer we're concerned with IS the one that's reinforcing the target behavior.

Other important distinctions between extinction and the penalty contingencies are summarized in the table on p. 107.

Spontaneous recovery and CyberRat

In one of your CyberRat exercises you'll extinguish lever pressing after having strengthened it through reinforcement. The lab report will ask you to study your graphs and report evidence of spontaneous recovery. Remember what Malott tells us in the middle of the 2nd column on p. 105: Spontaneous recovery can't occur during the first extinction session. Why? He explains that too.

Ignorance is no excuse.

It's sometimes thought that the extinction procedure simply amounts to "ignoring the behavior." If you want to use everyday terms, then yes, that's sometimes what's going on when a behavior gets extinguished. But this is too simple; extinction is NOT simply ignoring a behavior that you want to decrease in frequency. Extinction requires you to figure out what's reinforcing the undesirable behavior. Functional assessment, right? Then you have to figure out how to arrange conditions so that when that behavior happens, it's not followed by the reinforcer. There are all kinds of ways to do this, and ignoring the behavior is only one of them. Ignoring only works when the reinforcer is your attention. In those cases, if, by "ignoring," we mean not giving attention when the target behavior occurs, then this is an example of extinction.