Instrumental, or operant, conditioning – that is, the shaping of behavior by pairing desired actions with reinforcement – has been fundamental to psychological research with animals for over a century, and it is an essential part of more recent attempts to link neural activity to animal behavior. Importantly, elementary economic decision making by humans can be characterized in the same theoretical terms and investigated using similar empirical methods as those used in instrumental conditioning, with money in place of food pellets.


It seems clear then, that since perceptual decision making encompasses features of both psychophysics and conditioning, a single model should be able to explain experimental findings in conditioning, decision making and psychophysics. Neural network implementations of drift-diffusion or random walk decision making models that adapt their parameters during performance have shown promise as models that can achieve this unification. As shown in Simen & Cohen (2009), a diffusion model of decision making (e.g., Ratcliff, 1978) reduces exactly to Herrnstein’s classic `melioration’ model of conditioning (Herrnstein, 1997). This occurs when upward and downward steps are equally likely, and when decision thresholds are set inversely proportional to estimates of reward rates earned for each of the two choices. Such reduced models go further, however, in making detailed predictions about response time distributions in conditioning experiments. They may also pave the way to understanding the neural basis of processes like melioration, since stochastic neural activity can be efficiently represented in terms of diffusion/random-walk processes.


I and my colleagues are now more closely examining processes underlying reward/cost monitoring and its effects on choice in experiments that parametrically vary the penalties and rewards for perceptual decisions. Optimal (reward maximizing/loss minimizing) performance in these tasks requires unusual changes in behavior as reward conditions vary. For example, optimal models predict dramatic slowing of response times in visual motion discrimination – from the order of half a second to the order of 10 seconds – when monetary penalties for errors exceed the rewards for correct responses. Behavior consistent with this prediction would be strong evidence in favor of an adaptive, drift-diffusion model of decision making, conditioning and psychophysical discrimination. Preliminary findings support this prediction. Ten seconds is also slow enough that the continuous, real-time neural dynamics of decision making could theoretically be studied with functional magnetic resonance imaging (fMRI), despite that technique’s inherent inability to resolve high frequency changes in neural activity. I and others in the lab of Jonathan D. Cohen are currently using this prediction in fMRI research to search for neural correlates of perceptual decision making and adaptive control mechanisms.

 
  • Decision making
  • Conditioning
  • Interval timing
  • Complex cognition

Home

 
 
 

Animation of operant conditioning / economic choice in action (QuickTime)

Animation of operant conditioning / economic choice in action (Flash)