3.1 Neural Decoding and Signal Detection Theory

Introductory Prompt

Britten et Al '92

Decoding

Accumulated Evidence

def is_tiger(tiger_threshold, breeze_threshold, p_dist_tiger, p_dist_breeze):
        counter = 0
        time = 0
        is_tiger = None
        while True:
                s = getStimulus()
                likelihood = p_dist_tiger(s)/p_dist_breeze(s)
                counter += log(likelihood)
                if   counter >= tiger_threshold:
                        return True
                elif counter <= breeze_threshold:
                        return False

Kiani, Hanks, Shadlen (2006)

Scaling by Priors

Evidence for Scaling by Priors

This cover of Nate Silver's book neatly summarizes what's true for many important decisions. There's a small amount of signal in the world, as in the case of the photoreceptive current, and an awful lot of noise relative to any particular decision for the same reasons as we discussed in our last lecture. A given choice establishes a certain set of relative stimulus aspects and all other information, which may be very useful information for other purposes, becomes noise. In deciding whether to invest energy in reacting, you're not running away from the tiger, calling in the bomb squad to detonate a shopping bag, asking a girl for a date, the prior probability isn't the only factor. One also might want to take into account the cost of acting or not acting. So now let's assume there is a cost, or a penalty, for getting it wrong. You get eaten, the shopping bag explodes. And the cost for getting it wrong in the other direction, your photo gets spoiled, you miss meeting the love of your life. 

So how do we additionally take these costs into account in our decision? Let's calculate the average cost for a mistake, calling it plus when it is in fact minus. We get a loss which we'll call L minus, penalty weight, and for the opposite mistake, we get L plus. So our goal is to cut our losses and make the plus choice when the average loss for that choice is less than the other case. So we can write this as a balance of those average losses. The average or the expected loss from making the wrong decision, for choosing minus when it's plus is this expression, the weight for making the wrong decision multiplied by the probability that that occurs. And now we can make the decision to answer plus when the loss for making the plus choice is less than the loss for the minus choice. That is, when the average loss for that decision is less than the average loss in the other case. So now, let's use base rule to write these out. So now have L + P(r|-) P(r|-) divided by P(r), all that to be less than the opposite case, P(r|+)P(r) divided by the probability of response. So now you can see that when we cancel out this common factor, the probability of response, and rearrange this in terms of our likelihood ratio, because now we have here the likelihood. The probability of response given minus, on this side the likelihood for the probability of response given plus, we can now pull those factors out as the likelihood ratio and now we have a new criteria for our likelihood ratio test. Now one that takes these loss factors into account.

o A false alarm is equivalent to choosing the answer upward when the true answer was downward; this is equivalent to the probability that the downward stimulus leads to a firing rate above the threshold z (because everything above the threshold z you label as upward, according to your rule), i.e., the probability of the firing rate being above z given that the stimulus was downward: P(r > z | downward). - Slide: Likelihood ratio

o The likelihood of a model (e.g., the stimulus being upward or downward moving) is equal to the probability of seeing the data (e.g., the firing rate) given that model. This is not the same as the probability of the model given the data, although the two are related through Bayes’ rule.

o In the plots of P(I|signal) and P(I|noise), the y-axis follows a logarithmic scale. This means that Gaussian probability distributions will appear as inverted parabolas. Can you think of why this is?

o Prior probabilities are very important in decision making, and they can be expressed mathematically using the chain rule of probabilities. Bayes’ rule tells us, for example, that P(tiger|sound) is proportional to P(sound|tiger)P(tiger). We don’t need to worry about the proportionality constant if we are only comparing P(breeze|sound) to P(tiger|sound), which we usually are.

o In other words, the three important pieces of information you need when making a binary decision are: the evidence – the value of the sound, the prior – P(tiger), (the two combine to give the posterior P(tiger|sound)), and the cost/loss associated with a wrong/right decision.

Previous:2.4 Variability
Next:3.2 Population Coding and Bayesian Estimation