Interview — No. 497

Adipisci Nemo Eius

A short snippet into the blog sample

Yikes. Let’s skip straight to the neuroscience this year, shall we? Even this one small area of human endeavour has not escaped the forces unleashed by tech bros who think ethics are something that happen to other people.

A once thriving neuroscience Twitter community dissolved thanks to He Who Shall Not Be Named, the battle-hardened remaining to rail against the dying of the light, others scattered to other platforms, sadly disconnected from one another.

ChatGPT and its ilk, tools of such potential, also brought with them a wave of garbage science, including tranches of grammatically-correct, woefully-poor student essays, full of fun facts about studies that did not happen: did you know, for example, about Geoff Schoenbaum’s primate studies on decision making?

And yet, science prevails. This machine for creating knowledge moves at such pace and fury that even Musk is but a pebble causing a scant ripple in its flow. As you’ll see, we’ve learnt a lot about the brain this year, about what the brain doesn’t have, what it does have, and about expert game players. But what we learnt most about was dopamine.

Error Error Error

You wouldn’t have thought we’d have much new to learn about dopamine. Or, more accurately, about how dopamine conveys information about reward. It’s been 25 years since Schultz, Dayan & Montague’s classic paper in Science laid out how the firing of dopamine neurons in a monkey’s midbrain looked remarkably like they were signalling the error in predicting a reward: firing more to unexpected rewards, not changing their firing to expected rewards, and firing less when an expected reward didn’t turn up. Indeed, just the error expected in reinforcement-learning models that seek to learn the future value of things in the world.

After 25 years, most research fields in science are either abandoned to the historians as embarrassing dead-ends, stagnate from lack of new ideas, or become mainstream, their facts regurgitated in dull textbooks, reaching that Kuhnian “business as usual” stage. But not dopamine.

This year started with a flurry of head-scratching, high-profile papers on what those pesky handful of midbrain dopamine neurons were conveying to the rest of the brain. Jeong and colleagues kicked us off by claiming dopamine isn’t a signal for prediction error at all, but a signal for unexpected sequences of events in the world. Roughly speaking.

Proposing a new model for how the brain learns causality between events in the world, they gave the firing of dopamine neurons the job of conveying a term in that model that, well, was so hard to explain they didn’t bother, but is roughly how unexpected it was that one event followed another given how well that event is usually predicted.

"Frankly, they didn’t manage to explain why dopamine should have that job and not one of the other, at least two, error terms the model needed. "

Nonetheless, by assuming dopamine was this “adjusted net contingency for causal relations”, the new model did an impressive job of replicating the firing of dopamine neurons in a range of classical conditioning tasks, tasks where the animal just sits there and things happen around it: bells predict food; beeps predict water. The idea that dopamine is crucial to learning causality is not new, but Jeong and co make a good point that conceptually it’s easier to learn backwards, from stuff you’ve already experienced, than forwards, by predicting the future values of stuff you’re going to experience.

Mere weeks later, Markowitz and (many) colleagues took a look at the release of dopamine in mice running free, doing whatever they wanted in an open field (well, a 40cm diameter bucket). Dividing the mouse’s behaviour up into syllables, discrete bits of action like rearing or turning or scratching, they found the release of dopamine dips just before the end of one syllable and peaks just after the start of the next one. The data suggested that the bigger the peak of dopamine, the more likely the syllable was to occur again. From this Markowitz and co speculated that dopamine signals were thus being used internally to promote behaviours, just as they would be if those dopamine signals were evoked externally by reward. Between them, these first two papers were arguing there was much to dopamine beyond the error in predicting a reward.

A week later, Coddington, Lindo, and Dudman offered us a rather different take by pointing out that reinforcement learning had still much to offer. What, they asked, if we were looking at the wrong kind of reinforcement learning? You see, reinforcement learning models come in two flavours. In one, they learn about the value of things in the world, then decide what to do based on those learnt values. That’s where the classic dopamine-as-prediction-error comes in, as an error in those predicted values. In the other, they just learn directly what to do in each situation; they learn a “policy”.

Coddington and co. offered some (pretty impressive) evidence that the firing of dopamine neurons acts like the learning rate of something that’s directly learning a policy. That is, high firing rates would be big updates to the policy, low firing rates would be small updates. They offer us a double departure from the canonical theory: not only is dopamine not a prediction error for value, it’s not a prediction error at all.

AND THEN — yes, there’s more — in late summer were two bombshells. Jesse Goldberg’s team casually dropped into conversation that actually the dopamine neurons’ firing is not just fixed to a prediction error for reward. Rather, it is assigned to sending an error about whatever is most important right now. They showed this in male birds singing. Singing on its own, a bird’s dopamine neurons fired about the unexpected errors he made in his song (this much we already knew). But when singing to a female, the dopamine neurons fired after unexpected response calls from the female, when the male didn’t get the reaction it was predicting, whether that was good (she replied!) or not so good (she’s ignoring me). Ah, dopamine, now we can add awkward adolescent male courting to your list of responsibilities.

The second late summer bombshell was potentially the biggest. Tritsch and friends reported that the release of dopamine into the striatum is oscillating at between 0.5 and 4Hz, going up and down at least every two seconds and at most four times a second. Oscillating all the time, whether resting, moving, or getting reward. And on such a scale that the peaks of release were as big as those evoked by unexpected reward. This could be a problem.

For you see all current theories of what dopamine tells the brain still rest on the idea that there is a baseline from which the changes in dopamine convey information. That baseline defines what is “expected”. This is true of reward prediction error theories; it’s also true of the causality theory (it’s not true if we want to believe dopamine is a learning rate, but then it’s unclear why would we want that to be oscillating a few times a second). Tritsch and friends say there is no baseline.

What now? If neuroscience were like physics then tens of papers would have been posted to arXiv before Tritsch’s paper had even hit the stands as swarms of otherwise underemployed theorists descended on the latest anomaly to propose new, exotic theories that explained it away. And then discovering that the data were actually due to a loose cable, and all was for nought. But neuroscience is not like physics. Sometimes that’s a good thing.

Member Only