Timezone: »
Although it is widely believed that reinforcement learning is a suitable tool for describing behavioral learning, the mechanisms by which it can be implemented in networks of spiking neurons are not fully understood. Here, we show that different learning rules emerge from a policy gradient approach depending on which features of the spike trains are assumed to influence the reward signals, i.e., depending on which neural code is in effect. We use the framework of Williams (1992) to derive learning rules for arbitrary neural codes. For illustration, we present policy-gradient rules for three different example codes - a spike count code, a spike timing code and the most general ``full spike train code - and test them on simple model problems. In addition to classical synaptic learning, we derive learning rules for intrinsic parameters that control the excitability of the neuron. The spike count learning rule has structural similarities with established Bienenstock-Cooper-Munro rules. If the distribution of the relevant spike train features belongs to the natural exponential family, the learning rules have a characteristic shape that raises interesting prediction problems.
Author Information
Henning Sprekeler (Ecole Polytechnique Federal de Lausanne)
Guillaume Hennequin (EPFL)
Wulfram Gerstner (EPFL)
More from the Same Authors
-
2022 Poster: Mesoscopic modeling of hidden spiking neurons »
Shuqi Wang · Valentin Schmutz · Guillaume Bellec · Wulfram Gerstner -
2022 Poster: Kernel Memory Networks: A Unifying Framework for Memory Modeling »
Georgios Iatropoulos · Johanni Brea · Wulfram Gerstner -
2021 Poster: Local plasticity rules can learn deep representations using self-supervised contrastive predictions »
Bernd Illing · Jean Ventura · Guillaume Bellec · Wulfram Gerstner -
2021 Poster: Fitting summary statistics of neural data with a differentiable spiking network simulator »
Guillaume Bellec · Shuqi Wang · Alireza Modirshanechi · Johanni Brea · Wulfram Gerstner -
2019 : Poster Session »
Pravish Sainath · Mohamed Akrout · Charles Delahunt · Nathan Kutz · Guangyu Robert Yang · Joseph Marino · L F Abbott · Nicolas Vecoven · Damien Ernst · andrew warrington · Michael Kagan · Kyunghyun Cho · Kameron Harris · Leopold Grinberg · John J. Hopfield · Dmitry Krotov · Taliah Muhammad · Erick Cobos · Edgar Walker · Jacob Reimer · Andreas Tolias · Alexander Ecker · Janaki Sheth · Yu Zhang · Maciej Wołczyk · Jacek Tabor · Szymon Maszke · Roman Pogodin · Dane Corneil · Wulfram Gerstner · Baihan Lin · Guillermo Cecchi · Jenna M Reinen · Irina Rish · Guillaume Bellec · Darjan Salaj · Anand Subramoney · Wolfgang Maass · Yueqi Wang · Ari Pakman · Jin Hyung Lee · Liam Paninski · Bryan Tripp · Colin Graber · Alex Schwing · Luke Prince · Gabriel Ocker · Michael Buice · Benjamin Lansdell · Konrad Kording · Jack Lindsey · Terrence Sejnowski · Matthew Farrell · Eric Shea-Brown · Nicolas Farrugia · Victor Nepveu · Jiwoong Im · Kristin Branson · Brian Hu · Ramakrishnan Iyer · Stefan Mihalas · Sneha Aenugu · Hananel Hazan · Sihui Dai · Tan Nguyen · Doris Tsao · Richard Baraniuk · Anima Anandkumar · Hidenori Tanaka · Aran Nayebi · Stephen Baccus · Surya Ganguli · Dean Pospisil · Eilif Muller · Jeffrey S Cheng · Gaël Varoquaux · Kamalaker Dadi · Dimitrios C Gklezakos · Rajesh PN Rao · Anand Louis · Christos Papadimitriou · Santosh Vempala · Naganand Yadati · Daniel Zdeblick · Daniela M Witten · Nicholas Roberts · Vinay Prabhu · Pierre Bellec · Poornima Ramesh · Jakob H Macke · Santiago Cadena · Guillaume Bellec · Franz Scherr · Owen Marschall · Robert Kim · Hannes Rapp · Marcio Fonseca · Oliver Armitage · Jiwoong Im · Thomas Hardcastle · Abhishek Sharma · Wyeth Bair · Adrian Valente · Shane Shang · Merav Stern · Rutuja Patil · Peter Wang · Sruthi Gorantla · Peter Stratton · Tristan Edwards · Jialin Lu · Martin Ester · Yurii Vlasov · Siavash Golkar -
2015 Poster: Attractor Network Dynamics Enable Preplay and Rapid Path Planning in Maze–like Environments »
Dane Corneil · Wulfram Gerstner -
2015 Oral: Attractor Network Dynamics Enable Preplay and Rapid Path Planning in Maze–like Environments »
Dane Corneil · Wulfram Gerstner -
2011 Poster: Variational Learning for Recurrent Spiking Networks »
Danilo J Rezende · Daan Wierstra · Wulfram Gerstner -
2011 Poster: From Stochastic Nonlinear Integrate-and-Fire to Generalized Linear Models »
Skander Mensi · Richard Naud · Wulfram Gerstner -
2010 Poster: Rescaling, thinning or complementing? On goodness-of-fit procedures for point process models and Generalized Linear Models »
Felipe Gerhard · Wulfram Gerstner -
2008 Poster: Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning »
Gediminas Luksys · Carmen Sandi · Wulfram Gerstner -
2008 Oral: Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning »
Gediminas Luksys · Carmen Sandi · Wulfram Gerstner -
2007 Poster: An online Hebbian learning rule that performs Independent Component Analysis »
Claudia Clopath · André Longtin · Wulfram Gerstner -
2006 Poster: Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning »
Gediminas Luksys · Jeremie Knuesel · Denis Sheynikhovich · Carmen Sandi · Wulfram Gerstner