Coherent noise enables probabilistic sequence replay in spiking neuronal networks


Younes Bouhadjar , et al.

A spiking neural network recalls sequences in response to ambiguous cues

In this section, we provide a brief overview of the model and the task, illustrate how the network learns overlapping sequences occurring with different frequencies during the training, and show how these occurrence frequencies are encoded in the network. We then study the network responses to ambiguous cues and the influence of the occurrence frequencies on the recall behavior in the absence or presence of noise.

Similar to [5], the model consists of a randomly and sparsely connected network of NE excitatory neurons (population ) and a single inhibitory neuron (Fig 1A). Each excitatory neuron receives KEE excitatory inputs from other randomly chosen neurons in . Excitatory neurons are subdivided into M subpopulations, each containing neurons with identical stimulus preference: in the absence of any additional connections, all neurons in a given subpopulation fire a spike upon the presentation of a specific sequence element. The inhibitory neuron is recurrently connected to the excitatory neurons. In contrast to [5] where each excitatory subpopulation is equipped with its own inhibitory neuron, we here use a single inhibitory neuron to implement a winner-take-all (WTA) competition between the subpopulations of excitatory neurons. At the same time, the inhibitory neuron mediates the competition between neurons within subpopulations and thereby leads to sparse activity and context sensitivity, as described in [5] and below. The network is driven by external inputs, each representing a specific sequence element (“A”, “B”, …), and feeds all neurons in the subpopulation that have the same stimulus preference. Neurons are modeled as point neurons with the membrane potential evolving according to the leaky integrate-and-fire dynamics [31]. The total synaptic input current of excitatory neurons is composed of currents in distal dendritic branches, inhibitory currents, and currents from external sources, see Eq (5). The inhibitory neuron receives only inputs from excitatory neurons. The dynamics of dendritic currents include a nonlinearity describing the generation of dendritic action potentials (dAPs), see Eq (10). Synapses between excitatory neurons are plastic and subject to spike-timing-dependent plasticity and homeostatic control. Details on the network model are given in Materials and methods.


Fig 1. Network structure.

A) The architecture constitutes a recurrent network of subpopulations of excitatory neurons (filled gray circles) and a single inhibitory neuron (Inh). Each excitatory subpopulation contains neurons with identical stimulus preferences. Excitatory neurons are stimulated by external sources providing sequence-element specific inputs “A”,“F”, “B”, etc. Connections between and within the excitatory subpopulations are random and sparse. The inhibitory neuron is recurrently connected to all excitatory neurons. In the depicted example, the network is repetitively presented with two sequences {A,F,B,D} (brown) and {A,F,C,E} (blue) during learning. The sequence {A,F,C,E} occurs twice as often as {A,F,B,D}. B) During learning, the network forms sequence specific subnetworks (blue and brown arrows representing {A, F, B, D} and {A,F,C,E}, respectively) as a result of the synaptic plasticity dynamics. The connections between subpopulations representing the sequence shown more often are stronger (thick arrows). C) The network can be configured into a replay mode by increasing the neuronal excitability. During the replay mode, the network is presented with a cue stimulus representing the first sequence element “A”. In addition, the excitatory subpopulations receive input from distinct sources of background noise (gray traces) which is not present during learning. In the replay mode, the synaptic plasticity is switched off.

During the learning, the network is exposed to repeated presentations of S sequences s1, …, sS, such that each sequence si occurs with a specific frequency pi (for details on the learning protocol, see Materials and methods). For illustration, we focus here on a simple set of two sequences {A,F,B,D} and {A,F,C,E}, where the first sequence is shown with a relative frequency p1 = p and the second with p2 = 1 − p (e.g., p = 0.2 in Fig 2A). In the following, we refer to {A,F,B,D} as sequence 1 and to {A,F,C,E} as sequence 2. Before learning, presenting a sequence element causes all neurons in the respective subpopulation to fire. During the learning process, the repetitive sequential presentation of sequence elements increases the strength of connections between the corresponding subpopulations to a point where the activation of a certain subpopulation by an external input generates dAPs in a specific subset of neurons in the subpopulation representing the subsequent element. The generation of the dAPs results in a long-lasting depolarization (∼ 50 − 500 ms) of the soma. We refer to neurons that generate a dAP as predictive neurons. When receiving an external input, predictive neurons fire earlier as compared to non-predictive neurons. If a group of at least ρ neurons are predictive within a certain subpopulation, their advanced spikes initiate a fast and strong inhibitory feedback to all excitatory neurons, ultimately suppressing the firing of non predictive neurons. After learning, the model develops specific subnetworks representing the learned sequences (Fig 1B), such that the presentation of a sequence element leads to a context dependent prediction of the subsequent element [5]. As a result of Hebbian learning, the synaptic weights in the subnetwork corresponding to the most frequent sequence during learning are on average stronger than those for the less frequent sequence (Figs 1B, 3A and 4A). In the prediction mode, this asymmetry in synaptic weights plays no role. For ambiguous stimuli, all potential outcomes are predicted, i.e., the network predicts both “C” and “B” simultaneously in response to stimuli “A” and “F”, irrespective of the training frequencies.


Fig 2. Task.

A) During learning, the model is exposed to two (or more) competing sequences with different frequencies. Here, sequence 1 ({A,F,B,D}; brown) is shown twice as often as sequence 2 ({A,F,C,E}; blue). The respective normalized training frequencies p1 = 1/3 and p2 = 2/3 are depicted by the histogram. B) During replay, the network autonomously recalls the sequences in response to an ambiguous cue (first sequence element “A”; open black squares) according to different strategies. Maximum probability (max-prob): only the sequence with the highest training frequency is replayed. Probability matching (prob. matching): the replay frequency of a sequence matches its training frequency. Full exploration: all sequences are randomly replayed with the same frequency, irrespective of the training frequency. Histograms represent the replay frequencies and , respectively.


Fig 3. Correlated noise enhances exploratory behavior.

A) Sketch of subpopulations of excitatory neurons (boxes) representing the elements of the two sequences {A,F,C,E} (seq. 2) and {A,F,B,D} (seq. 1). The subpopulations “C” and “B” are unfolded showing their respective neurons. The arrows depict the connections after learning the task shown in Fig 2A. The line thickness represents the population averaged synaptic weight. The presentation of the character “A” constitutes an ambiguous cue during replay. The inhibitory neuron (Inh) mediates competition between subpopulations through the winner-take-all (WTA) mechanism. B,C,D) Spiking activity in the subpopulations depicted in panel A in response to three repetitions of the ambiguous cue “A” (black triangles at the top and vertical dotted lines) for three different noise configurations σ = 0 pA, c = 0 (B), σ = 26 pA, c = 0 (C), and σ = 26 pA, c = 1 (D). Brown, blue, and silver dots mark somatic spikes of excitatory neurons corresponding to sequence 1, sequence 2, and both, respectively. For clarity, only the sparse subsets of active neurons in each population are shown. Red dots mark spikes of the inhibitory neuron. Panels C and D depict the representative recall behavior. See Fig 4 for a detailed statistics across trials and network realizations. See Table 9 for model parameters.


Fig 4. Uncorrelated noise averages out in population based encoding.

Dependence of A) the compound weights (PSC amplitudes) wBF (brown) and wCF (blue; see Fig 3A), B–D) the population averaged response latencies tB and tC (subpopulation averaged time of first spike after the cue “A”; see Eq (1) for subpopulations “B” (brown) and “C” (blue), and E–G) the relative replay frequencies and of sequences 1 (brown) and 2 (blue), the failure rate f (gray) and the joint probability of replaying both sequences (silver) on the training frequency p1 = p of sequence 1. Note that the inhibition is disabled when measuring the latencies to ensure that both competing populations “B” and “C” elicit spikes. Panels B–G depict results for three different noise configurations σ = 0 pA, c = 0 (B,E), σ = 26 pA, c = 0 (C,F), and σ = 26 pA, c = 1 (D,G). In panel A, circles and error bars depict the mean and the standard deviation across different network realizations. In panels B–D, circles and error bars represent the mean and the standard deviation across Nt = 151 trials (cue repetitions), averaged across 5 different network realizations. In panels E–G, circles represent the mean across Nt = 151 trials, averaged across 5 different network realizations. See Table 9 for remaining parameters. Same task as described in Fig 2.

The model can be configured into a replay mode, where the network autonomously replays learned sequences in response to a cue stimulus. This is achieved by changing the excitability of the neurons such that the activation of a dAP alone can cause the neurons to fire [5]. In addition, the synaptic plasticity is disabled during replay to preserve the encoding of the training frequencies in the synaptic weights (Fig 4A; see also Discussion). In the replay mode, we present ambiguous cues and study whether the network can replay sequences following different strategies (Fig 2B). We refer to the “maximum probability” strategy (Fig 2B, left) as the case where the network exclusively replays the sequence with the highest occurrence frequency during training. When adopting the “probability matching” strategy, the network replays sequences with a frequency that matches the training frequency (Fig 2B, middle). The “full exploration” strategy refers to the case where all sequences are randomly replayed with the same frequency, irrespective of the training frequency (Fig 2B, right). In Fig 3, we illustrate the network’s decision behavior by presenting the ambiguous cue stimulus “A” three times. In the absence of noise, the network adopts the maximum probability strategy (Fig 3B): as a result of the higher weights between the neurons representing the more frequent sequence, the dAPs are activated earlier in these neurons, which advances their somatic firing times with respect to the neurons representing the less frequent sequence. This advanced response time quickly activates the inhibitory neuron, which suppresses the activity of the other neurons.

To assess the replay performance, we present the ambiguous cue “A” for Nt trials and examine the replay frequencies and of the two sequences s1 = {A,F,B,D} and s2 = {A,F,C,E} as a function of their relative occurrence frequencies pi during training. We define the sequences {A,F,B,D} or {A,F,C,E} to be successfully replayed if more than 0.5ρ = 10 neurons in the last subpopulations “E” or “D” have fired, respectively (for details on the assessment of the replay statistics, see Materials and methods). In the absence of noise, the network replays only the sequence with the highest training frequency p (Fig 4E). To understand this behavior, we inspect the response latencies tB/C of the subpopulations “B” and “C” as a function of the training frequencies (Fig 4B). Here, the response latency
of the subpopulation representing sequence element x ∈ {B,C} corresponds to the population average of the single-neuron response latencies ti (time of first spike after the cue) for each active neuron in this subpopulation. Averaged across trials, the response latency is smaller for the subpopulation participating in the sequence with the higher frequency. The response latencies tB and tC decrease with increasing the respective training frequencies. In the absence of noise, the distribution of the response latencies tB/C across trials is very narrow (
Fig 4B). Consequently, neurons representing the most frequent sequence fire earlier in all trials. For training frequencies between 0.4 and 0.6, the difference between tB and tC in some network realizations is small compared to the response latency of the WTA circuit. Hence, both sequences are occasionally replayed simultaneously (Fig 4E).

To foster exploratory behavior, i.e., to enable occasional replay of the low-frequency sequence, we equip the excitatory neurons with background noise. For simplicity, this background noise is added only during replay, but not during the learning (see Discussion). In this work, we investigate two different forms of background noise. Here, we first consider noise provided in the form of stationary synaptic background input (see below for an alternative form of noise). To this end, each subpopulation of excitatory neurons receives input from its private pool of independent excitatory and inhibitory Poissonian spike sources (Fig 1C). The background noise is parameterized by the noise amplitude σ (standard deviation of the synaptic input current arising from these background inputs) and the noise correlation c (see Fig 1C and Materials and methods). Inputs to neurons of the same subpopulation are correlated by an extent parameterized by c. Neurons in different subpopulations receive uncorrelated inputs. The noise amplitude σ is chosen such that the subthreshold membrane potentials of the excitatory neurons are fluctuating without eliciting additional spikes. As a consequence, the distributions of response latencies tB/C across trials may be broadened and partly overlap (Fig 4C and 4D). As we will show in the following, the network can adopt different replay strategies (Fig 2B) depending on the amount of this overlap. Note that noise is injected only during replay, but not during learning. During training, the weak noise employed here hardly affects the network behavior as the external inputs (stimulus) are strong and lead to a reliable, immediate responses.

With uncorrelated noise (c = 0), the replay behavior remains effectively non-explorative, i.e., only the high-frequency sequence is replayed in response to the cue (Fig 3C). This is explained by the fact that each sequence element is represented by a subset of ρ neurons, or in other words, that the response latency tx in Eq (1) is a population averaged quantity. Its across-trial variance
is determined by the population size ρ, the population averaged spike-time variance , and the population averaged spike-time correlation coefficient , with Cov(ti, tj) denoting the spike-time covariance for two neurons i and j. Here, we use the subscript “s” to indicate that vs and cs refer to the (co-)variability in the (first) “spike” times. The spike-time statistics vs and cs depend on the input noise statistics σ and c in a unique and monotonous manner [
32, 33]. In the absence of correlations (c = cs = 0), the across-trial variance v of tx vanishes for large population sizes ρ. For finite population sizes, v is non-zero but small (Fig 4C). The effect of the synaptic background noise on the variability of response latencies largely averages out. Hence, the average advance in the response of the population representing the high-frequency sequence cannot be overcome by noise; the network typically replays only the sequence with the higher occurrence frequency during training (Fig 4F). For small differences in the training frequencies (p ≈ 0.5), the network occasionally fails to replay any sequence or replays both sequences. The mechanism underlying this behavior is explained below.

Noise averaging is efficiently avoided by introducing noise correlations. For perfectly correlated noise and, hence, perfectly synchronous spike responses (c = cs = 1), the across-trial variance v of the response latency t is identical to the across-trial variance vs of the individual spike responses, i.e., v = vs, irrespective of the population size ρ; see Eq (2). For smaller but non-zero spike correlations (0 < cs < 1), the latency variance v is reduced but doesn’t vanish as ρ becomes large. Hence, in the presence of correlated noise, the across-trial response latency distributions for two competing populations have a finite width and may overlap (Fig 4D), thereby permitting an occasional replay of the sequence observed less often during training (Figs 3D and 4G and S6 Fig). Replay, therefore, becomes more exploratory, such that the occurrence frequencies during training are gradually mapped to the frequencies of sequence replay. With an appropriate choice of the noise amplitude and correlation, even an almost perfect match between training and replay frequencies can be achieved (probability matching; Fig 4G). For a training frequency p = 0.2, the replay frequency matches p already after about 20 training episodes (S5 Fig).

The results presented so far can be extended towards more than two competing sequences. As a demonstration, we train the network using five sequences {A,F,B,D}, {A,F,C,E}, {A,F,G,H}, {A,F,I,J}, and {A,F,K,L} presented with different relative frequencies. By adjusting the noise amplitude σ and correlation c, the replay frequencies can approximate the training frequencies (Fig 5).


Fig 5. Multiple competing sequences are learned and replayed according to their occurrence frequencies (probability matching).

During learning, five competing, partly overlapping sequences s1 = {A, F, B, D}, s2 = {A,F,C,E}, s3 = {A,F,G,H}, s4 = {A,F,I,J}, and s5 = {A,F,K,L} are repetitively presented with relative training frequencies p1 = 0.1, p2 = 0.14, p3 = 0.2, p4 = 0.23, p5 = 0.33, respectively (dotted red lines). After learning, the network autonomously replays the learned sequences in response to the ambiguous cue “A” with frequencies depicted by the blue bars. Parameters: σ = 12 pA, c = 1, τh = 4620 ms, z* = 21, Ne = 101, M = 12. See Table 9 for remaining parameters.

Random stimulus locking to spatiotemporal oscillations as natural form of noise

In vivo cortical activity is rarely stationary. Usually, it is characterized by substantial temporal and spatial fluctuations, often occurring in the form of transient spatiotemporal oscillations, i.e., cortical waves [27, 4042]. In the presence of traveling cortical waves, nearby neurons share the same oscillation phase, whereas distant neurons experience different phases (Fig 8). At the time of stimulus arrival, neurons in the up phase are more excitable and tend to fire earlier than neurons in a down phase. Cortical waves can be locked to external stimuli or events such as saccades [43], but they also occur spontaneously without locking to external cues [44]. Here, we exploit this finding and assume that the cue onset times are random with respect to the oscillation phase, thereby introducing a locally coherent form of trial-to-trial variability during replay.


Fig 8. Random locking of stimulus to global oscillations as a form of noise.

A) Snapshot of a wave of activity traveling across a cortical region at time t1 of the 1st stimulus onset. Grayscale depicts wave amplitudes in different regions. Brown and blue rectangles mark populations of neurons with stimulus preferences “B” and “C”, respectively. B) Background inputs to neurons in populations “B” and “C” at different times. Background inputs to each population “B” and “C” at different times. Background inputs to neurons within each population are in phase due to their spatial proximity. Background inputs to different populations are phase shifted. Arrows on the top depict stimulus onset times. The times t1, t2, … indicate input arrival to populations “B” and “C” (dashed vertical lines are random, not locked to the background activity).

To investigate the effect of this type of variability on the replay performance, we first train the network in the absence of any background input using the same two-sequence task and training setup discussed in earlier sections. During replay, we inject an oscillating background current with amplitude a and frequency f into all excitatory neurons (see Materials and methods). Neurons within a given subpopulation share the same oscillation phase. Phases for different subpopulations are randomly drawn from a uniform distribution between 0 and 2π. The replay performance of the network is assessed by monitoring the network responses to repetitive presentations of an external cue “A” with random, uniformly distributed inter-cue intervals . The analysis is repeated for a range of training frequencies p, oscillation amplitudes a, and frequencies f.

Depending on the choice of the oscillation amplitude a and frequency f, the network replicates different replay strategies (Fig 9). For low-amplitude oscillations, the model replays only the sequence with the higher training frequency (max-prob). With increasing oscillation amplitude, it becomes more explorative and occasionally replays the less frequent sequence. By adjusting the oscillation amplitude, the replay frequency can be closely matched to the training frequency. This behavior of the model is observed for a range of physiological frequency bands such as alpha (∼ 10 Hz), beta (∼30 Hz), and gamma (∼ 70 Hz) [45, 46]. Higher oscillation frequencies are less effective due to the low-pass characteristics of neuronal membranes and synapses. Consequently, increasing the oscillation frequency leads to a more reliable replay of the most frequent sequence. For slow oscillations with long periods that are large compared to the average inter-cue interval, the network responses in subsequent trials are more correlated. For sufficiently many trials, however, the network can still explore different solutions.


Fig 9. Changing replay strategy by modulation of spatiotemporal background oscillations.

Dependence of the relative replay frequencies and of sequences 1 (brown) and 2 (blue), the failure rate f (gray), and the joint probability of replaying both sequences (silver) on the relative training frequency p1 = p of sequence 1 for different amplitudes a ∈ {0, 10, 20} and frequencies of the background oscillations: f = 10 Hz (B,C), f = 30 Hz (A,D,E), and f = 70 Hz (F,G). Circles represent the mean across Nt = 181 trials, averaged across 5 network realizations. See Table 9 for remaining parameters. Same task as described in Fig 2.

To conclude: cortical waves in a range of physiological frequencies represent a form of highly fluctuating and locally correlated background activity. The absence of a systematic stimulus locking to this activity constitutes a natural source of randomness that does not average out and is hence well suited to generate robust exploratory behavior. The degree of exploratoriness, i.e., the decision strategy, can be adjusted in a biologically plausible manner by controlling the wave amplitude or frequency.

Source link