Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Nov 6;33(45):17921-36.
doi: 10.1523/JNEUROSCI.0357-13.2013.

When and why noise correlations are important in neural decoding

Affiliations

When and why noise correlations are important in neural decoding

Hugo Gabriel Eyherabide et al. J Neurosci. .

Abstract

Information may be encoded both in the individual activity of neurons and in the correlations between their activities. Understanding whether knowledge of noise correlations is required to decode all the encoded information is fundamental for constructing computational models, brain-machine interfaces, and neuroprosthetics. If correlations can be ignored with tolerable losses of information, the readout of neural signals is simplified dramatically. To that end, previous studies have constructed decoders assuming that neurons fire independently and then derived bounds for the information that is lost. However, here we show that previous bounds were not tight and overestimated the importance of noise correlations. In this study, we quantify the exact loss of information induced by ignoring noise correlations and show why previous estimations were not tight. Further, by studying the elementary parts of the decoding process, we determine when and why information is lost on a single-response basis. We introduce the minimum decoding error to assess the distinctive role of noise correlations under natural conditions. We conclude that all of the encoded information can be decoded without knowledge of noise correlations in many more situations than previously thought.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Previous estimations of ΔINIMin are tighter or looser depending on the context. AC, Examples of the simultaneous activity of two neurons elicited by two stimuli: S1 (black) and S2 (gray). A, B, Two population responses per stimulus. Responses to S1 are negatively correlated and responses to S2 are positively correlated. C, Populations responses have Gaussian distributions with mean values μ1 = [4,4] and μ2 = [6,6], variance equal to 1, and correlation coefficients ρ1 and ρ2 (for stimulus S1 and S2, respectively). DF, Surrogate NI population responses (see Materials and Methods). Because of the NI assumption, response distributions associated with different stimuli overlap. However, here we show that overlaps do not necessarily imply that information is lost (see text). For each example, ΔINIMin was estimated using four estimators: ΔINI, ΔINILS, ΔINID, and ΔINIDL (criteria 8a–8d). GL, Variations of the estimations with: the stimulus probabilities (GI), the response probabilities given S2 (J, K), and the correlation coefficient ρ2 (L). Only parameters specified in the x-axis are varied; the remaining parameters are constant. None of the estimators consistently lies below the others for all stimulus and response probabilities. Therefore, none of them constitutes a universal limit to the inefficiency of NI decoders and, depending on the context, they all overestimate, to a lesser or greater extent, the importance of noise correlations in neural decoding. Remaining parameters are as follows: in G, P(M,L|S1) = 0.5 and P(H,H|S2) = 0.5; in H, P(H,L|S1) = 0.8 and P(H,H|S2) = 0.5; in I, ρ1 = −0.9 and ρ2 = 0.7; in J, P(M,L|S1) = 0.5 and P(S1) = 0.25; in K, P(H,L|S1) = 0.5 and P(S1) = 0.5; in L, P(S 1) = 0.2 and ρ1 = −0.9.
Figure 2.
Figure 2.
Comparison of different strategies to construct decoders that ignore noise correlations. Each panel shows the simultaneous responses of two neurons R1 and R2 elicited by two stimuli S1 and S2. In both examples, stimuli and responses are equally likely. A, Linear decoders trained with surrogate NI population responses (dashed line) extract all the encoded information, whereas no probabilistic NI decoder can do so. Specifically, a probabilistic NI decoder is inefficient for a range of probabilities P(R1,R2|Sk) complying with the two following conditions: P(2,2|S1)2 = P(1,3|S1) P(3,1|S1) and P(3, 3|S2)2 = P(2,4|S2) P(4,2|S2). B, Although neurons are noise independent, no linear decoder is capable of extracting all of the encoded information.
Figure 3.
Figure 3.
The canonical NI decoder is modeled as a sequence of transformations of the population response. The first stage involves the NI assumption, transforming the population response R into a vector RNIL of NI likelihoods. The second stage involves the stimulus estimation, transforming RNIL into the decoded stimulus SNI. At each stage, information may be lost.
Figure 4.
Figure 4.
Examples of population activities decoded using the canonical NI decoder. The arrows show the transformation of population responses into vectors of NI likelihoods (RNIL) induced by the NI assumption (left to middle) and optimal estimation algorithms (middle to right). In A, P(S1) is set to 0.75, and P(M,L|S1) and P(H,H|S2) are both set to 0.5. In B and C, stimuli are equally likely and P(H,L|S1) and P(H,H|S2) are both set to 0.5 in B and to 0.66 in C. A, After the NI assumption, the distinction between responses elicited by different stimuli is preserved (middle panel). Therefore, ΔININIL is zero and noise correlations are unimportant for decoding. B, After the NI assumption, all responses are identical. No information about the stimulus remains and noise correlations are crucial for decoding. C, However, whenever population responses are not equally likely given each stimulus, the NI assumption preserves all the encoded information and noise correlations are unimportant for decoding. The case shown in B, in which noise correlations are important, therefore constitutes an isolated example.
Figure 5.
Figure 5.
The impact of the choice of a NI decoder. A, The classical NI decoder is modeled as a three-stage process involving: the NI assumption (first stage), in which the population response is transformed into a vector of NI likelihoods (RNIL); Bayes' rule (second stage), in which RNIL is transformed into a vector of NI posteriors (RNIP); and the estimation criterion (third stage), in which the decoded stimulus is inferred from RNIP. Each stage may induce an information loss. B, C, Population responses of Figure 4, A and C, decoded with the classical NI decoder. B, Both RNIL and RNIP keep responses elicited by different stimuli segregated and therefore all information is preserved. However, the stimulus cannot always be correctly inferred by simply choosing the one corresponding to the maximum NI posterior (argmax criterion, dotted line, and arrows; right). Nevertheless, there is an estimation criterion capable of correctly estimating the stimulus (continuous lines and arrows; right). C, Although the NI assumption preserves all the encoded information, after Bayes' rule, responses associated with different stimuli are merged, and thus some (but not all) information is lost (ΔININIB is greater than zero). As a result, no estimation criterion is capable of perfectly decoding the stimulus. However, other NI decoders may still be optimal for decoding (Fig. 4C).
Figure 6.
Figure 6.
Difference between assessing the role of noise correlations using mutual information or decoding error. The population response shown in Figure 5C is decoded using the classical NI decoder. Response probabilities are set according to P(H,L|S1) = P(L,L|S2). For different stimulus probabilities P(S1), A shows the variation of the minimum information loss ΔININIP relative to the encoded information I(R;S), and B shows the variation of the increment in the minimum decoding error ΔξNINIP relative to the minimum decoding error ξMin(R;S). The decoding error is here measured as decoding-error probability. The curves for P(S1) = p are identical to the curves for P(S1) = 1 − p (0 ≤ p ≤ 1). A, Unlike the case shown in Figure 4B, here, information is only partially lost and the loss depends on the stimulus and response probabilities. The maximum loss, however, only occurs when P(H,L|S1) reaches 0.5 regardless of the stimulus probability. B, Unlike ΔININIP, ΔξNINIP approaches its maximum value when P(H,L|S1) is greater or equal to P(S1).
Figure 7.
Figure 7.
Impact of ignoring noise correlations when decoding population responses with Gaussian distributions. In both examples, black and gray ellipses represent the contour curves of the response distributions elicited by two stimuli, S1 and S2, respectively. Stimuli are equally likely. Response parameters are as follows: μnk = 5 + (−1)k (AC) and μnk = 5 + 2 (1 − n)k (DF); σnk = 1, ρ1 = −0.5, and ρ2 = 0.5 (Eq. 42). A, D, Optimal decoding strategy (minimization of the decoding-error probability) when correlations are known (Eq. 13). White regions are decoded as S1 and gray regions as S2. B, E, Optimal decoding strategy when correlations are ignored (Eq. 31). The arrows show the transformation of the population response R throughout the decoding process, as described in Figure 3. Population responses (left) are first transformed into vectors of NI likelihoods RNIL (Eq. 19; middle). The distinction between regions filled with different gray levels is preserved. B, Optimal NI decoder maps the white region in the middle panel onto S1 and the gray region onto S2, thus decoding population responses in the same way as in A. E, The first transformation merges population responses that are decoded differently in D (compare the regions), thus incrementing the minimum decoding error. The optimal NI decoder maps white and light-gray regions in the middle panel onto S1 and gray and dark-gray regions onto S2. In both B and E, the optimal estimation criterion differs from the maximum likelihood (or maximum posterior) criterion, which maps regions above and below the diagonal (middle, dashed line) into S2 and S1, respectively. C, F, Analysis of the responses R that can be merged, after the NI assumption, without information loss. Left, Contour curves of the NI likelihoods PNI(R|S1) (black) and PNI(R|S2) (gray). Black dots represent two responses that, for each stimulus, have the same NI likelihoods (and are thus mapped onto the same RNIL). Right, Contour curves of P(S1|R) (black) and P(S2|R) (gray) passing through the responses denoted in the left panel (continuous and dashed lines, respectively). C, The NI assumption merges pairs of responses R that are symmetric with respect to the diagonal (left, dashed line). Because these pairs also have the same posterior probabilities (right), Equation 23 is fulfilled and no information is lost. F, The NI assumption merges pairs of responses R that are symmetric with respect to the line R1 = 5 (left, dashed line). These pairs have different posterior probabilities (right). Therefore, Equation 23 is not fulfilled and some information is lost.
Figure 8.
Figure 8.
The NI decoder can paradoxically extract more information than that encoded by individual neurons. A, Example showing the responses of two neurons R1 and R2 elicited by three stimuli S1, S2, and S3. All stimuli are equally likely. Response probabilities P(L,L|S3), P(M,M|S1), and P(H,H|S2) are equal to α and response probabilities P(L,M|S1), P(M,L|S1), P(L,H|S3), P(H,L|S3), P(M,H|S2), and P(H,M|S2) are equal to 0.5 − α/2, with α varying between 0 and 1. B, The NI decoder is capable of extracting more information than the sum of the information encoded by individual neurons for a wide range of response probabilities. This effect is enhanced by the fact that the latter information is only an upper bound of the information conveyed individually by the neurons in the population.

Similar articles

Cited by

References

    1. Averbeck BB, Lee D. Effects of noise correlations on information encoding and decoding. J Neurophysiol. 2006;95:3633–3644. doi: 10.1152/jn.00919.2005. - DOI - PubMed
    1. Averbeck BB, Latham PE, Pouget A. Neural correlations, population coding and computation. Nat Rev Neurosci. 2006;7:358–366. doi: 10.1038/nrn1888. - DOI - PubMed
    1. Babiloni F, Astolfi L. Social neuroscience and hyperscanning techniques: past, present and future. Neurosci Biobehav Rev. 2012 doi: 10.1016/j.neubiorev.2012.07.006. doi: 10.1016/j.neubiorev.2012.07.006. Advance online publication. - DOI - DOI - PMC - PubMed
    1. Berens P, Ecker AS, Cotton RJ, Ma WJ, Bethge M, Tolias AS. A fast and simple population code for orientation in primate V1. J Neurosci. 2012;32:10618–10626. doi: 10.1523/JNEUROSCI.1335-12.2012. - DOI - PMC - PubMed
    1. Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006.

Publication types

LinkOut - more resources

-