Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 23;6(43):eabb3984.
doi: 10.1126/sciadv.abb3984. Print 2020 Oct.

A shared neural substrate for action verbs and observed actions in human posterior parietal cortex

Affiliations

A shared neural substrate for action verbs and observed actions in human posterior parietal cortex

T Aflalo et al. Sci Adv. .

Abstract

High-level sensory and motor cortical areas are activated when processing the meaning of language, but it is unknown whether, and how, words share a neural substrate with corresponding sensorimotor representations. We recorded from single neurons in human posterior parietal cortex (PPC) while participants viewed action verbs and corresponding action videos from multiple views. We find that PPC neurons exhibit a common neural substrate for action verbs and observed actions. Further, videos were encoded with mixtures of invariant and idiosyncratic responses across views. Action verbs elicited selective responses from a fraction of these invariant and idiosyncratic neurons, without preference, thus associating with a statistical sampling of the diverse sensory representations related to the corresponding action concept. Controls indicated that the results are not the product of visual imagery or arbitrary learned associations. Our results suggest that language may activate the consolidated visual experience of the reader.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. Human parietal neurons are selective for observed actions and action verbs.
(A) Example neurons illustrating diverse selectivity patterns (SPs) across formats. Left: Sample still frames depicting stimuli for one of the five action exemplars (“grasp”) in each format (see fig. S3 for all action exemplars). Right: Representative units illustrating diverse neural responses to the five tested actions (color-coded) across the four tested formats. Each panel shows the firing rate (means ± SEM) through time for each action for a single format. Each column illustrates the responses of the same unit to the four formats. See fig. S1 for recording locations. Photo credit: Guy Orban, Department of Medicine and Surgery, Parma University. (B) Percentage of units with significant action selectivity split by format [means ± 95% confidence interval (CI), one-way ANOVA, P < 0.05 FDR-corrected]. Zero units were selective in each format during the 1-s window before stimulus onset (one-way ANOVA, P < 0.05 FDR-corrected). (C) Cross-validated R2 of units with significant selectivity [units significant in (B)] split by format (means ± 95% CI). (D) Sliding-window within-format classification accuracy for manipulative actions. Sliding window = overlapping 300-ms windows with 10-ms increments. Classification applied to data pooled across sessions. Black horizontal dashed line = chance classification performance. Blue horizontal dashed line = 97.5th percentile of prestimulus classification accuracy for the text condition. Horizontal colored bars indicate time of significant classification. Inset displays color code for format and associated latency estimate for onset of significant decoding (see fig. S7).
Fig. 2
Fig. 2. Action verbs link with observed actions.
(A) Across-format and within-format classification of manipulative actions. x-axis labels indicate the formats used for classifier training and testing (e.g., for across format, train→test). Dots = single-session result. Rectangle = 95% bootstrapped CI over sessions. Gray (red): values for matched (mismatched) labels across formats (see inset for definitions). Dashed horizontal lines show within-format cross-validated accuracy (mean across single-session results). All comparisons with chance performance (dashed line) or shuffled alignment reached significance (Wilcoxon rank-sum test, P < 0.05). (B) Similar to (A) but for EGS. Cross-format classification significant between all visual formats and between visual and text formats when pooling visual formats (see bar with asterisk). (C) Correlation of neural population responses across pairs of formats. Conventions as in (A). (D) Same as (C) for participant EGS (black horizontal bar indicates data that were pooled for statistical testing). (E) Pairwise population correlation while controlling for additional formats using partial correlation. Resulting correlations are above chance (part corr = 0) but below standard correlation values (mean = red diamonds). (F) Same as (E) for participant EGS.
Fig. 3
Fig. 3. Single-neuron SPs link action verbs and observed actions.
(A) Schematic illustrating the four possible ways the SP can compare across two formats (see fig. S9 for expanded description). (B) Summary of SPs across pairs of formats for participant NS (see fig. S9). Red = matched SP; gray = mismatched SP; cyan and light green = selectivity for a single format only [see title colors in (A)]. Photo credit: Guy Orban, Department of Medicine and Surgery, Parma University. (C) Same as (B) for participant EGS. “=” indicates matched SP, and “&” denotes mismatched SP.
Fig. 4
Fig. 4. Text links with all available visually selective cells.
(A) Histogram characterizing how the population of neurons link action representations across the three visual formats (F,L0,L1). “=” indicates matched SP, and “&” denotes mismatched SP. Exclusion of a format indicates no selectivity. Three schematic SPs (right, color-coded) across the visual formats are shown to illustrate how the SPs compare across formats. (B) Schematic models illustrating different architectures of how text relates to three visual representations of the corresponding action. Each oval contains the population of neurons that are selective for a particular visual format. Overlap between ovals indicates matching selectivity across formats. The possible patterns of overlap between ovals may be more complicated (e.g., more overlap between two of the three ovals) but is simplified here for schematic purposes. Yellow neurons are selective for text with matching selectivity, while gray neurons are not. Underneath each schematic is a prediction for how the distribution in (A) will change when the model selection analysis filters the full distribution of (A) for units with matching text selectivity. (C) Similar to (A), however, the histogram is limited to the subset of visually selective units with a matched SP to text [blue subpopulation in (D)]. In cases where the units have mismatched visual SPs (e.g., L0 & F), text can have a matched SP with one of several of the visual formats. Colored segments of histogram indicate which format has matched SP with text (see x-axis labels for color code). (D) Percentage of visually selective units with a matched SP to text. (E) Percentage of text-selective units with a matched SP to at least one visual format, mismatched SP to visual formats, or without visual format selectivity.
Fig. 5
Fig. 5. Temporal features support a semantic link between verbs and observed actions.
(A and B) Cross-modal match between text and visual formats occurs at low latency. (A) Dynamic cross-validated cross-correlation matrices demonstrating how the neural population response during stimulus presentation at one slice of time compares to all other slices of time, both within and across formats. Format comparisons as shown in x- and y-axis labels. Correlation magnitude as indicated by the color bar. Inset: The diagonal elements of the within- and across-format matrices were averaged into three logical groupings [(i) within-format visual, (ii) within-format text, and (iii) across-format text to visual] and normalized to a peak amplitude of 1 for comparison purposes. The temporal profile of the averaged correlations (means ± SE across sessions) is plotted to emphasize the similarity of onset timing for the within-format text and across-format text to visual population correlations. (B) Similar to (A) but for participant EGS. To compensate for the smaller number of sessions, we grouped correlation matrices for cross-modal comparisons. (C and D) Stable relationship between text and observed actions through experimental sessions. (C) Cross-format correlations for subject NS shown for text and the visual formats on a per-session basis (mean with 95% bootstrapped CI). Color code shows whether the subject was passively viewing stimuli or asked to actively imagine from the lateral or frontal perspective (see inset; Vis F = visualize from frontal perspective; Vis L = visualize from the lateral 0 perspective). (D) Same as (C) except for participant EGS (only silent reading).
Fig. 6
Fig. 6. The effect of explicit instruction on cross-format invariance.
During the initial seven sessions, subject NS silently read action verbs. In the six subsequent runs, she explicitly visualized the frontal (F, three runs) or lateral standing (L0, three runs) perspective in response to the action verb. (A) The percentage of units with a significant effect of action or action-format interaction for the format by action ANOVA applied to the triplet of formats pertinent to task instruction (T,F,L0). “Sig” = significant at P < 0.05 FDR-corrected (“NS” otherwise). Results are split by the task instruction. Total number of sorted units shown in title. (B) Results for the combined (BIC + cvR2) model selection analyses for the same triplet of actions split by task instruction. The percentage of T=L units was twice as prevalent as T=F units for passive viewing, as well as the two instructed conditions. (C) Mean dynamic cross-correlation between the visual formats and text split by passive viewing and active imagery in participant NS. Blue lines indicate video offset. (D) Pixel coordinates demonstrating a significant difference between passive viewing and active imagery (significant pixels in white, paired t test, P < 0.05.) Blue lines indicate video offset. (E) Cross-correlation value between text and the visual formats for the set of significant pixels shown in (D) as a function of session number. The blue line shows split between passive and active imagery sessions.

Similar articles

Cited by

References

    1. Binder J. R., Desai R. H., The neurobiology of semantic memory. Trends Cogn. Sci. 15, 527–536 (2011). - PMC - PubMed
    1. Meyer K., Damasio A., Convergence and divergence in a neural architecture for recognition and memory. Trends Neurosci. 32, 376–382 (2009). - PubMed
    1. Lambon Ralph M. A., Jefferies E., Patterson K., Rogers T. T., The neural and computational bases of semantic cognition. Nat. Rev. Neurosci. 18, 42–55 (2017). - PubMed
    1. Miyashita Y., Perirhinal circuits for memory processing. Nat. Rev. Neurosci. 20, 577–592 (2019). - PubMed
    1. Pulvermüller F., How neurons make meaning: Brain mechanisms for embodied and abstract-symbolic semantics. Trends Cogn. Sci. 17, 458–470 (2013). - PubMed

Publication types

LinkOut - more resources

-