Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 14;184(21):5482-5496.e28.
doi: 10.1016/j.cell.2021.09.014. Epub 2021 Sep 30.

Atlas of clinically distinct cell states and ecosystems across human solid tumors

Affiliations

Atlas of clinically distinct cell states and ecosystems across human solid tumors

Bogdan A Luca et al. Cell. .

Abstract

Determining how cells vary with their local signaling environment and organize into distinct cellular communities is critical for understanding processes as diverse as development, aging, and cancer. Here we introduce EcoTyper, a machine learning framework for large-scale identification and validation of cell states and multicellular communities from bulk, single-cell, and spatially resolved gene expression data. When applied to 12 major cell lineages across 16 types of human carcinoma, EcoTyper identified 69 transcriptionally defined cell states. Most states were specific to neoplastic tissue, ubiquitous across tumor types, and significantly prognostic. By analyzing cell-state co-occurrence patterns, we discovered ten clinically distinct multicellular communities with unexpectedly strong conservation, including three with myeloid and stromal elements linked to adverse survival, one enriched in normal tissue, and two associated with early cancer development. This study elucidates fundamental units of cellular organization in human carcinoma and provides a framework for large-scale profiling of cellular ecosystems in any tissue.

Keywords: CIBERSORTx; EcoTyper; cancer genomics; cell states; cellular communities; ecosystems; ecotypes; expression deconvolution; tumor immunology; tumor microenvironment.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests M.D. reports research funding from Varian Medical Systems and Illumina; ownership interest in CiberMed and Foresight Diagnostics; patent filings related to cancer biomarkers; and paid consultancy from Roche, AstraZeneca, RefleXion and BioNTech. A.A.A. is a member of the Cell advisory board and reports research support from Bristol Meyers Squibb; ownership interest in CiberMed, FortySeven Inc., and Foresight Diagnostics; patent filings related to cancer biomarkers; and paid consultancy from Genentech, Roche, Chugai, Gilead, and Celgene. A.M.N. reports ownership interest in CiberMed and patent filings related to cancer biomarkers. B.A.L., C.B.S., A.A.A., A.J.G., and A.M.N. have filed patent application PCT/US2020/059196. The remaining authors declare no potential conflicts of interest.

Figures

Figure 1.
Figure 1.. High-Throughput Characterization of Tumor Cell States and Ecosystems.
Schematic depicting the EcoTyper framework and its application to 16 types of human carcinoma (TCGA discovery cohort, Table S1). In this study, EcoTyper was applied within a multi-phase workflow, consisting of purification of cell type-specific gene expression profiles from bulk tissue transcriptomic data, identification of transcriptional states for each purified cell type, and determination of co-occurrence patterns between cell states that define multicellular communities, termed ecotypes. Once cell states and ecotypes are defined, they can be queried in external expression datasets, including bulk transcriptomes, scRNA-seq data, and spatial transcriptomic arrays, allowing validation and integrative characterization. See also Figure S1.
Figure 2.
Figure 2.. The Cell State Landscape Across 16 Carcinomas.
(A) Heat maps showing digitally-purified expression profiles of 12 cell types decoded from 16 bulk epithelial tumor types, with genes as rows and tumor/adjacent normal tissue samples as columns. Heat maps are organized by the most abundant cell state per sample. (B) UMAP projection of cell state heterogeneity across tumor and adjacent normal specimens in the discovery cohort. Points are colored by the most abundant cell state per sample, with states colored identically to panel A (gray denotes S9). (C) Expression of cell state-specific marker genes (rows) across seven scRNA-seq datasets (columns) spanning four types of carcinoma (Tables S1 and S4). Asterisks indicate cell states omitted from further analysis that were not distinguishable from potential doublets in scRNA-seq data. (D) Enrichment of EcoTyper states in normal adjacent tissue, comparing the discovery cohort to an scRNA-seq tumor atlas (Lambrechts et al., 2018). In both cases, tumor and adjacent normal tissues from NSCLC were analyzed. Concordance was determined as the fraction of states with significant normal enrichment in both datasets, with significance determined by Fisher’s exact test. (E) Top: H&E staining of colorectal cancer (CRC) specimens with high (arrows, left) vs. low (right) levels of foam cell macrophages. Bottom: Analysis of monocyte/macrophage marker genes (EcoTyper) in bulk RNA-seq profiles of laser micro-dissected stroma from CRC 393 and 380 (above) as well as another foam cell-depleted CRC tumor (CRC 406). Enrichment was calculated by pre-ranked gene set enrichment analysis applied to the log2 fold change of foam cell-high (n = 3) vs. foam cell-low (n = 3) RNA-seq profiles (Table S1). The scale bar (100μm) is identical for both images. See also Figures S2 and S3.
Figure 3.
Figure 3.. Cell State-Specific Survival Associations Across 15,008 Tumors.
(A) Survival associations of 69 cell states in 5,946 tumors (discovery cohort), stratified by cell type and aggregated across malignancies. Marker genes for the most significant adverse and favorable states are indicated. See also Figure S4A and Table S5. (B) State-specific survival associations in the discovery cohort (TCGA) and an independent cohort of 9,062 epithelial tumor transcriptomes (PRECOG). Concordance and statistical significance were assessed by Pearson correlation (see also Figure S4D). (C) Kaplan-Meier plots showing differences in overall survival between patients with high levels of M1-like macrophages (state 3) or M2 foamy-like macrophages (state 6) in three carcinomas. TCGA patients were stratified by the median difference between M1 and M2 foamy-like macrophages; thresholds determined in TCGA were applied to PRECOG. Statistical significance was calculated by a two-sided log-rank test. HR, hazard ratio. 95% HR confidence intervals are shown in brackets. See also Figure S4.
Figure 4.
Figure 4.. Large-Scale Reconstruction of Multicellular Communities In Vivo.
(A) Cell state abundance profiles across 16 carcinomas, organized into 10 carcinoma ecotypes (CEs). Only cell states and tumor samples assigned to CEs are shown (related to Figure S5A,B). Tumor samples are ordered by the most abundant CE class per specimen. (B) CE composition depicted as network diagrams. The width of each edge represents the Jaccard index across tumor samples (STAR Methods). (C–E) Validation of CEs in scRNA-seq profiles. (C) Overview of the approach. (D) Heat maps portraying co-occurrence relationships among cell state abundance profiles, both in the discovery cohort (left) and in six scRNA-seq atlases spanning BRCA, CRC, HNSCC, and NSCLC (right; Table S1). Only tumor types matching those analyzed by scRNA-seq are shown. Cell state fractions were analyzed to assess co-occurrence relationships. All states are grouped into predefined CEs (panel B) and only states assigned to CEs are shown (n = 58). Significantly recoverable CEs are indicated above the heat maps (*P < 0.05). ‘Co-occurrence index’ is a measure of covariance that accounts for noise (STAR Methods). (E) Composition of selected CEs in a subset of samples profiled by scRNA-seq for which each CE is highest. Cell types within each CE are distinguished by color; cell states can be distinguished by matching each CE and cell type with the corresponding node in panel B. See also Figure S5.
Figure 5.
Figure 5.. Carcinoma Ecotype Characteristics and Association with Immunotherapy Response.
(A) Characteristics of carcinoma ecotypes in the discovery cohort. Top: CE-specific survival associations across 16 carcinomas, colored by favorable (blue) or adverse (red) survival (color scale identical to Figure S4A). Center: CIBERSORTx-inferred proportions of 12 major cell types (averaged and scaled), grouped by the most abundant CE per tumor. Bottom: Key features of each CE. Enrichment statistics were calculated by dividing tumors into classes for which the indicated CE is highest (Table S6). (B) CE composition in normal tissues (GTEx), adjacent normal samples (discovery cohort), and primary tumor specimens (discovery cohort). Pan-carcinoma survival associations are also indicated. ns, not significant. (C) Association of 122 features with overall survival and ICI response in 571 patients with advanced melanoma (Mel.) or bladder cancer (BLCA). Results are ordered top to bottom by performance across therapies and outcome measures (Table S6). See also Figure S5.
Figure 6.
Figure 6.. Proinflammatory Communities are Spatially Distinct and Predictive of Early Lung Cancer Development.
(A) Heat maps displaying differentially expressed genes between CE9 and CE10 in seven scRNA-seq tumor datasets (Table S1), shown for cell types present in both CEs. For each dataset and cell state, mean expression is shown. (B) Immunofluorescence imaging of CE9 and CE10-specific T cell states (DAPI, CD3, and GZMB or GZMK) and monocyte/macrophage states (DAPI, CD68, and APOE or CCR2) in NSCLC specimens (T29, T36) with paired bulk RNA-seq data (Figure S6A,C and Table S6). CE9 and CE10-specific marker genes are highlighted in panel A. Images correspond to boxed regions in Figure S6A,C. Scale bar of 20μm is identical for all images. ‘Center’ refers to the tumor core; ‘Edge’ refers to the periphery of the tumor mass. (C) Left: Distribution of CE9 and CE10 in breast tumor and melanoma sections profiled by spatial transcriptomics. Tumor regions are demarcated by a dashed line. Right: Relative distance of CE9- and CE10-positive spots from tumor regions. (D) Schema for quantifying spatial colocalization of CE-specific cell states. (E) Significance of cell state colocalization within individual CEs, as measured across four tumor types (Table S1). (F) Left: Schema illustrating clinical outcomes of 33 subjects for whom premalignant lung lesions were profiled by microarray (Teixeira et al., 2019) and assessed for CE9 and CE10 by EcoTyper. Right: Relative abundance of CE9 versus CE10 in premalignant lung lesions, stratified by clinical outcome. Group comparisons in panels C and F were performed using a two-sided unpaired Wilcoxon rank sum test. See also Figure S6.

Comment in

Similar articles

Cited by

References

    1. Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJT, and Mahfouz A (2019). A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol 20, 194. - PMC - PubMed
    1. Aiello NM, Maddipati R, Norgard RJ, Balli D, Li J, Yuan S, Yamazoe T, Black T, Sahmoud A, and Furth EE (2018). EMT subtype influences epithelial plasticity and mode of cell migration. Dev Cell 45, 681–695. e684. - PMC - PubMed
    1. Aran D, Camarda R, Odegaard J, Paik H, Oskotsky B, Krings G, Goga A, Sirota M, and Butte AJ (2017). Comprehensive analysis of normal adjacent to tumor transcriptomes. Nature Communications 8, 1077. - PMC - PubMed
    1. Armingol E, Officer A, Harismendy O, and Lewis NE (2021). Deciphering cell–cell interactions and communication from gene expression. Nature Reviews Genetics 22, 71–88. - PMC - PubMed
    1. Azizi E, Carr AJ, Plitas G, Cornish AE, Konopacki C, Prabhakaran S, Nainys J, Wu K, Kiseliovas V, Setty M, et al. (2018). Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment. Cell 174, 1293–1308 e1236. - PMC - PubMed

Publication types

LinkOut - more resources

-