Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 28;10(1):58.
doi: 10.1038/s41540-024-00386-w.

Flexible modeling of regulatory networks improves transcription factor activity estimation

Affiliations

Flexible modeling of regulatory networks improves transcription factor activity estimation

Chen Chen et al. NPJ Syst Biol Appl. .

Abstract

Transcriptional regulation plays a crucial role in determining cell fate and disease, yet inferring the key regulators from gene expression data remains a significant challenge. Existing methods for estimating transcription factor (TF) activity often rely on static TF-gene interaction databases and cannot adapt to changes in regulatory mechanisms across different cell types and disease conditions. Here, we present a new algorithm - Transcriptional Inference using Gene Expression and Regulatory data (TIGER) - that overcomes these limitations by flexibly modeling activation and inhibition events, up-weighting essential edges, shrinking irrelevant edges towards zero through a sparse Bayesian prior, and simultaneously estimating both TF activity levels and changes in the underlying regulatory network. When applied to yeast and cancer TF knock-out datasets, TIGER outperforms comparable methods in terms of prediction accuracy. Moreover, our application of TIGER to tissue- and cell-type-specific RNA-seq data demonstrates its ability to uncover differences in regulatory mechanisms. Collectively, our findings highlight the utility of modeling context-specific regulation when inferring transcription factor activities.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of TIGER.
a TIGER uses a matrix decomposition method to estimate regulatory network W and TF activity Z. The inputs of TIGER are normalized gene expression matrix X and prior TF binding network W0 curated from the literature. Prior binding information is incorporated as different prior distributions in matrix W. Matrix Z is constrained to be strictly non-negative for identifiability and biological interpretation. Variational Inference is used to estimate parameters. b Description of the validation strategy. Test 1 (T1) uses yeast TF Knockout (TFKO), TF Overexpression (TFOE), and ChIP-seq data to assess the impact of sign-flipping. Test 2 (T2) uses A375 and MCF7 cancer cell line TFKO datasets and the DoRothEA prior network to assess the importance of edge weights. Tests 3 and 4 (T3 and T4) explore TIGER’s potential in tissue-specific and cell-type-specific data analysis.
Fig. 2
Fig. 2. TIGER improves TF activity estimation by improving edge sign accuracy.
a Boxplot depicting the TFA results estimated using TIGER and other methods, with pairwise comparisons to TIGER using Wilcoxon rank sum test p-values indicated above each box. Significance levels are indicated as ns (not significant), **(p < 0.01), and ****(p < 0.0001). b Correlation between TFA estimates from TIGER and VIPER, with a Spearman correlation coefficient of R = 0.65 (p < 0.05). The red rectangular region highlights the 9 TFs that TIGER successfully identifies, while VIPER fails to do so. The blue rectangular region represents 1 TF that VIPER successfully identifies, but TIGER fails to do so. (c) Boxplot displaying the associations between edge sign accuracy improvement (Y-axis) and TIGER’s performance (X-axis), where “Better,” “Equal,” and “Worse” correspond to TFs for which TIGER outperforms, equals, or underperforms VIPER, respectively. The Kruskal-Wallis test revealed a significance level of p < 0.05. The figure highlights two TFs with the maximum and minimum improvement. d Edge sign dot plot of YER169W (left) and YOR358W (right), with dot color indicating posterior edge sign, prior edge sign, and true edge sign. e Boxplot comparing TIGER and VIPER performance on the prior network with randomly flipped edge signs. The X-axis denotes the percentage of flipped edge signs ranging from 10 to 100%, with the dashed vertical line representing the actual rate (37%) of wrong edge signs in the yeast TFKO data. f Boxplot comparison of context-specific and generic prior networks using VIPER. VIPER_A stands for VIPER using context-specific ARACNe network. VIPER_T stands for VIPER using TIGER estimated network. TIGER is the reference and pairwise Wilcoxon rank sum test p-values are indicated as ns (not significant) or ****(p < 0.0001).
Fig. 3
Fig. 3. TIGER improves TF activity estimation by re-weighting edges.
a, b Boxplots demonstrate the TFA results estimated from TIGER and other methods. Wilcoxon rank sum tests were used to compare all methods against TIGER, with p values indicated at the top of each box (ns = not significant, *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001). c Boxplot displays the association between the regulon Quality Score (QS) improvement (Y-axis) and TIGER’s performance (X-axis), with the red box denoting the 7 TFs that TIGER successfully identifies (rank < =5) and the blue box indicating the 4 TFs that TIGER fails to identify (rank > 5). The Wilcoxon rank sum test between these two groups is labeled. d Boxplot illustrates the relationship between the prior regulon size (Y-axis) and TIGER’s performance (X-axis), with the red box representing the 7 TFs that TIGER successfully identifies (rank < =5) and the blue box depicting the 4 TFs that TIGER fails to identify (rank > 5). The Wilcoxon rank sum test between these two groups is labeled. (e) Genome browser tracks present the ChIP-seq signal of TF RELA binding to the promoter region of gene AR in MCF7 cells. f Network visualization of TIGER prior and posterior edges. g Genome browser tracks present the ChIP-seq signal of TF RELA binding to the promoter region of gene EGR1 in MCF7 cells.
Fig. 4
Fig. 4. TIGER reveals tissue- and cell-type-specific regulation.
a Volcano plot depicting differentially activated TFs in male vs. female breast tissue identified using the Wilcoxon rank sum test. Dashed lines indicate the cutoffs at adjusted p value = 0.001 and mean rank difference = 0. b TIGER network illustrating 5 active female TFs and their top 15 target genes. Activation events are denoted by red edges, while inhibition events are indicated by cyan edges. Edge width is proportional to regulation strength. c UMAP plot depicting 19 cell types identified by the weighted nearest neighbor algorithm. d Boxplots showing the RBO scores of TIGER and other methods, where a higher RBO score indicates a greater similarity to the multi-modality “gold standard.” Wilcoxon rank sum test p-values against TIGER are indicated. e Heatmap presenting the activity scores of the top 5 TIGER TFs in each cell type.

Similar articles

References

    1. Karlebach G, Shamir R. Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 2008;9:770–780. doi: 10.1038/nrm2503. - DOI - PubMed
    1. Alvarez MJ, et al. Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat. Genet. 2016;48:838–847. doi: 10.1038/ng.3593. - DOI - PMC - PubMed
    1. Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res. 2019;29:1363–1375. doi: 10.1101/gr.240663.118. - DOI - PMC - PubMed
    1. Holland CH, et al. Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol. 2020;21:36. doi: 10.1186/s13059-020-1949-z. - DOI - PMC - PubMed
    1. Badia IMP, et al. decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinform Adv. 2022;2:vbac016. doi: 10.1093/bioadv/vbac016. - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources

-