Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 7;115(32):E7550-E7558.
doi: 10.1073/pnas.1804015115. Epub 2018 Jul 23.

Inferring the shape of global epistasis

Affiliations

Inferring the shape of global epistasis

Jakub Otwinowski et al. Proc Natl Acad Sci U S A. .

Abstract

Genotype-phenotype relationships are notoriously complicated. Idiosyncratic interactions between specific combinations of mutations occur and are difficult to predict. Yet it is increasingly clear that many interactions can be understood in terms of global epistasis. That is, mutations may act additively on some underlying, unobserved trait, and this trait is then transformed via a nonlinear function to the observed phenotype as a result of subsequent biophysical and cellular processes. Here we infer the shape of such global epistasis in three proteins, based on published high-throughput mutagenesis data. To do so, we develop a maximum-likelihood inference procedure using a flexible family of monotonic nonlinear functions spanned by an I-spline basis. Our analysis uncovers dramatic nonlinearities in all three proteins; in some proteins a model with global epistasis accounts for virtually all of the measured variation, whereas in others we find substantial local epistasis as well. This method allows us to test hypotheses about the form of global epistasis and to distinguish variance components attributable to global epistasis, local epistasis, and measurement error.

Keywords: deep mutational scanning; evolution; fitness landscape; genotype–phenotype map; protein.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
(A) Global epistasis as a nonlinear function of an intermediate additive trait (Eq. 1) for binding of protein GB1 to an immunoglobulin fragment (IgGFC) (37). Solid black line indicates the nonlinear function g(ϕ) (95% confidence interval in light gray, too small to be visible). Based on the inferred magnitude of nonglobal epistasis, we estimate that the true phenotypes for 95% of genotypes lie between the two dashed lines (2σHOC=0.7). Colors are a histogram of observed binding. (B and C) The distribution of binding values is bimodally distributed (measured dashed line, inferred solid line) (B), while the distribution of inferred additive phenotypes is unimodal (C). (D) The impact of each possible amino acid substitution on the underlying additive trait. Open squares are the wild-type amino acid. Additive effects and the underlying trait are scaled so that the wild type has trait value 0 and the average mutation changes the trait by distance 1 (Materials and Methods).
Fig. 2.
Fig. 2.
The shape of global epistasis in a DMS study of GFP (31) shows a sharp threshold in the additive trait. Below the threshold there is low fluorescence and above the threshold fluorescence is high and the slope of the nonlinear mapping is positive (likelihood-ratio test, P<0.003). Data consist of 51,715 protein variants, with 3.7 mutations on average. HOC epistasis has magnitude σHOC=0.073, and 10-fold cross-validated r2=0.931±0.003. g(ϕ) is indicated by the black solid line with light gray (too small to be visible) indicating the 95% bootstrap CI. Our analysis suggests that the true values for 95% of genotypes will lie between the dashed lines. The underlying additive trait is scaled so that the wild type has trait value 0 and the mean absolute magnitude mutation changes the trait by distance 1 (Materials and Methods). Additive effects are shown in SI Appendix, Fig. S4.
Fig. 3.
Fig. 3.
Gene-by-environment interactions in a DMS study of β-lactamase (45). Data consist of 4,997 single mutants, measuring growth rate under different concentrations of antibiotic (ampicillin) and two replicates (45). Log relative growth rate is plotted against the inferred protein activity for each substitution; the mapping between protein activity for log relative growth rate for each antibiotic concentration is given by the colored curves and each such curve is surrounded by a gray region giving its 95% CI. Growth rate measurements were made using two biological replicates, shown as two slightly different hues for each antibiotic concentration. Cross-validated r2=0.923±0.002.

Similar articles

Cited by

References

    1. Kauffman S, Levin S. Towards a general theory of adaptive walks on rugged landscapes. J Theor Biol. 1987;128:11–45. - PubMed
    1. Kauffman SA. The Origins of Order: Self Organization and Selection in Evolution. Oxford Univ Press; New York: 1993.
    1. Huynen MA, Stadler PF, Fontana W. Smoothness within ruggedness: The role of neutrality in adaptation. Proc Natl Acad Sci USA. 1996;93:397–401. - PMC - PubMed
    1. Fontana W. Modelling ‘evo-devo’ with RNA. Bioessays. 2002;24:1164–1177. - PubMed
    1. Fowler DM, Fields S. Deep mutational scanning: A new style of protein science. Nat Methods. 2014;11:801–807. - PMC - PubMed

Publication types

-