Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 2;40(5):msad100.
doi: 10.1093/molbev/msad100.

Developing an Evolutionary Baseline Model for Humans: Jointly Inferring Purifying Selection with Population History

Affiliations

Developing an Evolutionary Baseline Model for Humans: Jointly Inferring Purifying Selection with Population History

Parul Johri et al. Mol Biol Evol. .

Abstract

Building evolutionarily appropriate baseline models for natural populations is not only important for answering fundamental questions in population genetics-including quantifying the relative contributions of adaptive versus nonadaptive processes-but also essential for identifying candidate loci experiencing relatively rare and episodic forms of selection (e.g., positive or balancing selection). Here, a baseline model was developed for a human population of West African ancestry, the Yoruba, comprising processes constantly operating on the genome (i.e., purifying and background selection, population size changes, recombination rate heterogeneity, and gene conversion). Specifically, to perform joint inference of selective effects with demography, an approximate Bayesian approach was employed that utilizes the decay of background selection effects around functional elements, taking into account genomic architecture. This approach inferred a recent 6-fold population growth together with a distribution of fitness effects that is skewed towards effectively neutral mutations. Importantly, these results further suggest that, although strong and/or frequent recurrent positive selection is inconsistent with observed data, weak to moderate positive selection is consistent but unidentifiable if rare.

Keywords: approximate Bayesian computation; background selection; demographic inference; distribution of fitness effects; human population genomics.

PubMed Disclaimer

Figures

<sc>Fig.</sc> 1.
Fig. 1.
(a) Model and parameters inferred by the ABC method. The left panel shows the DFE while the right panel shows the single, recent size change demographic model fit to the data. All inferred parameters are indicated in blue font. (b) Schematic of the expected number of bases (π50) to reach a 50% recovery of nucleotide diversity due to BGS around single exons. The three windows in which statistics were calculated are shown in green font. (c) Accuracy of joint inference of demography and the DFE. Crossvalidation was performed on 100 randomly selected parameter combinations for all size parameters with tolerance = 0.08. The black line represents the y = x line. All statistics were used for inference and were calculated after removing sites inaccessible to next-generation sequencing in both the simulated and empirical data.
<sc>Fig.</sc> 2.
Fig. 2.
Inference of (a) recent population history and (b) the DFE of deleterious mutations in the Yoruba population. Inferences from the current study (using the 5′ intergenic/intronic regions) are shown in grey/black while those from previous studies are shown in the colored bars/lines. Note that the current population size predicted by Terhorst et al. (2017) is 356,990 and is not visible due to truncation of the y-axis. Also note that 2Nes for the purpose of the current study corresponds to 2Nancs as the scaling was performed with respect to the ancestral population size. 2 hap: refers to inference performed using a single diploid individual; 4 hap: refers to inference performed using two diploid individuals; EGP: Environmental Genome Project (https://egp.gs.washington.edu/); PGA: Programs for Genomic Applications (https://pga.gs.washington.edu/).
<sc>Fig.</sc> 3.
Fig. 3.
Fit of the best model inferred by our method to the empirical data, as shown by the distribution of (a) nucleotide diversity, (b) Tajima's D, (c) r2, and (d) divergence per site, across the 465 exons, for each of the three windows: functional, linked, and less linked intergenic/intronic regions. The simulated best model (with 10 replicates) is shown in red, while the observed empirical distributions of the same statistics in the YRI population are shown in the white distributions.
<sc>Fig.</sc> 4.
Fig. 4.
Fit of the estimated best model to the empirical data in the presence of positive selection. (ac) Distribution of Tajima's D, r2, and divergence per site across the 465 exons (only in the “functional” windows) for the best-fitting model (in red), the best-fitting model with positive selection (in blue), and their overlap (in purple). The distribution of the empirical data is shown in the white distributions. Examples of varying extents of positive selection are shown: (a) infrequent (fpos=0.1%) and weak (2Nesb=10), (b) moderately frequent (fpos=1%) and moderately strong (2Nesb=100), and (c) common (fpos=5%) and strong (2Nesb=1000). (d) A grid depicting the fit of varying extents of positive selection to the data with a check mark indicating that the addition of positive selection does not worsen the fit of the model to the data, and the number of “×” marks indicating the severity of the misfit to the calculated statistics generated by the addition of positive selection.

Update of

Similar articles

Cited by

References

    1. Altshuler D, Donnelly P, The International HapMap Consortium . 2005. A haplotype map of the human genome. Nature. 437:1299–1320. - PMC - PubMed
    1. Auton A, Abecasis GR, Altshuler DM, Durbin RM, Abecasis GR, Bentley DR, Chakravarti A, Clark AG, Donnelly P, Eichler EE, et al. . 2015. A global reference for human genetic variation. Nature. 526:68–74. - PMC - PubMed
    1. Babenko VN, Chadaeva IV, Orlov YL. 2017. Genomic landscape of CpG rich elements in human. BMC Evol Biol. 17:19. - PMC - PubMed
    1. Bank C, Ewing GB, Ferrer-Admettla A, Foll M, Jensen JD. 2014. Thinking too positive? Revisiting current methods of population genetic selection inference. Trends Genet. 30:540–546. - PubMed
    1. Beaumont MA. 2010. Approximate Bayesian computation in evolution and ecology. Annu Rev Ecol Evol Syst. 41:379–406.

Publication types

-