Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 16;6(1):bpab017.
doi: 10.1093/biomethods/bpab017. eCollection 2021.

OptM: estimating the optimal number of migration edges on population trees using Treemix

Affiliations

OptM: estimating the optimal number of migration edges on population trees using Treemix

Robert R Fitak. Biol Methods Protoc. .

Abstract

The software Treemix has become extensively used to estimate the number of migration events, or edges (m), on population trees from genome-wide allele frequency data. However, the appropriate number of edges to include remains unclear. Here, I show that an optimal value of m can be inferred from the second-order rate of change in likelihood (Δm) across incremental values of m. Repurposed from its original use to estimate the number of population clusters in the software StructureK), I show using simulated populations that Δm performs equally as well as current recommendations for Treemix. A demonstration of an empirical dataset from domestic dogs indicates that this method may be preferable in large, complex population histories and can prioritize migration events for subsequent investigation. The method has been implemented in a freely available R package called "OptM" and as a web application (https://rfitak.shinyapps.io/OptM/) to interface directly with the output files of Treemix.

Keywords: SNPs; likelihood; population genomics; structure.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
The output produced by OptM for the simulated dataset with m = 3 migration edges. (a) The mean and standard deviation (SD) across 10 iterations for the composite likelihood L(m) (left axis, black circles) and proportion of variance explained (right axis, red “x”s). The 99.8% threshold (horizontal dotted line) is that recommended by Pickrell and Pritchard [2]. (b) The second-order rate of change (Δm) across values of m.
Figure 2:
Figure 2:
The output produced by OptM for an empirical dataset of domestic dogs. A total of 10 iterations were run for each possible number of migration edges, m = 1–40. (a) The mean and standard deviation (SD) for the composite likelihood L(m) (left axis, black circles) and proportion of variance explained (right axis, red “x”s). The 99.8% threshold is that recommended by Pickrell and Pritchard [2], but not visible here because the threshold is still not met at m = 40 edges. OptM produces a warning to notify the user that this threshold is not visible. (b) The second-order rate of change (Δm) across values of m. The arrow indicates the peak in Δm at m = 5 edges.
Figure 3:
Figure 3:
The tree structure of the graph inferred by Treemix for the 34 dog breeds and gray wolf populations. Five migration edges were allowed as inferred by OptM. The migration edges are colored according to their weight (w^). The scale bar indicates ten times the average standard error of the values in the covariance matrix.

Similar articles

Cited by

References

    1. Ellegren H.Genome sequencing and population genomics in non-model organisms. Trends Ecol Evol 2014;29:51–63. - PubMed
    1. Pickrell JK, Pritchard JK.. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet 2012;8:e1002967. - PMC - PubMed
    1. Teixeira MM, Barker BM.. Use of population genetics to assess the ecology, evolution, and population structure of Coccidioides. Emerg Infect Dis 2016;22:1022–30. - PMC - PubMed
    1. von Wettberg EJB, Chang PL, Basdemir F. et al.Ecology and genomics of an important crop wild relative as a prelude to agricultural innovation. Nat Commun 2018;9:649. - PMC - PubMed
    1. Card DC, Schield DR, Adams RH. et al.Phylogeographic and population genetic analyses reveal multiple species of Boa and independent origins of insular dwarfism. Mol Phylogenet Evol 2016;102:104–16. - PMC - PubMed
-