Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb 12;6(2):e1000843.
doi: 10.1371/journal.pgen.1000843.

The scale of population structure in Arabidopsis thaliana

Affiliations

The scale of population structure in Arabidopsis thaliana

Alexander Platt et al. PLoS Genet. .

Abstract

The population structure of an organism reflects its evolutionary history and influences its evolutionary trajectory. It constrains the combination of genetic diversity and reveals patterns of past gene flow. Understanding it is a prerequisite for detecting genomic regions under selection, predicting the effect of population disturbances, or modeling gene flow. This paper examines the detailed global population structure of Arabidopsis thaliana. Using a set of 5,707 plants collected from around the globe and genotyped at 149 SNPs, we show that while A. thaliana as a species self-fertilizes 97% of the time, there is considerable variation among local groups. This level of outcrossing greatly limits observed heterozygosity but is sufficient to generate considerable local haplotypic diversity. We also find that in its native Eurasian range A. thaliana exhibits continuous isolation by distance at every geographic scale without natural breaks corresponding to classical notions of populations. By contrast, in North America, where it exists as an exotic species, A. thaliana exhibits little or no population structure at a continental scale but local isolation by distance that extends hundreds of km. This suggests a pattern for the development of isolation by distance that can establish itself shortly after an organism fills a new habitat range. It also raises questions about the general applicability of many standard population genetics models. Any model based on discrete clusters of interchangeable individuals will be an uneasy fit to organisms like A. thaliana which exhibit continuous isolation by distance on many scales.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Map of collection sites around the world.
Red dots indicate sample sites.
Figure 2
Figure 2. Fraction of non-matching alleles between all pairs of plants.
Solid bars are observed measurements from data. Stacked on each other are pairs within Eurasia (blue), pairs within North America (red), and inter-continental pairs (black). Green line is the distribution from a simulation assuming panmixia. Yellow line is a simulation assuming global random mating but only measuring differences between unique haplotypes.
Figure 3
Figure 3. Estimated selfing rate per field site.
Individual dots are specific field sites. North American sites are in red. The curve is a smoothed kernel density.
Figure 4
Figure 4. Distribution of haplogroup diversity by field site.
Probability of two plants in a field site being of different haplogroups. Low values (red) indicate monomorphic field sites. High values (light) indicate diverse field sites. A dynamic map will be available online at (http://arabidopsis.usc.edu/Accession/).
Figure 5
Figure 5. Probability of finding two members of a haplogroup as a function of distance and continent.
Dot size shows relative (within panel) number of observations per bin. Blue line is curve of the form y = mx+b that is best fit to the binned data. Red line is model of exponential decay of the form y = Cexp(−λ*x) that is best fit to the binned data. (A,B) use 150 km bins. (C,D) use 10 km bins. (E,F) use 1/2 km bins.
Figure 6
Figure 6. Pairwise distribution of non-shared alleles as a function of geographic distance and continent.
Boxes show median, 25th and 75th percentile; whiskers show 9th and 91st percentile. Shading shows relative (within panel) number of observations per bin. Blue line is curve of the form y = mx+b that is best fit to the binned data. Red line is model of exponential decay of the form y = K-Cexp(−λ*x) that is best fit to the binned data. (A,B) use 150 km bins. (C,D) use 10 km bins. (E,F) use 1/2 km bins. Data in (A,E) would not converge on an exponential curve.

Similar articles

Cited by

References

    1. Kliman RM, Andolfatto P, Coyne JA, Depaulis F, Kreitman M, et al. The Population Genetics of the Origin and Divergence of the Drosophila simulans Complex Species. Genetics. 2000;156:1913–1931. - PMC - PubMed
    1. Marchini J, Cardon LR, Phillips MS, Donnelly P. The effects of human population structure on large genetic association studies. Nat Genet. 2004;36:512–517. doi: 10.1038/ng1337. - DOI - PubMed
    1. Voight BF, Pritchard JK. Confounding from cryptic relatedness in case-control association studies. PLoS Genet. 2005;1:e32. doi: 10.1371/journal.pgen.0010032. - DOI - PMC - PubMed
    1. Buckler ES, Thornsberry JM, Kresovich S. Molecular Diversity, Structure and Domestication of Grasses. Genetics Research. 2001;77:213–218. doi: 10.1017/S0016672301005158. - DOI - PubMed
    1. Sasaki T, Matsumoto T, Yamamoto K, Sakata K, Baba T, et al. The genome sequence and structure of rice chromosome 1. Nature. 2002;420:312–316. doi: 10.1038/nature01184. - DOI - PubMed

Publication types

-