U.S. flag

An official website of the United States government

Build Summary

Release Version: 20230706150541

New: We are pleased to present the NCBI ALFA Release 3 (version 20230706150541), one of the largest and most comprehensive aggregated variant databases that includes allele and genotype frequency. This version incorporates new genotype frequency and Hardy-Weinberg equilibrium(HWE) probability to help you analyze more effectively. All of these information are now in our FTP files.

##FORMAT=<ID=HWEP,Number=1,Type=Integer,Description="int(-log(HWE score test p-value)); -1 indicates that the HWE score test p-value could not be computed">
##FORMAT=<ID=GR,Number=1,Type=Integer,Description="Genotype homozygous reference allele (AA) count; in rare cases may not be the GRCh reference allele">
##FORMAT=<ID=GV,Number=1,Type=Integer,Description="Genotype heterozygous ref/alt (A/B) count; reported for the most common two alleles that may or may not include the reference allele">
##FORMAT=<ID=GA,Number=1,Type=Integer,Description="Genotype homozygous alternate allele (B/B) count; could be any of the non-biallelic variant alleles.">

The RefSNP website already reports the frequency of alleles from this release. We are planning to update the RefSNP page to include genotype frequency and Hardy-Weinberg equilibrium (HWE) probability.

Input and Output Counts


Input Count
Studies 82
Subjects 204,108
Genotypes 5,773,169,974,362
Output Count
Total RefSNPs 904,666,942
Exist in dbSNP 156 904,113,309
Novel 553,633

* Subject counts for different assay source can be overlapping.

Population Biosample ID Subjects Total Site Count MAF = 0 MAF >= 0.01 0.01 > MAF >= 0.001 MAF < 0.001 Singleton
European SAMN10492695 170,432 897,795,726 790,467,870 12,698,121 10,217,165 874,880,440 55,468,196
AfricanOthers SAMN10492696 330 889,789,877 867,263,040 16,189,977 6,336,860 867,263,040 6,693,257
EastAsian SAMN10492697 2,515 889,359,780 877,818,106 11,382,903 133,481 877,843,396 3,530,766
AfricanAmerican SAMN10492698 8,860 890,733,488 823,801,666 17,221,278 17,452,425 856,059,785 25,341,328
LatinAmerican1 SAMN10492699 817 889,296,648 869,911,354 12,614,612 6,770,380 869,911,656 6,683,777
LatinAmerican2 SAMN10492700 4,703 889,338,588 862,580,163 9,603,034 17,148,712 862,586,842 11,064,663
OtherAsian SAMN10492701 1,000 889,157,433 880,613,029 8,495,071 41,203 880,621,159 2,585,443
SouthAsian SAMN10492702 2,619 889,137,055 875,447,223 13,542,288 139,843 875,454,924 4,210,232
Other SAMN11605645 12,832 897,815,333 859,256,511 14,990,800 22,453,126 860,371,407 14,080,922
African (note 1) SAMN10492703 9,190 890,733,974 822,797,281 17,256,003 17,786,640 855,691,331 25,853,998
Asian (note 2) SAMN10492704 3,515 889,379,475 876,472,062 9,016,798 3,858,978 876,503,699 4,097,403
Total (note 3) SAMN10492705 204,108 897,855,544 736,676,981 15,200,441 17,592,488 865,062,615 81,123,968

Notes:

  1. Total of African American and African Others; see population descriptions.

  2. Total of East Asian and Other Asian; see population descriptions.

  3. Total of unique subjects and excluding African and Asian redundant counts above.

Column descriptions:

Output Population - see ALFA computed populations

BioSample ID - population BioSample accession ID

Subjects - unique subject count by population

Total Site Count - total unique variant sites reported

MAF = 0 - site homozygous for the reference allele and no variant allele detected from the current subject sample size; possibly rare if subject size > 100

MAF >= 0.01 - common variant with MAF >= 0.01

0.01 > MAF >= 0.001 - rare variants

MAF < 0.001 - ultra rare variants

Singleton - minor allele is found once

Support Center

Last updated: 2023-08-11T15:20:29Z

-