Explore Population Genetics in dbSNP with NCBI’s Allele Frequency Aggregator (ALFA)

Explore Population Genetics in dbSNP with NCBI’s Allele Frequency Aggregator (ALFA)

Access to comprehensive and accurate allele frequency data is essential to understanding the impact of genetic variations on human health and disease. Allele Frequency Aggregator (ALFA) provides the Database of Single Nucleotide Polymorphisms (dbSNP) with allele frequency data for 200K subjects from the Database of Genotypes and Phenotypes (dbGaP). ALFA adheres to the Findable, Accessible, Interoperable, and Reusable (FAIR) Data Principles providing open access to valuable allele frequency data.  

Searching Allele Frequencies

Search dbSNP for minor allele frequencies (MAF) within ALFA’s populations to see genetic differences and get insights into human population history. This may also shed light on genetic variations and diseases that affect different populations and guide research on phenotype-genotype associations and possible treatments.  

Here are some examples that showcase the versatility and usefulness of ALFA searching in the SNP database. 

  • A specific minor allele frequency or a range of MAFs in the ALFA populations: 

Variants with MAF 0.01 in the European Population (ALFA_EUR]: 
  “00000.0100”[ALFA_EUR] 

All common variants (MAF 0.01 to 0.5) in the African Populations [ALFA_AFR]:  "00000.0100":"00000.5000"[ALFA_AFR]

  • Minor allele frequencies or a frequency range of MAFs in Boolean searches with other dbSNP filters or search terms:

Common variants in European populations [ALFA_EUR] that are rare in African populations [ALFA_AFR]:

"00000.0100":"00000.5000"[ALFA_EUR] AND "00000.0010":"00000.0110"[ALFA_AFR]

Rare missense variants in the South Asian population [ALFA_SAS] that are identified as pathogenic alleles:

"00000.0001"[ALFA_SAS] : "00000.01000"[ALFA_SAS] AND pathogenic[Clinical_Significance] AND missense variant[Function_Class] 

Screenshot of dbSNP results and ALFA frequencies for a missense variantFigure 1: dbSNP results and ALFA frequencies for a missense variant (rs17580) in the SERPINA1 gene. This is a globally common variant that can be associated with alpha-1-antitrypsin deficiency. It shows as a common variant in the ALFA European population but is much rarer in the African population.  

Advanced Search

Use additional search terms and attributes such as gene name, other function classes (e.g., synonymous, frameshift, etc.), variant types, and others for more precise results. The SNP Advanced Search Builder can help with setting up complex Boolean queries.  

API Access

You can use automated workflows through the E-Utilities API to search and retrieve ALFA data using automated workflows. Visit GitHub for tutorials and demonstration code that show how to use the E-Utilities to access these data. 

Stay up to date

Follow us on social media @NCBI and join our mailing list to keep up to date with dbSNP and other NCBI news. 

Questions?

Please reach out to us with questions or feedback.  

Leave a Reply