Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 15:10:606.
doi: 10.1186/1471-2164-10-606.

Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome

Affiliations

Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome

Clara Amid et al. BMC Genomics. .

Abstract

Background: Host defense peptides are a critical component of the innate immune system. Human alpha- and beta-defensin genes are subject to copy number variation (CNV) and historically the organization of mouse alpha-defensin genes has been poorly defined. Here we present the first full manual genomic annotation of the mouse defensin region on Chromosome 8 of the reference strain C57BL/6J, and the analysis of the orthologous regions of the human and rat genomes. Problems were identified with the reference assemblies of all three genomes. Defensins have been studied for over two decades and their naming has become a critical issue due to incorrect identification of defensin genes derived from different mouse strains and the duplicated nature of this region.

Results: The defensin gene cluster region on mouse Chromosome 8 A2 contains 98 gene loci: 53 are likely active defensin genes and 22 defensin pseudogenes. Several TATA box motifs were found for human and mouse defensin genes that likely impact gene expression. Three novel defensin genes belonging to the Cryptdin Related Sequences (CRS) family were identified. All additional mouse defensin loci on Chromosomes 1, 2 and 14 were annotated and unusual splice variants identified. Comparison of the mouse alpha-defensins in the three main mouse reference gene sets Ensembl, Mouse Genome Informatics (MGI), and NCBI RefSeq reveals significant inconsistencies in annotation and nomenclature. We are collaborating with the Mouse Genome Nomenclature Committee (MGNC) to establish a standardized naming scheme for alpha-defensins.

Conclusions: Prior to this analysis, there was no reliable reference gene set available for the mouse strain C57BL/6J defensin genes, demonstrating that manual intervention is still critical for the annotation of complex gene families and heavily duplicated regions. Accurate gene annotation is facilitated by the annotation of pseudogenes and regulatory elements. Manually curated gene models will be incorporated into the Ensembl and Consensus Coding Sequence (CCDS) reference sets. Elucidation of the genomic structure of this complex gene cluster on the mouse reference sequence, and adoption of a clear and unambiguous naming scheme, will provide a valuable tool to support studies on the evolution, regulatory mechanisms and biological functions of defensins in vivo.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of the defensin gene cluster region in mouse (top) and human (bottom). A clone tiling path is shown for the corresponding regions in mouse (top) and human (bottom). Clones are displayed in yellow but regions overlapping with adjacent clones are shown in black. Genes are indicated by arrows. Genes in shadowed boxes are duplicated and the color indicates the pairs; A -'- highlights all potential Defcr5 genes (see color legend for more details). The mouse assembly is based on NCBIM37, in which three gaps currently exist; two gaps are indicated by grey bars and the biggest gap between the two clusters is joined by a 'V'.
Figure 2
Figure 2
Multiple protein alignment of defensin peptides. Most defensin peptides contain six canonical cysteine residues (A); Members of the CRS1C family contain eleven cysteines in a different spacing between each other; CRS4C-6 belongs to the CRS4C family but consists of ten instead of the usual nine cysteines for this group (B). A novel sequence (OTTMUSG00000019857) has been annotated within the defensin gene cluster region which lacks all the canonical cysteines in any known number and spacing. Four cysteine residues can be found here but they don't align with any of the known cysteines in other peptides (C). All cysteine residues are highlighted in yellow. Genes identified for the first time in this study are tagged as noveldef.
Figure 3
Figure 3
The polymorphic Defcr5 locus. A protein alignment between all potential Defcr5 copies and P28312.2. Variation in amino acids is highlighted in red.
Figure 4
Figure 4
Novel coding and non-coding variants. Vega presenting the region for Defb30 and Defb42, where three new variants per locus were annotated. Defb30: Variants 1 is a known variant with known CDS, variant 2 is a novel variant with the same CDS as variant 1 but has an alternative 3' UTR, variant 3 and 4 are novel variants with putative CDS and different 3'UTR. Defb42: Variant 1 represents a non-coding transcript, variant 2 is a novel variant with the same CDS as the known transcript (3) but with an alternative 5' UTR, variant 3 is a known variant with known CDS and variant 4 is a NMD candidate.
Figure 5
Figure 5
Potential promotor region for some defensin genes. (A) For two genes OTTMUSG00000019784 and Defcr26 a weak TATA box motif could be identified. (B) A strong TATA box motif was found for 27 defensin genes, here an example is shown for Defcr3 and OTTMUSG00000019785, a novel defensin gene. TATA box motifs are shown in red/blue and start codons in green.

Similar articles

Cited by

References

    1. Ganz T, Selsted M, Lehrer R. Defensins. Eur J Haematol. 1990;44:1–8. - PubMed
    1. Mestas J, Hughes CCW. Of Mice and Not Men: Differences between Mouse and Human Immunology. J Immunol. 2004;172:2731–2738. - PubMed
    1. Ganz T, Metcalf J, Gallin J, Boxer L, Lehrer R. Microbicidal/cytotoxic proteins of neutrophils are deficient in two disorders: Chediak-Higashi syndrome and "specific" granule deficiency. Journal of Clinical Investigation. 1988;82:552–556. doi: 10.1172/JCI113631. - DOI - PMC - PubMed
    1. Schullerus D, von Knobloch R, Chudek J, Herbers J, Kovacs G. Microsatellite analysis reveals deletion of a large region at chromosome 8p in conventional renal cell carcinoma. Int J Cancer. 1999;80:22–24. doi: 10.1002/(SICI)1097-0215(19990105)80:1<22::AID-IJC5>3.0.CO;2-S. - DOI - PubMed
    1. Young A, de Oliveira Salles P, Lim S, Cohen C, Petros J, Marshall F, Neish A, Amin M. Beta defensin-1, parvalbumin, and vimentin: a panel of diagnostic immunohistochemical markers for renal tumors derived from gene expression profiling studies using cDNA microarrays. Am J Surg Pathol. 2003;27:199–205. doi: 10.1097/00000478-200302000-00008. - DOI - PubMed

Publication types

LinkOut - more resources

-