A large-scale phylogeny-guided analysis of pseudogenes in Pseudomonas aeruginosa bacterium
- PMID: 37750703
- PMCID: PMC10580986
- DOI: 10.1128/spectrum.01704-23
A large-scale phylogeny-guided analysis of pseudogenes in Pseudomonas aeruginosa bacterium
Abstract
Pseudogenes, once considered "junk DNA" based on the incorrect assumption that the absence of full coding potential means a complete lack of functionality, have recently become a subject of significant interest in the scientific community. Concurrently, it is widely assumed that bacterial genomes are compact and have a high density of coding genes with little room for non-coding genes, including pseudogenes. A key aspect of genome annotation is the correct identification of genes and the distinction between coding genes and pseudogenes, as it directly impacts functional and comparative genomics studies. In this study, we analyzed the genomic data of 4,699 strains of the bacterium Pseudomonas aeruginosa (P. aeruginosa) as they exhibit high variability in the number of annotated pseudogenes. In particular, we looked for correlations between the number of pseudogenes and other genomic and meta-features of the strains. We identified clusters of orthologous genes and pseudogenes and compared cluster size distributions and length homogeneity within clusters. We then mapped and examined orthology relationships between genes and pseudogenes. Additionally, we generated a phylogenetic tree of the strains and found that phylogenetically related strains are more homogeneous in the number of pseudogenes and share a significant amount of pseudogenes. Finally, we delved into clusters of orthologous genes and pseudogenes and quantified their phylogenetic neighborhood, classifying pseudogenes into evolutionary preserved pseudogenes, mis-annotated pseudogenes, or pseudogenes formed by failed horizontal transfer events. This in-depth study provides important insights that can be incorporated into pseudogene annotation pipelines in the future. IMPORTANCE Accurate annotation of genes and pseudogenes is vital for comparative genomics analysis. Recent studies have shown that bacterial pseudogenes have an important role in regulatory processes and can provide insight into the evolutionary history of homologous genes or the genome as a whole. Due to pseudogenes' nature as non-functional genes, there is no commonly accepted definition of a pseudogene, which poses difficulties in verifying the annotation through experimental methods and resolving discrepancies among different annotation techniques. Our study introduces an in-depth analysis of annotated genes and pseudogenes and insights that can be incorporated into improved pseudogene annotation pipelines in the future.
Keywords: Pseudomonas aeruginosa; bacteria; comparative genomics; phylogenetics; pseudogenes.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
![Fig 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10580986/bin/spectrum.01704-23.f001.gif)
![Fig 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10580986/bin/spectrum.01704-23.f002.gif)
![Fig 3](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10580986/bin/spectrum.01704-23.f003.gif)
![Fig 4](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10580986/bin/spectrum.01704-23.f004.gif)
![Fig 5](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10580986/bin/spectrum.01704-23.f005.gif)
![Fig 6](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10580986/bin/spectrum.01704-23.f006.gif)
Similar articles
-
Gauging the trends of pseudogenes in plants.Crit Rev Biotechnol. 2021 Nov;41(7):1114-1129. doi: 10.1080/07388551.2021.1901648. Epub 2021 May 17. Crit Rev Biotechnol. 2021. PMID: 33993808 Review.
-
Overcoming challenges and dogmas to understand the functions of pseudogenes.Nat Rev Genet. 2020 Mar;21(3):191-201. doi: 10.1038/s41576-019-0196-1. Epub 2019 Dec 17. Nat Rev Genet. 2020. PMID: 31848477 Review.
-
Comprehensive comparative homeobox gene annotation in human and mouse.Database (Oxford). 2015 Sep 27;2015:bav091. doi: 10.1093/database/bav091. Print 2015. Database (Oxford). 2015. PMID: 26412852 Free PMC article.
-
The GENCODE pseudogene resource.Genome Biol. 2012 Sep 26;13(9):R51. doi: 10.1186/gb-2012-13-9-r51. Genome Biol. 2012. PMID: 22951037 Free PMC article.
-
Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome.BMC Genomics. 2009 Dec 15;10:606. doi: 10.1186/1471-2164-10-606. BMC Genomics. 2009. PMID: 20003482 Free PMC article.
References
LinkOut - more resources
Full Text Sources