A probabilistic similarity metric for Medline records: a model for author name disambiguation
- PMID: 14728536
- PMCID: PMC1480109
A probabilistic similarity metric for Medline records: a model for author name disambiguation
Abstract
We present a model for automatically generating training sets and estimating the probability that a pair of Medline records sharing a last and first name initial are authored by the same individual, based on shared title words, journal name, co-authors, medical subject headings, language, and affiliation, as well as distinctive features of the name itself (i.e., presence of middle initial, suffix, and prevalence in Medline).
Figures
Similar articles
-
Correction to: Treatment of Immature Scars: Manual Massages.2021 Mar 18. In: Téot L, Mustoe TA, Middelkoop E, Gauglitz GG, editors. Textbook on Scar Management: State of the Art Management and Emerging Technologies [Internet]. Cham (CH): Springer; 2020. Chapter 65. 2021 Mar 18. In: Téot L, Mustoe TA, Middelkoop E, Gauglitz GG, editors. Textbook on Scar Management: State of the Art Management and Emerging Technologies [Internet]. Cham (CH): Springer; 2020. Chapter 65. PMID: 36351125 Free Books & Documents. Review.
-
Three journal similarity metrics and their application to biomedical journals.PLoS One. 2014 Dec 23;9(12):e115681. doi: 10.1371/journal.pone.0115681. eCollection 2014. PLoS One. 2014. PMID: 25536326 Free PMC article.
-
[A unique author ID is a possible solution to the name ambiguity problem].Ugeskr Laeger. 2014 Sep 29;176(40):V04140239. Ugeskr Laeger. 2014. PMID: 25294516 Review. Danish.
-
Author Name Disambiguation in MEDLINE.ACM Trans Knowl Discov Data. 2009 Jul 1;3(3):11. doi: 10.1145/1552303.1552304. ACM Trans Knowl Discov Data. 2009. PMID: 20072710 Free PMC article.
-
The strength of co-authorship in gene name disambiguation.BMC Bioinformatics. 2008 Jan 29;9:69. doi: 10.1186/1471-2105-9-69. BMC Bioinformatics. 2008. PMID: 18230174 Free PMC article.
Cited by
-
Notes on the data quality of bibliographic records from the MEDLINE database.Database (Oxford). 2023 Nov 4;2023:baad070. doi: 10.1093/database/baad070. Database (Oxford). 2023. PMID: 37935584 Free PMC article.
-
Exploring high scientific productivity in international co-authorship of a small developing country based on collaboration patterns.J Big Data. 2023;10(1):64. doi: 10.1186/s40537-023-00744-1. Epub 2023 May 15. J Big Data. 2023. PMID: 37215244 Free PMC article.
-
Scientific rewards for biomedical specialization are large and persistent.BMC Biol. 2022 Sep 30;20(1):211. doi: 10.1186/s12915-022-01400-5. BMC Biol. 2022. PMID: 36175953 Free PMC article.
-
The ripple effects of funding on researchers and output.Sci Adv. 2022 Apr 22;8(16):eabb7348. doi: 10.1126/sciadv.abb7348. Epub 2022 Apr 22. Sci Adv. 2022. PMID: 35452287 Free PMC article.
-
Man versus machine? Self-reports versus algorithmic measurement of publications.PLoS One. 2021 Sep 29;16(9):e0257309. doi: 10.1371/journal.pone.0257309. eCollection 2021. PLoS One. 2021. PMID: 34587169 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources