Analysis of Primary Citations (References) of PDB Deposits

Conference: 2022: 72nd ACA Annual Meeting
Joanna Lenkiewicz Poster Author
University of Virginia
Charlottesville, VA 
 
Michal Gucwa Additional Author
University of Virginia
Charlottesville, VA 
 
David Cooper Additional Author
University of Virginia
Charlottesville, VA 
 
Wladek Minor Additional Author
University of Virginia
Charlottesville, VA 
 
07/30/2022: 5:30 PM - 7:30 PM
Poster Session 
Portland Marriott Downtown Waterfront 
Room: Exhibit Hall 

Description

Researchers worldwide from almost every biomedical discipline perform basic searches of the PDB, so the essential information in a PDB deposit must be as informative as possible. On a larger scale, inaccurate or misleading metadata can skew data mining efforts. The title and keywords of PDB deposits may play an essential role in the data mining of the PDB. The primary citation (reference) title may help in such a search, yet many deposits have notable discrepancies between the structure title and the primary reference title. Moreover, we have observed that the fraction of deposits with the status "To be published" has grown in recent years. We also analyze the similarity of titles, the number of citations for various classes of structures, and the primary reference keywords. Finally, the information about crystallization conditions is compared between PDB and the methods section from the primary citation. Several noteworthy examples are presented.