Website's previous version is available at www60.jinr.ru

Methods and software tools for automating management of scientific publication metadata

Meshcheryakov Laboratory of Information Technologies

Joint Laboratory Seminar

Date and Time: Thursday, 4 June 2026, at 15:00

Venue: room 310, Meshcheryakov Laboratory of Information Technologies, online on Webinar

Seminar topic: «Methods and software tools for automating the management of scientific publication metadata»

Speaker: Andrey Kondratyev

Аннотация:

The talk presents methods and software tools for automating the management of scientific publication metadata, enabling end-to-end processing of bibliographic data from aggregation across distributed sources to verified import into institutional digital repositories. A cascaded metadata aggregation methodology is proposed, utilising application programming interfaces of external scientific databases to generate unified publication records through sequential matching based on digital identifiers and bibliographic attributes. To resolve author ambiguity, a multifactor algorithm has been developed that integrates deterministic matching via global researcher identifiers, fuzzy comparison of surnames and initials using string distance metrics, and content-based analysis of thematic profiles employing a statistical term weighting model. A hybrid verification mechanism has been implemented, wherein algorithmic filtering is complemented by routing complex cases to an expert review system, thereby minimising the proportion of manual processing. The software suite is designed based on a modular architecture with standardised adapters for open-source digital repository platforms, ensuring that performance remains independent of the target storage system. Experimental evaluation in a production environment confirmed high author identification accuracy, effective duplicate elimination, and a substantial reduction in processing time for bibliographic records.

By using the JINR web-site, you accept the cookies that we use. Learn more about how we use cookies.