Automation of accounting of publications using the ORCID application programming interface
DOI:
https://doi.org/10.17721/1812-5409.2024/1.26Keywords:
APIs, accounting of publications, duplicate search, ORCID, Python, MatLab, algorithm, XMLAbstract
The procedure for automated accounting of publications based on the use of Rest API of the ORCID database is proposed. The relevance of publication accounting is described. The importance of using various technologies for creating bibliographic data repositories is substantiated. The possibility of using API technology in the most famous publication databases such as Web of science, SCOPUS, Crossref, Google Scholar, and ORCID was analyzed. The possibility of using the ORCID database is substantiated. The scheme for downloading publications from the ORCID database by specified registration numbers based on services implemented in the Python and MatLab programming languages is given. The received data in JSON or XML is subject to further parsing. MatLab functions for obtaining a structure from XML (JSON) data formats are provided.In addition, the algorithm for finding duplicate publications during their accounting is considered. Approaches to avoid duplication of publications in databases based on the application of the Levenstein algorithm for similarity assessment are formulated. It is proposed to transliterate the Cyrillic alphabet into the Latin alphabet to ensure clarity and correct comparison of textual data. A MySql database was developed to collect and update data on publishing activity. The title of the publication table of the database is supplemented with a special attribute, which stores the results of the conversion of Cyrillic names into corresponding Latin names. It is recommended to use indexing of database table fields (INDEX) by various attributes, which allowed to significantly increase the efficiency of searching, processing and comparing data. It is proposed to use the Soundex() function as a MySQL DBMS tool to determine the level of consonance of publication topics by additional parameters. The practical implementation of the algorithm for finding duplicate publications and their numbering confirmed the constructiveness of the proposed approach which was confirmed when filling the database. This article is of interest to software developers.
Pages of the article in the issue: 141 - 146
Language of the article: Ukrainian
References
Ahlawat Anil & Sagar Kalpna (2022) Automating Duplicate Detection for Lexical Heterogeneous Web Databases. Recent Advances in Computer Science and Communications. 15 (4). Article ID: e220322185588. https://dx.doi.org/10.2174/2666255813999200904170035
API Tutorial: Get an Authenticated ORCID iD. (2023). https://info.orcid.org/documentation/api-tutorials/api-tutorial-get-and-authenticated-orcid-id/
Collect Authenticated ORCID iDs and permissions. (2023). https://info.orcid.org/hands-on-with-the-orcid-api/2-collect-authenticated-orcid-ids-and-permissions/
CrossRef. Fact file 2018-2019 annual report. (2023). https://www.crossref.org/pdfs/annual-report-factfile-2018-19.pdf
CrossRef. REST API. (2023, 20 листопада). https://www.crossref.org/documentation/retrieve-metadata/rest-api/
Elsevier Research products APIs. (2023). https://dev.elsevier.com
Google scholar. (2023). https://scholar.google.com
May Mahmoud, Robert J. Walker, and Jörg Denzinger (2024). API usage templates via structural generalization. Journal of Systems and Software. 210. Article 111974. https://doi.org/10.1016/j.jss.2024.111974
ORCID. Connecting research and researchers.(2023). https://info.orcid.org/researchers/
ORCID. Public API. (2023). https://info.orcid.org/documentation/features/public-api/
Thodoris Sotiropoulos, Stefanos Chaliasos, and Zhendong Su. (2024). API-Driven Program Synthesis for Testing Static Typing Implementations. Proc. ACM Program. Lang. 8 (POPL). Article 62 (January 2024). 1850-1881. https://doi.org/10.1145/3632904
Wang, Y., Chen, L., Gao, C. et al. (2024). Prompt enhance API recommendation: visualize the user’s real intention behind this query. Automated Software Engineering. 31. Article 27. https://doi.org/10.1007/s10515-024-00425-0
Web of science core collection. (2023). https://clarivate.com/cis/solutions/web-of-science-core-collection/
Web of science API Expanded. (2023). https://developer.clarivate.com/apis/wos
Welcome to MatLab. (2024). https://matlab.mathworks.com
Wu, D., Feng, Y., Zhang, H. et al. (2024) Automatic recognizing relevant fragments of APIs using API references. Automated Software Engineering 31. Article 3. https://doi.org/10.1007/s10515-023-00401-0
ZhuWeiheng, YinJian, DengYuhui, LongShun, QiuShiding. (2016) Efficient Duplicate Detection Approach for High Dimensional Big Data[J]. Journal of Computer Research and Development, 53(3): 559-570. DOI: 10.7544/issn1000-1239.2016.20148218
Horbachevskyi, S. (2022). Automation of the accounting of publications of scientific units based on the use of the ORCID API. Military education, 1 (45). P. 52-58. [in Ukrainian].
Ivanov, S.M. & Flakei, R.R. (2023). Search for duplicate publications based on phonetic consonance of topics. In Patrak et al. (ed.), Information society: technological, economic and technical aspects of formation: Vol. 83. Information systems and technologies (pp. 43-45). FOP Shpak V.B.http://www.konferenciaonline.org.ua/ua/article/id-1505/ [in Ukrainian].
Shershun, O. O., Tyturenko, Zh. A., Zinchenko, I. I.& Olshevska, O. V. (2020) Development of an automated data processing resource of ONAKHT scientists from scientometric databases. Automation of technological and business processes. 12(3). 40 - 46. [in Ukrainian].
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Serhii Ivanov, Eugene Ivohin, Mykhailo Makhno
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).