Research Paper

Mining Related Articles for Automatic Journal Cataloging

  • Yuqing Mao & Zhiyong Lu
Expand
  • 1 School of Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China;
    2 National Center for Biotechnology Information, National Library of Medicine, MD 20894, USA

Received date: 2015-12-14

  Revised date: 2016-01-05

  Online published: 2016-06-15

Supported by

We would like to thank Dr. John Wilbur for his helpful discussion on this project. This research is supported by NIH Intramural Research Program, National Library of Medicine.

Abstract

Purpose: This paper is an investigation of the effectiveness of the method of clustering biomedical journals through mining the content similarity of journal articles.
Design/methodology/approach: 3,265 journals in PubMed are analyzed based on article content similarity and Web usage, respectively. Comparisons of the two analysis approaches and a citation-based approach are given.
Findings: Our results suggest that article content similarity is useful for clustering biomedical journals, and the content-similarity-based journal clustering method is more robust and less subject to human factors compared with the usage-based approach and the citation-based approach.
Research limitations: Our paper currently focuses on clustering journals in the biomedical domain because there are a large volume of freely available resources such as PubMed and MeSH in this field. Further investigation is needed to improve this approach to fit journals in other domains.
Practical implications: Our results show that it is feasible to catalog biomedical journals by mining the article content similarity. This work is also significant in serving practical needs in research portfolio analysis.
Originality/value: To the best of our knowledge, we are among the first to report on clustering journals in the biomedical field through mining the article content similarity. This method can be integrated with existing approaches to create a new paradigm for future studies of journal clustering.


http://ir.las.ac.cn/handle/12502/8597

Cite this article

Yuqing Mao & Zhiyong Lu . Mining Related Articles for Automatic Journal Cataloging[J]. Journal of Data and Information Science, 2016 , 1(2) : 45 -59 . DOI: 10.20309/jdis.201613

References

B rody, T., Harnad, S., & Carr, L. (2006). Earlier web usage statistics as predictors of later citation impact. Journal of the American Society for Information Science and Technology, 57(8), 1060-1072.
C hen, C. M. (2008). Classification of scientific networks using aggregated journal-journal citation relations in the Journal Citation Reports. Journal of the American Society for Information Science and Technology, 59(14), 2296-2304.
D'Sou za, J. L., & Smalheiser, N. R. (2014). Three journal similarity metrics and their application to biomedical journals. PLOS One, 9(12), e115681.
Ei senberg, T., & Wells, M. T. (2014). Ranking law journals and the limits of journal citation reports. Economic Inquiry, 52(4), 1301-1314.
Fu jii, A. (2007). Enhancing patent retrieval by citation analysis. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. (pp. 793-794). New York: ACM.
Jä rvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422-446.
Kang, N., Doornenbal, M. A., & Schijvenaars, R. J. (2015). Elsevier journal finder: Recommending journals for your paper. In Proceedings of the 9th ACM Conference on Recommender Systems. (pp. 261-264). New York: ACM.
Klava ns, R., & Boyack, K. W. (2006). Identifying a better measure of relatedness for mapping science. Journal of the American Society for Information Science and Technology, 57(2), 251-263.
Lin, J., & Wilbur, W. J. (2007). PubMed related articles: A probabilistic topic-based model for content similarity. BMC Bioinformatics, 8(1), 423.
Lu, Z ., Xie, N., & Wilbur, W. J. (2009). Identifying related journals through log analysis. Bioinformatics, 25(22), 3038-3039.
Mao, Y., & Lu, Z. (2013). Predicting clicks of PubMed articles. In AMIA Annual Symposium Proceedings. (pp. 947-956). Bethesda, Maryland: American Medical Informatics Association.
Mao, Y., & Lu, Z. (in press). MeSH now: Automatic MeSH indexing at PubMed scale via learning to rank. Journal of Biomedical Semantics.
Pudov kin, A. I., & Garfield, E. (2002). Algorithmic procedure for finding semantically related journals. Journal of the American Society for Information Science and Technology, 53(13), 1113-1119.
Salto n, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513-523.
Shult z, M. (2007). Comparing test searches in PubMed and Google Scholar. Journal of the Medical Library Association: JMLA, 95(4), 442.
Small , H. G., & Koenig, M. E. (1977). Journal clustering using a bibliographic coupling method. Information Processing & Management, 13(5), 277-288.
Srivastava, C.V., Towery, N.D., & Zuckerman, B. (2007). Challenges and opportunities for research portfolio analysis, management, and evaluation. Research Evaluation, 16(3), 152-156.
Weis, S. (2013). NLM Catalog. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK153380/.
The N ational Center for Biotechnology Information (NCBI). (2010). Entrez programming utilities help. Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK25501/.
Outlines

/

京ICP备05002861号-43

Copyright © 2023 All rights reserved Journal of Data and Information Science

E-mail: jdis@mail.las.ac.cn Add:No.33, Beisihuan Xilu, Haidian District, Beijing 100190, China

Support by Beijing Magtech Co.ltd E-mail: support@magtech.com.cn