Purpose: Journal Impact Factors and other citation-based indicators are widely used and abused to help select journals to publish in or to estimate the value of a published article. Nevertheless, citation rates primarily reflect scholarly impact rather than other quality dimensions, including societal impact, originality, and rigour. In response to this deficit, Journal Quality Factors (JQFs) are defined and evaluated. These are average quality score estimates given to a journal’s articles by ChatGPT.
Design/methodology/approach: JQFs were compared with Polish, Norwegian and Finnish journal ranks and with journal citation rates for 1,300 journals with 130,000 articles from 2021 in large monodisciplinary journals in the 25 out of 27 Scopus broad fields of research for which it was possible. Outliers were also examined.
Findings: JQFs correlated positively and mostly strongly (median correlation: 0.641) with journal ranks in 24 out of the 25 broad fields examined, indicating a nearly science-wide ability for ChatGPT to estimate journal quality. Journal citation rates had similarly high correlations with national journal ranks, however, so JQFs are not a universally better indicator. An examination of journals with JQFs not matching their journal ranks suggested that abstract styles may affect the result, such as whether the societal contexts of research are mentioned.
Research limitations: Different journal rankings may have given different findings because there is no agreed meaning for journal quality.
Practical implications: The results suggest that JQFs are plausible as journal quality indicators in all fields and may be useful for the (few) research and evaluation contexts where journal quality is an acceptable proxy for article quality, and especially for fields like mathematics for which citations are not strong indicators of quality.
Originality/value: This is the first attempt to estimate academic journal value with a Large Language Model.
[1] Aksnes D. W., Langfeldt L., & Wouters P. (2019). Citations, citation indicators, and research quality: An overview of basic concepts and theories.Sage Open, 9(1), 2158244019829575.
[2] Barrere, R. (2020). Indicators for the assessment of excellence in developing countries. In E. Kraemer-Mbula, R. Tijssen, M. L. Wallace, & R. McClean (Eds.), Transforming research excellence: New ideas from the Global South(pp. 219-232). Cape Town, South Africa: African Minds.
[3] Bharti P. K., Ghosal T., Agarwal M., & Ekbal A. (2024). PEERRec: An AI-based approach to automatically generate recommendations and predict decisions in peer review.International Journal on Digital Libraries, 25(1), 55-72.
[4] Curry S., Gadd E., & Wilsdon J. (2022). Harnessing the Metric Tide: Indicators, infrastructures & priorities for UK responsible research assessment. Report of The Metric Tide Revisited panel. https://doi.org/10.6084/m9.figshare.21701624
[5] de Winter, J. (2024). Can ChatGPT be used to predict citation counts, readership, and social media interaction? An exploration among 2222 scientific abstracts. Scientometrics, 129, 2469-2487. https://doi.org/10.1007/s11192-024-04939-y
[6] Fiorillo, L., & Mehta, V. (2024). Accelerating editorial processes in scientific journals: Leveraging AI for rapid manuscript review.Oral Oncology Reports, 10, 100511.
[7] gov.pl. (2024). Komunikat Ministra Nauki z dnia 5 stycznia 2024 r . w sprawie wykazu czasopism naukowych i recenzowanych materiałów z konferencji międzynarodowych (Announcement by the Minister of Science on the List of International Conferences, Scientific Journals, and Peer-Reviewed Materials for Science, January 5, 2024). https://www.gov.pl/web/nauka/komunikat-ministra-nauki-z-dnia-5-stycznia-2024-r-w-sprawie-wykazu-czasopism-naukowych-i-recenzowanych-materialow-z-konferencji-miedzynarodowych
[8] Hecht F., Hecht B. K., & Sandberg A. A. (1998). The journal “impact factor”: A misnamed, misleading, misused measure.Cancer genetics and cytogenetics, 104(2), 77-81.
[9] Kousha K.,& Thelwall, M. (2024). Artificial intelligence to support publishing and peer review: A summary and review. Learned Publishing, 37(1), 4-12.
[10] Kulczycki, E., & Rozkosz, E. A. (2017). Does an expert-based evaluation allow us to go beyond the Impact Factor? Experiences from building a ranking of national journals in Poland.Scientometrics, 111(1), 417-442.
[11] Langfeldt L., Nedeva M., Sörlin S., & Thomas D. A. (2020). Co-existing notions of research quality: A framework to study context-specific understandings of good research.Minerva, 58(1), 115-137.
[12] Leydesdorff, L., & Opthof, T. (2010). Scopus’s source normalized impact per paper (SNIP) versus a journal impact factor based on fractional counting of citations.Journal of the American society for information science and technology, 61(11), 2365-2369.
[13] McKiernan E. C., Schimanski L. A., Muñoz Nieves C., Matthias L., Niles M. T., & Alperin J. P. (2019). Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations.elife, 8, e47338.
[14] Moed, H. F. (2016). Comprehensive indicator comparisons intelligible to non-experts: The case of two SNIP versions.Scientometrics, 106(1), 51-65.
[15] Nederhof, A. J. (2006). Bibliometric monitoring of research performance in the social sciences and the humanities: A review.Scientometrics, 66(1), 81-100.
[16] Pölönen J., Guns R., Kulczycki E., Sivertsen G., & Engels T. C. (2021). National lists of scholarly publication channels: An overview and recommendations for their construction and maintenance.Journal of Data and Information Science, 6(1), 50-86.
[17] REF2021 (2019). Panel criteria and working methods (2019/02). https://2021.ref.ac.uk/publications-and-reports/panel-criteria-and-working-methods-201902/index.html
[18] Roberts R. H., Ali S. R., Hutchings H. A., Dobbs T. D., & Whitaker I. S. (2023). Comparative study of ChatGPT and human evaluators on the assessment of medical literature according to recognised reporting standards. BMJ Health & Care Informatics, 30(1), e100830. https://doi.org/10.1136/bmjhci-2023-100830
[19] Rushforth, A., & Hammarfelt, B. (2023). The rise of responsible metrics as a professional reform movement: A collective action frames account.Quantitative Science Studies, 4(4), 879-897.
[20] Saarela, M., & Kärkkäinen, T. (2020). Can we automate expert-based journal rankings? Analysis of the Finnish publication indicator.Journal of Informetrics, 14(2), 101008.
[21] Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research.British Medical Journal, 314(7079), 497.
[22] Suleiman A., von Wedel D., Munoz-Acuna R., Redaelli S., Santarisi A., Seibold E. L., & Schaefer M. S. (2024). Assessing ChatGPT’s ability to emulate human reviewers in scientific research: A descriptive and qualitative approach. Computer Methods and Programs in Biomedicine, 254, 108313.
[23] Thelwall, M. (2024a). Can ChatGPT evaluate research quality? Journal of Data and Information Science, 9(2), 1-21. https://doi.org/10.2478/jdis-2024-0013
[24] Thelwall, M. (2024b). Evaluating research quality with large language models: An analysis of ChatGPT’s effectiveness with different settings and inputs. Journal of Data and Information Science. Advance online publication. https://doi.org/10.2478/jdis-2025-0011
[25] Thelwall, M., & Fairclough, R. (2015). Geometric journal impact factors correcting for individual highly cited articles.Journal of informetrics, 9(2), 263-272.
[26] Thelwall M., Jiang X., & Bath P. A. (2024). Evaluating the quality of published medical research with ChatGPT. arXiv. https://arxiv.org/abs/2411.01952
[27] Thelwall M., Kousha K., Makita M., Abdoli M., Stuart E., Wilson P., & Levitt J. (2023). In which fields do higher impact journals publish higher quality articles? Scientometrics, 128, 3915-3933. https://doi.org/10.1007/s11192-023-04735-0
[28] Thelwall, M., & Yaghi, A. (2024). In which fields can ChatGPT detect journal article quality? An evaluation of REF2021 results. arXiv. https://arxiv.org/abs/2409.16695
[29] Vogel, R. M. (2022). The geometric mean? Communications in Statistics-Theory and Methods, 51(1), 82-94.
[30] Waltman, L., & Traag, V. A. (2020). Use of the journal impact factor for assessing individual articles: Statistically flawed or not? F1000Research, 9, 366. https://doi.org/10.12688/f1000research.23418.1
[31] Wang, J. (2013). Citation time window choice for research impact evaluation.Scientometrics, 94(3), 851-872.
[32] Wilsdon J., Allen L., Belfiore E., Campbell P., Curry S., Hill S., & Johnson B. (2015). The metric tide: Report of the independent review of the role of metrics in research assessment and management. HEFCE. DOI: 10.13140/RG.2.1.4929.1363
[33] Wouters, P. (2014). The citation: From culture to infrastructure. In B. Cronin & C. Sugimoto (Eds.), Beyond bibliometrics: Harnessing multidimensional indicators of scholarly impact (pp. 47-66). Cambridge, MA: The MIT Press.