Is Google Gemini better than ChatGPT at evaluating research quality?

Expand
  • Information School, University of Sheffield, UK
† Mike Thelwall (E-mail: m.a.thelwall@sheffield.ac.uk; ORCID: https://orcid.org/0000-0001-6065-205X)

Received date: 2024-12-09

  Revised date: 2024-12-24

  Accepted date: 2024-12-25

  Online published: 2025-01-08

Abstract

Google Gemini 1.5 Flash scores were compared with ChatGPT 4o-mini on evaluations of (a) 51 of the author’s journal articles and (b) up to 200 articles in each of 34 field-based Units of Assessment (UoAs) from the UK Research Excellence Framework (REF) 2021. From (a), the results suggest that Gemini 1.5 Flash, unlike ChatGPT 4o-mini, may work better when fed with a PDF or article full text, rather than just the title and abstract. From (b), Gemini 1.5 Flash seems to be marginally less able to predict an article’s research quality (using a departmental quality proxy indicator) than ChatGPT 4o-mini, although the differences are small, and both have similar disciplinary variations in this ability. Averaging multiple runs of Gemini 1.5 Flash improves the scores.

Cite this article

Mike Thelwall . Is Google Gemini better than ChatGPT at evaluating research quality?[J]. Journal of Data and Information Science, 0 : 1 -1 . DOI: 10.2478/jdis-2025-0014

References

[1] Thelwall, M. (2025). Evaluating research quality with Large Language Models: An analysis of ChatGPT’s effectiveness with different settings and inputs. Journal of Data and Information Science, https://doi.org/10.2478/jdis-2025-0011
[2] Thelwall, M., & Yaghi, A. (2024). In which fields can ChatGPT detect journal article quality? An evaluation of REF2021 results. https://arxiv.org/abs/2409.16695
Outlines

/

京ICP备05002861号-43

Copyright © 2023 All rights reserved Journal of Data and Information Science

E-mail: jdis@mail.las.ac.cn Add:No.33, Beisihuan Xilu, Haidian District, Beijing 100190, China

Support by Beijing Magtech Co.ltd E-mail: support@magtech.com.cn