An explorative study on document type assignment of review articles in Web of Science, Scopus and journals’ websites

Manman Zhu; Xinyue Lu; Fuyou Chen; Liying Yang; Zhesi Shen

doi:10.2478/jdis-2024-0003

Journal of Data and Information Science >

2024 , Vol. 9 >Issue 1: 11 - 36

DOI: https://doi.org/10.2478/jdis-2024-0003

Research Papers

An explorative study on document type assignment of review articles in Web of Science, Scopus and journals’ websites

Manman Zhu ¹^,²^,³^,^* ,
Xinyue Lu ¹^,²^,^* ,
Fuyou Chen ¹ ,
Liying Yang ¹^,² ,
Zhesi Shen ^,¹^,^†

Expand

¹National Science Library, Chinese Academy of Sciences, Beijing 100190, China
²Department of Information Resource Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
³Institute of Science and Technology Information, Jiangsu University, Zhenjiang 212013, China

†Zhesi Shen (Email: shenzhs@mail.las.ac.cn).

^*These authors contribute equally.

Received date: 2024-01-01

Revised date: 2024-01-21

Accepted date: 2024-01-29

Online published: 2024-01-31

Fold

Abstract

Purpose: Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS) and Scopus is important. This study aims to investigate the document type assignation of review articles in Web of Science, Scopus and Publisher’s websites on a large scale.

Design/methodology/approach: 27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed. The document types of these papers labeled on journals’ websites, and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning. For the document type labeled on the website, we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.

Findings: Overall, WoS and Scopus performed similarly, with an average precision of about 99% and recall of about 80%. However, there were some differences between WoS and Scopus across different journal series and within the same journal series. The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly, especially for Scopus.

Research limitations: The document types we used as the gold standard were based on the journal websites’ labeling which were not manually validated one by one. We only studied the labeling performance for review articles published during 2017-2018 in review journals. Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.

Practical implications: This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus, and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites, WoS and Scopus.

Originality/value: This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.

Key words： Document type; Web of Science; Scopus; Review article

Cite this article

Manman Zhu , Xinyue Lu , Fuyou Chen , Liying Yang , Zhesi Shen . An explorative study on document type assignment of review articles in Web of Science, Scopus and journals’ websites[J]. Journal of Data and Information Science, 2024 , 9(1) : 11 -36 . DOI: 10.2478/jdis-2024-0003

1 Introduction

1.1 Reviews play an importance role in science and science communication

As Garfield (1996) described,

“Reviews play an essential role in scientific communication and understanding. In terms of the inherent characteristics of the review, they can provide a synthesis of the proliferating fragmented knowledge appearing in the plethora of foreign and domestic journals in a specialty or subspecialty. As such, they can elucidate trends in research and point to unanswered questions that provide opportunities for future study. Reviews also give science policymakers as well as researchers a clearer insight into the potential importance of emerging knowledge.”

In addition, reviews provide excellent and stimulating reading for the general reader and researcher dedicated to cross-disciplinary study, because they advance our perceptions of the relationships between different research efforts. The value of a review does not exist solely in the author’s synthesis of previously published papers; the bibliography in a review usually is a high quality list of core articles about the subject. In all, writing a review will certainly do as much for the advancement of science as those who do the original research.

1.2 The necessity of accurately assigning document type of reviews in databases

Despite the importance of reviews in science and science communication, the effect of reviews on scientometrics analysis is also significant. Reviews tend to be more frequently cited (Aksnes, 2003; Moed, 2010; Teixeira et al., 2013). Correlated with this overcitation, there is an overrepresentation of reviews in the highly cited papers, and this overrepresentation becomes greater when the most highly cited papers are considered (Miranda et al., 2018). Moreover, 20% reviews can increase the average citations of an individual researcher by 40%-80%. Consequently, researchers boost their citations by publishing reviews, and journals increase their Impact Factor by publishing reviews (Ketcham et al., 2007; Lei et al., 2020; Teixeira et al., 2013).

Review a will also affect the citation of the articles reviewed. An alarming trend within the biological/biomedical science has been noted that the authors prefer to cite review articles rather than the original article when writing literature review(Ketcham et al., 2007; Teixeira et al., 2013). It is more efficient to cite reviews than all the individual studies, but the scientific credit to the time-consuming original studies will be absorbed by the reviews and the review authors (Ketcham et al., 2007; Lachance et al., 2014; Teixeira et al., 2013). Ho et al. (2017) out pointed that review papers will affect the main path analysis and clustering analysis. When conducting bibliometrics research and evaluating scientific research achievements, we should decide which document type to be included and whether to treat articles of different document types separately(Lei et al., 2020). To facilitate the above process, highly accurate assignment of review articles in databases is required. Wrongly assigned document type has great impact on the citation-based evaluation(Donner, 2017; Zhu et al., 2022).

1.3 Definition of Review in databases and related studies of the document type assignment of reviews in databases

WoS describes the Review^①(① http://webofscience.help.clarivate.com/en-us/Content/document-types.html (accessed at 2023.08.30)) as

“Detailed, critical surveys of published research. A review article may summarize previously published studies and draw some conclusions but will not present new information on the subject. Includes Reviews, Review of Literature, Mini-reviews, and Systematic reviews. If an article is listed under the review section in a journal and/or Review of Literature appears in the title it will be assigned a review.

If an article is not assigned a review by the journal but Review, Systematic Review or Mini-review appears in the title, it must also appear someplace else in the article (abstract/summary or introduction) in order to be assigned the document type review.

NOTE: If the article(s) meet the above criteria - they must have References in order to be tagged as a Review item.

Review articles that were presented at a Symposium or Conference will be processed as Proceedings Papers.”

This description has been accessed recently. Several years before, the description of “Review” in WoS is a renewed study or survey of previously published literature providing new analysis or summarization of the research topic(WoS, 2023). And several criteria are used to determine whether a paper is a review. such as the following:

“In the JCR system any article containing more than 100 references is coded as a review. Articles in ‘Review’ sections of research or clinical journals are also coded as reviews, as are articles whose titles contain the word ‘Review’ or ‘overview’”(Garfield, 1994).

The “more than 100 references” criteria had been removed in 2010. The change of criteria may lead to some changes to the statistics. For example, in a 1987 paper, Eugene Garfield pointed out that there were 625,432 articles indexed in the 1986 SCI and of which approximately 32,000 are reviews (Garfield, 1987). But when we search the SCI in June 2021, total articles indexed in 1986 are 709,136 and 13,197 are reviews.

Scopus describes Review as

“A significant review of original research also includes conference papers. Reviews typically have an extensive bibliography. Educational items that review specific issues within the literature are also considered to be reviews. As non-original articles, reviews lack the most typical sections of original articles such as materials & methods and results”(McCullough, 2023).

In Scopus, there is another document type related to review called “Short survey”, which is described as

“Short or mini-review of original research. Short surveys are similar to reviews, but usually are shorter (not more than a few pages) and with a less extensive bibliography.”

We can see that the description of reviews in Scopus and WoS is mainly related to the words used in titles and abstracts, the length of the reference list and article structure. Colebunders et al. (2013) compared the number of records related to reviews retrieved in WoS via different strategies, i.e., (1) based on the WoS document type, (2) having either the word review or the word overview in the title, and (3) a topic search (TS=) for the words review or reviews. It is found that the absolute and relative numbers of reviews differ depending on which of the three definitions are used. Harzing (2013) reported a comprehensive analysis of document categories for 27 journals in nine Social Science and Science disciplines and showed that WoS may misclassify social science journal articles containing original research into the “review” or “proceedings paper” category. The possible reason is length of references of in social science articles is larger than 100.

Several studies compared the document type assignment accuracy of citation index database with other sources(e.g., manually coded, publisher’s websites). Hayashi et al. (2013) compared the records’ document type of 18 research journals of Nature Publishing Group in WoS, Scopus, and the website and found that all “Review” items in the website were labeled as “Review” in both WoS and Scopus, and some papers of other types are also labeled as “Review” in WoS and Scopus. As the authors didn’t further report the details of these reviews labeled in WoS and Scopus, we cannot infer the real accuracy. Donner (2017) reported a study on the document type assignment accuracy of 791 randomly selected papers in WoS and Scopus. When only focusing on these selected papers, the accuracy of WoS (83%) is higher than Scopus(76%). The study also statistically inferred the WoS overall proportion of correctly assigned DT is 0.94, but for the reviews, the precision is 0.87 and recall is 0.57. Yeung (2019) examined the DT assignment accuracy of 400 top-cited publications defined by Scopus as ‘article’ in the field of food and nutritional sciences. Among these 400 publications, 117 were manually coded as reviews. Further, for these 117 reviews, 111 is indexed in WoS and 55/111 were wrongly labeled. Another interesting observation is that, the publisher' website labeled 52/117 reviews as articles.

Reviews have always been a very important research object in the field of Scientometrics and Informetrics, and the research directions about reviews had been discussed at a workshop at the Conference of the International Society of Informetrics in September 2019, in which participants identified six realms of study. One of the themes is “the study of methodological caveats resulting from the usage of scholarly databases”, such as the lack of accuracy of document assignation in scholarly databases (Blümel et al., 2020). In this work, we’d like to analyze the accuracy of document type assignment of review articles in WoS and Scopus on a large scale and identify the possible reasons for wrongly assigning.

2 Data and methods

2.1 Data collection

In the publishing ecosystem, there are several journal series mainly (or only) publishing review articles, e.g., the Annual Reviews series, Nature Reviews series. These journals can be treated as appropriate data sources for us to investigate the correctness of document type assignment of reviews in databases. For example, as shown in Figure 1, a paper published by Nature Reviews Cancer, has a document type annotation on its official website, and can be further compared with the corresponding document type provided in WoS and Scopus. In this study, we selected 160 SCI journals included in Journal Citation Report 2019 from five series of pure review journals (only publishing review articles) and four series of mixed review journals (mainly publishing review articles) as shown in Table 1.

Series of review journal	Type of Review Journal	NO. of journals	NO. of papers
Annual Reviews	Pure	39	1,842
Cell Trends In series	Pure	15	3,206
Wolters Kluwer Current Opinion	Pure	24	4,333
Reviews of Modern Physics	Pure	1	86
WIREs-Wiley Interdisciplinary Reviews	Pure	9	755
Elsevier Current Opinion	Mixed	20	4,983
Nature Reviews	Mixed	18	5,975
Taylor & Francis Expert Opinion	Mixed	11	2,519
Taylor & Francis Expert Review	Mixed	13	2,737
Taylor & Francis Critical Review	Mixed	10	1,180
Total	-	160	27,616

Type	Annual Reviews	Total	Web of Science				Scopus
Type	Annual Reviews	Total	review	article	others	not indexed	review	article	short survey	others	not indexed
Review	Explicit	1,795	1,501	292	2	0	1,285	505	0	5	0
Other paper		47	2	0	22	23	4	2	1	31	9
Total		1,842	1,503	292	24	23	1,289	507	1	36	9

Type	Cell Trends	Total	Web of Science				Scopus
Type	Cell Trends	Total	review	article	other	not indexed	review	article	short survey	other	not indexed
Review	All	2,876	2,121	16	739	0	2,139	2	728	7	0
	Explicit	2,139	2,121	16	2	0	2,137	2	0	0	0
	Mini Review	737	0	0	737	0	2	0	728	7	0
Other paper		331	0	0	310	21	0	1	0	330	0
Total		3,207	2,121	16	1,049	21	2,139	3	728	337	0

Website Type	Website Description	Mapping Type
Advanced Review	These articles review key areas of research in a citation-rich format similar to that of leading review journals.	explicit review
Focus Article	These articles are mini-reviews, and which therefore illustrate aspects of larger ideas covered in Overviews and Advanced Reviews.	implicit review
Primer	Meant to be understood by a very general audience. These articles should provide orientation to the key theories, knowledge, uncertainties, and controversies in the field.	implicit review
Overview	Broad and relatively non-technical treatment of important topics at a level. These articles must refer to the key articles/books in the field (not exhaustive but comprehensive).	implicit review
Software Focus	These articles should review the capabilities of the software and how it has been and can be applied.	implicit review
Perspective	A forum for thought-leaders, hand-picked. They should cite literature which authenticates their argument(s), but without the need to be exhaustive or comprehensive.	implicit review

Type		Total	Web of Science				Scopus
Type		Total	review	article	others	not indexed	review	article	others	not indexed
Review	All	731	487	241	3	0	466	150	0	115
	Explicit	386	383	3	0	0	336	14	0	36
	Implicit	345	104	238	3	0	130	136	0	79
Other paper		201	1	12	10	178	6	8	9	178
Total		932	488	253	13	178	472	158	9	293

Type		Total	Web of Science				Scopus
Type		Total	review	article	others	not indexed	review	article	short survey	others	not indexed
Review	All	1,799	1,501	207	1	90	1,589	123	0	30	57
	Explicit	1,788	1,496	201	1	90	1,586	115	0	30	57
	Implicit	11	5	6	0	0	3	8	0	0	0
Other paper		4,178	3	11	,3346	818	127	784	301	2,770	196
Total		5,977	1,504	218	3,347	908	1,716	907	301	2,800	253

模态框（Modal）标题

Abstract

Cite this article

1 Introduction

1.1 Reviews play an importance role in science and science communication

1.2 The necessity of accurately assigning document type of reviews in databases

1.3 Definition of Review in databases and related studies of the document type assignment of reviews in databases

2 Data and methods

2.1 Data collection

Figure 1. An example of the document type annotation of a review paper on its official Website, Web of Science and Scopus.

Table 1. List of review journal series investigated.

2.2 Measurement of assignment accuracy

Figure 2. Examples of section headings and document type annotations for mixed review journals.

3 Result

3.1 Descriptive results of review mark for pure review journals

3.1.1 Annual Review journal series

Table 2. Assignment matrix for Annual Reviews series.

3.1.2 Cell Trends In journal series

Figure 3. Examples of document type annotation on the website of Cell Trends In series.

Table 3. Assignment matrix for Cell Trends series.

3.1.3 Wolters Kluwer Current Opinion journal series

Figure 4. Document Type annotation on the website of Current Opinion series.

Table 4. Assignment matrix for Wolters Kluwer Current Opinion series.

3.1.4 Review of Modern Physics

Table 5. Assignment matrix for Reviews of Modern Physics

3.1.5 WIREs journal series

Table 6. Official website descriptions of the main types for WIREs series.

Table 7. Assignment matrix for WIREs series.

3.2 Descriptive results of Review assignment for mixed review journals

3.2.1 Elsevier Current Opinion series

Table 8. Assignment matrix for Elsevier Current Opinion series.

3.2.2 Nature Reviews series

Table 9. Assignment matrix for Nature Reviews series.

Figure 5. Distribution of document types for other papers on websites, WoS and Scopus.

Figure 6. Example of PrimerViews and Primer in Nature Reviews Disease Primers.

3.2.3 Taylor & Francis Expert Opinion series

Figure 7. The annotation in the website of Taylor & Francis Expert Opinion series.

Table 10. Assignment matrix for Taylor & Francis Expert Opinion.

3.2.4 Taylor & Francis Expert Review series

Table 11. Assignment matrix for Taylor & Francis Expert Review.

3.2.5 Taylor & Francis Critical Reviews

Table 12. Assignment matrix for Taylor & Francis Critical Reviews.

3.3 Overview of assignment performance for these review journal series

Figure 8. Assignment precision and recall of review articles. (a)-(d) respectively show the total precision, total recall, explicit review recall and implicit review recall. (d) just represent the results of 6 mixed review journal series.

4 Conclusion and discussion

Acknowledgements

Author contributions

Data availability

Appendix

Figure A1. Correspondence of document types on websites, WoS and Scopus for Cell Trends In journal series.

Figure A2. Correspondence of document types on websites, WoS and Scopus for Review of Modern Physics.

Figure A3. Correspondence of document types on websites, WoS and Scopus for WIREs journal series.

Figure A4. Correspondence of document types on websites, WoS and Scopus for Elsevier Current Opinion series.

Figure A5. Correspondence of document types on websites, WoS and Scopus for Nature Reviews series.

Figure A6. Correspondence of document types on websites, WoS and Scopus for Taylor & Francis Expert Opinion Series.

Figure A7. Correspondence of document types on websites, WoS and Scopus for Taylor & Francis Expert Review Series.

Figure A8. Correspondence of document types on websites, WoS and Scopus for Taylor & Francis Critical Reviews.

References