Research Papers

A comparative study on characteristics of retracted publications across different open access levels

  • Er-Te Zheng 1, 2 ,
  • Hui-Zhen Fu , 1,
Expand
  • 1Department of Information Resources Management, School of Public Affairs, Zhejiang University, Hangzhou 310058, China
  • 2School of Information Resource Management, Renmin University of China, Beijing 100872, China
†Hui-Zhen Fu (Email: ; ORCID: 0000-0002-1534-9374).

Received date: 2023-08-20

  Revised date: 2024-02-20

  Accepted date: 2024-02-21

  Online published: 2024-03-26

Abstract

Purpose: Recently, global science has shown an increasing open trend, however, the characteristics of research integrity of open access (OA) publications have rarely been studied. The aim of this study is to compare the characteristics of retracted articles across different OA levels and discover whether OA level influences the characteristics of retracted articles.

Design/methodology/approach: The research conducted an analysis of 6,005 retracted publications between 2001 and 2020 from the Web of Science and Retraction Watch databases. These publications were categorized based on their OA levels, including Gold OA, Green OA, and non-OA. The study explored retraction rates, time lags and reasons within these categories.

Findings: The findings of this research revealed distinct patterns in retraction rates among different OA levels. Publications with Gold OA demonstrated the highest retraction rate, followed by Green OA and non-OA. A comparison of retraction reasons between Gold OA and non-OA categories indicated similar proportions, while Green OA exhibited a higher proportion due to falsification and manipulation issues, along with a lower occurrence of plagiarism and authorship issues. The retraction time lag was shortest for Gold OA, followed by non-OA, and longest for Green OA. The prolonged retraction time for Green OA could be attributed to an atypical distribution of retraction reasons.

Research limitations: There is no exploration of a wider range of OA levels, such as Hybrid OA and Bronze OA.

Practical implications: The outcomes of this study suggest the need for increased attention to research integrity within the OA publications. The occurrences of falsification, manipulation, and ethical concerns within Green OA publications warrant attention from the scientific community.

Originality/value: This study contributes to the understanding of research integrity in the realm of OA publications, shedding light on retraction patterns and reasons across different OA levels.

Cite this article

Er-Te Zheng , Hui-Zhen Fu . A comparative study on characteristics of retracted publications across different open access levels[J]. Journal of Data and Information Science, 2024 , 9(2) : 22 -40 . DOI: 10.2478/jdis-2024-0010

1 Introduction

Scientific publications are regarded as the cornerstone reflecting the development of the scientific community (Shah et al., 2021). Research integrity is important since the level of trust characterized science and its relationship with society (Olson & Griffiths, 1995). Misconduct and errors in publications will undermine academic development and public trust in science, therefore, retraction of dubious publications represents the fulfilment of social responsibilities (Vuong et al., 2020). Retraction is a vital way of self-purification in the scientific community, which can reduce the negative influence of flawed researches.
Open access (OA) is aimed to promote transparency of results, widen the diffusion of knowledge (European Commission, 2020). The post-publication content scrutiny of OA publications based on a large number of readers may accelerate the detection of misconduct and errors of flawed publications (Shah et al., 2021). The connection between OA and research integrity has been identified. Openness is increasingly recognized as a driver of responsible research practices (Tijdink et al., 2021). It is highlighted that openness in research is more than just access to research, which also brought equality to the research process (Nosek et al., 2018).
However, several challenges in OA need to be addressed. OA journals are often criticized for having high article processing charges (Björk & Solomon, 2015) and lacking transparency of the review process (Butler, 2013; Bohannon, 2013). Moreover, fake science (like the “predatory journals”) that exploited the OA publishing business model is emerging (Shen & Björk, 2015), which dilutes high-quality research (European Commission, 2020). Therefore, despite the advantages, OA also poses some potential threats to research integrity.
Previous studies have analyzed the relationship between OA and research integrity. However, there are few empirical studies on OA retracted publications based on a comprehensive literature dataset, especially on retracted publications across different OA levels. The aim of this study is to compare the characteristics of retracted articles across different OA levels and discover whether OA level influences the characteristics of retracted articles.

2 Literature review

2.1 Study on characteristics of retraction

The trend of retraction and reasons for retraction are the most concerned aspects of previous empirical studies on retracted publications. In terms of the trend of retraction, studies have revealed that the number of retracted publications was booming in the past 20 years (Steen et al., 2013; Vuong et al., 2020; Zhang & Grieneisen, 2013). Specifically, the number of retracted publications started rising in 2006-2010 (Bar-Ilan & Halevi, 2018; He, 2013; Shuai et al., 2017), fluctuated during 2010-2012, and kept climbing rapidly in 2016-2021 (Sharma, 2021). The increase of retraction was owing to not only the rising misconduct and errors in research, but also that researchers and editors were more skilled at identifying flawed publications (Fanelli, 2013). Also, the reasons for retraction expanded over time, causing more post-publication content scrutiny of articles (Steen et al., 2013).
The reasons for retraction, which are closely related to the misbehavior of scientists, have attracted wide attention. Plagiarism and data falsification were the most common reasons for retraction in obstetrics and gynecology (Chambers et al., 2019) and biomedical researches in India (Elango, 2021). For biomedical studies in China, the common reasons for retraction were plagiarism, errors, self-plagiarism, and fake peer review (Chen et al., 2018). Most retractions in Iran were due to fake peer review and plagiarism (Ghorbi et al., 2021). Fake peer review was observed to be the most common reason for retraction by a large-scale research which studied over 18,000 retracted articles covering 127 research fields (Vuong et al., 2020). The reasons for retraction would significantly affect retraction time lag of articles. Falsification and errors usually took a longer time for the post-publication content scrutiny than plagiarism due to the difficulty of detection (Dal-Ré, 2019; Trikalinos, 2008). Publications with the issues of falsification were found to take the longest time to be retracted among all retraction reasons (Elango, 2021). In this study, we compared the trends and reasons for retraction among articles with different OA levels, and tried to explain their difference in retraction time lag from the perspective of reasons for retraction.

2.2 Study on OA retracted publications

There are only a few studies focused on retracted publications from the perspective of OA. As for the comparison of OA and non-OA retracted publications, Peterson (2013) found that OA literature did not differ from non-OA literature in impact factor, detection of errors, or change in post-retraction citation rates. The dataset of only biomedicine would limit its application to other disciplines. Shah et al. (2021) reported that the retraction rate for OA articles was 62% higher than non-OA articles, and non-OA publications were retracted earlier. The reasons for retractions have not been investigated, and number of OA types have been simplified to only two, which might miss distinctive types of OA.
Generally, most studies only discussed the characteristics of OA retracted publications themselves, without comparison to non-OA retracted publications. Moreover, the characteristics, like reasons for retraction and retraction time lag, which was closely related to research integrity, have rarely been discussed. Besides, studies were mostly focused on specific disciplines, like the biomedical field (Elango, 2021; Freedman & Inglese, 2014; Stojanovski, 2015; Wang et al., 2019), obstetrics and gynecology (Chambers et al., 2019). Few researches examined characteristics of different OA levels based on a comprehensive dataset covering all disciplines. This study will study deeper into characteristics of retracted publications in different OA levels from multiple perspectives, including trends of retraction, reasons of retraction and retraction time lag.

2.3 Research questions

Attention is a key predictor of retraction (Furman et al., 2012), while OA can increase the attention of article (Vadhera et al., 2022; Wang et al., 2015). With free online accessibility, OA articles might have quicker exposure to more readers, then the post-publication content scrutiny of more readers can cause earlier detection of fraudulent publications (Foo, 2011; Shah et al., 2021). This study attempts to further explore the characteristics of retracted publications across different OA levels, to find out whether OA level makes a difference in retraction of scientific publications. Specifically, our research questions are as follows:
(1) What are the characteristics of retracted publications across different OA levels from the perspectives of trends and reasons for retraction?
(2) Do scientific publications with higher OA levels have higher retraction rates?
(3) Do scientific publications with higher OA levels get retracted faster?

3 Data and methodology

3.1 Data collection

This study collected information of retracted publications from two comprehensive databases, namely, Web of Science (WoS) and Retraction Watch. The Science Citation Index Expanded (SCIE) and Social Sciences Citation Index (SSCI) of WoS Core Collection were selected, thus mitigating the influence of low-quality OA journals (like “predatory journals”). Retraction Watch database assembles the information regarding retracted publications identified by several databases and contains abstracts dating back to the 1970s. It is the largest and most visible database of retracted publications now (Brainard & You, 2018; Oransky, 2018; Retraction Watch, 2018). By collecting and checking publications from these two databases, we ensured the richness of data and the credibility of research results.
This study adopted the same search strategy as our previous study (Zhang et al., 2020). The strategy mainly contains three steps: (i) searching retracted publications and retraction notices in SCIE and SSCI databases in WoS between 2001 and 2020; (ii) searching the corresponding reasons for retraction in Retraction Watch database; (iii) combining records of retracted publications, retraction notices and reasons for retraction by matching them with titles. The publication year, author, title, journal, total and annual citations, OA level, retraction year, full texts of retraction notice, detailed and classified reason for retraction, and other information related to retraction were extracted. Finally, 6,005 retracted publications were obtained for further analysis.

3.2 Classification of OA level

In this study, we selected two different types of OA, namely Gold OA and Green OA, and used the results of non-OA as a reference for these two OA categories. The primary difference between Gold OA and Green OA is explained in Chan et al. (2002), which recommends two complementary strategies for achieving open access to scholarly journal literature. The first strategy involves self-archiving, where scholars deposit their articles in open electronic archives, now known as Green OA. The second strategy pertains to open-access journals, where scholars publish their articles in fully open access journals, referred to today as Gold OA.
(1) High OA level - Gold OA: A freely accessible final version of an article, including articles published in journals listed in the Directory of Open Access Journals (DOAJ), and articles identified as having a Creative Commons license by ImpactStory’s Unpaywall Database but were not in journals listed on the DOAJ (WoS, 2021). In this study, we marked “Gold” and “Gold Hybrid” publications as Gold OA group. If a publication was tagged both in Gold OA and Green OA in WoS, we marked the publication as Gold OA, because the OA level of Gold OA is higher than Green OA.
(2) low OA level - Green OA: A freely accessible version of an article located in an institutional or discipline-based OA repository. We classified “Green accepted” and “Green published” publications as Green OA group.
(3) non-OA: Only subscribers could access the full text of the article. We marked publications without the “Open Access” tag in WoS as non-OA group.
Among all publications, there were 1,454 (24%) Gold OA retracted publications, Green OA took 329 (6%) publications, and 4,222 (70%) publications were published as non-OA at the lowest OA level.

3.3 Reasons for retraction

The reasons for retraction classified by Retraction Watch database could be grouped into eight broad categories. The classification is same with our previous studies (Zhang et al., 2020), which is illustrated in Table 1.
Table 1. Category of reasons for retraction.
Major reason Original reasons identified by Retraction Watch database
Error and concern Error in image/data/text/results and/or conclusions/methods/materials (general)/cell lines/tissues/analyses; Concerns/issues about image/data/results/referencing/attributions; Contamination of reagents/materials (general)/cell lines/tissues; Unreliable data/image/results; Results not reproducible
Plagiarism Plagiarism of text/image/data/article; Euphemisms for plagiarism
Self-plagiarism Self-plagiarism of text/image/data/article; Euphemisms for self-plagiarism; Salami slicing
Falsification and manipulation Falsification/fabrication of results/image/data; Manipulation of results/images; Hoax publication; Paper mill; Fake peer review; Sabotage of materials/methods
Authorship issues Forged authorship; Concerns/Issues about authorship
Ethical issues Legal reasons/legal threats; Civil/criminal proceedings; Ethical violations; Lack of ethical approval; informed/patient consent-none/retracted; Infringement of patient privacy; Lack of balance/bias issues; Conflict of interest; Copyright claims
Others Other reasons for retraction
Not available/ lack of information

3.4 Indicators

3.4.1 Retraction rate

The retraction rate of articles in one OA type refers to the ratio of the number of retracted publications to the total number of publications of that OA type, i.e. Retraction rate=nretraction/npublication. We chose the retraction rate, combined with the number of retracted publications, to describe the trends of retraction and reasons for retraction of retracted publications across different OA levels. The value of retraction rate is related to the proportion of potential flawed articles. To examine its impact on the results, we investigated the proportion of retracted articles across various Journal Impact Factor Quartiles in WoS (see Figure A1 in the Appendix 1). Our analysis revealed that no specific OA type exhibits a disproportionately high percentage of journals in low quartiles. This suggests that the comparison of retraction rates will not be significantly influenced by the proportion of potentially flawed articles within each OA type across different quartiles.

3.4.2 Retraction time lag

The retraction speed is another vital indicator, which has usually been measured by retraction time lag. Retraction time lag refers to the interval between the publication time and retraction time of an article, which characterized how long it takes for the scientific community to detect and retract flawed publications.
We used the publication year of the article as the publication time, and the publication year of the retraction notice as the retraction time. The difference between them was the retraction time lag. Based on the retraction time lag, we drew the survival rate curve of retracted publications, to reveal the retraction speed of each OA level. We further examined the retraction time lag of the different OA levels in each reason for retraction, in order to find out how and why retraction time lag varied from different OA levels.

4 Results

4.1 Retraction rate of OA level

4.1.1 Trends of retraction

The trends of the number and retraction rate of retracted publications across different OA levels are shown in Figure 1. The number of OA journals have increased quickly from 80 to 4,672 during 2001 to 2020. Similarly, the number of OA retracted publications increased from 2001 to 2014 (Figure 1a). Non-OA had the largest number of retracted publications among different OA levels. Gold OA took second place, which surged from 2009, and reached its peak in 2014. The number of Green OA retracted publications was the least, with relatively gentle fluctuation during the investigated period.
Figure 1. Trends of number of retracted publications (a) and retraction rate of different OA levels (b). The horizontal axis is the publication year.
Figure 1b showed that the gap of retraction rate among different OA levels was small before 2010. The retraction rate of Gold OA was significantly higher than that of Green OA and non-OA after 2010, but it saw a substantial decline starting in 2015, approaching the levels of Green OA and non-OA by 2020. The retraction rates of Green OA and non-OA stayed low in the recent 20 years, and the trends of their curves were similar. The turning point occurring after 2014 in the figure could primarily be attributed to the time lag in the retraction. The publisher or journal requires time to thoroughly examine, investigate, and provide responses before reaching a final decision on inquiries related to a flawed publication. We calculated the average retraction rate of different OA levels (Figure 2). The overall retraction rate is 1.90‱, which is lower than that of Gold OA and Green OA but higher than that of non-OA. The retraction rate of Gold OA was higher than Green OA, and that of Green OA was higher than non-OA, which was consistent with a previous study that the retraction rate of OA articles was higher than non-OA articles (Shah et al., 2021).
Figure 2. Average retraction rate of different OA levels.

Note: *p = 0.05; **p = 0.01; ***p=0.001; the same below.

Generally, the gap between the number of OA and non-OA publications narrowed year by year (Figure 1a). The number of retracted publications of Gold OA showed a quickly increasing trend after 2010, which implied that articles of high OA level were active on retraction.

4.1.2 Reasons for retraction

The six categories of main reasons for retraction were ranked by descending order of frequency, including error and concern, self-plagiarism, ethical issues, falsification and manipulation, plagiarism, and authorship issues. The proportion of reasons for retraction among different types of OA articles can be seen in Table 2. The gap of proportion of reasons for retraction between Gold OA and non-OA was small, while that of Green OA was quite different from the other two. The proportions of falsification and manipulation, error and concern of Green OA, were much higher than those of Gold OA and non-OA, while Green OA had much lower shares in plagiarism and authorship issues.
Table 2. The proportion and retraction time lag of different reasons for retraction.
Figure 3 shows the retraction rate of reasons for retraction across different OA levels. It could be seen that the retraction rate of Gold OA was the highest among all reasons for retraction. The retraction rate of non-OA was the lowest in error and concern, self-plagiarism, falsification and manipulation, and higher than Green OA in plagiarism and authorship issues. Green OA has a similar retraction rate in ethical issues with non-OA.
Figure 3. Retraction rate of reasons for retraction across different OA levels.
Except the error and concern of reasons for retraction, Green OA had the highest proportion in falsification and manipulation, while it has unexpectedly the lowest proportion in plagiarism and authorship issues. The reason could be partially explained by the “Selection Bias” hypothesis (Craig et al., 2007) of the author. Green OA articles are uploaded by authors themselves into the OA repository, and authors tend to avoid uploading the publications with easily-detected issues, like plagiarism. Therefore, the proportion of Green OA with these issues was relatively low. On the contrary, authors will take risks to upload the publications with difficult-detected issues to enhance their scientific impact, like falsification and manipulation, which is considered to be harder to be examined (Dal-Ré, 2019; Gerber, 2006; Trikalinos, 2008). This performance may be one of the reasons for the higher proportion of these issues in Green OA.

4.2 Retraction time lag of OA level

4.2.1 The retraction time lag of different OA level

Figure 4a shows the average retraction time lag of publications of different OA levels. The overall retraction time lag is 2.95 years, exceeding that of Gold OA, yet remaining lower than those observed for both non-OA and Green OA. The retraction time lag of Gold OA was the shortest, followed by non-OA, while that of Green OA was the longest.
Figure 4. Retraction time lag (a) and survival rate curve (b) of different OA levels.
We drew the survival rate curve of retracted publications across different OA levels (Figure 4b), so as to reveal the retraction speed of each OA level from a more detailed perspective. The horizontal axis was the retraction time lag, and the vertical axis was the percentage of problematic publications that were still “alive” (not yet been retracted) in that year. Among all retracted publications in Green OA, 56% of the publications were still “alive” in the first three years after publication, while the survival rate of non-OA was about 41%, and that of Gold OA was only 36% of Gold OA publications were “alive” in the first three years after publication. To ensure the robustness of our findings, the study selected retracted articles from Q1 journals across different OA types. The results indicate that the conclusions regarding their retraction time lags are consistent with the overall results (see Figure A2 in the Appendix 2).
It is obvious that Gold OA publications were retracted faster than Green OA and non-OA. The result verifies the added value of Gold OA publications of being more forthcoming about errors when they are detected (Peterson, 2013; Shah et al., 2021). Specifically, Gold OA literature has an advantage of greater post-publication content scrutiny, for they are subject to the scrutiny of more readers, so potential unethical behavior is easier to be identified (Fox & Beall, 2014; Lin & McPhee, 2007). The retraction time lag of Green OA was much longer than that of non-OA, which seems unusual. Therefore, we made the analysis of retraction time lag of reasons for retraction, trying to explore why Green OA has the longest retraction time lag.

4.2.2 The reasons why Green OA has the longest retraction time lag

This study reveals that the abnormal proportion of reasons for retractions in Green OA may constitute a potential explanation for the longest retraction time lag observed within Green OA. Table 2 presents the proportion and retraction time lag of reasons for retraction across articles of different OA levels. It was obvious that different reasons for retraction caused different retraction time lags. Green OA articles had higher proportion in falsification and manipulation, and error and concern, and lower proportion in plagiarism and authorship issues. Publications with the issues of falsification and error usually takes longer time to be retracted than plagiarized publications due to the difficulty of detection, which requires more time to have the post-publication content scrutiny (Dal-Ré, 2019; Gerber, 2006; Trikalinos, 2008). The retraction time lags of plagiarism and authorship issues were comparatively shorter, mainly because these issues were easier to be proposed and identified.
Furthermore, we drew the cumulative survival rate curve of each reason for retraction in Figure 5, in order to shed light on the detailed survival characteristics in reasons for retraction of different OA levels. The survival rates of Green OA decreased the slowest in most of the reasons for retraction, followed by non-OA. The survival rate of Gold OA was the fastest, which corresponded to the result of retraction time lag.
Figure 5. Survival rate curve of different reasons for retraction.
Therefore, in Green OA publications, the reasons for retraction with longer retraction time lag accounted for the highest shares, while the reasons with shorter retraction time lag took the lowest shares. This fact could be partly responsible for the longest retraction time lag of Green OA.

5 Discussion and conclusion

After collecting the data of retracted publications from Web of Science and Retraction Watch database, this study compared the differences in the characteristics of retracted publications across different OA levels, and drew the following conclusions.
(1) The retraction rate of Gold OA was much higher than that of Green OA and non-OA. The higher OA level tended to have a higher retraction rate. The number of non-OA retracted articles accounted for the largest proportion of total retracted publications during the past 20 years, followed by Gold OA and Green OA. The number of Gold OA and non-OA retracted articles both increased during 2001-2014, that of Gold OA have gradually narrowed the gap with non-OA, especially in the most recent years. The number of Green OA retracted publications remained stable in the past 20 years.
(2) The reasons for retraction ranked by descending order of frequency were: error and concern, self-plagiarism, ethical issues, falsification and manipulation, plagiarism, and authorship issues. The proportion of retraction reasons between Gold OA and non-OA was similar, but the retraction rate of Gold OA was much higher than that of non-OA in all reasons. In terms of Green OA, the proportion of falsification and manipulation was higher than the other two OA types, which had an influence on its long retraction time lag.
(3) The retraction time lag of Gold OA was the shortest, and that of Green OA was the longest, rather than non-OA as expected. The reason for the long retraction time lag of Green OA could be partly explained by the abnormal proportion of reasons for retraction.
Generally, high OA level (Gold OA) has the highest retraction rate and highest retraction speed among articles of all different OA levels, low OA level (Green OA) has the second retraction rate, non-OA has the lowest retraction rate. However, this did not necessarily mean that higher OA level has better retraction effect, because lots of factors could affect the retraction rate and retraction speed of one OA type, like the proportion of potential flawed articles, the attention that articles received, the difficulty of problem detection. There are two main potential reasons which may explain why articles of higher OA level has higher retraction rate: (1) Articles of higher OA level could attract more readers, and thus cause problems in articles easier to be identified; (2) There are more potential problems (or more easily-detected problems) in articles of higher OA level, which leads to their higher retraction rate. Therefore, whether high OA level could enhance the effectiveness of retraction need to be further discussed.
More activities can be recommended and supported to promote the OA movement from the perspective of research integrity in the scientific community. Peer reviewers, editors, readers, and publishers should join together to promote detection of OA problematic publications. Editors and peer reviewers need to pay more attention to the reliability of images and data in OA publications. Although OA journals should be promoted in scholarly communications (Shah et al., 2021), there was wide variation in quality control among OA journals now (Erfanmanesh & Teixeira, 2019). The use of duplication checking technology in article submission systems could be recommended in OA journals (Liu & Lei, 2021), so that publications with image or data issues could be detected and corrected at the review stage. Readers could use PubPeer or other online platforms to help build an early warning system that made the scientific community aware of the problems in publications (Haunschild & Bornmann, 2021).
We have analyzed the retraction rate and time lag of retracted articles across different OA types. Moreover, the citation impact serves as a distinguishing feature among retracted articles within different OA categories. An exploration of citation patterns both before and after retraction across diverse OA types can illuminate the relationship between openness and the changes in post-retraction citations. This analysis could lead to a better understanding of how open access affects retractions in different ways. Besides, it is also important to consider the impact of other factors on the retraction rate and retraction speed in future studies, such as the discipline and the publication date. Each discipline garners varying levels of attention, those with higher scrutiny are more likely to have issues in articles detected by readers, potentially leading to higher retraction rate and quicker retraction processes (Yeo-Teh & Tang, 2022). Articles published earlier might face less rigorous review from readers and editors, resulting in lower retraction rate and speed. Conversely, articles published more recently may also exhibit lower retraction rate and speed since it takes time for issues to be identified and articles to be retracted. Therefore, articles with neither too early nor too recent publication dates may have relatively higher retraction rate and retraction speed (Fang et al., 2012; Richard, 2011). Future research that controls for these variables when exploring the impact of OA on retraction could lead to more precise and insightful conclusions.
It should be noted that this study has several limitations. (1) OA level in this study is limited to Gold OA, Green OA, and non-OA. There is no exploration of a wider range of OA levels, such as Hybrid OA and Bronze OA. (2) This study classifies the OA level of all retracted publications as its highest OA level. For the literature published in both high OA level and low OA level, we underestimate low OA level and overlook the multiplicity of the OA levels of publications to a certain extent.
Despite these limitations, our bibliometric analysis of OA level of retracted publications provides a broader overview of the relationship between Open Access and research integrity, and may serve as a reference for readers, researchers, and policymakers who paid attention to the development of OA publishing model, along with the problem of research integrity in OA literature.

Acknowledgment

We would like to thank Dr. Zhichao Fang at Renmin University of China for his constructive suggestions and comments, which helped improve the quality of the original manuscript.

Disclosure statement

No potential conflicts of interest were reported by the authors.

Data availability statement

Data available on request from the authors.

Funding information

This study was supported by the National Social Science Foundation of China (No. 22CTQ032).

Author contributions

Er-te Zheng (zhengerte@ruc.edu.cn): Data curation (Lead), Formal analysis (Lead), Methodology (Equal), Visualization (Lead), Writing - original draft (Lead).
Huizhen Fu (fuhuizhen@zju.edu.cn): Conceptualization (Lead), Methodology (Equal), Project administration (Lead), Supervision (Lead), Writing - original draft (Supporting), Writing - review & editing (Equal).

Appendix

Appendix 1

The retraction rate is chosen to compare the probability of problems being found in articles of different OA levels. However, a high retraction rate is not necessarily related to a high rate of problem detection, but may also be due to a high proportion of potential problems existed in articles. Therefore, the study examined the proportion of retracted articles in different Journal Impact Factor (JIF) Quartiles in WoS (Figure A1) to see if articles of one OA type has a disproportionately high percentage of journals in low quartiles, which leads to a higher proportion of potential problematic articles.
Figure A1. Proportion of retracted publications in different JIF quartiles.
Figure A1 showed that there is no specific OA type has disproportionately high percentage of journals in low quartiles, which meant the proportion of potential problematic articles did not differ much across articles with different OA levels. Therefore, it was reasonable to assume that the retraction rate could reflect the probability of problems detection in articles with different OA levels.

Appendix 2

To validate the robustness of the results of retraction time lag, we selected retracted articles from Q1 journals across different OA types and examined whether their retraction time lags were consistent with the overall findings. The result shows that, in Q1 journals, the retraction time lag from shortest to longest are Gold OA, non-OA, and Green OA, aligning with the overall results. This concordance affirms the robustness of our results.
Figure A2. Retraction time lag (a) and survival rate curve (b) of different OA levels (Q1 articles).
[1]
Bar-Ilan J., & Halevi G. (2018). Temporal characteristics of retracted articles. Scientometrics, 116(3), 1771-1783.

[2]
Björk B. C., & Solomon D. (2015). Article processing charges in OA journals: relationship between price and quality. Scientometrics, 103(2), 373-385.

[3]
Bohannon J. (2013). Who’s Afraid of Peer Review? Science, 342(6154), 60-65.

DOI PMID

[4]
Brainard J., & You J. (2018). What a massive database of retracted papers reveals about science publishing’s ‘death penalty’. Science, 25(1), 1-5.

[5]
Butler D. (2013). Investigating journals: The dark side of publishing. Nature News, 495(7442), 433.

[6]
Chambers L. M., Michener C. M., & Falcone T. (2019). Plagiarism and data falsification are the most common reasons for retraction publications in obstetrics and gynaecology. BJOG: An International Journal of Obstetrics & Gynaecology, 126 (9), 1134-1140.

[7]
Chan L., Cuplinskas D., Eisen M., et al. (2002). Budapest open access initiative. ARL Bimonthly, 48.

[8]
Chen W., Xing Q. R., Wang H., & Wang T. (2018). Retracted publications in the biomedical literature with authors from mainland China. Scientometrics, 114(1), 217-227.

[9]
Craig I. D., Plume A. M., McVeigh M. E., Pringle J., & Amin M. (2007). Do open access articles have greater citation impact?: a critical review of the literature. Journal of Informetrics, 1(3), 239-248.

[10]
Dal-Ré R., & Ayuso C. (2019). Reasons for and time to retraction of genetics articles published between 1970 and 2018. Journal of medical genetics, 56(11), 734-740.

DOI PMID

[11]
European Commission. (2020). Responsible Open Science: an ethics and integrity perspective. Retrieved June 1, 2021 from: https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/topic-details/swafs-30-2020.

[12]
European Commission. (2021). Open Innovation, Open Science, Open to the World. Retrieved June 1, 2021 from: https://ec.europa.eu/commission/presscorner/detail/en/speech_15_5243.

[13]
Elango B. (2021). Retracted articles in the biomedical literature from Indian authors. Scientometrics, 126(5), 3965-3981.

[14]
Elia N., Wager E., & Tramèr M. R. (2014). Fate of articles that warranted retraction due to ethical concerns: a descriptive cross-sectional study. PLoS One, 9(1), e85846.

[15]
Erfanmanesh M., & Teixeira Da Silva J. A. (2019). Is the soundness-only quality control policy of open access mega journals linked to a higher rate of published errors? Scientometrics, 120(2), 917-923.

[16]
Fang F. C., Steen R. G., & Casadevall A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences, 109(42), 17028-17033.

[17]
Foo J. Y. A. (2011). A retrospective analysis of the trend of retracted publications in the field of biomedical and life sciences. Science and engineering ethics, 17(3), 459-468.

DOI PMID

[18]
Fox M., & Beall J. (2014). Advice for plagiarism whistleblowers. Ethical & Behavior, 24(5), 341-349.

[19]
Freedman L. P., & Inglese J. (2014). The increasing urgency for standards in basic biologic research. Cancer research, 74(15), 4024-4029.

DOI PMID

[20]
Furman J. L., Jensen K., & Murray F. (2012). Governing knowledge in the scientific community: Exploring the role of retractions in biomedicine. Research Policy, 41(2), 276-290.

[21]
Gerber P. (2006). What can we learn from the Hwang and Sudbø affairs? Medical Journal of Australia, 184(12), 632-635.

DOI PMID

[22]
Ghorbi A., Fazeli-Varzaneh M., Ghaderi-Azad E., Ausloos M., & Kozak M. (2021). Retracted papers by Iranian authors: causes, journals, time lags, affiliations, collaborations. Scientometrics, 126(9), 7351-7371.

[23]
Haunschild R., Bornmann L. (2021). Can tweets be used to detect problems early with scientific papers? A case study of three retracted COVID-19/SARS-CoV-2 papers. Scientometrics, 126(6), 5181-5199.

[24]
He T. (2013). Retraction of global scientific publications from 2001 to 2010. Scientometrics, 96(2), 555-561.

[25]
Liu W., & Lei L. (2021). Retractions in the Middle East from 1999 to 2018: a bibliometric analysis. Scientometrics, 126(6), 4687-4700.

[26]
Nosek B. A., Ebersole C. R., DeHaven A. C., & Mellor D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600-2606.

[27]
Olson S., & Griffiths P. A. (1995). On Being a Scientist: Responsible Conduct in Research. Retrieved June 1, 2021 from: http://www.sunstar-solutions.com/AOP/SOW/being_scientist.htm.

[28]
Oransky I. (2018). We’re officially launching our database today. Here’s what you need to know. Retrieved June 1, 2021 from: https://retractionwatch.com/2018/10/25/were-officially-launching-our-database-today-heres-what-you-need-to-know/.

[29]
Peterson G. M. (2013). Characteristics of retracted open access biomedical literature: A bibliographic analysis. Journal of the American Society for Information Science and Technology, 64(12), 2428-2436.

[30]
Rai R., & Sabharwal S. (2017). Retracted publications in orthopaedics: prevalence, characteristics, and trends. The Journal of Bone and Joint surgery, 99(9), e44.

[31]
Retraction Watch. (2018). Retraction Watch database user guide. Retrieved June 1, 2021 from: https://retractionwatch.com/retraction-watch-database-user-guide/.

[32]
Richard V. N. (2011). Science publishing: The trouble with retractions. Nature, 478(7367), 26-28.

[33]
Shah T. A., Gul S., Bashir S., Ahmad S., Huertas A., Oliveira A.,... & Chakraborty K. (2021). Influence of accessibility (open and toll-based) of scholarly publications on retractions. Scientometrics, 126(6), 4589-4606.

[34]
Sharma K. (2021). Team size and retracted citations reveal the patterns of retractions from 1981 to 2020. Scientometrics, 126(10), 8363-8374.

[35]
Shen C., & Björk B. C. (2015). ‘Predatory’open access: a longitudinal study of article volumes and market characteristics. BMC medicine, 13(1), 1-15.

[36]
Shuai X., Rollins J., Moulinier I., Custis T., Edmunds M., & Schilder F. (2017). A multidimensional investigation of the effects of publication retraction on scholarly impact. Journal of the Association for Information Science and Technology, 68(9), 2225-2236.

[37]
Steen R. G., Casadevall A., & Fang F. C. (2013). Why Has the Number of Scientific Retractions Increased?. PLoS ONE, 8(7), e68397-e68397.

[38]
Stojanovski J. (2015). Do Croatian open access journals support ethical research? Content analysis of instructions to authors. Biochemia medica, 25(1), 12-21.

DOI PMID

[39]
Tijdink J. K., Horbach S. P., Nuijten M. B., & O’Neill G. (2021). Towards a research agenda for promoting responsible research practices. Journal of Empirical Research on Human Research Ethics, 16(4), 450-460.

[40]
Trikalinos N. A., Evangelou E., & Ioannidis J. P. (2008). Falsified papers in high-impact journals were slow to retract and indistinguishable from nonfraudulent articles. Journal of clinical epidemiology, 61(5), 464-470.

DOI PMID

[41]
Vadhera A. S., Lee J. S., Veloso I. L., Khan Z. A., Trasolini N. A., Gursoy S.,... & Verma N. N. (2022). Open access articles garner increased social media attention and citation rates compared with subscription access research articles: an altmetrics-based analysis. The American Journal of Sports Medicine, 50(13), 3690-3697.

DOI PMID

[42]
Vuong Q. H., La V. P., Hồ M. T., Vuong T. T., & Ho M. T. (2020). Characteristics of retracted articles based on retraction data from online sources through February 2019. Science Editing, 7(1), 34-44.

[43]
Wang T., Xing Q. R., Wang H., & Chen W. (2019). Retracted publications in the biomedical literature from open access journals. Science and engineering ethics, 25, 855-868.

DOI PMID

[44]
Wang X., Liu C., Mao W., & Fang Z. (2015). The open access advantage considering citation, article usage and social media attention. Scientometrics, 103(2), 555-564.

[45]
Web of Science. (2021). Web of Science Core Collection Help. Retrieved June 1, 2021 from: http://images.webofknowledge.com//WOKRS535R111/help/WOS/hp_whatsnew_wos.html

[46]
Yeo-Teh N. S. L., & Tang B. L. (2022). Sustained Rise in Retractions in the Life Sciences Literature during the Pandemic Years 2020 and 2021. Publications, 10(3).

[47]
Zhang M., & Grieneisen M. L. (2013). The impact of misconduct on the published medical and non-medical literature, and the news media. Scientometrics, 96(2), 573-587.

[48]
Zhang Q., Abraham J., & Fu H. Z. (2020). Collaboration and its influence on retraction based on retracted publications during 1978-2017. Scientometrics, 125(1), 213-232.

Outlines

/

京ICP备05002861号-43

Copyright © 2023 All rights reserved Journal of Data and Information Science

E-mail: jdis@mail.las.ac.cn Add:No.33, Beisihuan Xilu, Haidian District, Beijing 100190, China

Support by Beijing Magtech Co.ltd E-mail: support@magtech.com.cn