Home Search result

Search result

Journal
    Loading ...
Publication year
    Loading ...
Channels
    Loading ...
Article type
    Loading ...
Journal sort
    Loading ...
  • Article
  • Video
  • Order by
Total 506 results are found
Please wait a minute...
  • Select all
    |
  • Corrigendum
    William H. Walters
    Journal of Data and Information Science. 2025, 10(2): 179-179. https://doi.org/10.2478/jdis-2025-0018
  • Editorials
    The JDIS Editors
    Journal of Data and Information Science. 2025, 10(2): 176-178. https://doi.org/10.2478/jdis-2025-0019
  • Editorials
    Journal of Data and Information Science. 2025, 10(2): 174-175. https://doi.org/10.2478/jdis-2025-0027
  • Research Papers
    Jinzhong Guo, Jianan Liu, Moxin Li, Xiaoling Liu, Chengyong Liu
    Journal of Data and Information Science. 2025, 10(2): 152-173. https://doi.org/10.2478/jdis-2025-0025

    Purpose: Currently, different research conclusions exist about the relationship between relational capital and corporate innovation. The research aims to (1) reveal the actual relationship between executive alumni relations and firm innovation performance, (2) examine the moderating role of executive academic backgrounds, (3) analyze the paths for firms to leverage knowledge spillovers from regional universities to promote firm innovation by their geographic location.
    Design/methodology/approach: A social network approach is used to construct alumni relationship networks of A-share listed companies in Shanghai and Shenzhen, China. A two-way fixed effects model is used to assess the impact of firms’ structural position in executive alumni networks on firms’ innovation performance. In addition, the research also delves into the interactions between knowledge spillovers from geographic locations and executives’ alumni networks, aiming to elucidate their combined effects on firms’ innovation performance.
    Findings: This paper explores the curvilinear relationship between executive alumni networks’ centrality and firm innovation within the Chinese context. It also finds that in the positive effect interval on the right side of the “U-shaped,” the industry with the highest number of occurrences is the high-tech industry. Moreover, it elucidates the moderating influence of executives’ academic experience on the alumni networks-innovation nexus, offering a nuanced understanding of these dynamics. Lastly, we provide novel insights into optimizing resource allocation to leverage geographic knowledge spillovers for innovation.
    Research limitations: The study may not fully represent the broader population of firms, particularly small and medium-sized enterprises (SMEs) or unlisted companies. Future research could expand the sample to include a more diverse range of firms to enhance the generalizability of the findings.
    Practical implications: Firstly, companies can give due consideration to the alumni resources of executives in their personnel decisions, but they should pay attention to the rational use of resources. Secondly, universities should actively work with companies to promote knowledge transfer and collaboration.
    Originality/value: The findings help clarify the influence mechanism of firms’ innovation performance, providing theoretical support and empirical evidence for firms to drive innovation at the executive alumni relationship network level.

  • Research Papers
    Murtuza Shahzad, Hamed Barzamini, Joseph Wilson, Hamed Alhoori, Mona Rahimi
    Journal of Data and Information Science. 2025, 10(2): 124-151. https://doi.org/10.2478/jdis-2025-0020

    Purpose: This research addresses the challenge of concept drift in AI-enabled software, particularly within autonomous vehicle systems where concept drift in object recognition (like pedestrian detection) can lead to misclassifications and safety risks. This study introduces a proactive framework to detect early signs of domain-specific concept drift by leveraging domain analysis and natural language processing techniques. This method is designed to help maintain the relevance of domain knowledge and prevent potential failures in AI systems due to evolving concept definitions.
    Design/methodology/approach: The proposed framework integrates natural language processing and image analysis to continuously update and monitor key domain concepts against evolving external data sources, such as social media and news. By identifying terms and features closely associated with core concepts, the system anticipates and flags significant changes. This was tested in the automotive domain on the pedestrian concept, where the framework was evaluated for its capacity to detect shifts in the recognition of pedestrians, particularly during events like Halloween and specific car accidents.
    Findings: The framework demonstrated an ability to detect shifts in the domain concept of pedestrians, as evidenced by contextual changes around major events. While it successfully identified pedestrian-related drift, the system’s accuracy varied when overlapping with larger social events. The results indicate the model’s potential to foresee relevant shifts before they impact autonomous systems, although further refinement is needed to handle high-impact concurrent events.
    Research limitations: This study focused on detecting concept drift in the pedestrian domain within autonomous vehicles, with results varying across domains. To assess generalizability, we tested the framework for airplane-related incidents and demonstrated adaptability. However, unpredictable events and data biases from social media and news may obscure domain-specific drifts. Further evaluation across diverse applications is needed to enhance robustness in evolving AI environments.
    Practical implications: The proactive detection of concept drift has significant implications for AI-driven domains, especially in safety-critical applications like autonomous driving. By identifying early signs of drift, this framework provides actionable insights for AI system updates, potentially reducing misclassification risks and enhancing public safety. Moreover, it enables timely interventions, reducing costly and labor-intensive retraining requirements by focusing only on the relevant aspects of evolving concepts. This method offers a streamlined approach for maintaining AI system performance in environments where domain knowledge rapidly changes.
    Originality/value: This study contributes a novel domain-agnostic framework that combines natural language processing with image analysis to predict concept drift early. This unique approach, which is focused on real-time data sources, offers an effective and scalable solution for addressing the evolving nature of domain-specific concepts in AI applications.

  • Research Papers
    Mike Thelwall, Kayvan Kousha
    Journal of Data and Information Science. 2025, 10(2): 106-123. https://doi.org/10.2478/jdis-2025-0016

    Purpose: Journal Impact Factors and other citation-based indicators are widely used and abused to help select journals to publish in or to estimate the value of a published article. Nevertheless, citation rates primarily reflect scholarly impact rather than other quality dimensions, including societal impact, originality, and rigour. In response to this deficit, Journal Quality Factors (JQFs) are defined and evaluated. These are average quality score estimates given to a journal’s articles by ChatGPT.
    Design/methodology/approach: JQFs were compared with Polish, Norwegian and Finnish journal ranks and with journal citation rates for 1,300 journals with 130,000 articles from 2021 in large monodisciplinary journals in the 25 out of 27 Scopus broad fields of research for which it was possible. Outliers were also examined.
    Findings: JQFs correlated positively and mostly strongly (median correlation: 0.641) with journal ranks in 24 out of the 25 broad fields examined, indicating a nearly science-wide ability for ChatGPT to estimate journal quality. Journal citation rates had similarly high correlations with national journal ranks, however, so JQFs are not a universally better indicator. An examination of journals with JQFs not matching their journal ranks suggested that abstract styles may affect the result, such as whether the societal contexts of research are mentioned.
    Research limitations: Different journal rankings may have given different findings because there is no agreed meaning for journal quality.
    Practical implications: The results suggest that JQFs are plausible as journal quality indicators in all fields and may be useful for the (few) research and evaluation contexts where journal quality is an acceptable proxy for article quality, and especially for fields like mathematics for which citations are not strong indicators of quality.
    Originality/value: This is the first attempt to estimate academic journal value with a Large Language Model.

  • Research Papers
    Zhiwei Zhang, Wenhao Zhou, Hailin Li
    Journal of Data and Information Science. 2025, 10(2): 80-105. https://doi.org/10.2478/jdis-2025-0008

    Purpose: This study explores the combined effects of structural and relational embeddedness within alliance networks on firm innovation. By focusing on the interplay between network structures and relationships, this study provides a nonlinear framework to unravel the complex dynamics between alliance networks and firm innovation performance within the manufacturing industry.
    Design/methodology/approach: Using social network analysis, this study examines the topological structure of firms’ alliance networks. An exploratory approach involving K-Means clustering and decision tree methods is employed to identify heterogeneous network types within the alliance networks. The analysis further explores the nonlinear relationships between network characteristics, including closeness centrality, betweenness centrality, clustering coefficient, and relational attributes, including collaboration intensity and breadth, and their combined influence on firm innovation.
    Findings: The study identified four distinct heterogeneous network types: dyadic, star, ringlike, and complex networks. Each type reveals unique network characteristics and their impact on innovation performance. Key decision rules were extracted, showing that strong relational embeddedness can hinder innovation in dyadic networks, while a greater distance from the central firm correlates with higher innovation performance in star alliance networks. For ringlike alliance networks, moderate cooperation intensity is beneficial for innovation when the clustering coefficient is not high. In complex alliance networks, the combined effects of cooperation intensity, breadth, and clustering coefficient significantly influence innovation.
    Research limitations: The research presented in this study, while offering valuable insights into the relationship between alliance networks and firm innovation within the manufacturing sector, is subject to several limitations. A focus on the manufacturing industry may restrict the generalizability of our findings to other sectors, where the dynamics of innovation and collaboration might differ significantly. Additionally, our reliance on patent data, while providing a quantifiable measure of innovation, may overlook other forms of innovation that are equally critical in different contexts, such as service innovations or business model transformations.
    Practical implications: This research offers significant insights into how firms can leverage both network structure and relational aspects to enhance innovation outcomes. By revealing the nonlinear and complex interactions between network embeddedness dimensions, this study makes a valuable contribution to both theory and practice. This highlights that strategic management of both structural and relational embeddedness can foster superior innovation performance, offering firms a competitive advantage by optimizing their alliance network configurations.
    Originality/value: This study’s originality lies in its examination of the combined effects of structural and relational network embeddings on innovation performance. By identifying distinct network types and their impact on innovation, this study advances the theoretical understanding of how network characteristics interact to shape firm innovation. It contributes to the literature by offering a novel, multidimensional framework that integrates social network theory and resource-based view, providing new insights for firms to leverage their network positions and relationships for competitive advantage.

  • Research Papers
    Jiandong Zhang, Sonia Gruber, Rainer Frietsch
    Journal of Data and Information Science. 2025, 10(2): 61-79. https://doi.org/10.2478/jdis-2025-0028

    Purpose: Interdisciplinary research has become a critical approach to addressing complex societal, economic, technological, and environmental challenges, driving innovation and integrating scientific knowledge. While interdisciplinarity indicators are widely used to evaluate research performance, the impact of classification granularity on these assessments remains underexplored.
    Design/methodology/approach: This study investigates how different levels of classification granularity—macro, meso, and micro—affect the evaluation of interdisciplinarity in research institutes. Using a dataset of 262 institutes from four major German non-university organizations (FHG, HGF, MPG, WGL) from 2018 to 2022, we examine inconsistencies in interdisciplinarity across levels, analyze ranking changes, and explore the influence of institutional fields and research focus (applied vs. basic).
    Findings: Our findings reveal significant inconsistencies in interdisciplinarity across classification levels, with rankings varying substantially. Notably, the Fraunhofer Society (FHG), which performs well at the macro level, experiences significant ranking declines at meso and micro levels. Normalizing interdisciplinarity by research field confirmed that these declines persist. The research focus of institutes, whether applied, basic, or mixed, does not significantly explain the observed ranking dynamics.
    Research limitations: This study has only considered the publication-based dimension of institutional interdisciplinarity and has not explored other aspects.
    Practical implications: The findings provide insights for policymakers, research managers, and scholars to better interpret interdisciplinarity metrics and support interdisciplinary research effectively.
    Originality/value: This study underscores the critical role of classification granularity in interdisciplinarity assessment and emphasizes the need for standardized approaches to ensure robust and fair evaluations.

  • Research Papers
    Anand Bihari, Sudhakar Tripathi, Akshay Deepak, P. Mohan Kumar
    Journal of Data and Information Science. 2025, 10(2): 40-60. https://doi.org/10.2478/jdis-2025-0013

    Purpose: Generally, the scientific comparison has been done with the help of the overall impact of scholars. Although it is very easy to compare scholars, but how can we assess the scientific impact of scholars who have different research careers? It is very obvious, the scholars may gain a high impact if they have more research experience or have spent more time (in terms of research career in a year). Then we cannot compare two scholars who have different research careers. Many bibliometrics indicators address the time-span of scholars. In this series, the h-index sequence and EM/EM’-index sequence have been introduced for assessment and comparison of the scientific impact of scholars. The h-index sequence, EM-index sequence, and EM’-index sequence consider the yearly impact of scholars, and comparison is done by the index value along with their component value. The time-series indicators fail to give a comparative analysis between senior and junior scholars if there is a huge difference in both scholars’ research careers.
    Design/methodology/approach: We have proposed the cumulative index calculation method to appraise the scientific impact of scholars till that age and tested it with 89 scholars data.
    Findings: The proposed mechanism is implemented and tested on 89 scholars’ publication data, providing a clear difference between the scientific impact of two scholars. This also helps in predicting future prominent scholars based on their research impact.
    Research limitations: This study adopts a simplistic approach by assigning equal credit to all authors, regardless of their individual contributions. Further, the potential impact of career breaks on research productivity is not taken into account. These assumptions may limit the generalizability of our findings
    Practical implications: The proposed method can be used by respected institutions to compare their scholars impact. Funding agencies can also use it for similar purposes.
    Originality/value: This research adds to the existing literature by introducing a novel methodology for comparing the scientific impact of scholars. The outcomes of this research have notable implications for the development of more precise and unbiased research assessment frameworks, enabling a more equitable evaluation of scholarly contributions.

  • Research Papers
    Giovanni Abramo, Ciriaco Andrea D’Angelo, Leonardo Grilli
    Journal of Data and Information Science. 2025, 10(2): 13-39. https://doi.org/10.2478/jdis-2025-0010

    Purpose: Scholars face an unprecedented ever increasing demand for acting as reviewers for journals, recruitment and promotion committees, granting agencies, and research assessment agencies. Consequently, journal editors face an ever increasing scarcity of experts willing to act as reviewers. It is not infrequent that reviews diverge, which forces editors to recur to additional reviewers or make a final decision on their own. The purpose of the proposed bibliometric system is to support of editors’ accept/reject decisions in such situations.
    Design/methodology/approach: We analyse nearly two million 2017 publications and their scholarly impact, measured by normalized citations. Based on theory and previous literature, we extrapolated the publication traits of text, byline, and bibliographic references expected to be associated with future citations. We then fitted a regression model with the outcome variable as the scholarly impact of the publication and the independent variables as the above non-scientific traits, controlling for fixed effects at the journal level.
    Findings: Non-scientific factors explained more than 26% of the paper’s impact, with slight variation across disciplines. On average, OA articles have a 7% greater impact than non-OA articles. A 1% increase in the number of references was associated with an average increase of 0.27% in impact. Higher-impact articles in the reference list, the number of authors and of countries in the byline, the article length, and the average impact of co-authors’ past publications all show a positive association with the article’s impact. Female authors, authors from English-speaking countries, and the average age of the article’s references show instead a negative association.
    Research limitations: The selected non-scientific factors are the only observable and measurable ones to us, but we cannot rule out the presence of significant omitted variables. Using citations as a measure of impact has well-known limitations and overlooks other forms of scholarly influence. Additionally, the large dataset constrained us to one year’s global publications, preventing us from capturing and accounting for time effects.
    Practical implications: This study provides journal editors with a quantitative model that complements peer reviews, particularly when reviewer evaluations diverge. By incorporating non-scientific factors that significantly predict a paper’s future impact, editors can make more informed decisions, reduce reliance on additional reviewers, and improve the efficiency and fairness of the manuscript selection process.
    Originality/value: To the best of our knowledge, this study is the first one to specifically address the problem of supporting editors in any field in their decisions on submitted manuscripts with a quantitative model. Previous works have generally investigated the relationship between a few of the above publication traits and their impact or the agreement between peer-review and bibliometric evaluations of publications.

  • Research Notes
    Vasile Cernat, Jaime A. Teixeira da Silva
    Journal of Data and Information Science. 2025, 10(2): 6-12. https://doi.org/10.2478/jdis-2025-0022

    Purpose: This study examines the impact of research policy changes on scientific retractions of publications authored by Romanian authors, focusing on national trends and the interplay between policy reforms and publishing practices.
    Design/methodology/approach: Using data from the Retraction Watch Database and Web of Science (WoS), 188 unique retractions involving Romanian authors (2000-2022) were analyzed. The study compared retraction patterns before and after the 2016 reforms, which prioritized the publication of articles in WoS-indexed journals over non-WoS outputs.
    Findings: The analysis identified two key trends: (1) before the 2016 reforms, retractions predominantly involved non-WoS journals (99 non-WoS retractions to 38 WoS retractions), a trend that reversed post-reform (16 non-WoS to 35 WoS), and (2) while the total number of WoS-indexed retractions increased after the reforms, the retraction rates for WoS articles remained stable. Post-reform reliance on MDPI journals, which have low retraction rates, partially explains this stability. Excluding MDPI publications, retraction rates for articles and reviews increase by 14.91%, aligning with patterns seen elsewhere.
    Research limitations: The study focuses on retractions involving Romanian authors, limiting its generalizability. Furthermore, reliance on database records may not fully capture all retractions.
    Practical implications: These findings underscore the need for research policy reforms to consider a broader range of effects, and the need for nuanced interpretations of retraction data, which are influenced by a complex range of factors, including specific publisher practices.
    Originality/value: This research is the first to investigate the complex relationship between research policy reforms, publisher behavior, and retraction trends.

  • Research Notes
    Mike Thelwall
    Journal of Data and Information Science. 2025, 10(2): 1-5. https://doi.org/10.2478/jdis-2025-0014

    Google Gemini 1.5 Flash scores were compared with ChatGPT 4o-mini on evaluations of (a) 51 of the author’s journal articles and (b) up to 200 articles in each of 34 field-based Units of Assessment (UoAs) from the UK Research Excellence Framework (REF) 2021. From (a), the results suggest that Gemini 1.5 Flash, unlike ChatGPT 4o-mini, may work better when fed with a PDF or article full text, rather than just the title and abstract. From (b), Gemini 1.5 Flash seems to be marginally less able to predict an article’s research quality (using a departmental quality proxy indicator) than ChatGPT 4o-mini, although the differences are small, and both have similar disciplinary variations in this ability. Averaging multiple runs of Gemini 1.5 Flash improves the scores.

  • Research Papers
    Mudassar Hassan Arsalan, Omar Mubin, Abdullah Al Mahmud, Sajida Perveen
    Journal of Data and Information Science. 2025, 10(1): 188-227. https://doi.org/10.2478/jdis-2025-0001

    Purpose: This study investigates key factors contributing to research impact and their interactions with the Research Impact Quintuple Helix Model by Arsalan et al. (2024).

    Design/methodology/approach: Using data from a global survey of 630 scientists across diverse disciplines, genders, regions, and experience levels, Structural Equation Modelling (SEM) was employed to assess the influence of 29 factors related to researcher characteristics, research attributes, publication strategies, institutional support, and national roles.

    Findings: The study validated the Quintuple Helix Model, uncovering complex interdependencies. Institutional support significantly affects research impact by covering leadership, resources, recognition, and funding. Researcher attributes, including academic experience and domain knowledge, also play a crucial role. National socioeconomic conditions indirectly influence research impact by supporting institutions, underscoring the importance of conducive national frameworks.

    Research limitations: While the study offers valuable insights, it has limitations. Although statistically sufficient, the response rate was below 10%, suggesting that the findings may not fully represent the entire global research community. The reliance on self-reported data may also introduce bias, as perceptions of impact can be subjective.

    Practical implications: The findings have a significant impact on researchers aiming to enhance their work’s societal, economic, and cultural significance, institutions seeking supportive environments, and policymakers interested in creating favourable national conditions for impactful research. The study advocates for a strategic alignment among national policies, institutional practices, and individual researcher efforts to maximise research impact and effectively address global challenges.

    Originality/value: By empirically validating the Research Impact Quintuple Helix Model, this study offers a holistic framework for understanding the synergy of factors that drive impactful research.

  • Research Papers
    Jun Zhang, Jianhua Liu, Haihong E, Tianyi Hu, Xiaodong Qiao, ZiChen Tang
    Journal of Data and Information Science. 2025, 10(1): 167-187. https://doi.org/10.2478/jdis-2025-0003

    Purpose: In this paper, we develop a heterogeneous graph network using citation relations between papers and their basic information centered around the “Paper mills” papers under withdrawal observation, and we train graph neural network models and classifiers on these heterogeneous graphs to classify paper nodes.

    Design/methodology/approach: Our proposed citation network-based “Paper mills” detection model (PDCN model for short) integrates textual features extracted from the paper titles using the BERT model with structural features obtained from analyzing the heterogeneous graph through the heterogeneous graph attention network model. Subsequently, these features are classified using LGBM classifiers to identify “Paper mills” papers.

    Findings: On our custom dataset, the PDCN model achieves an accuracy of 81.85% and an F1-score of 80.49% in the “Paper mills” detection task, representing a significant improvement in performance compared to several baseline models.

    Research limitations: We considered only the title of the article as a text feature and did not obtain features for the entire article.

    Practical implications: The PDCN model we developed can effectively identify “Paper mills” papers and is suitable for the automated detection of “Paper mills” during the review process.

    Originality/value: We incorporated both text and citation detection into the “Paper mills” identification process. Additionally, the PDCN model offers a basis for judgment and scientific guidance in recognizing “Paper mills” papers.

  • Research Papers
    William H. Walters
    Journal of Data and Information Science. 2025, 10(1): 151-166. https://doi.org/10.2478/jdis-2025-0002

    Purpose: For a set of 1,561 Open Access (OA) and non-OA journals in business and economics, this study evaluates the relationships between four citation metrics—five-year Impact Factor (5IF), CiteScore, Article Influence (AI) score, and SCImago Journal Rank (SJR)—and the journal ratings assigned by expert reviewers. We expect that the OA journals will have especially high citation impact relative to their perceived quality (reputation).

    Design/methodology/approach: Regression is used to estimate the ratings assigned by expert reviewers for the 2021 CABS (Chartered Association of Business Schools) journal assessment exercise. The independent variables are the four citation metrics, evaluated separately, and a dummy variable representing the OA/non-OA status of each journal.

    Findings: Regardless of the citation metric used, OA journals in business and economics have especially high citation impact relative to their perceived quality (reputation). That is, they have especially low perceived quality (reputation) relative to their citation impact.

    Research limitations: These results are specific to the CABS journal ratings and the four citation metrics. However, there is strong evidence that CABS is closely related to several other expert ratings, and that 5IF, CiteScore, AI, and SJR are representative of the other citation metrics that might have been chosen.

    Practical implications: There are at least two possible explanations for these results: (1) expert evaluators are biased against OA journals, and (2) OA journals have especially high citation impact due to their increased accessibility. Although this study does not allow us to determine which of these explanations are supported, the results suggest that authors should consider publishing in OA journals whenever overall readership and citation impact are more important than journal reputation within a particular field. Moreover, the OA coefficients provide a useful indicator of the extent to which anti-OA bias (or the citation advantage of OA journals) is diminishing over time.

    Originality/value: This is apparently the first study to investigate the impact of OA status on the relationships between expert journal ratings and journal citation metrics.

  • Research Papers
    Alan J. Giacomin, Martin Zatloukal, Mona A. Kanso, Nhan Phan-Thien
    Journal of Data and Information Science. 2025, 10(1): 132-150. https://doi.org/10.2478/jdis-2025-0004

    Purpose: This study investigates the physics of annual fractional citation growth and its impact on journal bibliographic metrics, focusing on the interplay between journal publication growth and citation dynamics.

    Design/methodology/approach: We analyze bibliometric data from three prominent fluids journals—Physics of Fluids, Journal of Fluid Mechanics, and Physical Review Fluids—over the period 1999-2023. The analysis examines the relations among annual fractional journal publication growth, citation growth, and bibliographic metric suppressions.

    Findings: Our findings reveal that the suppression of impact factor growth is significantly influenced by annual fractional journal publication growth rather than citation growth. All three journals exhibit similar responses to publication growth with minimal scatter, following a consistent functional relation. We also identify narrow, nearly Gaussian distributions for annual fractional journal publication growth. Furthermore, we introduce a new growth-independent dimensionless bibliometric metric, journal urgency, the ratio of annual fractional citation growth to the 4-year running average immediacy index. This metric captures effectively the dependency of citation growth on urgency and reveals consistent distributions across the journals analyzed.

    Research limitations: The study is limited to three major fluids journals and to the availability of bibliometric data from 1999 to 2023. Future work could extend the analysis to other disciplines and journals.

    Practical implications: Understanding the relation between publication growth and bibliometric suppressions can inform editorial and strategic decisions in journal management. The proposed journal urgency metric offers a novel tool for assessing and comparing journal performance independent of growth rates.

    Originality/value: This study introduces a new bibliometric metric—journal urgency—that provides fresh insights into citation dynamics and bibliographic metric behavior. It highlights the critical role of publication growth in shaping journal impact factors and CiteScores, offering a unified framework applicable across multiple journals.

  • Research Papers
    Jose A. Garcia, Rosa Rodriguez-Sanchez, J. Fdez-Valdivia
    Journal of Data and Information Science. 2025, 10(1): 101-131. https://doi.org/10.2478/jdis-2025-0006

    Purpose: In this paper, we use author clustering based on journal coupling (i.e., shared academic journals) to determine researchers who have the same scientific interests and similar conceptual frameworks. The basic assumption is that authors who publish in the same academic journals are more likely to share similar conceptual frameworks and interests than those who never publish in the same venues. Therefore, they are more likely to be part of the same invisible college (i.e., authors in this subgroup contribute materially to research on the same topic and often publish their work in similar publication venues).

    Design/methodology/approach: Test in a controlled exercise the grouping of authors based on journal coupling to determine invisible colleges in a research field using a case study of 302 authors who had published in the Information Science and Library Science (IS&LS) category of the Web of Science Core Collection. For each author, we retrieved all the scientific journals in which this author had published his/her articles. We then used the cosine measure to calculate the similarity between authors (both first and second order).

    Findings: In this paper, using journal coupling of IS&LS authors, we found four main invisible colleges: “Information Systems”, “Business and Information Management”, “Quantitative Information Science” and “Library Science.” The main journals that determine the existence of these invisible colleges were Inform Syst Res, Inform Syst J, J Bus Res, J Knowl Manage, J Informetr, Pro Int Conf Sci Inf, Int J Geogr Inf Sci, J Am Med Inform Assn, and Learn Publ. However, the main journals that demonstrate that IS&LS determine a field were J Am Soc Inf Sci Tec/J Assoc Inf Sci Tech, Scientometrics, Inform Process Manag, and J Inf Sci.

    Research limitations: The results shown in this article are from a controlled exercise. The analysis performed using journal coupling excludes books, book chapters, and conference papers. In this article, only academic journals were used for the representation of research results.

    Practical implications: Our results may be of interest to IS&LS scholars. This is because these results provide a new lens for grouping authors, making use of the authors’ journal publication profile and journal coupling. Furthermore, extending our approach to the study of the structure of other disciplines would possibly be of interest to historians of science as well as scientometricians.

    Originality/value: This is a novel approach based on journal coupling to determine authors who are most likely to be part of the same invisible college.

  • Research Papers
    Ciriaco Andrea D’Angelo
    Journal of Data and Information Science. 2025, 10(1): 74-100. https://doi.org/10.2478/jdis-2025-0005

    Purpose: This study investigates whether publication-centric incentive systems, introduced through the National Scientific Accreditation (ASN: Abilitazione Scientifica Nazionale) for professorships in Italy in 2012, contribute to adopting “salami publishing” strategies among Italian academics.

    Design/methodology/approach: A longitudinal bibliometric analysis was conducted on the publication records of over 25,000 Italian science professors to examine changes in publication output and the originality of their work following the implementation of the ASN.

    Findings: The analysis revealed a significant increase in publication output after the ASN’s introduction, along with a concurrent decline in the originality of publications. However, no evidence was found linking these trends to increased salami slicing practices among the observed researchers.

    Research limitations: Given the size of our observation field, we propose an innovative indirect approach based on the degree of originality of publications’ bibliographies. We know that bibliographic coupling cannot capture salami publications per se, but only topically-related records. On the other hand, controlling for the author’s specialization level in the period, we believe that a higher level of bibliographic coupling in his scientific output can signal a change in his strategy of disseminating the results of his research. The relatively low R-squared values in our models (0.3-0.4) reflect the complexity of the phenomenon under investigation, revealing the presence of unmeasured factors influencing the outcomes, and future research should explore additional variables or alternative models that might account for a greater proportion of the variability. Despite this limitation, the significant predictors identified in our analysis provide valuable insights into the key factors driving the observed outcomes.

    Practical implications: The results of the study support those who argue that quantitative research assessment frameworks have had very positive effects and should not be dismissed, contrary to the claims of those evoking the occurrence of side effects that do not appear in the empirical analyses.

    Originality/value: This study provides empirical evidence on the impact of the ASN on publication behaviors in a huge micro-level dataset, contributing to the broader discourse on the effects of quantitative research assessments on academic publishing practices.

  • Research Papers
    Sandra Rousseau, Cinzia Daraio
    Journal of Data and Information Science. 2025, 10(1): 47-73. https://doi.org/10.2478/jdis-2025-0012

    Purpose: We aimed to measure the variation in researchers’ knowledge and attitudes towards bibliometric indicators. The focus is on mapping the heterogeneity of this metric-wiseness within and between disciplines.

    Design/methodology/approach: An exploratory survey is administered to researchers at the Sapienza University of Rome, one of Europe’s oldest and largest generalist universities. To measure metric-wiseness, we use attitude statements that are evaluated by a 5-point Likert scale. Moreover, we analyze documents of recent initiatives on assessment reform to shed light on how researchers’ heterogeneous attitudes regarding and knowledge of bibliometric indicators are taken into account.

    Findings: We found great heterogeneity in researchers’ metric-wiseness across scientific disciplines. In addition, within each discipline, we observed both supporters and critics of bibliometric indicators. From the document analysis, we found no reference to individual heterogeneity concerning researchers’ metric wiseness.

    Research limitations: We used a self-selected sample of researchers from one Italian university as an exploratory case. Further research is needed to check the generalizability of our findings.

    Practical implications: To gain sufficient support for research evaluation practices, it is key to consider researchers’ diverse attitudes towards indicators.

    Originality/value: We contribute to the current debate on reforming research assessment by providing a novel empirical measurement of researchers’ knowledge and attitudes towards bibliometric indicators and discussing the importance of the obtained results for improving current research evaluation systems.

  • Research Papers
    Lutz Bornmann, Charles Crothers, Robin Haunschild
    Journal of Data and Information Science. 2025, 10(1): 26-46. https://doi.org/10.2478/jdis-2025-0009

    Purpose: Citations can be used in evaluative bibliometrics to measure the impact of papers. However, citation analysis can be extended by a multi-dimensional perspective on citation impact which is intended to receive more specific information about the kind of received impact.

    Design/methodology/approach: Bornmann, Wray, and Haunschild (2019) introduced citation concept analysis (CCA) for capturing the importance and usefulness certain concepts have in subsequent research. The method is based on the analysis of citances - the contexts of citations in citing papers. This study applies the method by investigating the impact of various concepts introduced in the oeuvre of the world-leading French sociologist Pierre Bourdieu.

    Findings: We found that the most cited concepts are ‘social capital’ (with about 34% of the citances in the citing papers), ‘cultural capital’, and ‘habitus’ (both with about 24%). On the other hand, the concepts ‘doxa’ and ‘reflexivity’ score only about 1% each.

    Research limitations: The formulation of search terms for identifying the concepts in the data and the citation context coverage are the most important limitations of the study.

    Practical implications: The results of this explorative study reflect the historical development of Bourdieu’s thought and its interface with different fields of study.

    Originality/value: The study demonstrates the high explanatory power of the CCA method.

  • Research Papers
    Mike Thelwall
    Journal of Data and Information Science. 2025, 10(1): 7-25. https://doi.org/10.2478/jdis-2025-0011

    Purpose: Evaluating the quality of academic journal articles is a time consuming but critical task for national research evaluation exercises, appointments and promotion. It is therefore important to investigate whether Large Language Models (LLMs) can play a role in this process.

    Design/methodology/approach: This article assesses which ChatGPT inputs (full text without tables, figures, and references; title and abstract; title only) produce better quality score estimates, and the extent to which scores are affected by ChatGPT models and system prompts.

    Findings: The optimal input is the article title and abstract, with average ChatGPT scores based on these (30 iterations on a dataset of 51 papers) correlating at 0.67 with human scores, the highest ever reported. ChatGPT 4o is slightly better than 3.5-turbo (0.66), and 4o-mini (0.66).

    Research limitations: The data is a convenience sample of the work of a single author, it only includes one field, and the scores are self-evaluations.

    Practical implications: The results suggest that article full texts might confuse LLM research quality evaluations, even though complex system instructions for the task are more effective than simple ones. Thus, whilst abstracts contain insufficient information for a thorough assessment of rigour, they may contain strong pointers about originality and significance. Finally, linear regression can be used to convert the model scores into the human scale scores, which is 31% more accurate than guessing.

    Originality/value: This is the first systematic comparison of the impact of different prompts, parameters and inputs for ChatGPT research quality evaluations.

  • Editorial
    Yu Liao, Jiandong Zhang, Liying Yang, Zhesi Shen
    Journal of Data and Information Science. 2025, 10(1): 4-6. https://doi.org/10.2478/jdis-2025-0015
  • Editorial
    Liying Yang, Ronald Rousseau, Ping Meng
    Journal of Data and Information Science. 2025, 10(1): 1-3. https://doi.org/10.2478/jdis-2025-0017
  • Review Papers
    Mohd Hafizul Afifi Abdullah, Norshakirah Aziz, Said Jadid Abdulkadir, Kashif Hussain, Hitham Alhussian, Noureen Talpur
    Journal of Data and Information Science. 2024, 9(4): 196-238. https://doi.org/10.2478/jdis-2024-0029

    Purpose: The purpose of this study is to serve as a comprehensive review of the existing annotated corpora. This review study aims to provide information on the existing annotated corpora for event extraction, which are limited but essential for training and improving the existing event extraction algorithms. In addition to the primary goal of this study, it provides guidelines for preparing an annotated corpus and suggests suitable tools for the annotation task.

    Design/methodology/approach: This study employs an analytical approach to examine available corpus that is suitable for event extraction tasks. It offers an in-depth analysis of existing event extraction corpora and provides systematic guidelines for researchers to develop accurate, high-quality corpora. This ensures the reliability of the created corpus and its suitability for training machine learning algorithms.

    Findings: Our exploration reveals a scarcity of annotated corpora for event extraction tasks. In particular, the English corpora are mainly focused on the biomedical and general domains. Despite the issue of annotated corpora scarcity, there are several high-quality corpora available and widely used as benchmark datasets. However, access to some of these corpora might be limited owing to closed-access policies or discontinued maintenance after being initially released, rendering them inaccessible owing to broken links. Therefore, this study documents the available corpora for event extraction tasks.

    Research limitations: Our study focuses only on well-known corpora available in English and Chinese. Nevertheless, this study places a strong emphasis on the English corpora due to its status as a global lingua franca, making it widely understood compared to other languages.

    Practical implications: We genuinely believe that this study provides valuable knowledge that can serve as a guiding framework for preparing and accurately annotating events from text corpora. It provides comprehensive guidelines for researchers to improve the quality of corpus annotations, especially for event extraction tasks across various domains.

    Originality/value: This study comprehensively compiled information on the existing annotated corpora for event extraction tasks and provided preparation guidelines.

  • Research Papers
    Yifan Wang, Xiaoping Liu, Xiang-Li Zhu
    Journal of Data and Information Science. 2024, 9(4): 155-195. https://doi.org/10.2478/jdis-2024-0031

    Purpose: Nanomedicine has significant potential to revolutionize biomedicine and healthcare through innovations in diagnostics, therapeutics, and regenerative medicine. This study aims to develop a novel framework that integrates advanced natural language processing, noise-free topic modeling, and multidimensional bibliometrics to systematically identify emerging nanomedicine technology topics from scientific literature.

    Design/methodology/approach: The framework involves collecting full-text articles from PubMed Central and nanomedicine-related metrics from the Web of Science for the period 2013-2023. A fine-tuned BERT model is employed to extract key informative sentences. Noiseless Latent Dirichlet Allocation (NLDA) is applied to model interpretable topics from the cleaned corpus. Additionally, we develop and apply metrics for novelty, innovation, growth, impact, and intensity to quantify the emergence of novel technological topics.

    Findings: By applying this methodology to nanomedical publications, we identify an increasing emphasis on research aligned with global health priorities, particularly inflammation and biomaterial interactions in disease research. This methodology provides deeper insights through full-text analysis and leading to a more robust discovery of emerging technologies.

    Research limitations: One limitation of this study is its reliance on the existing scientific literature, which may introduce publication biases and language constraints. Additionally, manual annotation of the dataset, while thorough, is subject to subjectivity and can be time-consuming. Future research could address these limitations by incorporating more diverse data sources, and automating the annotation process.

    Practical implications: The methodology presented can be adapted to explore emerging technologies in other scientific domains. It allows for tailored assessment criteria based on specific contexts and objectives, enabling more precise analysis and decision-making in various fields.

    Originality/value: This study offers a comprehensive framework for identifying emerging technologies in nanomedicine, combining theoretical insights and practical applications. Its potential for adaptation across scientific disciplines enhances its value for future research and decision-making in technology discovery.

  • Research Papers
    Daria Maltseva, Vladimir Batagelj
    Journal of Data and Information Science. 2024, 9(4): 110-154. https://doi.org/10.2478/jdis-2024-0028

    Purpose: We analyzed the structure of a community of authors working in the field of social network analysis (SNA) based on citation indicators: direct citation and bibliographic coupling metrics. We observed patterns at the micro, meso, and macro levels of analysis.

    Design/methodology/approach: We used bibliometric network analysis, including the “temporal quantities” approach proposed to study temporal networks. Using a two-mode network linking publications with authors and a one-mode network of citations between the works, we constructed and analyzed the networks of citation and bibliographic coupling among authors. We used an iterated saturation data collection approach.

    Findings: At the macro-level, we observed the global structural features of citations between authors, showing that 80% of authors have not more than 15 citations from other works. At the meso-level, we extracted the groups of authors citing each other and similar to each other according to their citation patterns. We have seen a division of authors in SNA into groups of social scientists and physicists, as well as into other groups of authors from different disciplines. We found some examples of brokerage between different groups that maintained the common identity of the field. At the micro-level, we extracted authors with extremely high values of received citations, who can be considered as the most prominent authors in the field. We examined the temporal properties of the most popular authors.

    Research limitations: The main challenge in this approach is the resolution of the author’s name (synonyms and homonyms). We faced the author disambiguation, or “multiple personalities” (Harzing, 2015) problem. To remain consistent and comparable with our previously published articles, we used the same SNA data collected up to 2018. The analysis and conclusions on the activity, productivity, and visibility of the authors are relative only to the field of SNA.

    Practical implications: The proposed approach can be utilized for similar objectives and identifying key structures and characteristics in other disciplines. This may potentially inspire the application of network approaches in other research areas, creating more authors collaborating in the field of SNA.

    Originality/value: We identified and applied an innovative approach and methods to study the structure of scientific communities, which allowed us to get the findings going beyond those obtained with other methods. We used a new approach to temporal network analysis, which is an important addition to the analysis as it provides detailed information on different measures for the authors and pairs of authors over time.

  • Research Papers
    Yang Zhao, Mengting Zhang, Xiaoli Chen, Zhixiong Zhang
    Journal of Data and Information Science. 2024, 9(4): 90-109. https://doi.org/10.2478/jdis-2024-0027

    Purpose: To address the “anomalies” that occur when scientific breakthroughs emerge, this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers, aiming to achieve early identification of scientific breakthroughs in papers.

    Design/methodology/approach: This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content. Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages. The development and evolution process are traced using literature time tags. Finally, a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine.

    Findings: Through manual analysis of all identified outlier papers, the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified.

    Research limitations: The study’s applicability has only been empirically tested in the biomedical field. More data from various fields are needed to validate the robustness and generalizability of the method.

    Practical implications: This study provides a valuable supplement to current methods for early identification of scientific breakthroughs, effectively supporting technological intelligence decision-making and services.

    Originality/value: The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities, offering a more sensitive, precise, and fine-grained alternative method compared to traditional citation-based evaluations, which enhances the ability to identify nascent breakthrough innovations.

  • Research Papers
    Mario Coccia, Saeed Roshani
    Journal of Data and Information Science. 2024, 9(4): 71-89. https://doi.org/10.2478/jdis-2024-0005

    Purpose: The goal of this study is to analyze the relationship between funded and unfunded papers and their citations in both basic and applied sciences.

    Design/methodology/approach: A power law model analyzes the relationship between research funding and citations of papers using 831,337 documents recorded in the Web of Science database.

    Findings: The original results reveal general characteristics of the diffusion of science in research fields: a) Funded articles receive higher citations compared to unfunded papers in journals; b) Funded articles exhibit a super-linear growth in citations, surpassing the increase seen in unfunded articles. This finding reveals a higher diffusion of scientific knowledge in funded articles. Moreover, c) funded articles in both basic and applied sciences demonstrate a similar expected change in citations, equivalent to about 1.23%, when the number of funded papers increases by 1% in journals. This result suggests, for the first time, that funding effect of scientific research is an invariant driver, irrespective of the nature of the basic or applied sciences.

    Originality/value: This evidence suggests empirical laws of funding for scientific citations that explain the importance of robust funding mechanisms for achieving impactful research outcomes in science and society. These findings here also highlight that funding for scientific research is a critical driving force in supporting citations and the dissemination of scientific knowledge in recorded documents in both basic and applied sciences.

    Practical implications: This comprehensive result provides a holistic view of the relationship between funding and citation performance in science to guide policymakers and R&D managers with science policies by directing funding to research in promoting the scientific development and higher diffusion of results for the progress of human society.

  • Research Papers
    Zhiwei Zhang, Wenhao Zhou
    Journal of Data and Information Science. 2024, 9(4): 49-70. https://doi.org/10.2478/jdis-2024-0030

    Purpose: This study aims to explore how network intermediaries influence collaborative innovation performance within inter-organizational technological collaboration networks.

    Design/methodology/approach: This study employs a mixed-method approach, combining quantitative social network analysis with regression techniques to investigate the role of network intermediaries in collaborative innovation performance. Using a patent dataset of Chinese industrial enterprises, the research constructs the collaboration networks and analyzes their structural positions, particularly focusing on their role as intermediaries, characterized by betweenness centrality. Negative binomial regression analysis is employed to assess how these network characteristics shape innovation outcomes.

    Findings: The study reveals that firms in intermediary positions enhance collaborative innovation performance, but this effect is nuanced. A key finding is that network clustering negatively moderates the intermediary-innovation relationship. Highly clustered networks, while fostering local collaboration, may limit the innovation potential of intermediaries. On the other hand, relationship strength, measured by collaboration intensity and trust among firms, positively moderates the intermediary-innovation link.

    Research limitations: This study has several limitations that present opportunities for further research. The reliance on quantitative social network analysis may overlook the complexity of intermediaries’ roles, and future studies could benefit from incorporating qualitative methods to better understand cultural and institutional factors. Additionally, cross-country comparisons are needed to assess the consistency of these dynamics in different contexts.

    Practical implications: The study offers practical insights for firms and policymakers. Organizations should strategically position themselves as network intermediaries to access diverse information and resources, thereby improving innovation performance. Building strong trust helps using network intermediary advantages. For firms in highly clustered networks, it is important to seek external partners to avoid limiting their exposure to new ideas and technologies. This research emphasizes the need to balance network diversity with relationship strength for sustained innovation.

    Originality/value: This research contributes to the literature by offering new insights into the role of network intermediaries, presenting a comprehensive framework for understanding the interaction between network dynamics and firm innovation.

  • Research Papers
    Alonso Rodríguez-Navarro
    Journal of Data and Information Science. 2024, 9(4): 24-48. https://doi.org/10.2478/jdis-2024-0032

    Purpose: To analyze the diversity of citation distributions to publications in different research topics to investigate the accuracy of size-independent, rank-based indicators. The top percentile-based indicators are the most common indicators of this type, and the evaluations of Japan are the most evident misjudgments.

    Design/methodology/approach: The distributions of citations to publications from countries and journals in several research topics were analyzed along with the corresponding global publications using histograms with logarithmic binning, double rank plots, and normal probability plots of log-transformed numbers of citations.

    Findings: Size-independent, top percentile-based indicators are accurate when the global ranks of local publications fit a power law, but deviations in the least cited papers are frequent in countries and occur in all journals with high impact factors. In these cases, a single indicator is misleading. Comparisons of the proportions of uncited papers are the best way to predict these deviations.

    Research limitations: This study is fundamentally analytical, and its results describe mathematical facts that are self-evident.

    Practical implications: Respectable institutions, such as the OECD, the European Commission, and the U.S. National Science Board, produce research country rankings and individual evaluations using size-independent percentile indicators that are misleading in many countries. These misleading evaluations should be discontinued because they can cause confusion among research policymakers and lead to incorrect research policies.

    Originality/value: Studies linking the lower tail of citation distribution, including uncited papers, to percentile research indicators have not been performed previously. The present results demonstrate that studies of this type are necessary to find reliable procedures for research assessments.