Research Paper

Methods and Practices for Institutional Benchmarking based on Research Impact and Competitiveness: A Case Study of ShanghaiTech University

  • Jiang Chang , 1† ,
  • Jianhua Liu 1, 2
Expand
  • 1 Library and information services, ShanghaiTech University, Shanghai, 201210, China
  • 2 Beijing Wanfang Data Ltd., Beijing, 100038, China
Corresponding author: Jiang Chang (Email: ).

Received date: 2019-05-24

  Request revised date: 2019-07-29

  Accepted date: 2019-08-06

  Online published: 2019-09-02

Copyright

Open Access

Abstract

Purpose To develop and test a mission-oriented and multi-dimensional benchmarking method for a small scale university aiming for internationally first-class basic research.

Design/methodology/Approach: An individualized evidence-based assessment scheme was employed to benchmark ShanghaiTech University against selected top research institutions, focusing on research impact and competitiveness at the institutional and disciplinary levels. Topic maps opposing ShanghaiTech and corresponding top institutions were produced for the main research disciplines of ShanghaiTech. This provides opportunities for further exploration of strengths and weakness.

Findings: This study establishes a preliminary framework for assessing the mission of the university. It further provides assessment principles, assessment questions, and indicators. Analytical methods and data sources were tested and proved to be applicable and efficient.

Research limitations: To better fit the selective research focuses of this university, its schema of research disciplines needs to be re-organized and benchmarking targets should include disciplinary top institutions and not necessarily those universities leading overall rankings. Current reliance on research articles and certain databases may neglect important research output types.

Practical implications: This study provides a working framework and practical methods for mission-oriented, individual, and multi-dimensional benchmarking that ShanghaiTech decided to use for periodical assessments. It also offers a working reference for other institutions to adapt. Further needs are identified so that ShanghaiTech can tackle them for future benchmarking.

Originality/value This is an effort to develop a mission-oriented, individually designed, systematically structured, and multi-dimensional assessment methodology which differs from often used composite indices.

Cite this article

Jiang Chang , Jianhua Liu . Methods and Practices for Institutional Benchmarking based on Research Impact and Competitiveness: A Case Study of ShanghaiTech University[J]. Journal of Data and Information Science, 2019 , 4(3) : 55 -72 . DOI: 10.2478/jdis-2019-0015

1 Introduction

University benchmarking and assessment has long been a heated area of research and practices with many approaches and rankings. However, there exist different types of universities with different missions and sizes, resulting in varied emphases on research, teaching, or industrial/regional development. Even within research-oriented universities, some are more focused on basic research, while other ones aim more at applications. Most general ranking systems cannot show the degree to which ShanghaiTech University (hereafter: ShanghaiTech) as a small scale research university with a mission for cutting edge basic research in internationally competitive areas, fulfils its mission. These ranking systems fail to pinpoint the university’s strengths and weaknesses, and hence are useless for development planning. Hence, a new, individualized, and differentiating assessment method is needed, with special emphasis on research impact and competitiveness.
The past decades witnessed a worldwide growth of research activities and an increasing demand for return on research investment. Considerable effort has been devoted to benchmark research performance at the international, domestic, and intramural levels (Abramo & D’Angelo, 2011). Yet, assessment of research activities is complex and contentious, and researchers, funders and decision makers try to adapt with ever improving and expanding methods and indicators (James et al., 2015). However, measuring research performance remains a challenge all over the world (Huang et al., 2017). Traditional assessments are usually composite index based, taking many dimensions together, such as research investment (in labor force and in funding), output, quality, impact, social impact, etc. (Moed & Plume, 2011). Most indicators are calculated through bibliometric measurements such as publication and citation counts, patent counts, translational contracts, and even alternative metrics. Established citation databases, such as Web of Science (WoS) and Scopus, extended with specialized analytic tools such as InCites and Scival, are the main sources providing commonly used indicators. Waltman (2016) presented an in-depth overview of the main bibliographic databases and indicators, and made a distinction between basic size-dependent and size-independent indicators. Size-dependent indicators, such as total number of citations, number of highly cited publications, and the h-index, provide an overall performance measurement; size-independent indicators, such as average number of citations per publication, proportion of highly cited publications, reflect the research quality of units across different sizes of the target entities (James et al., 2015; Waltman, 2016). Besides these basic indicators, variations and extended indicators such as the normalized citation impact and the number of documents in the first quartile for journals give supplementary options for comparisons. These indicators and metrics have been widely applied to evaluate research entities (e.g. scientists, research groups, departments, institutions, research-intensive enterprises) (Ding et al., 2013; Ibrahim, 2018; Kumar & Pandit, 2018), with funding allocation, strategic planning, and government policy-making as their goals.
These approaches are used in university rankings to benchmark academic institutions. Modern rankings began in 1983, conducted by US News and World Report (Lukman et al., 2010), while today, more than 80 different university rankings have emerged (Jia, 2018). These rankings draw attention from a broad range of users such as would-be students or faculty, employers, government agencies, and institutions themselves. As a result many universities list their rankings in promotion materials. Some even set up targets in the ranking in their strategic plans (The University of Manchester, 2015). Each ranking is based on a variety of indicators, and many but not all of those are linked to research performance. Even with research activities, rankings bundle together multiple indicators different in nature and intended usage. For example, the Academic Ranking of World Universities (ARWU) includes three WoS related indicators, namely highly cited researchers, papers published in Nature and Science, total publications in Science Citation Index (SCI) and Social Science Citation Index (SSCI). Each of these indicators weighs 20% of the final score (ARWU, 2018). Research productivity (publications, 6%) and research impact (citations, 30%) on Scopus count a total 36% of the final weight in THE World University Ranking (Times Higher Education, 2019). In QS World University Rankings, the indicator of citations per faculty (total citations on Scopus) takes a proportion of 20% in the overall score (QS, 2019). Best Global Universities Rankings by U.S. News uses several WoS/InCites indicators, including total publication counts (10%), publications in the 10% most cited (22.5%), international collaboration (10%), highly cited papers (10%), total citations (7.5%), normalized citation impact (10%), leading to a total of 70% of the final score (U.S. News, 2018).
Although these and other ranking systems have continuously improved, they still face a number of intrinsic challenges. First, they do not differentiate institutions with different missions, sizes, and research emphases, leading to a bias toward large and more comprehensive ones. Second, many of them rely on composite indices, so that universities focusing on and excelling in only some aspects are disadvantaged in the overall rankings. Yet, recently some ranking systems began to give individually faceted rankings like research quality or translational capabilities (Shanghai Ranking, 2018). Thirdly, many indicator-based schemes fail to provide diagnostic power which institutions could use to improve their performance. Relying on limited types of documents or impact measures also affects the applicability of rankings when dealing with different types of institutions. For example, books and policy studies play an important role in the social sciences and humanities, but are not or only partially covered in the usual source databases, creating special problems with funders or assessment agencies in those fields. Some indicators are criticized for other reasons. For example, there is the well-known abuse of the journal impact factor (JIF) (Garfield, 1972), and disciplinary differences are often not accounted for. These and other problems contribute to tendencies of simply increasing the number of publications or ranges of research (Amin & Mabe, 2000). Efforts to correct some of the problems include the “Norwegian Model” designed in 2005 (Sivertsen, 2018). However, the Norwegian model still tries to balance the number and the quality of publications at the same time, so it may not be suitable in all cases and for all purposes.
With increasing applications of scientometric methods, the literature related to research indicators is now more inclined towards studying new indicators (Thelwall et al., 2015). However, over-reliance on single or composite indexes, how good they may be, still easily leads to biases and may hinder the development of research entities with specially focused missions (Vieira & Gomes, 2010). Furthermore, rankings and indicators can only present differences via numerical values, while the cause of these differences, stay beneath the veil. In addition, institutions have preponderant disciplines, therefore benchmarking the research performance of entire institutions while ignoring disciplinary or internal unit-based performance could easily fail to serve the diagnosis and planning purpose of institutions.
The San Francisco Declaration on Research Assessment (DORA) (ASCB, 2016) and The Leiden Manifesto (Hicks et al., 2015) discussed the problems and outlined good principles in research evaluation. In China, the General Office of the State Council issued “Opinions on deepening project, talent and institutional evaluation reform” emphasizing the establishment of a faceted assessment structure (Xinhuanet, 2018). Hence, flexible benchmarking methods geared to the specific missions and research emphases of individual and different institutions are needed. This paper describes the methods and practices ShanghaiTech employed for this purpose.

2 Methodology

2.1 Background

ShanghaiTech is a research university established jointly by the Shanghai Municipal Government and the Chinese Academy of Sciences (CAS) in 2013. It is set up as a small-scale, internationalized, and first-class research institution aiming at tackling globally advanced and hard-fought scientific challenges. It focuses its research only on a few selected subject fields in Physical Sciences & Technologies, Biological Sciences and Technologies, and Information Sciences and Technology, and avoids trying to cover complete disciplines. As of June 2019, it has fewer than 200 faculty members, no more than 500 undergraduate and 1000 graduate enrollments yearly. The university evaluates the research performance of itself, its schools and institutes, and its researchers, not in numbers of papers or average citations, but on their research quality, impact, competitiveness, and power or potential of leading internationally. So the common ranking schemes do not serve its mission, and they are even unfair to it. Consequently ShanghaiTech has strived to design an individualized mission-oriented assessment scheme to see how it is fulfilling its mission, and what and how each subject areas are performing in this regard. The university also needs to be compared, in specifically chosen measurements, to those institutions it aims to surpass. It takes this effort seriously and invests in continuous efforts reporting directly to the president of ShanghaiTech University.

2.2 Principles

The following principles are established to guide the design of the scheme:
(1) Individually designed to suit the mission of ShanghaiTech, not pretending to be widely applicable without individualized adaptation.
(2) Focusing on research impact and competitiveness.
(3) Striving for the highest possible diagnostic power.
(4) Using different indicators to provide different analysis angles, instead of composite indexes.
(5) Enabling drilling down into sub-units, subject areas and even individual researchers to help development planning and faculty recruitment.
(6) Specifically comparing to a selected group of top research universities

2.3 Assessment questions

In this study, an evidence-based benchmarking effort is proposed and tested in practice; future improvements are proposed. These efforts are guided by the following assessment questions:
(1) What are the high quality and high impact research outputs using commonly accepted indicators as available from reliable data sources, and how do their values compare to other institutions?
(2) What is the relative competitiveness of research quality and impact when comparing ShanghaiTech with benchmarking institutions?
(3) The same question as question (2) but now studied at the disciplinary level.
(4) The same question as question (2) but now studied at the area level.
(5) The same question as question (2) but now at the level of individual units and researchers (the latter was carried out in another study not reported here).
Many further questions can be explored to diagnose in-depth the how and why of reported strengths and weaknesses. However, it is felt that these explorations are better dealt with by research experts and decision-makers exploring the data interactively together with information analysts, including data outside the standard assessment databases.

2.4 Selection of assessment indicators

Evidence-based assessment relies on objective indicators; these indicators must serve the principles and assessment questions, following a chain of analysis from the mission, principles, assessment questions, to indicators, and then further analytic results or “assessment answers”.
Figure 1. Flow diagram of assessment process.
The key underlining indicators are those representing research quality and impact, focusing on the number of high quality papers and their citation impact. A set of indicators are selected to represent research quality and impact, such as the number of papers published in the Q1 list of SCI source journals, highly cited papers according to ESI, or papers published in Cell, Nature, and Science (CNS, a category often used in China). Research impact is measured using citation-based impact indicators, including the Category Normalized Citation Impact (CNCI) (Clarivate Analytics, 2018). Competitiveness is demonstrated by the percentages of high quality papers and high impact papers from each institution in the benchmarking line-up. Total numbers of publications are still used but only as a reference background. For some analyses, international and domestic medians are calculated and displayed for easy comparison with ShanghaiTech’s values.
It is fully recognized that publishing in high quality journals does not necessarily represent high quality of any single paper itself, but when averaged across institutions or disciplines, the average CNCI achieved by an institution may give an interesting picture of the overall quality of the institution. It is also recognized that for some fields, especially in computer science and IT, proceeding papers are important research outputs, and for other fields patents need to be included to present a full picture of research performance. Due to the focus on basic research of ShanghaiTech, and because this study aims to evaluate ShanghaiTech’s research in terms of basic research, the current effort employs only data related to research papers. Future iterations of the study intend to include data on proceeding papers and patents, though these present challenges as what to choose and how to evaluate their “impact”.
Table 1 gives the set of indicators following the framework in Figure 1. Nearly all of the analytic angles can be applied to different research units (schools or institutes, disciplines, or groups) to enable drilling down analysis, etc. Size of the entities may vary, but the majority of indicators employed are size-independent, enabling assessment of relative competitiveness.
Table 1 Establishment of analytic dimensions and indicators for institutional benchmarking.
Dimension Category Indicator Description Data Source
Research Output Publication counts Number of publications Total publication of research papers InCites, Scival
Research Impact Counts of high quality publication % Papers in Q1 journals Percentage of publications in Q1 journals of JIF Quartile InCites
% Highly cited publications Percentage of papers ranked in the top 1% by citations in a period of time InCites, Scival
% Hot publications Percentage of papers ranked in the top 1‰ by citations in a period of time (article and review) ESI
% Publications on CNS Percentage of papers published in Cell, Nature and Science WoS, Scopus
Impact of citation Times cited Number of citation of total publication InCites, Scival
H-index H-index of the set of publications WoS
Citation impact Average (mean) number of citations per year InCites, Scival
Normalized citation impact Citation impact that normalized for subject, year and document type InCites, Scival
Research Topics (content analysis) Knowledge maps Content analysis based on research topics Comparison of structure of maps constructed based on keywords pairing (reflect research topics) between Inst. S and benchmarking institutions WoS, Scopus
Contribution to research Front Publications as core papers in research fronts, and publishing year comparison Number of publications as core papers in research fronts, and the comparison between publishing year of these contributed papers and average publishing year of the total core papers in each research front ESI
Research Connection Collaboration Main collaborators and research areas Top collaborators with the most collaborative papers (contain co-authors), and the most collaborative research areas with each collaborator WoS, Scopus

2.5 Selection of benchmarking institutions

As benchmarking is mission-oriented, the selection of benchmarking institutions is of the utmost importance. This selection must consider many factors of the targets, such as their nature, mission, scale, research emphases, etc. In line with the mission of ShanghaiTech to be a first-class research institution aiming at solving globally advanced and hard-fought scientific challenges, internationally top research universities must be chosen for benchmarking. Since ShanghaiTech is a Chinese university that will inevitably also be compared to top research universities in the country, a few top ones in China, which are internationally competitive, are selected too. Another consideration is the fact that ShanghaiTech is a brand new and experimental university of small-scale, internationalized, and selectively focusing on certain research areas. Hence, selected institutions must be as comparable as possible. Finally, to limit the workload of evaluators only a handful of institutions can be considered. A total of 8 international universities (Univ A—Univ H) are chosen from among the top 10 of the main world-wide rankings (e.g. U.S News, ARWU), and 5 domestic universities, selected from the top 5 among the Chinese institutions from the same rankings (Univ I—Univ M). Two newly established research-intensive universities are also added to the list (Univ N—Univ O). A research area-guided selection strategy, for those excellent in individual research areas similar to those of ShanghaiTech but not necessarily coming up high in ranking systems, was contemplated but is left to the second phase of this work due to workload consideration, though some on-request analyses of research competitiveness with selected niche institutions as benchmarking targets have been conducted for individual schools/institutes and even labs/teams.

3 Data and Results

3.1 Data sources

Since research in ShanghaiTech is mostly in scientific fields and only high quality output will be included in the assessment, SCI-Expanded of WoS Core Collection was chosen as the data source, and further limited to article and review papers and ten research areas in the ESI schema, namely physics, chemistry, biology & biotechnology, materials science, engineering, computer science, molecular biology & genetics, immunology, microbiology, and neuroscience & behavior. The timespan covers the period from 2014 to the retrieval date in 2018 (Sep. 21st) to coincide with the development period of ShanghaiTech. Some of the indicators can be obtained directly from data sources, while others are produced by InCites, and still others, such as the research content analysis to produce subject-area knowledge maps, is produced via analytic and visualization tools such as VOSviewer and Derwent Data Analyzer.
A large number of researchers from institutes of CAS work as adjunct professors in ShanghaiTech, and they publish papers with affiliations including ShanghaiTech. This makes it difficult to determine the contributions of different types of faculty members. For this reason three datasets of ShanghaiTech faculty were constructed. SHTech-A is the dataset for the entire faculty, full-time or adjunct, if any of the co-authors is from ShanghaiTech. SHTech-B is the dataset only for those authors in SHTech-A and which are first or corresponding authors. SHTech-C is the dataset consisting of those authors in SHTech-B which are full-time faculty members of the university.

3.2 General research performance of ShanghaiTech compared with benchmarking institutions

First, indicators with high quality papers, high impact papers, and their percentages out of each institution’s total were calculated, presented in Table 2. Compared with the international leading universities Shanghai Tech had lower percentages of high quality papers, received fewer citations, and established less international collaboration. Yet, it was, by many of the indicators, better than most domestic universities. Interestingly, SHTech-C had higher percentages of highly cited papers and CNCI values, indicating that its full-time faculty performs quite well.
Table 2 Overall difference of research performance between ShanghaiTech and benchmarking institutions.
Region Institution Publication % documents in Q1
journals
% highly cited papers % hot papers % international collaboration h-index citation impact CNCI
Inter-
national
Univ A 19,508 63.50% 3.60% 0.20% 69.30% 160 14.7 2.03
Univ B 6,898 65.00% 4.90% 0.50% 50.60% 116 16.9 2.41
Univ C 19,034 64.80% 5.40% 0.40% 51.50% 190 17.8 2.47
Univ D 19,834 62.50% 3.60% 0.30% 68.60% 156 13.9 2.01
Univ E 22,013 66.00% 5.70% 0.50% 55.90% 198 17.9 2.54
Univ F 35,249 64.80% 4.70% 0.40% 55.60% 221 17.4 2.39
Univ G 14,252 63.30% 2.80% 0.20% 66.40% 112 12.5 1.82
Univ H 16,885 65.30% 4.40% 0.30% 52.90% 165 17.1 2.15
Domestic Univ I 31,673 50.40% 2.40% 0.10% 32.70% 139 8.9 1.45
Univ J 20,278 52.00% 2.30% 0.20% 33.00% 128 9.9 1.42
Univ K 27,699 45.90% 1.40% 0.10% 31.70% 108 7.2 1.19
Univ L 15,927 48.50% 1.60% 0.00% 31.30% 102 9.0 1.24
Univ M 19,123 52.10% 2.60% 0.10% 30.10% 121 9.7 1.43
Univ N 32,958 47.60% 1.60% 0.10% 18.30% 123 7.3 1.21
Univ O 2,074 53.10% 2.30% 0.10% 41.50% 48 7.6 1.6
SHTech-A 1,555 55.00% 3.80% 0.10% 41.00% 48 9.0 1.75
SHTech-B 660 52.00% 3.00% 0.00% 37.30% 27 5.6 1.58
SHTech-C 225 50.70% 5.30% 0.00% 46.70% 21 6.2 1.9

3.3 Research performance of ShanghaiTech compared with benchmarking institutions at discipline level

An analysis was conducted to see if the differences changed in various research areas. A ratio of papers in each discipline to the total publications of each institution was calculated. Knowing this ratio helps to put the numbers in perspective, avoiding comparing to those not really focusing on the chosen discipline. Similar indicators in the overall comparison were employed to benchmark at disciplinary level. International leading universities still had higher percentages of high quality papers, citation counts, and CNCI. But in some disciplines, ShanghaiTech performed better than domestic top institutions, and even came close to some international leading universities.
Table 3 Disciplinary difference (Molecular Biology & Genetics) of research performance between ShanghaiTech and benchmarking institutions
Region Institution % total publication % documents in Q1
journals
% Highly cited
papers
% Hot papers % International Collaboration H-index Citation Impact CNCI
Inter-
national
Univ A 12.44% 70.90% 3.70% 0.41% 74.20% 90 23.5 2.5
Univ B 5.55% 72.85% 7.00% 0.78% 38.90% 55 29.1 2.75
Univ C 14.88% 69.98% 5.30% 0.35% 51.70% 111 25.7 2.71
Univ D 13.77% 70.90% 4.80% 0.33% 73.40% 104 23.3 2.52
Univ E 12.97% 76.91% 10.40% 0.54% 57.80% 148 40.7 4.14
Univ F 22.79% 72.08% 5.80% 0.29% 59.00% 178 28.2 2.81
Univ G 6.41% 68.13% 2.60% 0.11% 75.00% 58 17.9 1.98
Univ H 8.92% 69.39% 4.90% 0.00% 47.10% 88 29.2 2.55
Median 12.70% 70.90% 5.10% 0.34% 58.40% 97 27 2.63
Domestic Univ I 2.83% 61.61% 2.70% 0.00% 58.40% 53 15.9 1.84
Univ J 8.38% 55.39% 1.20% 0.00% 46.40% 56 12.6 1.35
Univ K 10.68% 50.51% 0.70% 0.03% 37.10% 57 9.4 1.1
Univ L 14.93% 53.26% 0.50% 0.04% 30.40% 55 9.2 1.08
Univ M 1.82% 54.60% 1.10% 0.00% 32.90% 31 14.6 1.33
Univ N 4.91% 55.69% 1.50% 0.00% 37.10% 51 9.2 1.21
Univ O 3.66% 57.89% 0.00% 0.00% 33.50% 11 5.6 1.12
Median 4.91% 55.39% 1.10% 0.00% 37.10% 53 9.4 1.21
Total Median 8.92% 68.13% 2.70% 0.04% 47.10% 57 17.9 1.98
SHTech-A 14.92% 71.55% 2.20% 0.00% 72.40% 23 9.4 1.64
SHTech-B 15.15% 71.00% 3.00% 0.00% 49.10% 15 7.9 1.68
SHTech-C 10.00% 63.63% 10.00% 0.00% 45.00% 7 10.6 2.6

3.4 Difference between ShanghaiTech and benchmarking institutions on research topics in each discipline

A content analysis was conducted to further explore the differences in research topics, using co-occurrence analysis of keywords. Here the effort was made to compare ShanghaiTech with the strongest in the chosen field. Analyses were performed resulting in a research topic map for ShanghaiTech, and another for the institution with highest CNCI in the discipline. In the WoS data, Keywords Plus extracted from titles of cited publications and footnotes provide supplementary terms to the Author’s Keywords (Chen, 2017; Garfield, 1990). Based on keywords and using the full counting method maps were generated by VOSviewer (version 1.6.10).
Univ E was selected as the leading university in the area of molecular biology & genetics. Its keyword co-occurrence map led to six main clusters (indicated with colored nodes and links, Figure 2a). The themes of these clusters are stem cell and genes (blue cluster, main keywords: gene expression, DNA methylation, human genome, embryonic stem-cells), genetic mutation (green cluster, main keywords: cell lung cancer, T cell, somatic mutations, copy-number alteration), cancer therapy mechanism (pink cluster, main keywords: cancer, stem cell, drug delivery), protein (sky blue cluster, main keywords: protein, saccharomyces-cerevisiae, crystal structure), genome research (red cluster, main keywords: cardiovascular disease, susceptibility loci, genetic architecture), medical research (yellow cluster, main keywords: human brain, Alzheimer disease, skeletal muscle). The main co-occurring (linked) keywords of each cluster indicate the main research topics in each theme.
In the corresponding map for ShanghaiTech, there were also six major thematic clusters, resulting from the following principal keywords: crystal structure, embryonic stem cell, gene expression, cell proliferation, cell lung cancer and hippo pathway. These six clusters have a strong linkage correlation (Figure 2b). Compared to Univ E, it seems that ShanghaiTech has narrower research coverage and the linkage connections are sparser, which might be one of the reasons of relatively lower citations and impact.
The distributions themselves should be compared cautiously. Each institution has its selected focuses in research topics so the maps will naturally differ. Moreover, in itself such a map does not provide information on the quality, impact or leading capacity. Hence, further information, such as citation impact or median year of publication of each subject, may need to be added or superimposed to the respective maps for a thorough comparison. However, looking at the overlaps or at the gaps in topic distribution serves as a first step to guide the attention onto the actual research subjects and helps identifying the strengths or uniqueness of an institution. It further alerts to missing trends or topics, and indicates for which topics new researchers are needed to fill critical gaps.
Figure 2. Maps of keywords of molecular biology & genetics of Univ E (left) and ShanghaiTech (right); mapped parameter: co-occurrence; unit of analysis: all keywords; threshold of (a) = 10, threshold of (b) = 2, full counting.
Another method to reveal the development of subjects is illustrated in Table 4. The most frequent keywords in the field of molecular biology & genetics from all benchmarking universities are listed, together with the year of first appearance during the period 2014-2018 for each word by any of the institutions, indicated by √. If, during the period 2014-2018 ShanghaiTech papers used this keyword for the first time as benchmarking universities this is indicated by grey background. It was found that for a few keywords, ShanghaiTech published papers almost at the same time with other institutions, but ShanghaiTech papers dealt with most keywords later than others. However, the time range of the data, starting at 2014, complicated the explanation because we did not know if there were occurrences earlier than 2014 for those keywords ShanghaiTech papers claimed to use concurrently with others. A much longer time range is needed.
Table 4 The most frequent keywords of Molecular Biology & Genetics of benchmarking institutions and year of first appearance in ShanghaiTech’s papers.
Frequent Keywords 2014 2015 2016 2017 2018
1 alzheimer disease
2 cancer cells
3 cell cycle
4 cell lung cancer
5 cell proliferation
6 colorectal-cancer
7 crystal structures
8 dna methylation
9 embryonic stem-cells
10 epithelial-mesenchymal transition
11 escherichia coli
12 gene expression
13 human genome
14 inflammation
15 lung cancer
16 mammalian-cells
17 mesenchymal stem cells
18 molecular mechanism
19 mouse model
20 nf-kappa-b
21 oxidative stress
22 pluripotent stem-cells
23 progenitor cells
24 saccharomyces-cerevisiae
25 signaling pathway
26 skeletal muscle
27 stem cell
28 susceptibility loci
29 tumor growth
30 tumor suppression

3.5 Degree of leading research: comparing ShanghaiTech and benchmarking institutions

Analyzing the contribution to research fronts can distinguish whether an institution is leading or following. A Research Front (in short RF) is a cluster of co-cited core papers and the group of current papers citing those papers. Based on the core papers associated with a RF provided in the Essential Science Indicator (ESI) database, the publication data of each university were mapped (by WoS ID). A total of 9,662 research fronts and corresponding core papers were retrieved. Then the citation counts of those core papers of each institution were obtained through WoS. ESI provides the average year of the core papers in its clusters for each RF, and to some extent it can be assumed that the earlier papers published in each RF cluster, the more likely they were leading the research topics. Based on this, the year of publication of core papers in a RF of each university was compared with average publication year of the topic in RF; we recorded if ShanghaiTech papers were published earlier/no later than the average publication year of the RF topics.
Generally, international leading universities had better performance than domestic universities. Although ShanghaiTech and Univ O had a few papers in the RF, and published some papers not later than the average publication year (Table 5), there were no papers that were published earlier than the average publication year, which shows that these two institutions are not leading in this RF.
Table 5 Number of papers of each institution as core papers in the research front.
Papers in Research Front No later than average publication year Earlier than average publication year
Papers Percentage Papers Percentage
Univ A 382 180 47.12% 42 10.99%
Univ B 221 85 38.46% 15 6.79%
Univ C 570 319 55.96% 77 13.51%
Univ D 399 192 48.12% 40 10.03%
Univ E 713 364 51.05% 65 9.12%
Univ F 921 470 51.03% 86 9.34%
Univ G 260 123 47.31% 22 8.46%
Univ H 436 234 53.67% 53 12.16%
Univ I 434 221 50.92% 47 10.83%
Univ J 275 148 53.82% 23 8.36%
Univ K 213 104 48.83% 21 9.86%
Univ L 134 61 45.52% 10 7.46%
Univ M 256 142 55.47% 26 10.16%
Univ N 272 131 48.16% 27 9.93%
Univ O 27 6 22.22% 0 0.00%
SHTech-A 32 13 40.63% 0 0.00%

3.6 Research performance of the internal units in ShanghaiTech

In order to assess the research performance of internal units of ShanghaiTech (e.g. school or Principal Investigator (PI)), a graph using the number of publication, times cited, and CNCI of each PI of ShanghaiTech was obtained. The data is presented in Figure 3 with the X axis representing the number of publications for each PI; the Y axis shows the total number of citations, while node radii refer to CNCI values. Nodes, representing PIs, of different colors indicate the schools of these PIs. This figure immediately benchmarks productivity and impact for each PI of each internal unit of ShanghaiTech, with those at the right-upper corner publishing more and gaining more citations. Those at the left-lower corner published fewer and received fewer citations. Some PIs published only a few highly cited papers while other ones published more with a lower number of citations. Admittedly, this might be an overly simplified description of research productivity, but further analysis based on it may provide interesting insights.
Figure 3. Map of PI’s research productivity and competitiveness of ShanghaiTech.

4 Discussion

This study is a preliminary exploration for a mission-oriented benchmarking effort. Different from the common ranking methods, our approach aims to focus on research impact and competitiveness by comparing with top research universities. We do this on the university level, then on the level of disciplines, and further on the level of topics. The result was reported to the faculty and accepted by the university as one of the analytic angles in its self-assessment, complementary to international expert reviews, utilized as the main evaluation approach. Improved studies using the same approach are commissioned by the university, at a yearly interval. The basic approach and methodology can also be adapted for other universities.
Clarifying the principles and assessment questions is crucial, as different indicators and analytic methods must be selected accordingly. Traditional composite index based assessment is usually biased toward comprehensive institutions with large number of researchers and subject coverage (Borgman, 2015; Lopez-Illescas et al., 2009; Waltman & Eck, 2013), but our effort takes the nature and purpose of each institution into consideration and aims to design individual assessment schemes. Selection of benchmarking targets is also in line with this mission-oriented approach in the case of ShanghaiTech.
Multiple indicators are used, trying to provide multiple angles and to avoid bias caused by excessive use of a single-categorized indicator. We recall that Rousseau et al. (2018) mentioned that citation measurement may indicate but is not equal to research quality. Even though citation impact indicators are widely accepted, they can be influenced and biased by many factors, such as language and document type. Researchers and whoever uses indicators in assessments must be aware of the consequence of misusing indicators (Rousseau et al., 2018).
Faculty make-up, in the case of ShanghaiTech with a large number of adjunct faculty, needs to be considered when comparing institutions with obvious differences. For this reason we defined multiple datasets according to employee status to enable revealing their performance separately. However, we were unable to distinguish the detailed categories of faculty at the other benchmarking universities. We would argue that for these more established and bigger institutions adjunct faculty may not play a similar role as in ShanghaiTech. Moreover, comparing to a bigger pool of researchers in the target institutions may not be as bad because it works against over-estimating one’s own performance.
As a mission-oriented model, future improvements are needed to provide better diagnostic power and to help research planning. Top institutions in common composite rankings may not be the real leading ones in specific research areas ShanghaiTech focuses on or plans to develop. Finding those niche leaders for research-area-based benchmarking may provide a better understanding of one’s standing in the field. Using the disciplinary categories by ESI is a choice of convenience only. Constructing disciplinary categories according to individual institution’s research focuses would produce more meaningful and diagnostically more helpful results, but with obvious challenges. Other quality or impact indicators may also be needed to avoid influences of journal-centric ones. Topic distribution maps as they are now in the study are yet unable to reveal impacts, degrees of leading, and evolutions. Limiting to research articles and to a certain database creates its own intrinsic problems, especially for institutions with a strong engineering research focus and corresponding investment like ShanghaiTech. We understand that any analytic result like the one presented here is only a beginning for further explorations. Consequently, new techniques and tools are needed to integrate original data, results. Such tools are needed to drill down, expand, connect, fuse, or otherwise analyze data, leading to reports that are read by researchers or decision-makers to explore new questions stimulated by the results. The authors are planning further improvements in the second and future phases of benchmarking.

Author Contributions

Jiang Chang (changjiang@shanghaitech.edu.cn) and Jianhua Liu (liujh@wanfangdata.com.cn) designed and produced the research. Jiang Chang wrote the paper.

The authors have declared that no competing interests exist.

1
Abramo G.,&D’Angelo C.(2011). Evaluating research: From informed peer review to bibliometrics. Scientometrics, 87(3), 499-514. doi:10.1007/s11192-011-0352-7National research assessment exercises are becoming regular events in ever more countries. The present work contrasts the peer-review and bibliometrics approaches in the conduct of these exercises. The comparison is conducted in terms of the essential parameters of any measurement system: accuracy, robustness, validity, functionality, time and costs. Empirical evidence shows that for the natural and formal sciences, the bibliometric methodology is by far preferable to peer-review. Setting up national databases of publications by individual authors, derived from Web of Science or Scopus databases, would allow much better, cheaper and more frequent national research assessments.

DOI

2
Amin M.,& Mabe, M. (2000). Impact factors: Use and abuse. Perspectives in Publishing, 1(1). doi:10.1177/0891988714527516

3
ARWU. (2018). Academic ranking of world universities. Retrieved from

4
ASCB. (2016). San Francisco Declaration on Research Assessment (DORA). Retrieved from

5
Borgman C. (2015). Big data, little data, no data: Scholarship in the Networked world. Journal of the Association for Information Science & Technology, 2016, 67(3), 751-753.

6
Chen C.(2017). Science mapping: A systematic review of the literature. Journal of Data and Information Science, 2(2), 1-40. doi:10.1515/jdis-2017-0006

7
Clarivate Analytics. (2018). InCites Indicators Handbook. Category Normalized Citation Impact. Retrieved from

8
Ding Z. Q., Ge J. P., Wu X. M.,& Zheng X. N. (2013). Bibliometrics evaluation of research performance in pharmacology/pharmacy: China relative to ten representative countries. Scientometrics, 96(3), 829-844. doi:10.1007/s11192-013-0968-xAbstract<br/><p class="a-plus-plus">Pharmacology/pharmacy is an important scientific field and plays a pivotal role in new drug research and development. China has steadily increased investment in drug development. This study aimed to evaluate the productivity of China in the field pharmacology/pharmacy in the past decade in relation to ten representative countries. The publications in the field pharmacology/pharmacy of China and ten representative countries in the past decade (2001–2010) were retrieved from Web of Science database, and studies were conducted on the immediacy index of articles published in 2011. Multiple bibliometric indicators were obtained from the “InCites” analysis. Most of the bibliometric indicators for the developed countries including the USA and the European countries remained stable in the past decade. The number of publications by the Asian countries, especially China, increased dramatically in the past decade year by year; however, the Asian countries improved little in the indicators assessing the scientific quality of publications including the citation behaviors and the impact relative to either country and subject area. It may need a long time to fill in the gap, in terms of the scientific quality, between the developing countries and the developed countries. In view of the dramatic increase in the financial investment, our findings suggest that the development of the field pharmacology/pharmacy worldwide is not optimistic, which may partially explain the decreased R&amp;D productivity of pharmaceutical industry since the last decade.</p><br/>

DOI

9
Garfield E.(1972). Citation analysis as a tool in journal evaluation: Journals can be ranked by frequency and impact of citations for science policy studies. Science, 178(4060), 471-479.

10
Garfield E.(1990). Keywords plus - ISI’s breakthrough retrieval method. 1. Expanding your searching power on current - contents on diskette. Current Contents, 32, 5-9.

11
Hicks D., Wouters P., Waltman L., de Rijcke S.,& Rafols I. (2015). The Leiden Manifesto for research metrics. Nature, 520(7548), 429-431. doi:10.1038/520429aUse these ten principles to guide research evaluation, urge Diana Hicks, Paul Wouters and colleagues.

DOI PMID

12
Huang Y., Zhu D. H., Lv Q., Porter A. L., Robinson D. K. R.,& Wang X. F. (2017). Early insights on the Emerging Sources Citation Index (ESCI): An overlay map-based bibliometric study. Scientometrics, 111(3), 2041-2057. doi:10.1007/s11192-017-2349-3

13
Ibrahim B.(2018). Arab Spring’s effect on scientific productivity and research performance in Arab countries. Scientometrics, 117(3), 1555-1586. doi:10.1007/s11192-018-2935-z

14
James W., Liz A., Eleonora B., Philip C., Stephen C., Steven H., . . , & Ben, J. (2015). The metric tide: Report of the independent review of the role of metrics in research assessment and management.

15
Jia T.(2018). China’s research works are underrepresented. Paper presented at the The 2nd International Conference on Data-driven Knowledge Discovery, Beijing.

16
Kumar , A.J., &Pandit R.(2018). Science and engineering research in India (1985-2016): Insights from two scientometric databases. Current Science, 115(3), 399-409. doi:10.18520/cs/v115/i3/399-409

17
Lopez-Illescas C., de Moya Anegon F.,& Moed H. (2009). Comparing bibliometric country-by-country rankings derived from the Web of Science and Scopus: The effect of poorly cited journals in oncology. Journal of Information Science, 35(2), 244-256. doi:10.1177/0165551508098603

18
Lukman R., Krajnc D.,& Glavic P. (2010). University ranking using research, educational and environmental indicators. Journal of Cleaner Production, 18(7), 619-628. doi:10.1016/j.jclepro.2009.09.015<h2 class="secHeading" id="section_abstract">Abstract</h2><p id="">This paper introduces a model, which is enabling a comparison between universities regarding research, educational and environmental performances; the mission of university fits the sustainability idea. The purpose of the paper is to improve the methodology and indicators of the existing ranking tables. The three dimensional index, which provides simplified information about the quality of universities, has been developed. It enables quick detection of the weaknesses, strengths and opportunities for universities. Weights of indicators were determined using the analytic hierarchy process (AHP). Results of the AHP have shown that the most important are research oriented indicators, followed by social and environmental ones.</p><p id="">The proposed model has been tested on a sample of 35 top universities from the ARWU (Academic Ranking of World Universities) and Times ranking tables and a new ranking table &ndash; the Three dimensional University Ranking (TUR) has been developed. In addition, correlations between indicators and ranking tables have been carried out. There is only a medium correlation between the ARWU and TUR. Regarding the indicators, a high correlation with Hirsch indices and Highly Cited Researchers exists, while there is an insignificant correlation between the low student to staff ratio and the graduation rate.</p>

DOI

19
Moed H.,&Plume A.(2011). The multi-dimensional research assessment matrix. Research Trends(23).

20
QS. (2019). QS World University Rankings.

21
Rousseau R., Egghe L.,& Guns R. (2018). Becoming metric-wise: A bibliometric guide for researchers: Chandos-Elsevier.

22
Shanghai Ranking. (2018). The best universities in China. Retrieved from

23
Sivertsen G.(2018). The Norwegian model in Norway. Journal of Data and Information Science, 3, 3-19. doi:10.2478/jdis-2018-0017

24
The University of Manchester. (2015). Manchester 2020: The university of Manchester’s strategy plan.

25
Thelwall M., Kayvan K., Paul W., Ludo W., Sarah de R., Alex R.,& Thomas F. (2015). The metric tide:Literature review (Supplementary Report to the Independent Review of the Role of Metrics in Research Assessment and Management).

26
Times Higher Education. (2019). The world university rankings.

27
U.S. News. (2018). Best global universities rankings.

28
Vieira,E. S., & Gomes, J. A. N. F. (2010). Citations to scientific articles: Its distribution and dependence on the article features. Journal of Informetrics, 4(1), 1-13. doi:10.1016/j.joi.2009.06.002<h2 class="secHeading" id="section_abstract">Abstract</h2><p id="">The citation counts are increasingly used to assess the impact on the scientific community of publications produced by a researcher, an institution or a country. There are many institutions that use bibliometric indicators to steer research policy and for hiring or promotion decisions. Given the importance that counting citations has today, the aim of the work presented here is to show how citations are distributed within a scientific area and determine the dependence of the citation count on the article features. All articles referenced in the Web of Science in 2004 for Biology &amp; Biochemistry, Chemistry, Mathematics and Physics were considered.</p><p id="">We show that the distribution of citations is well represented by a double exponential-Poisson law. There is a dependence of the mean citation rate on the number of co-authors, the number of addresses and the number of references, although this dependence is a little far from the linear behaviour. For the relation between the mean impact and the number of pages the dependence obtained was very low. For Biology &amp; Biochemistry and Chemistry we found a linear behaviour between the mean citation per article and impact factor and for Mathematics and Physics the results obtained are near to the linear behaviour.</p>

DOI

29
Waltman L.(2016). A review of the literature on citation impact indicators. Journal of Informetrics, 10(2), 365-391. doi:10.1016/j.joi.2016.02.007

30
Waltman L.,&Eck N.(2013). Source normalized indicators of citation impact: An overview of different approaches and an empirical comparison. Scientometrics, 96(3), 699-716. doi:10.1007/s11192-012-0913-4Abstract<br/><p class="a-plus-plus">Different scientific fields have different citation practices. Citation-based bibliometric indicators need to normalize for such differences between fields in order to allow for meaningful between-field comparisons of citation impact. Traditionally, normalization for field differences has usually been done based on a field classification system. In this approach, each publication belongs to one or more fields and the citation impact of a publication is calculated relative to the other publications in the same field. Recently, the idea of source normalization was introduced, which offers an alternative approach to normalize for field differences. In this approach, normalization is done by looking at the referencing behavior of citing publications or citing journals. In this paper, we provide an overview of a number of source normalization approaches and we empirically compare these approaches with a traditional normalization approach based on a field classification system. We also pay attention to the issue of the selection of the journals to be included in a normalization for field differences. Our analysis indicates a number of problems of the traditional classification-system-based normalization approach, suggesting that source normalization approaches may yield more accurate results.</p><br/>

DOI

31
Xinhuanet. (2018). The General Office of the Central Office of the Communist Party of China, the General Office of the State Council issued “Opinions on Deepening the Reform of Project, Talent and Institutional Assessment”. Retrieved from

Outlines

/

京ICP备05002861号-43

Copyright © 2023 All rights reserved Journal of Data and Information Science

E-mail: jdis@mail.las.ac.cn Add:No.33, Beisihuan Xilu, Haidian District, Beijing 100190, China

Support by Beijing Magtech Co.ltd E-mail: support@magtech.com.cn