Research Paper

A Metric Approach to Hot Topics in Biomedicine via Keyword Co-occurrence

  • Jane H. Qin 1, 2, ,
  • Jean J. Wang 1, 2 ,
  • Fred Y. Ye , 1
Expand
  • 1Jiangsu Key Laboratory of Data Engineering and Knowledge Service, School of Information Management, Nanjing University, Nanjing 210023, China
  • 2International Joint Informatics Laboratory (IJIL), Nanjing University - University of Illinois, Nanjing - Champaign, China - USA
Fred Y. Ye (E-mail: ).

Received date: 2019-09-24

  Request revised date: 2019-11-19

  Accepted date: 2019-11-24

  Online published: 2019-12-19

Copyright

Open Access

Abstract

Purpose: To reveal the research hotpots and relationship among three research hot topics in biomedicine, namely CRISPR, iPS (induced Pluripotent Stem) cell and Synthetic biology.

Design/methodology/approach: We set up their keyword co-occurrence networks with using three indicators and information visualization for metric analysis.

Findings: The results reveal the main research hotspots in the three topics are different, but the overlapping keywords in the three topics indicate that they are mutually integrated and interacted each other.

Research limitations: All analyses use keywords, without any other forms.

Practical implications: We try to find the information distribution and structure of these three hot topics for revealing their research status and interactions, and for promoting biomedical developments.

Originality/value: We chose the core keywords in three research hot topics in biomedicine by using h-index.

Cite this article

Jane H. Qin , Jean J. Wang , Fred Y. Ye . A Metric Approach to Hot Topics in Biomedicine via Keyword Co-occurrence[J]. Journal of Data and Information Science, 2019 , 4(4) : 13 -25 . DOI: 10.2478/jdis-2019-0018

1 Introduction

Since 2000, new century, biomedicine had great progress at both scientific and technical levels. Among “breakthrough of the year” in Science every year, we could conclude three important hot topics in biomedicine as follows. The first topic is Genome editing technique CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR-associated system). As a genome editing method, CRISPR/Cas was the top breakthrough in 2015. The CRISPR/Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as those present within plasmids and phages that provides a form of acquired immunity. At beginning, CRISPR described segments of prokaryotic DNA containing short, repetitive base sequences in ancient bacteria (Horvath & Barrangou, 2010). Later, the group of Jennifer Doudna induced CRISPR/Cas9 as a tool to cut DNA with crRANs in 2012 (Jinek et al.), and then the group of Feng Zhang applied CRISPR/Cas9 into eucaryotic cells in 2013 (Cong et al., 2013). The group of Ma et al. (2017) describe the correction of a pathogenic gene mutation in human embryos with CRISPR/Cas9. Cox et al. (2017) proved that RNA can be edited with CRISPR-Cas13 to correct disease-relevant human mutations and proposed an RNA-editing platform named REPAIR. While another nuclease Cpf1 was discovered in 2015 then CRISPR/Cpf1 became another CRISPR system (Zetsche et al., 2015). Yan et al. (2019) systematically discovered additional subtypes of type V CRISPR-Cas systems. The diversity, modularity, and efficacy of CRISPR-Cas systems are driving a biotechnological revolution and CRISPR-Cas guides the future of genetic engineering (Knott & Doudna, 2018). The second theme is Stem cell technique iPS cell. As a type of pluripotent stem cell, the iPS cell technique was selected into new breakthrough in both 2012 and 2016. The iPS cell technique was pioneered by Shinya Yamanaka’s lab in Kyoto, who showed in 2006 that the introduction of four specific genes encoding transcription factors could convert adult cells into pluripotent stem cells (Takahashi, 2006), on which Yamanaka was awarded the 2012 Nobel Prize along with Sir John Gurdon for their discovery that mature cells can be reprogrammed to become pluripotent. Since then, researchers have found a variety of more optimal induction methods (Anokye-Danso et al., 2011; Ma, Kong, & Zhu, 2017). At the meantime, researchers turned to introduce disease-associated mutations into a sample of iPS cells through gene editing. Paquet et al. (2016) generated cells with precise combinations of Alzheimer’s-associated mutations by introducing specific point mutations into iPS cells using CRISPR. The iPS cells have wide application perspectives in drug discovery and disease modelling (Scudellari, 2016). The last topic is Synthetic biology and artificial life. This is an interdisciplinary branch of biology and engineering, which was selected into new breakthrough in 2010. Synthetic biologists come in two broad classes. One uses unnatural molecules to reproduce emergent behaviors from natural biology, with the goal of creating artificial life. The other seeks interchangeable parts from natural biology to assemble into systems that function unnaturally (Benner & Sismour, 2005). Gibson et al. (2010) introduced their study about the Creation of a bacterial cell controlled by a chemically synthesized genome. Esvelt and Wang (2013) think Genome-modification technologies enable the rational engineering and perturbation of biological systems, such as CRISPR/Cas. Cameron, Bashor and Collins (2014) reviews the history of synthetic biology and points out that the field of synthetic biology has chartered many notable achievements and is poised to transform biotechnology and medicine.
In this article, based on biomedical documents and data analysis, we try to find the information distribution and structure of these three hot topics via analyzing the collaborative networks and visualizing their cores, for revealing their research status and interactions, for promoting biomedical developments.

2 Methodology

We process core keyword co-occurrence networks in this study. The methods focus on network analysis (Friedkin, 1991; Newman, 2004; Wolfe, 1997) and information visualization (Chen, 2006), and data come from scientific document database. For information visualization, VOSviewer (Eck & Waltman, 2010) is applied to draw pictures. In network analysis, Gephi (Bastian, Heymann, & Jacomy, 2009) and UCINET (Borgatti, Everett, & Freeman, 2002) are applied to compute network parameters. Meanwhile, open software SATI (Liu & Ye, 2012), Excel and R programming are used for data processing.

2.1 Methods

In this article, we chose the core keywords in three research hot topics in biomedicine by using h-index. Hirsch (2005) proposed the h-index, defined as the number of papers with citation number ≥ h, as a useful index to characterize the scientific output of a researcher. If all papers published by one research are arranged in descending order of citation frequency, supposing ci is the total number of citations of the ith paper, h-index can be quantified by formula h = {max i: i ≤ ci}.
Because h-index takes into account both the number and quality of papers published by one researcher, it overcomes the shortcomings of the previous single dimension theory, such as the number of papers or the number of citations (Bornmann & Daniel, 2005). Therefore, h-index can more objectively evaluate the academic achievements of researchers, which has brought widespread attention in scientific circles (Bornmann & Daniel, 2007). Braun, Glänzel, and Schubert (2006) put forward that h-index can be applied in the evaluation of impact of journals for the first time, and believed that h-index was a powerful supplement to the impact factor of journals. Banks (2006) applied h-index to identify the main research topics of compounds. Thereafter, h-index is used in many academic research areas. Its definition is extended as follows: h-index of one academic information source refers to that at least h articles has been cited at least h times (Ye, 2014).
Keyword co-occurrence was considered to be the main means for identifying research themes (Wang et al., 2017). Keyword co-occurrence network reflects the knowledge structure and knowledge kernel that can display the relationship between keywords (Su & Lee, 2010). Lee and Su (2010) believed that research hotspots can be evaluated by the centrality of the nodes in a keyword co-occurrence network. In keyword co-occurrence network, nodes represent keywords, while edges represent co-occurrence relationships among nodes. By using the social network analysis method to analyze the keyword co-occurrence networks, we can analyze the knowledge structure and hotpots of the research field.
This article evaluates the influence of nodes in the network based on closeness centrality, betweenness centrality and eigenvector centrality. Degree centrality means the importance of nodes in the network. The higher degree centrality of the node, the more important the node is, which means that the keywords represented by the node are more likely to be research hotspots. Betweenness centrality measures the ability of one keyword in a network to affect the other keywords that appear together. eigenvector centrality measures the number of adjacency nodes and the influence of adjacency nodes.
(1) Degree Centrality
In network measures, centrality indices are applied as terms of a real-valued function on the vertices of a graph, where the values produced are expected to provide a ranking which identifies the most important nodes. For a given graph G (V, E) with number of vertices V and number of edges E, let A=(au,v) be the adjacency matrix, i.e. au,v = 1 if vertex u is linked to vertex v and au,v = 0 otherwise. The degree centrality score of vertex u can be defined as
$x_{u}=\sum_{v∈G}a_{u,v}x_{v}$(1)
The relative centrality of vertex u can be defined as
$x_{u}=\frac{1}{λ}\sum_{v∈G}a_{u,v}x_{v}$ (2)
(2) Betweenness Centrality
Betweenness Centrality measures the shortest path in a network, which is used to evaluate the role of nodes in information integration in social networks. The higher the betweenness centrality, the greater the role it plays in information integration. Gst represents the number of shortest paths from point s to point t. Gst(v) represents the number of shortest paths from point s to point t that pass through node v. The betweenness centrality of vertex u can be defined as follows:
$x_{u}=\sum_{v∈G}\frac{G_{st}(v)}{G_{st}}$ (3)
(3) Eigenvector Centrality
Since the entries in the adjacency matrix are non-negative, there is a unique largest eigenvalue, which is real and positive. This greatest eigenvalue results in the desired centrality measure is eigenvector centrality or eigencentrality, which reveals the core importance of a vertex in a network. Its eigenvectors are orthogonal and diagonalizable. The centrality of vertices is proportional to the sum of the central points of the vertices it connects. The eigenvector center x is described in two equivalent ways. As the sum of matrix equations, the eigenvector centrality can be defined as follows:
AX=λX (4)
$AX=λx,λX_i=\sum_{j=i}^{n}a_{ij}x_{j},i=1,…,n $ (5)
λ is the maximum eigenvalue of A and n is the number of vertices.

2.2 Data

In our empirical study we search the Web of Science (WoS) database for articles published during 1900 to 2018. We collected the data in January 2019. The retrieval strategies were as follows:
(1) H1-CRISPR
TS=“clustered regularly interspaced short palindromic repeats” OR CRISPR
(2) H2-iPS cell
TS=“induce* pluripotent stem cell” OR “induce* pluripotent stem cells” OR “IPS cell” OR “IPS cells”
(3) H3-Synthetic biology
TS=“synthetic biology” OR “gene circuit” OR “gene circuits” OR “genetic circuit” OR “genetic circuits” OR “genetic device” OR “genetic devices” OR “synthetic life” OR “synthetic lives” OR “synthetic tissue” OR “synthetic tissues” OR “synthetic cell” OR “synthetic cells” OR “synthetic genome” OR “synthetic genomes” OR “synthetic gene” OR “synthetic genes” OR “minimal genome” OR “minimal genomes” OR “biology, synthetic”
The computed data will lead to next results for finding core keywords and setting up keyword co-occurrence networks.

3 Keyword co-occurrence results

High-frequency keywords can reflect the research hotspots and research directions to some extent, but the information displayed by the linear arrangement of the frequency of keywords has great limitations. H-index can comprehensively reflect the occurrence frequency of keywords and the number of citations. In this article, the 100 keywords with the highest h-index are defined as core keywords. Keyword co-occurrence relationship can reflect the internal connection between keywords. In this chapter, we construct the co-occurrence matrix and co-word network with the help of R language. Besides that, we analyze the research hotspots of the three hot topics in biomedicine by using social network analysis method.
In this study, R language is used to extract keywords and the number of citations corresponding to keywords. The h-index is calculated by our own programming. The extracted keywords have the following problems.
(1) Case difference, such as “Induced pluripotent stem cell”, “induced pluripotent stem cell”, “CRISPR/Cas9”, and “CRISPR/Cas9”.
(2) Inconsistent connectors, such as “CRISPR/Cas9”, and “CRISPR-Cas9”.
(3) Heteronyms, such as abbreviation and the full name phenomenon. “iPSCs”, “Human-induced pluripotent stem cells”, and “Induced pluripotent stem cells (iPSCs)”have the same meaning.
Considering that many keywords are special terms in the field of biomedicine, and the current general dictionary is not applicable for the study. The following two methods are used for data processing. The first method is case conversion, which unifies keywords into capital letters. The second method is self-compiled dictionary which can solve the problems of inconsistent connectors and Heteronyms.

3.1 Keyword co-occurrence with CRISPR/Cas9

Fig. 1 shows the co-occurrence network of 100 keywords with the highest h-index in the field of CRISPR/Cas9. Node color represents matrix-based clustering and node size represents h-index. Table 1 shows the 20 nodes with the highest degree centrality and betweenness centrality.
Table 1 Co-occurrence network centrality with the highest h-index keywords in the field of CRISPR/Cas9.
Rank Keyword Degree Centrality Keyword Betweenness Centrality
1 CRISPR 188 CRISPR 1,378.00
2 GENOME EDITING 156 GENOME EDITING 664.00
3 CRISPR/CAS 132 CRISPR/CAS 441.37
4 GENOME ENGINEERING 94 GENOME ENGINEERING 157.52
5 HOMOLOGOUS RECOMBINATION 84 GENES 135.44
6 GENES 76 HOMOLOGOUS RECOMBINATION 111.44
7 ZEBRAFISH 70 ZEBRAFISH 110.97
8 GENE TARGET 66 GENE REGULATION 60.97
9 TALEN 64 CRISPRI 53.39
10 SYNTHETIC BIOLOGY 58 APOPTOSIS 49.33
11 GENE REGULATION 56 DNA REPAIR 48.34
12 GENE THERAPY 56 SYNTHETIC BIOLOGY 47.97
13 DNA REPAIR 54 GENE TARGET 47.13
14 GENE KNOCKOUT 54 IPSC 42.64
15 IPSC 50 GENE KNOCKOUT 41.15
16 ZFN 50 TALEN 41.09
17 SGRNA 46 GENE THERAPY 39.40
18 CANCER 42 EVOLUTION 37.89
19 CRRNA 42 SGRNA 34.15
20 EVOLUTION 42 CANCER 34.14
In the network, “CRISPR”, “GENOME EDITING”, “CRISPR/CAS”, “GENOME ENGINEERING”, “HOMOLOGOUS RECOMBINATION”, “GENE TARGET” are located in the center of the network, and have high degree centrality and betweenness centrality. Thus, they are the core research contents. “ZEBRAFISH”, “MOUSE”, “ZFN”, “TALEN”, “GENE THERAPY”, “CANCER” are the much important research.
Interestingly, the keywords of “IPSC”, “HUMAN IPSC”, “STEM CELL”, “SYNTHETIC BIOLOGY”, “METABOLIC ENGINEERING” are conspicuous in the co-occurrence network. Besides that, these keywords have high degree centrality and betweenness centrality. That means CRISPR/Cas9 and the other two hot topics have large cross-study.
Figure 1. Co-occurrence network with the highest h-index keywords in the field of CRISPR/Cas9.

3.2 Keyword co-occurrence with iPS cell

Fig. 2 shows the co-occurrence network of 100 keywords with the highest h-index in the field of iPS cell. Node color represents matrix-based clustering and node size represents h-index. Table 2 shows the 20 nodes with the highest degree centrality and betweenness centrality.
Figure 2. Co-occurrence network with the highest h-index keywords in the field of iPS cell.
Table 2 Co-occurrence network centrality with the highest h-index keywords in the field of iPS cell.
Rank Keyword Degree Centrality Keyword Betweenness Centrality
1 STEM CELL 184 STEM CELL 359.39
2 EMBRYONIC STEM CELL 170 EMBRYONIC STEM CELL 267.07
3 REPROGRAMMING 166 HUMAN IPSC 254.20
4 HUMAN IPSC 162 REPROGRAMMING 246.40
5 PLURIPOTENT STEM CELL 160 DIFFERENTIATION 225.41
6 DIFFERENTIATION 158 PLURIPOTENT STEM CELL 221.89
7 REGENERATIVE MEDICINE 118 HUMAN EMBRYONIC STEM CELL 110.40
8 HUMAN EMBRYONIC STEM CELL 116 REGENERATIVE MEDICINE 100.92
9 NEURAL STEM CELL 116 MESENCHYMAL STEM CELL 94.19
10 MESENCHYMAL STEM CELL 114 NEURAL STEM CELL 83.66
11 CELL THERAPY 102 TRANSPLANTATION 70.11
12 PLURIPOTENCY 102 PLURIPOTENCY 67.90
13 TRANSPLANTATION 102 CELL THERAPY 60.74
14 TISSUE ENGINEERING 98 CARDIOMYOCYTES 58.25
15 CARDIOMYOCYTES 90 NEURON 56.58
16 NEURON 90 TISSUE ENGINEERING 54.92
17 DISEASE MODELING 82 HIPSC 43.86
18 HIPSC 82 DRUG SCREENING 42.39
19 PARKINSON’S DISEASE 82 GENE EXPRESSION 40.25
20 GENE EXPRESSION 80 DISEASE MODELING 37.50
In the network, “STEM CELL”, “EMBRYONIC STEM CELL”, “REPROGRAMMING”, “HUMAN IPSC”, “PLURIPOTENT STEM CELL”, “DIFFERENTIATION”, “REGENERATIVE MEDICINE”, “MESENCHYMAL STEM CELL” are located in the center of the network, and have high degree centrality and betweenness centrality. Thus, they are the core research contents.
In the field of iPS cells, “CARDIOMYOCYTES”, “NEURON”, “PARKINSON’S DISEASE”, “CELL THERAPY”, “DISEASE MODELING” are the hotspots.

3.3 Keyword co-occurrence with synthetic biology

Fig. 3 shows the co-occurrence network of 100 keywords with the highest h-index in the field of synthetic biology. Node color represents matrix-based clustering and node size represents h-index. Table 3 shows the 20 nodes with the highest degree centrality and betweenness centrality. In the network, “METABOLIC ENGINEERING”, “GENE CIRCUIT”, “GENE EXPRESSION”, “SYSTEMS BIOLOGY”, “SYSTEMS BIOLOGY”, “PROTEIN ENGINEERING” are located in the center of the network, and have high degree centrality and betweenness centrality. Thus, they are the core research contents. “YEAST”, “ESCELLHERICHIA COLI” and “SACCHAROMYCES CEREVISIAE” are experimental vectors used in synthetic biology research. These keywords are located in the network center and have high degree centrality and betweenness centrality, which indicates that they are widely used in research .
Figure 3. Co-occurrence network with the highest h-index keywords in the field of synthetic biology.
In the keyword co-occurrence network, CRISPR/CAS9 also have high degree centrality, and they are important node in the keyword cluster. This indicates that there are correlation between synthetic biology and CRISPR/CAS9. Besides that, the frequency of cross research and the number of citations is relatively high.
Table 3 Co-occurrence network centrality with the highest h-index keywords in the field of synthetic biology.
Rank Keyword Degree
Centrality
Keyword Betweennes Centrality
1 METABOLIC ENGINEERING 100 METABOLIC ENGINEERING 378.42
2 YEAST 90 GENE EXPRESSION 312.93
3 GENE CIRCUIT 82 YEAST 286.57
4 ES CELLHERICHIA COLI 80 GENE CIRCUIT 281.48
5 GENE EXPRESSION 80 ES CELLHERICHIA COLI 249.28
6 SACCHAROMYCES CEREVISIAE 80 SACCHAROMYCES CEREVISIAE 229.17
7 PROTEIN ENGINEERING 66 SYNTHETIC GENE 180.28
8 DIRECTED EVOLUTION 62 PROTEIN ENGINEERING 171.01
9 GENE REGULATION 60 CELL CYCLE 169.32
10 SYNTHETIC GENE 60 GENE THERAPY 158.88
11 SYSTEMS BIOLOGY 60 TRANSCRIPTION 146.23
12 TRANSCRIPTION 60 DIRECTED EVOLUTION 143.33
13 GENE ENGINEERING 56 SYSTEMS BIOLOGY 130.95
14 BIOTECHNOLOGY 50 GENE REGULATION 108.56
15 GENE THERAPY 50 GENE ENGINEERING 90.24
16 CRISPR/CAS9 48 ESSENTIAL GENE 87.67
17 CYANOBACTERIA 46 CELL-FREE PROTEIN SYNjournal 76.44
18 ESSENTIAL GENE 46 EVOLUTION 71.90
19 EVOLUTION 44 TRANSCRIPTION FACTOR 70.90
20 E. COLI 42 BIOTECHNOLOGY 66.57

4 Discussion and conclusion

Above results construct core keyword co-occurrence networks with visualizing and calculating the keywords’ centralities. The three research hot topics in biomedicine are analyzed and characterized as follows.
(1) The research hotspots of CRISPR/Cas9 include the comparison of gene editing technology with the previous two generations, the discovery of new CRISPR/Cas9 system, improvement of gene editing technology and methods, and application research of CRISPR/Cas9 in the gene therapy and cancer therapy.
(2) The research hotspots of synthetic biology include “METABOLIC ENGINEERING”, “GENE CIRCUIT”, “GENE EXPRESSION”, “SYSTEMS BIOLOGY”, “SYSTEMS BIOLOGY”, and “PROTEIN ENGINEERING”.
(3) The research hotspots of iPS cells include HUMAN IPSC, the comparison between iPS cell and EMBRYONIC STEM CELL, and the application of iPS cells in the research of CARDIOMYOCYTES, NEURON, PARKINSON’S DISEASE, etc.
(4) There were overlapping keywords corresponding to the three biomedical topics, among which the overlapping keywords of synthetic biology and CRISPR/Cas9 were the most obvious. The research on the three topics is overlapping.
Since all analyses use keywords, without any other forms, the limitations are remained in this article, which may be improved in future studies.

Author Contributions

Jane J. Qin (jhqin@smail.nju.edu.cn) collected and processed the data and wrote Chinese version, Jean J. Wang (wangjingjing236@sina.com)re-organized and wrote English version, and Fred Y. Ye (yye@nju.edu.cn) initiated the research and revised the paper.

The authors have declared that no competing interests exist.

[1]
Anokye-Danso F., Trivedi C.M., Juhr D., Gupta M., Cui Z., Tian Y., Zhang Y., Yang W., Gruber P.J., & Epstein J.A. (2011). Highly efficient miRNA-mediated reprogramming of mouse and human somatic cells to pluripotency. Cell Stem Cell, 8(4), 376-388.Transcription factor-based cellular reprogramming has opened the way to converting somatic cells to a pluripotent state, but has faced limitations resulting from the requirement for transcription factors and the relative inefficiency of the process. We show here that expression of the miR302/367 cluster rapidly and efficiently reprograms mouse and human somatic cells to an iPSC state without a requirement for exogenous transcription factors. This miRNA-based reprogramming approach is two orders of magnitude more efficient than standard Oct4/Sox2/K1f4/Myc-mediated methods. Mouse and human miR302/367 iPSCs display similar characteristics to Oct4/Sox2/K1f4/Myc-iPSCs, including pluripotency marker expression, teratoma formation, and, for mouse cells, chimera contribution and germline contribution. We found that miR367 expression is required for miR3021367-mediated reprogramming and activates Oct4 gene expression, and that suppression of Hdac2 is also required. Thus, our data show that miRNA and Hdac-mediated pathways can cooperate in a powerful way to reprogram somatic cells to pluripotency.

DOI

[2]
Banks , M.G. (2006). An extension of the hirsch index: Indexing scientific topics and compounds. Scientometrics, 69(1), 161-168.An interesting twist of the Hirsch index is given, in terms of an index for topics and compounds. By comparing both the hb index and m for a number of compounds and topics, it can be used to differentiate between a new so-called hot topic with older topics. This quick method is shown to help new comers to identify how much interest and work has already been achieved in their chosen area of research.]]>

DOI

[3]
Bastian M., Heymann S., & Jacomy M. (2009). Gephi: An open source software for exploring and manipulating networks. In Proceedings of the Third International Conference on Weblogs and Social Media (361-362). San Jose, California, USA. ICWSM.

[4]
Benner , S.A., &Sismour, A.M. (2005). Synthetic biology. Nat Rev Genet, 6, 533-543.Synthetic biologists come in two broad classes. One uses unnatural molecules to reproduce emergent behaviours from natural biology, with the goal of creating artificial life. The other seeks interchangeable parts from natural biology to assemble into systems that function unnaturally. Either way, a synthetic goal forces scientists to cross uncharted ground to encounter and solve problems that are not easily encountered through analysis. This drives the emergence of new paradigms in ways that analysis cannot easily do. Synthetic biology has generated diagnostic tools that improve the care of patients with infectious diseases, as well as devices that oscillate, creep and play tic-tac-toe.

DOI PMID

[5]
Borgatti S.P., Everett M.G., & Freeman L.C. (2002). Ucinet for windows: Software for social network analysis, Harvard, MA: Analytic Technologies.

[6]
Bornmann , L., &Daniel , H.D. (2005). Does the h-index for ranking of scientists really work? Scientometrics, 65(3), 391-392.Hirsch (2005) has proposed the h-index as a single-number criterion to evaluate the scientific output of a researcher (Ball, 2005): A scientist has index h if h of his/her Np papers have at least h citations each, and the other (Nph) papers have fewer than h citations each. In a study on committee peer review (Bornmann & Daniel, 2005) we found that on average the h-index for successful applicants for post-doctoral research fellowships was consistently higher than for non-successful applicants.]]>

DOI

[7]
Bornmann , L., &Daniel ,H. (2007). What do we know about the h index? Journal of the American Society for Information Science and Technology, 58(9), 1381-1385.Recommendations for anesthetic care are often difficult to implement in the intraoperative setting because of the requirement for continuous attention WHAT THIS ARTICLE TELLS US THAT IS NEW: Closed-loop, automated management of anesthetic, analgesic, fluid, and ventilation parameters was superior to manual control and might influence postoperative outcomes BACKGROUND:: Cognitive changes after anesthesia and surgery represent a significant public health concern. We tested the hypothesis that, in patients 60 yr or older scheduled for noncardiac surgery, automated management of anesthetic depth, cardiac blood flow, and protective lung ventilation using three independent controllers would outperform manual control of these variables. Additionally, as a result of the improved management, patients in the automated group would experience less postoperative neurocognitive impairment compared to patients having standard, manually adjusted anesthesia.

DOI PMID

[8]
Braun T., Glänzel W., & Schubert A. (2006). A hirsch-type index for journals. Scientometrics, 69(1), 169-173.We suggest that a h-type index - equal to h if you have published h papers, each of which has at least h citations - would be a useful supplement to journal impact factors.]]>

DOI

[9]
Cameron D.E., Bashor C.J., & Collins J.J. (2014). A brief history of synthetic biology. Nature Reviews Microbiology, 12, 381-390.The ability to rationally engineer microorganisms has been a long-envisioned goal dating back more than a half-century. With the genomics revolution and rise of systems biology in the 1990s came the development of a rigorous engineering discipline to create, control and programme cellular behaviour. The resulting field, known as synthetic biology, has undergone dramatic growth throughout the past decade and is poised to transform biotechnology and medicine. This Timeline article charts the technological and cultural lifetime of synthetic biology, with an emphasis on key breakthroughs and future challenges.

DOI

[10]
Chen , C. (2006). Information visualization: Beyond the horizon, Berlin: Springer.

[11]
Cong L., Ran F.A., Cox D., Lin S., Barretto R., Habib N., Hsu P.D., Wu X., Jiang W., & Marraffini L.A. (2013). Multiplex genome engineering using CRISPR/Cassystems. Science, 339, 819-823.Allostery is well documented for proteins but less recognized for DNA-protein interactions. Here, we report that specific binding of a protein on DNA is substantially stabilized or destabilized by another protein bound nearby. The ternary complex's free energy oscillates as a function of the separation between the two proteins with a periodicity of similar to 10 base pairs, the helical pitch of B-form DNA, and a decay length of similar to 15 base pairs. The binding affinity of a protein near a DNA hairpin is similarly dependent on their separation, which-together with molecular dynamics simulations-suggests that deformation of the double-helical structure is the origin of DNA allostery. The physiological relevance of this phenomenon is illustrated by its effect on gene expression in live bacteria and on a transcription factor's affinity near nucleosomes.

DOI

[12]
Cox D.B.T., Gootenberg J.S., Abudayyeh O.O., Franklin B., Kellner M.J., Joung J., & Zhang F. (2017). Rna editing with crispr-cas13. Science, 358, 1019-1027.Nucleic acid editing holds promise for treating genetic disease, particularly at the RNA level, where disease-relevant sequences can be rescued to yield functional protein products. Type VI CRISPR-Cas systems contain the programmable single-effector RNA-guided ribonuclease Cas13. We profiled type VI systems in order to engineer a Cas13 ortholog capable of robust knockdown and demonstrated RNA editing by using catalytically inactive Cas13 (dCas13) to direct adenosine-to-inosine deaminase activity by ADAR2 (adenosine deaminase acting on RNA type 2) to transcripts in mammalian cells. This system, referred to as RNA Editing for Programmable A to I Replacement (REPAIR), which has no strict sequence constraints, can be used to edit full-length transcripts containing pathogenic mutations. We further engineered this system to create a high-specificity variant and minimized the system to facilitate viral delivery. REPAIR presents a promising RNA-editing platform with broad applicability for research, therapeutics, and biotechnology.

DOI PMID

[13]
Eck , N.J.V., &Waltman ,L. (2010). Software survey: Vosviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523-538.We present VOSviewer, a freely available computer program that we have developed for constructing and viewing bibliometric maps. Unlike most computer programs that are used for bibliometric mapping, VOSviewer pays special attention to the graphical representation of bibliometric maps. The functionality of VOSviewer is especially useful for displaying large bibliometric maps in an easy-to-interpret way. The paper consists of three parts. In the first part, an overview of VOSviewer舗s functionality for displaying bibliometric maps is provided. In the second part, the technical implementation of specific parts of the program is discussed. Finally, in the third part, VOSviewer舗s ability to handle large maps is demonstrated by using the program to construct and display a co-citation map of 5,000 major scientific journals.

DOI

[14]
Esvelt , K.M., &Wang ,H.H. (2013). Genome-scale engineering for systems and synthetic biology. Molecular Systems Biology, 9, 641-641.Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering.

DOI PMID

[15]
Friedkin , N.E. (1991). Theoretical foundations for centrality measures. American Journal of Sociology, 96, 1478-1504.

DOI

[16]
Gibson D.G., Glass J.I., Lartigue C., Noskov V.N., Chuang R.Y., Algire M.A., Benders G.A., Montague M.G., Ma L., & Moodie M.M. (2010). Creation of a bacterial cell controlled by a chemically synthesized genome. 329(5987), 52-56.

[17]
Hirsch , J.E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102, 16569-16572.

[18]
Horvath , P., &Barrangou ,R. (2010). CRISPR/Cas, the immune system of bacteria and archaea. 327, 167-170.

[19]
Knott , G.A., &Doudna ,J.A. (2018). CRISPR-Casguides the future of genetic engineering.Science, 361, 866-869.The diversity, modularity, and efficacy of CRISPR-Cas systems are driving a biotechnological revolution. RNA-guided Cas enzymes have been adopted as tools to manipulate the genomes of cultured cells, animals, and plants, accelerating the pace of fundamental research and enabling clinical and agricultural breakthroughs. We describe the basic mechanisms that set the CRISPR-Cas toolkit apart from other programmable gene-editing technologies, highlighting the diverse and naturally evolved systems now functionalized as biotechnologies. We discuss the rapidly evolving landscape of CRISPR-Cas applications, from gene editing to transcriptional regulation, imaging, and diagnostics. Continuing functional dissection and an expanding landscape of applications position CRISPR-Cas tools at the cutting edge of nucleic acid manipulation that is rewriting biology.

DOI PMID

[20]
Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J.A., & Charpentier E. (2012). A programmable dual-rna-guided DNA endonuclease in adaptive bacterial immunity. Science, 337, 816-821.Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems provide bacteria and archaea with adaptive immunity against viruses and plasmids by using CRISPR RNAs (crRNAs) to guide the silencing of invading nucleic acids. We show here that in a subset of these systems, the mature crRNA that is base-paired to trans-activating crRNA (tracrRNA) forms a two-RNA structure that directs the CRISPR-associated protein Cas9 to introduce double-stranded (ds) breaks in target DNA. At sites complementary to the crRNA-guide sequence, the Cas9 HNH nuclease domain cleaves the complementary strand, whereas the Cas9 RuvC-like domain cleaves the noncomplementary strand. The dual-tracrRNA:crRNA, when engineered as a single RNA chimera, also directs sequence-specific Cas9 dsDNA cleavage. Our study reveals a family of endonucleases that use dual-RNAs for site-specific DNA cleavage and highlights the potential to exploit the system for RNA-programmable genome editing.

DOI PMID

[21]
Lee , P.C., &Su ,H.N. (2010). Investigating the structure of regional innovation system research through keyword co-occurrence and social network analysis. Innovation: Management, Policy& Practice, 12(1), 26-40.This PDQ cancer information summary for health professionals provides comprehensive, peer-reviewed, evidence-based information about the treatment of pediatric bladder cancer. It is intended as a resource to inform and assist clinicians who care for cancer patients. It does not provide formal guidelines or recommendations for making health care decisions. This summary is reviewed regularly and updated as necessary by the PDQ Pediatric Treatment Editorial Board, which is editorially independent of the National Cancer Institute (NCI). The summary reflects an independent review of the literature and does not represent a policy statement of NCI or the National Institutes of Health (NIH).

DOI PMID

[22]
Liu , Q., &Ye ,Y. (2012). A study on mining bibliographic records by designed software sati case study on library and information science. Journal of Information Resources Management, 22(1), 50-58.

DOI

[23]
Ma H., Marti-Gutierrez N., Park S.W., Wu J., & Mitalipov S. (2017). Correction of a pathogenic gene mutation in human embryos. Nature, 548, 413-419.Genome editing has potential for the targeted correction of germline mutations. Here we describe the correction of the heterozygous MYBPC3 mutation in human preimplantation embryos with precise CRISPR-Cas9-based targeting accuracy and high homology-directed repair efficiency by activating an endogenous, germline-specific DNA repair response. Induced double-strand breaks (DSBs) at the mutant paternal allele were predominantly repaired using the homologous wild-type maternal gene instead of a synthetic DNA template. By modulating the cell cycle stage at which the DSB was induced, we were able to avoid mosaicism in cleaving embryos and achieve a high yield of homozygous embryos carrying the wild-type MYBPC3 gene without evidence of off-target mutations. The efficiency, accuracy and safety of the approach presented suggest that it has potential to be used for the correction of heritable mutations in human embryos by complementing preimplantation genetic diagnosis. However, much remains to be considered before clinical applications, including the reproducibility of the technique with other heterozygous mutations.

DOI PMID

[24]
Ma X., Kong L., & Zhu S. (2017). Reprogramming cell fates by small molecules. Protein Cell, 8(5), 328-348.Reprogramming cell fates towards pluripotent stem cells and other cell types has revolutionized our understanding of cellular plasticity. During the last decade, transcription factors and microRNAs have become powerful reprogramming factors for modulating cell fates. Recently, many efforts are focused on reprogramming cell fates by non-viral and non-integrating chemical approaches. Small molecules not only are useful in generating desired cell types in vitro for various applications, such as disease modeling and cell-based transplantation, but also hold great promise to be further developed as drugs to stimulate patients' endogenous cells to repair and regenerate in vivo. Here we will focus on chemical approaches for generating induced pluripotent stem cells, neurons, cardiomyocytes, hepatocytes and pancreatic β cells. Significantly, the rapid and exciting advances in cellular reprogramming by small molecules will help us to achieve the long-term goal of curing devastating diseases, injuries, cancers and aging.

DOI PMID

[25]
Newman , M. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Science, 101, 5200-5205.

[26]
Paquet D., Kwart D., Chen A., Sproul A., Jacob S., Teo S., Olsen K.M., Gregg A., Noggle S., & Tessier-Lavigne M. (2016). Efficient introduction of specific homozygous and heterozygous mutations using CRISPR/Cas9. Nature, 533, 125-129.The bacterial CRISPR/Cas9 system allows sequence-specific gene editing in many organisms and holds promise as a tool to generate models of human diseases, for example, in human pluripotent stem cells. CRISPR/Cas9 introduces targeted double-stranded breaks (DSBs) with high efficiency, which are typically repaired by non-homologous end-joining (NHEJ) resulting in nonspecific insertions, deletions or other mutations (indels). DSBs may also be repaired by homology-directed repair (HDR) using a DNA repair template, such as an introduced single-stranded oligo DNA nucleotide (ssODN), allowing knock-in of specific mutations. Although CRISPR/Cas9 is used extensively to engineer gene knockouts through NHEJ, editing by HDR remains inefficient and can be corrupted by additional indels, preventing its widespread use for modelling genetic disorders through introducing disease-associated mutations. Furthermore, targeted mutational knock-in at single alleles to model diseases caused by heterozygous mutations has not been reported. Here we describe a CRISPR/Cas9-based genome-editing framework that allows selective introduction of mono- and bi-allelic sequence changes with high efficiency and accuracy. We show that HDR accuracy is increased dramatically by incorporating silent CRISPR/Cas-blocking mutations along with pathogenic mutations, and establish a method termed 'CORRECT' for scarless genome editing. By characterizing and exploiting a stereotyped inverse relationship between a mutation's incorporation rate and its distance to the DSB, we achieve predictable control of zygosity. Homozygous introduction requires a guide RNA targeting close to the intended mutation, whereas heterozygous introduction can be accomplished by distance-dependent suboptimal mutation incorporation or by use of mixed repair templates. Using this approach, we generated human induced pluripotent stem cells with heterozygous and homozygous dominant early onset Alzheimer's disease-causing mutations in amyloid precursor protein (APP(Swe)) and presenilin 1 (PSEN1(M146V)) and derived cortical neurons, which displayed genotype-dependent disease-associated phenotypes. Our findings enable efficient introduction of specific sequence changes with CRISPR/Cas9, facilitating study of human disease.

DOI PMID

[27]
Scudellari , M. (2016). A decade of ips cells. Nature, 534, 310-312.

DOI PMID

[28]
Su , H.N., &Lee ,P.-C. (2010). Mapping knowledge structure by keyword co-occurrence: A first look at journal papers in technology foresight. Scientometrics, 85(1), 65-79.This study proposes an approach for visualizing a knowledge structure, the proposed approach creates a three-dimensional 舠Research focused parallelship network舡 a 舠Keyword Co-occurrence Network舡 and a two-dimensional knowledge map to facilitate visualization of the knowledge structure created by journal papers from different perspectives. The networks and knowledge maps can be depicted differently by choosing different information as the network actor, e.g. author, institute or country keyword, to reflect knowledge structures in micro-, meso-, and macro-levels, respectively. Technology Foresight is selected as an example to illustrate the method proposed in this study. A total of 556 author keywords contained in 181 Technology Foresight related papers have been analyzed. European countries, China, India and Brazil are located at the core of Technology Foresight research. Quantitative ways of mapping journal papers are investigated in this study to unveil emerging elements as well as to demonstrate dynamics and visualization of knowledge. The quantitative method provided in this paper shows a possible way of visualizing and evaluating knowledge structure; thus a computerized calculation is possible for potential quantitative applications, e.g. R&D resource allocation, research performance evaluation, science map, etc.

DOI

[29]
Takahashi , K. (2006). Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell, 126, 663-676.Differentiated cells can be reprogrammed to an embryonic-like state by transfer of nuclear contents into oocytes or by fusion with embryonic stem (ES) cells. Little is known about factors that induce this reprogramming. Here, we demonstrate induction of pluripotent stem cells from mouse embryonic or adult fibroblasts by introducing four factors, Oct3/4, Sox2, c-Myc, and Klf4, under ES cell culture conditions. Unexpectedly, Nanog was dispensable. These cells, which we designated iPS (induced pluripotent stem) cells, exhibit the morphology and growth properties of ES cells and express ES cell marker genes. Subcutaneous transplantation of iPS cells into nude mice resulted in tumors containing a variety of tissues from all three germ layers. Following injection into blastocysts, iPS cells contributed to mouse embryonic development. These data demonstrate that pluripotent stem cells can be directly generated from fibroblast cultures by the addition of only a few defined factors.

DOI PMID

[30]
Wang P., Zhu F., Song H., & Hou J. (2017). A bibliometric profile of current science between 1961 and 2015. Current Science, 113(3), 386-392.

DOI

[31]
Wolfe , A.W. (1997). Social network analysis: Methods and applications. American Ethnologist, 24, 136-137.To explore the perspectives of those involved in co-designing a mobile application with people with dementia and their carers.

DOI PMID

[32]
Yan W.X., Hunnewell P., Alfonse L.E., Carte J.M., Keston-Smith E., Sothiselvam S., Garrity A.J., Chong S., Makarova K.S., Koonin E.V., Cheng D.R., & Scott D.A. (2019).Functionally diverse type v CRISPR-Cassystems. Science, 363(6422), 88-91.Type V CRISPR-Cas systems are distinguished by a single RNA-guided RuvC domain-containing effector, Cas12. Although effectors of subtypes V-A (Cas12a) and V-B (Cas12b) have been studied in detail, the distinct domain architectures and diverged RuvC sequences of uncharacterized Cas12 proteins suggest unexplored functional diversity. Here, we identify and characterize Cas12c, -g, -h, and -i. Cas12c, -h, and -i demonstrate RNA-guided double-stranded DNA (dsDNA) interference activity. Cas12i exhibits markedly different efficiencies of CRISPR RNA spacer complementary and noncomplementary strand cleavage resulting in predominant dsDNA nicking. Cas12g is an RNA-guided ribonuclease (RNase) with collateral RNase and single-strand DNase activities. Our study reveals the functional diversity emerging along different routes of type V CRISPR-Cas evolution and expands the CRISPR toolbox.

DOI PMID

[33]
Ye , F.Y. (2014). The research progress and developing perspective of assessment indicators.Journal of the China Scciety for Scientific and Technical Information, 33, 215-224.An estimated 10.8 million children under 5 continue to die each year in developing countries from causes easily treatable or preventable. Non governmental organizations (NGOs) are frontline implementers of low-cost and effective child health interventions, but their progress toward sustainable child health gains is a challenge to evaluate. This paper presents the Child Survival Sustainability Assessment (CSSA) methodology--a framework and process--to map progress towards sustainable child health from the community level and upward. The CSSA was developed with NGOs through a participatory process of research and dialogue. Commitment to sustainability requires a systematic and systemic consideration of human, social and organizational processes beyond a purely biomedical perspective. The CSSA is organized around three interrelated dimensions of evaluation: (1) health and health services; (2) capacity and viability of local organizations; (3) capacity of the community in its social ecological context. The CSSA uses a participatory, action-planning process, engaging a 'local system' of stakeholders in the contextual definition of objectives and indicators. Improved conditions measured in the three dimensions correspond to progress toward a sustainable health situation for the population. This framework opens new opportunities for evaluation and research design and places sustainability at the center of primary health care programming.

DOI PMID

[34]
Zetsche B., Gootenberg J.S., Abudayyeh O.O., Slaymaker I.M., Makarova K.S., Essletzbichler P., Volz S.E., Joung J., Van Der Oost J., Regev A., V.Koonin E., & Zhang F. (2015). Cpf1 is a single rna-guided endonuclease of a class 2 CRISPR-Cassystem. Cell, 163, 759-771.The microbial adaptive immune system CRISPR mediates defense against foreign genetic elements through two classes of RNA-guided nuclease effectors. Class 1 effectors utilize multi-protein complexes, whereas class 2 effectors rely on single-component effector proteins such as the well-characterized Cas9. Here, we report characterization of Cpf1, a putative class 2 CRISPR effector. We demonstrate that Cpf1 mediates robust DNA interference with features distinct from Cas9. Cpf1 is a single RNA-guided endonuclease lacking tracrRNA, and it utilizes a T-rich protospacer-adjacent motif. Moreover, Cpf1 cleaves DNA via a staggered DNA double-stranded break. Out of 16 Cpf1-family proteins, we identified two candidate enzymes from Acidaminococcus and Lachnospiraceae, with efficient genome-editing activity in human cells. Identifying this mechanism of interference broadens our understanding of CRISPR-Cas systems and advances their genome editing applications.

DOI PMID

Outlines

/

京ICP备05002861号-43

Copyright © 2023 All rights reserved Journal of Data and Information Science

E-mail: jdis@mail.las.ac.cn Add:No.33, Beisihuan Xilu, Haidian District, Beijing 100190, China

Support by Beijing Magtech Co.ltd E-mail: support@magtech.com.cn