1 Introduction
2 Related works
3 Methods
3.1 Basic principles for identifying multidisciplinary problems
Table 1. Text pattern of abstracts and titles of scientific papers. |
Research objective | Abstract features | Abstractive title |
---|---|---|
US | Study/investigate/test + individual object + structure/state/performance | Research/analysis of the performance/characteristics of problem |
SO | To address/tackle + problem + based on/utilizing + method + construct/propose/build | Study of problem based on method |
EXP-S | Summarize/review/introduce + individual object + current status/progress | The current status/overview of research on problem |
EXP-RG | Investigate/explore/analyze/discuss + the relationship/interaction mechanism/influence + multiple objects | The impact /mechanism of the problem |
3.2 Overall framework
Figure 1. Flowchart of the entire process. |
3.2.1 Identifying the research objectives and disciplinary codes of papers
3.2.2 Generating abstractive titles
3.2.3 Extracting problem phrases from abstractive titles
3.2.4 Determining same problems
Figure 2. Process of identifying the same problems. |
3.2.5 Identifying multidisciplinary research problems
4 Experiments and discussions
4.1 Data
Table 2. Discipline distribution of the number of papers in the CPCN dataset. |
Main category | Data volume of main category | First-level category | Data volume of first-level category |
---|---|---|---|
07 Science | 1,917 | 0703 Chemistry | 1,334 |
0706 Atmospheric Sciences | 583 | ||
08 Engineering | 15,322 | 0805 Materials Science and Engineering | 736 |
0807 Power Engineering and Engineering Thermophysics | 1,008 | ||
0813 Architecture | 638 | ||
0817 Chemical Engineering and Technology | 5,309 | ||
0819 Mining Engineering | 767 | ||
0820 Oil and Gas Engineering | 1,008 | ||
0823 Transportation Engineering | 750 | ||
0828 Agricultural engineering | 2,055 | ||
0830 Environmental Science and Engineering | 3,051 |
4.2 Experimental results on text classification of research objectives and disciplinary codes
4.2.1 The research objective classification
Table 3. Comparison of different methods for research objective classification. |
Algorithm | Macro-Precisio n | Macro-Recall | Macro-F1 |
---|---|---|---|
SVM | 0.85 | 0.84 | 0.84 |
NB | 0.81 | 0.81 | 0.81 |
Random forest | 0.77 | 0.75 | 0.75 |
LSTM | 0.69 | 0.62 | 0.65 |
FastText | 0.71 | 0.67 | 0.68 |
4.2.2 The disciplinary classification
Table 4. Comparison of stacking method and other methods in disciplinary classification. |
Algorithm | Macro-Precision | Macro-Recall | Macro-F1 |
---|---|---|---|
SVM | 0.81 | 0.69 | 0.74 |
NB | 0.64 | 0.77 | 0.68 |
LSTM | 0.67 | 0.65 | 0.66 |
Stacking | 0.81 | 0.79 | 0.80 |
4.3 Experimental results on multidisciplinary research problems identification
4.3.1 Abstractive title generation
Table 5. Comparison of abstractive title generation between BART and ChatGLM. |
Research Objective | Model | 1-Gram | 2-Gram | 3-Gram | BLEU | Exact Match | Unigram |
---|---|---|---|---|---|---|---|
US | ChatGLM | 0.560 | 0.462 | 0.371 | 0.402 | 0.182 | 0.417 |
BART | 0.582 | 0.474 | 0.376 | 0.411 | 0.145 | 0.369 | |
SO | ChatGLM | 0.612 | 0.494 | 0.387 | 0.440 | 0.299 | 0.441 |
BART | 0.631 | 0.498 | 0.374 | 0.437 | 0.356 | 0.438 | |
EXP-S | ChatGLM | 0.501 | 0.436 | 0.359 | 0.351 | 0.186 | 0.346 |
BART | 0.597 | 0.502 | 0.413 | 0.436 | 0.233 | 0.422 | |
EXP-RG | ChatGLM | 0.588 | 0.487 | 0.401 | 0.422 | 0.197 | 0.441 |
BART | 0.610 | 0.509 | 0.422 | 0.428 | 0.201 | 0.434 | |
ALL | BART | 0.577 | 0.463 | 0.372 | 0.408 | 0.203 | 0.367 |
4.3.2 Multidisciplinary research problems identification
Table 6. Examples of multidisciplinary research problems. |
Multidisciplinary research problems | The first-level disciplines involved |
---|---|
Catalytic, Cracking, Hydrogenation | 0703 Chemistry, 0817 Chemical Engineering and Technology, 0820 Oil and Gas Engineering |
Oxidation, Desulfurization, Catalytic | 0817 Chemical Engineering and Technology, 0820 Oil and Gas Engineering, 0830 Environmental Science and Engineering |
Rare earths, Catalysts, Environmentally friendly | 0805 Materials Science and Engineering, 0820 Oil and Gas Engineering |
Coal Combustion, Flue Gas, Distribution | 0817 Chemical Engineering and Technology, 0823 Transportation Engineering |
Communities, Microorganisms, Carbon Sources | 0828 Agricultural Engineering, 0830 Environmental Science and Engineering |
Table 7. Manual Evaluation Results. |
Research problem | Quantities |
---|---|
Multidisciplinary research problems | 34 |
Single-discipline research problems | 16 |
4.3.3 Analysis of multidisciplinary research problems identification results
Figure 3. Discipline distribution chart of multidisciplinary research problems. |