1 Introduction
Table 1. Linguistic complexity and reading experience of the four Chinese classic novels. |
| Book title | Linguistic complexity | Reading experience | Reasoning |
|---|---|---|---|
| Journey to the West (西游记) | High difficulty | Low difficulty | Linguistic complexity: Rich in classical Chinese expressions, cultural references, and metaphors. Reading experience: Linear and episodic storyline, vivid characters, and engaging plot make it easy to follow and enjoyable. |
| Romance of the Three Kingdoms (三国演义) | High difficulty | High difficulty | Linguistic complexity: Complex sentence structures, historical terms, and strategic descriptions. Reading experience: Dense historical and military content, with intricate relationships and strategies that require significant understanding. |
| Water Margin (水浒传) | Medium difficulty | Medium difficulty | Linguistic complexity: More straightforward classical Chinese with less challenging syntax. Reading experience: Many characters and subplots demand attention, but the heroic themes and action sequences are engaging. |
| Dream of the Red Chamber (红楼梦) | High difficulty | High difficulty | Linguistic complexity: Elaborate classical Chinese with poetic and symbolic elements. Reading experience: Complex emotional depth, subtle cultural references, and numerous characters and relationships make it demanding. |
2 Literature review
2.1 The existing graded reading systems
Table 2. The existing graded reading systems. |
| Graded reading system | Description | Limitation |
|---|---|---|
| A-Z System (Hiebert & Tortorelli, 2022; McNamara et al., 2014) | Categorizes books into 26 levels (A-Z), covering language difficulty and thematic content, with added factors like font and illustrations. | Primarily designed for English; not easily adaptable to languages with different script systems. |
| Oxford Reading System (Gorard & See, 2016; Smith & Doe, 2018) | Developed by Oxford University Press, uses vertical levels (based on age, cognitive and emotional development) and horizontal stages (e.g. phonics, comprehension). | Structured for English-speaking readers; lacks accommodation for cultural and linguistic differences in other regions. |
| Developmental Reading Assessment (DRA) (Beaver & Carter, 2024; Johnson & Lee, 2017) | A U.S. standard assessment evaluating reading comprehension, lexical knowledge, and reading strategies through progressive testing. | Primarily assesses English skills; limited in flexibility for application to other linguistic contexts. |
| Lexile System (Hiebert, 2005; McNamara et al., 2014; Smith et al., 2016; Zeng & Fan, 2017) | Uses semantic and grammatical complexity to determine reader levels and text difficulty, matching readers with suitable texts. | Focuses on English text; lacks cultural adaptation for non-English readers. |
| Chinese Southern Graded Reading Center (Nur, 2019; Qiang et al., 2020) | Based on the Lexile framework, divides grades 1-9 into four stages, considering text difficulty, narrative structure, and the integration of text and visuals. | Primarily focuses on linguistic complexity; limited attention to reader interest and emotional engagement. |
| Shanghai Graded Reading Ability Standards (Holzknecht et al., 2022; Kidwai et al., 2016; Zhao, 2020) | Adapts Lexile’s approach to measure reading attitudes, cognitive processes, and text difficulty in a Chinese context. | Relies on linguistic complexity and overlooks personalized reading interests and emotional dimensions. |
2.2 The limitations of current Chinese graded reading systems
2.3 Fuzzy evaluation of graded reading system
3 Methodology
3.1 The criteria of graded reading system
Figure 1. The graded reading evaluation criteria system. “Cost” and “Benefit” represent the direction of evaluation for each criterion. “Cost” refers to criteria where a lower value denotes easier reading difficulty. “Benefit” refers to criteria where a higher value denotes easier reading difficulty. |
3.2 The evaluation framework of graded reading system
Figure 2. The evaluation framework diagram. The process involves three steps from left to right: collect data, determine criterion weights, and calculate advantage degrees. Firstly, data is collected and aggregated using Probabilistic Fuzzy Linguistic Term (PFLT) and Probabilistic Linguistic Averaging (PLA). Qualitative data is assessed by experts using linguistic terms at five levels: Very Low (VL), Low (L), Medium (M), High (H), and Very High (VH), while quantitative data is directly measured values. Next, triangular and traditional entropy methods are used to calculate weights for the two criteria. Finally, the TODIM method calculates the advantage degree of each book based on the criterion and combines degree to rank the books. |
4 Results
4.1 Data encoding
4.2 Decision-making process
Table 3. The entropy value and weight of each criterion. |
| Criteria | Quantitative criterion | |||||||
|---|---|---|---|---|---|---|---|---|
| C11 | C12 | C13 | C21 | C22 | C31 | C32 | C41 | |
| Entropy value | 0.579 | 0.602 | 0.602 | 0.684 | 0.730 | 0.693 | 0.670 | 0.728 |
| Initial weight | 0.127 | 0.133 | 0.132 | 0.150 | 0.160 | 0.152 | 0.147 | 0.131 |
| Normalized weights | 0.076 | 0.080 | 0.079 | 0.090 | 0.096 | 0.091 | 0.088 | 0.072 |
| Ranking | 8th | 6th | 7th | 3th | 1th | 2th | 4th | 10th |
| Criteria | Qualitative criterion | |||||||
| C42 | C43 | C51 | C52 | C53 | C54 | C55 | ||
| Entropy value | 0.628 | 0.601 | 0.667 | 0.667 | 0.762 | 0.689 | 0.812 | |
| Initial weight | 0.113 | 0.108 | 0.121 | 0.120 | 0.137 | 0.124 | 0.146 | |
| Normalized weights | 0.062 | 0.059 | 0.068 | 0.066 | 0.075 | 0.068 | 0.080 | |
| Ranking | 14th | 15th | 12th | 13th | 9th | 11th | 5th | |
Table 4. The comprehensive dominance, η(Bi) and ranking of all candidate books. |
| Benchmark books | Comprehensive dominance | η(Bi) | Ranking |
|---|---|---|---|
| B1 | 6.714 | 1.000 | 1th |
| B2 | 2.325 | 0.953 | 3th |
| B3 | 5.401 | 0.986 | 2th |
| B4 | -14.815 | 0.769 | 5th |
| B5 | -11.959 | 0.799 | 4th |
| B6 | -22.123 | 0.690 | 6th |
| B7 | -37.006 | 0.530 | 7th |
| B8 | -41.461 | 0.482 | 9th |
| B9 | -40.341 | 0.494 | 8th |
| B10 | -75.107 | 0.121 | 10th |
| B11 | -81.314 | 0.054 | 11th |
| B12 | -86.359 | 0.000 | 12th |
5 Model validation
5.1 Internal comparison
Figure 3. The comparison of book rankings across various evaluation methods. Different colored circles denote different books (B1, B2, …, B12). The vertical axis represents the ranking positions, while the horizontal axis represents the different evaluation methods. The lines of the same style track the changes in rankings for each book, highlighting the trends and variations in their ranking. |
Figure 4. The Kendall correlation between different ranking methods: WSTF’s Rank, Lexile’s Rank, Our Rank, Guide’s Rank, Exchange’s Rank, and Remove’s Rank. Significance levels are indicated as follows: *** p < 0.001 (two-tailed), ** p < 0.01 (two-tailed), and * p < 0.05 (two-tailed). |
5.2 Cross-method comparison
Figure 5. The comparison of ranking under different decision methods. |


