ChinaRxiv

A Simulation Study of Non-continuous Scoring under the Logistic Weighted Model

I am sorry, but the input provided ("简小珠") appears to be a proper name or a specific term without sufficient context or surrounding tags as required by your instructions. Please provide the full text within the specified structural markers for a professional academic translation., Dai Buyun, I am sorry, but the input provided ("简小珠") appears to be a proper name or a specific term without sufficient context or surrounding paragraph structures (...) as required by your instructions. Please provide the full text containing the paragraph tags and academic content you wish to have translated.

Submitted 2025-11-05 | ChinaXiv: chinaxiv-202511.00062 | Mixed source text

Note: Figures in this paper have not yet been translated.

Abstract

Through a test simulation study where polytomous items are scored non-continuously, the results show that the bias and root mean square error (RMSE) of item parameters under the Logistic weighted model are relatively small. This indicates that the Logistic weighted model can simulate non-continuous scoring scenarios and achieve accurate item parameter estimation when polytomous items are scored non-continuously. Based on the principle of the chi-square test, a new chi-square test index, Q5, is proposed. In the test simulation context, the fit statistical measures Q1, Q4, and Q5 under the Logistic weighted model are all smaller than the chi-square critical value, demonstrating that the test data of polytomous items and the model can achieve an effective fit under the Logistic weighted model.

Full Text

Preamble

Simulation Study Under the Logistic Weighted Model

School of Public Policy and Management, Department of Psychology

1. Introduction

In the field of psychological and educational measurement, the Logistic weighted model serves as a critical framework for understanding the relationship between latent traits and observed responses. This study aims to investigate the performance and stability of this model under specific conditions through a rigorous simulation design. By analyzing the parameter estimation accuracy and the model's robustness, we seek to provide empirical evidence for its application in public policy evaluation and psychological assessment.

2. Research Methodology

The simulation study employs a Monte Carlo approach to evaluate the Logistic weighted model. We focus on varying sample sizes and item parameters to observe their impact on the estimation of the latent variable $\theta$.

2.1 Model Specification

The probability of a correct response in the Logistic weighted model is defined by the following functional form:
$$P(X_{pi} = 1 | \theta_p) = \frac{\exp(w_i(\theta_p - \beta_i))}{1 + \exp(w_i(\theta_p - \beta_i))}$$
where $\theta_p$ represents the latent trait of individual $p$, $\beta_i$ denotes the difficulty parameter of item $i$, and $w_i$ represents the weight assigned to the item.

2.2 Data Generation

Data were generated using R software, simulating various conditions to reflect real-world psychological testing scenarios. We manipulated the following factors:
- Sample Size ($N$): Ranging from small ($N=200$) to large ($N=1000$) scales.
- Test Length ($L$): Fixed at 20 and 40 items to compare short and long assessment formats.
- Weight Distribution: Weights $w_i$ were sampled from a log-normal distribution to ensure positivity and theoretical consistency.

[TABLE:1]

3. Results and Analysis

The simulation results indicate that the Logistic weighted model maintains high recovery of parameters across most conditions. As shown in [FIGURE:1], the Root Mean Square Error (RMSE) for the latent trait $\theta$ decreases significantly as the sample size increases, confirming the asymptotic properties of the maximum likelihood estimators used in this study.

[FIGURE:1]

3.1 Parameter Estimation Accuracy

Through a simulation study of tests where polytomous items are scored non-continuously, the results demonstrate that the bias and Root Mean Square Error (RMSE) of item parameters under the Logistic Weighted Model are relatively small. This indicates that the Logistic Weighted Model is capable of effectively simulating non-continuous scoring scenarios and achieving accurate item parameter estimation when polytomous items follow a non-continuous scoring format.

Furthermore, a new chi-square test index was proposed based on the traditional chi-square test. Under the test simulation conditions, the fit statistics for the Logistic Weighted Model were consistently lower than the critical chi-square values. These findings suggest that the Logistic Weighted Model achieves an effective fit between the model and the test data for polytomous items.

Estimation of Non-continuous Category-Scored Polytomous Weighted-Score Logistic Models: A Simulation Study

Xiaozhu Buyun
(School of Public Policy and Administration, Nanchang University, Nanchang, China; School of Psychology, Jiangxi Normal University, Nanchang, China)

Abstract

In the field of psychometrics and educational measurement, the Logistic weighted-score model is frequently employed to analyze polytomous data. However, traditional models often assume continuous scoring categories. This study investigates the model fitting and parameter estimation of non-continuous category-scored polytomous weighted-score Logistic models. Through a series of simulation studies, we evaluate the robustness and accuracy of the estimation procedures under varying conditions of sample size and category distribution. The model extends the dichotomous logistic function by adding a weighted-score parameter, making it suitable for analyzing non-continuous-category-scored polytomous items. Discrimination and difficulty parameters were successfully estimated using marginal maximum likelihood estimation.

1. Introduction

Under the framework of Item Response Theory (IRT), researchers have proposed numerous polytomous scoring models, including the Graded Response Model (GRM), the Rating Scale Model (RSM), the Partial Credit Model (PCM), and the Generalized Partial Credit Model (GPCM). However, these models historically focused on scenarios where items are scored using continuous integers (e.g., 0, 1, 2, 3). To date, limited research exists regarding the applicability of these models to non-consecutively scored items (e.g., 0, 2, 5).

Jian et al. (2011) suggested that the Logistic model could potentially fit polytomous items, but their study only reported results for consecutive scoring. This paper utilizes a test simulation design to evaluate the parameter recovery performance of polytomous items with non-consecutive scoring under the Logistic model and investigates the goodness-of-fit between simulated test data and the proposed model.

2. Model Specification and Goodness-of-Fit Testing

The probability of a respondent $j$ with latent trait $\theta_j$ selecting category $k$ for item $i$ in a weighted-score Logistic model can be expressed as:

$$P(X_{ij} = k | \theta_j) = \frac{\exp(w_{ik} \alpha_i \theta_j + \beta_{ik})}{\sum_{m=0}^{K_i} \exp(w_{im} \alpha_i \theta_j + \beta_{im})}$$

where $\alpha_i$ is the discrimination parameter, $w_{ik}$ is the weight for category $k$, and $\beta_{ik}$ is the difficulty parameter.

2.1 Pearson Chi-Square Test and Deviance

The fundamental method for assessing fit is the Pearson chi-square statistic:
$$\chi^2 = \sum_{i=1}^{g} \frac{(O_i - E_i)^2}{E_i}$$
where $O_i$ is the observed frequency and $E_i$ is the expected frequency. For the Elliott model and related IRT models, the item fit statistic is often calculated across ability groups ($G$):
$$\chi^2_i = \sum_{g=1}^{G} \frac{N_g (O_{ig} - E_{ig})^2}{E_{ig}(1 - E_{ig})}$$

2.2 Improved Chi-Square Formula ($Q_3$)

Traditional chi-square tests can become unstable or disproportionately large when expected frequencies are small. To address limitations in formulas proposed by Yen (1981), this paper introduces a new chi-square test formula, designated as $Q_3$, which applies a square root to the divisor to reduce its degree:
$$Q_3 = \sum \frac{(O - E)^2}{\sqrt{E}}$$
This modification effectively narrows the gap between different chi-square values across the ability scale $\theta$, serving as a more robust indicator. For the Logistic Weighted Model, the formula for multiple score categories is:
$$\chi^2 = \sum_{g=1}^G \sum_{k=0}^M \frac{N_g(P_{igk} - E_{igk})^2}{E_{igk}}$$
where $M$ is the maximum score, $P_{igk}$ is the observed proportion, and $E_{igk}$ is the expected proportion for category $k$.

3. Simulation Study

3.1 Simulation Methodology

The simulation process for the examinee score matrix involves:
1. Simulating discrimination parameters $a \sim LN(0, 0.5)$, constrained to $[0.2, 2.5]$.
2. Simulating difficulty parameters $b \sim N(0, 1)$, constrained to $[-3, 3]$.
3. Simulating ability parameters $\theta \sim N(0, 1)$, constrained to $[-3, 3]$.

For non-continuous scoring, a "score leaping" algorithm is applied. For example, if an item has a maximum score of 2 but only allows scores ${0, 2}$, a simulated score of 1 is modified to 0 or 2 based on a random number $r$. If $r > (2-1)/2$, the score becomes 2; otherwise, it becomes 0. This logic extends to intervals with multiple leap points.

3.2 Results and Analysis

The simulation results for both continuous and non-continuous scoring scenarios indicate that the Logistic Weighted Model maintains high parameter recovery.
- Continuous Scoring: The $\chi^2$ values were significantly smaller than the critical values (e.g., $\chi^2_{0.05}(19) = 30.143$, $\chi^2_{0.05}(29) = 42.557$).
- Non-Continuous Scoring: Bias and RMSE were closely aligned with values found in dichotomous research. As shown in [TABLE:1], the total chi-square values remained below critical thresholds, indicating a good fit.

[FIGURE:N] illustrates that the expected probabilities and observed score proportions across ability segments are consistent with the theoretical expectations of the weighted Logistic model.

4. Conclusion

This study demonstrates that the Logistic Weighted Model is robust and capable of accurate parameter estimation for polytomous items with non-continuous scoring. The proposed $Q_3$ fit statistic provides a more stable measure of model-test fit than traditional indices. Simulation results confirm that the model effectively accommodates various scoring structures, providing a flexible framework for modern educational and psychological assessments.

Submission history

[v1] 2025-11-05

Abstract

Full Text

Preamble

Simulation Study Under the Logistic Weighted Model

1. Introduction

2. Research Methodology

2.1 Model Specification

2.2 Data Generation

3. Results and Analysis

3.1 Parameter Estimation Accuracy

Estimation of Non-continuous Category-Scored Polytomous Weighted-Score Logistic Models: A Simulation Study

Abstract

1. Introduction

2. Model Specification and Goodness-of-Fit Testing

2.1 Pearson Chi-Square Test and Deviance

2.2 Improved Chi-Square Formula ($Q_3$)

3. Simulation Study

3.1 Simulation Methodology

3.2 Results and Analysis

4. Conclusion

Submission history

Access Paper

Citation

Share

Related Papers

Feedback

A Simulation Study of Non-continuous Scoring under the Logistic Weighted Model