An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters

cuhadar, ismail; Kalkan, Ömür Kaya

doi:10.21031/epod.660273

An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters

Ömür Kaya KALKAN, (Pamukkale Üniversitesi, Eğitim Fakültesi, Denizli, Türkiye)

İsmail ÇUHADAR (Milli Eğitim Bakanlığı, Ankara, Türkiye)

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi

0 0

Yıl: 2020 Cilt: 11 Sayı: 2 Sayfa Aralığı: 131 - 146 Metin Dili: İngilizce DOI: 10.21031/epod.660273 İndeks Tarihi: 08-10-2020

An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters

Öz:

In an achievement test, the examinees with the required knowledge and skill on a test item are expected to answerthe item correctly while the examinees with a lack of necessary information on the item are expected to give anincorrect answer. However, an examinee can give a correct answer to the multiple-choice test items throughguessing or sometimes give an incorrect response to an easy item due to anxiety or carelessness. Either case maycause a bias estimation of examinee abilities and item parameters. Four-parameter logistic item response theory(4PL IRT) model and the deterministic inputs, noisy, and gate (DINA) model can be used to mitigate thesenegative impacts on the parameter estimations. The current simulation study aims to compare the estimatedpseudo-guessing and slipping parameters from the 4PL IRT model and the DINA model under several studyconditions. The DINA model was used to simulate the datasets in the study. The study results showed that thebias of the estimated slipping and guessing parameters from both 4PL IRT and DINA models were reasonablysmall in general although the estimated slipping and guessing parameters were more biased when datasets wereanalyzed through the 4PL IRT model rather than the DINA model (i.e., the average bias for both guessing andslipping parameters = .00 from DINA model, but .08 from 4PL IRT model). Accordingly, both 4PL IRT andDINA models can be considered for analyzing the datasets contaminated with guessing and slipping effects.

Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık

Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model (Research Report 18-21). Princeton, NJ: Educational Testing Service. doi: 10.1002/j.2333- 8504.1981.tb01255.x
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29. doi: 10.18637/jss.v048.i06
Chiu, C. Y. (2008). Cluster analysis for cognitive diagnosis: Theory and applications (Doctoral dissertation). Retrieved from https://www.ideals.illinois.edu/handle/2142/80055
Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in organizational research. Organizational Research Methods, 6(2), 147-168. doi: 10.1177/1094428103251541
Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142-1163. doi: 10.1007/s11336-015-9477-6
de Ayala, R. J. (2009). The theory and practice of item response theory. New York, NY: The Guilford Press.
DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8-26. doi: 10.1177/0146621610377081
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45(4), 343-362. doi: 10.1111/j.1745- 3984.2008.00069.x
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational And Behavioral Statistics, 34(1), 115-130. doi: 10.3102/1076998607309474
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199. doi: 10.1007/s11336-011-9207-7
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrica, 69(3), 333-353. doi: 10.1007/BF02295640
de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595-624. doi: 10.1007/s11336-008-9063-2
de la Torre, J., Hong, Y., & Deng, W. (2010). Factors affecting the item parameter estimation and classification accuracy of the DINA model. Journal of Educational Measurement, 47(2), 227-249. doi: 10.1111/j.1745-3984.2010.00110.x
de la Torre, J., & Lee, Y. S. (2010). A note on the invariance of the dina model parameters. Journal of Educational Measurement, 47(1), 115-127. doi: 10.1111/j.1745-3984.2009.00102.x
de la Torre, J., & Lee, Y. S. (2013). Evaluating the Wald test for item‐ level comparison of saturated and reduced models in cognitive diagnosis. Journal of Educational Measurement, 50(4), 355-373. doi: 10.1111/jedm.12022
DeMars, C. E. (2007). “Guessing” parameter estimates for multidimensional item response theory models. Educational and Psychological Measurement, 67(3), 433-446. doi: 10.1177/0013164406294778
Doornik, J. A. (2018). An object-oriented matrix programming language Ox (Version 8.0) [Computer software]. London: Timberlake Consultants Press.
Finch, H. (2010). Item parameter estimation for the MIRT model: Bias and precision of confirmatory factor analysis-based models. Applied Psychological Measurement, 34(1), 10-26. doi: 10.1177/0146621609336112
Finch, H., Habing, B. T., & Huynh, H. (2003, April). Comparison of NOHARM and conditional covariance methods of dimensionality assessment. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26(4), 301-321. doi: 10.1111/j.1745-3984.1989.tb00336.x
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer.
Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnosis. Applied Psychological Measurement, 29(4), 262-277. doi: 10.1177/0146621604272623
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice. Educational and Psychological Measurement, 66(3), 393- 416. doi: 10.1177/0013164405282485
Hoijtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62(2), 171-189. doi: 10.1007/BF02295273
Huebner, A., & Wang, C. (2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71(2), 407-419. doi: 10.1177/0013164410388832
Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two- and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6(3), 249-260. doi: 10.1177/014662168200600301
Jackson, D. L., Gillaspy, J. A., & Purc-Stephenson, R. (2009). Reporting practices in confirmatory factor analysis: an overview and some recommendations. Psychological Methods, 14(1), 6-23. doi: 10.1037/a0014694
Junker, B. W. (2001). On the interplay between nonparametric and parametric IRT, with some thoughts about the future. In A. Boomsma, M. A. J. Van Duijn, &T. A. B. Snijders (Eds.), Essays on item response theory (pp. 274-276). New York, NY: Springer-Verlag.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272. doi: 10.1177/01466210122032064
Liao, W. W., Ho, R. G., Yen, Y. C., & Cheng, H. C. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality: An International Journal, 40(10), 1679-1694. doi: 10.2224/sbp.2012.40.10.1679
Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509-525. doi: 10.1348/000711009X474502
Lorenzo-Seva, U., & Ferrando, P. J. (2006). FACTOR: A computer program to fit the exploratory factor analysis model. Behavior Research Methods, Instruments, & Computers, 38(1), 88-91. doi: 10.3758/BF03192753
Lord, F. M. (2012). Applications of item response theory to practical testing problems. New Jersey, NJ: Lawrence Erlbaum Associates.
Ma, W., & de la Torre, J. (2020). GDINA: The generalized DINA model framework: R package (Version 2.7.9). Retrieved from https://CRAN.R-project.org/package=GDINA
Magis, D. (2013). A note on the item information function of the four-parameter logistic model. Applied Psychological Measurement, 37(4), 304-315. doi: 10.1177/0146621613475471
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187-212. doi: 10.1007/BF02294535
Meng, X., Xu, G., Zhang, J., & Tao, J. (2019). Marginalized maximum a posteriori estimation for the fourparameter logistic model under a mixture modelling framework. British Journal of Mathematical and Statistical Psychology, Advanced online publication. doi: 10.1111/bmsp.12185
Muthén, L. K., & Muthén, B. O. (1998-2017). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthén & Muthén.
R Core Team. (2017). R: A language and environment for statistical computing [Computer Software]. Vienna, Austria: R Foundation for Statistical Computing.
Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2019). Package ‘CDM’. Retrieved from https://cran.rproject. org/web/packages/CDM/CDM.pdf
Rowley, G. L., & Traub, R. E. (1977). Formula scoring, number-right scoring, and test-taking strategy. Journal of Educational Measurement, 14(1), 15-22. doi: 10.1111/j.1745-3984.1977.tb00024.x
Rulison, K. L., & Loken, E. (2009). I’ve fallen and i can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83-101. doi: 10.1177/0146621608324023
Svetina, D., Valdivia, A., Underhill, S., Dai, S., & Wang, X. (2017). Parameter recovery in multidimensional item response theory models under complexity and nonnormality. Applied Psychological Measurement, 41(7), 530-544. doi: 10.1177/0146621617707507
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of educational measurement, 20(4), 345-354. doi: 10.1111/j.1745-3984.1983.tb00212.x
Vermunt, J. K., & Magidson, J. (2016). Upgrade manual for latent GOLD 5.1. Belmont, MA: Statistical Innovations Inc.
Waller, N. G., & Feuerstahler, L. (2017). Bayesian modal estimation of the four-parameter item response model in real, realistic, and idealized data sets. Multivariate behavioral research, 52(3), 350-370. doi: 10.1080/00273171.2017.1292893
Yakar, L. (2017). Bilişsel tanı ve çok boyutlu madde tepki kuramı modellerinin karşılıklı uyumlarının incelenmesi (Doctoral thesis). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/
Yen, Y. C., Ho, R. G., Laio, W. W., Chen, L. J., & Kuo, C. C. (2012). An empirical evaluation of the slip correction in the four parameter logistic models with computerized adaptive testing. Applied Psychological Measurement, 36(2), 75-87. doi: 10.1177/0146621611432862

APA	Kalkan Ö, cuhadar i (2020). An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. , 131 - 146. 10.21031/epod.660273
Chicago	Kalkan Ömür Kaya,cuhadar ismail An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. (2020): 131 - 146. 10.21031/epod.660273
MLA	Kalkan Ömür Kaya,cuhadar ismail An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. , 2020, ss.131 - 146. 10.21031/epod.660273
AMA	Kalkan Ö,cuhadar i An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. . 2020; 131 - 146. 10.21031/epod.660273
Vancouver	Kalkan Ö,cuhadar i An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. . 2020; 131 - 146. 10.21031/epod.660273
IEEE	Kalkan Ö,cuhadar i "An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters." , ss.131 - 146, 2020. 10.21031/epod.660273
ISNAD	Kalkan, Ömür Kaya - cuhadar, ismail. "An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters". (2020), 131-146. https://doi.org/10.21031/epod.660273

APA	Kalkan Ö, cuhadar i (2020). An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11(2), 131 - 146. 10.21031/epod.660273
Chicago	Kalkan Ömür Kaya,cuhadar ismail An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 11, no.2 (2020): 131 - 146. 10.21031/epod.660273
MLA	Kalkan Ömür Kaya,cuhadar ismail An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, vol.11, no.2, 2020, ss.131 - 146. 10.21031/epod.660273
AMA	Kalkan Ö,cuhadar i An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2020; 11(2): 131 - 146. 10.21031/epod.660273
Vancouver	Kalkan Ö,cuhadar i An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2020; 11(2): 131 - 146. 10.21031/epod.660273
IEEE	Kalkan Ö,cuhadar i "An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters." Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11, ss.131 - 146, 2020. 10.21031/epod.660273
ISNAD	Kalkan, Ömür Kaya - cuhadar, ismail. "An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters". Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 11/2 (2020), 131-146. https://doi.org/10.21031/epod.660273