Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions

Yıl: 2019 Cilt: 10 Sayı: 4 Sayfa Aralığı: 377 - 393 Metin Dili: İngilizce DOI: 10.21031/epod.525647 İndeks Tarihi: 27-03-2020

Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions

Öz:
The validity of individual test scores is an important issue that needs to be studied in psychological and educational assessment. An important factor affecting the validity of individual test scores is aberrant item response behavior. Aberrant item scores may increase/decrease the individuals’ scores and as a result individuals’ ability can be estimated above/below their true ability. Person-fit statistics (PFS) are useful tools to detect aberrant behavior. There are a great number of parametric and nonparametric PFS in the literature. The general purpose of the study is to examine the effectiveness of the parametric and nonparametric PFS in data sets which consist of polytomous items. This study is fundamental research aimed at determining the effectiveness of PFS using simulated data sets. According to the results, as expected, as the Type I error rates (significance alpha level) increased, detection rates (power) increased. In general, it is seen that as the number of misfitting item score vector and number of items increased, detection rates increased. Generally, nonparametric PFS (N-PFS) (especially GP) detected more aberrant individuals than parametric PFS (P-PFS) lzp. However, in some tests’ conditions lzp detected more aberrant individuals than N-PFS for longer tests. The results indicate that N-PFS outperformed P-PFS in most of the test conditions.
Anahtar Kelime:

Konular: Tarih
Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • Bahry, L. M. (2012). Polytomous item response theory parameter recovery: an investigation of nonnormal distributions and small sample size (Master’s thesis). Retrieved from ProQuest Dissertations and Theses database. (UMI No. MR90146)
  • Baker, F. B. (2001). The basis of item response theory. United State of America: Eric Clearinghouse on Assessment and Evaluation.
  • Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17(4), 335-350. doi: 10.1177/014662169301700402
  • Conijn, J. M., Emons, W. H., De Jong, K., & Sijtsma, K. (2015). Detecting and explaining aberrant responding to the outcome questionnaire-45. Assessment, 22(4), 513-524. doi: 10.1177/1073191114560882
  • DeMars, C. E. (2002, April). Recovery of graded response and partial credit parameters in multilog and parscale. Paper presented at the annual meeting of American Educational Research Association, Chicago.
  • Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67-86. doi: 10.1111/j.2044-8317.1985.tb00817.x
  • Egberink, I. J. A. L. (2010). Applications of item response theory to non-cognitive data. Groningen: University Library Groningen.
  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New Jersey, NJ: Lawrence Erlbaum Associates.
  • Emmen, P. (2011). A person-fit analysis of personality data (Master thesis). Vrije Universiteit, Amsterdam. Retrieved from https://www.innovatiefinwerk.nl/sites/innovatiefinwerk.nl/files/field/bijlage/patrick_emmen.pdf
  • Emons, W. H. M. (2009). Detection and diagnosis of person misfit from patterns of summed polytomous item scores. Applied Psychological Measurement, 33(8), 599-619. doi: 10.1177/0146621609334378
  • Emons, W. H. M. (2003). Detection and diagnosis of misfitting item-score vectors. Amsterdam: Dutch University Press.
  • Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224-247. doi: 10.1177/0146621607302479
  • Emons, W. H. M., Glas, C. A. W., Meijer, R. R., & Sijtsma, K. (2003). Person fit in order-restricted latent class models. Applied Psychological Measurement, 27(6), 459-478. doi: 10.1177/0146621603259270
  • Glass, C. A. W., & Dagohoy, A. V. T. (2007). A person-fit test for irt models for polytomous items. Psychometrika, 72(2), 159-180. doi: 10.1007/s11336-003-1081-5
  • Hambleton, R. K., van der Linden W. J., & Wells, C. S. (2011). IRT models for the analysis of polytomous scored data: Brief and selected history of model building advances. In Nering M. L., & Ostini R. (Eds.), Handbook of polytomous item response theory models (pp. 21-42). New York, NY: Routledge.
  • Jiang, S., Wang, C., & Weiss, D. J. (2016). Sample size requirements for estimation of item parameters in the multidimensional graded response model. Frontiers In Psychology, 7. doi: 10.3389/fpsyg.2016.00109
  • Junker, B., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25(3), 211-220. doi: 10.1177/01466210122032028
  • Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16(4), 277-298. doi: 10.1207/S15324818AME1604_2
  • Lee, Y. S. (2007). A comparison of methods for nonparametric estimation of item characteristic curves for binary items. Applied Psychological Measurement, 31(2), 121-134. doi: 10.1177/0146621606290248
  • Lee, Y. S., Wollack, J. A., & Douglas, J. (2009). On the use of nonparametric item characteristic curve estimation techniques for checking parametric model fit. Educational and Psychological Measurement, 69(2), 181- 197. doi: 10.1177/0013164408322026
  • Liang, T., Wells, C. S., & Hambleton, R. K. (2014). An assessment of nonparametric approach for evaluating the fit of item response models. Journal of Educational Measurement, 51(1), 1-17. doi: 10.1111/jedm.12031
  • Meijer, R. R. (1996). Person-fit research: An introduction. Applied Measurement in Education, 9(1), 3-8. doi: 10.1207/s15324818ame0901_2
  • Meijer, R. R. (2003). Diagnosing item score patterns on a test using item response theory-based person-fit statistics. Psychological Methods, 8(1), 72-87. doi: 10.1037/1082-989X.8.1.72
  • Meijer, R. R. (2004). Investigating the quality of items in CAT using nonparametric IRT. (LSAC Research Report Series No. 04-05). Newton, PA: Law School Admission Council.
  • Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person-fit. Applied Psychological Measurement, 25(2), 107-135. doi: 10.1177/01466210122031957 Meijer, R. R., & Tendeiro, J. N. (2018). Unidimensional item response theory. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development (pp. 413-443). UK: John Wiley & Sons
  • Meijer, R. R., Egberink, I. J., Emons, W. H. M., & Sijtsma, K. (2008). Detection and validation of unscalable item score patterns using item response theory: An illustration with harter’s self-perception profile for children. Journal of Personality Assessment, 90(3), 227-238. doi: 10.1080/00223890701884921
  • Meijer, R. R., Molenaar, I. W., & Sijtsma, K. (1994). Influence of test and person characteristics on nonparametric appropriateness measurement. Applied Psychological Measurement, 18(2), 111-120. doi: 10.1177/014662169401800202
  • Meijer, R. R., Niessen, A. S. M., & Tendeiro, J. N. (2016). A practical guide to check the consistency of item response patterns in clinical research through person-fit statistics: Examples and a computer program. Assessment, 23(1), 52-62. doi: 10.1177/1073191115577800
  • Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25(3), 295-299. doi: 10.1177/01466210122032091
  • Mousavi, A., Tendeiro, J. N., & Younesi, J. (2016). Person fit assessment using the Perfit package in R. The Quantitative Methods for Psychology, 12(3), 232-242. doi: 10.20982/tqmp.12.3.p232
  • Nydick, S. W. (2015) catIrt: An R package for simulating IRT-based computerized adaptive tests. R package version 0.4-2. Retrieved from http://CRAN.R-project.org/package=catIrt
  • Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation.Psychometrika, 56(4), 611-630. Retrieved from https://link.springer.com/article/10.1007/BF02294494
  • Rupp, A. A. (2013). A systematic review of the methodology for person-fit research in item response theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3-38. Retrieved from http://www.psychologie- aktuell.com/fileadmin/download/ptam/1-2013_20130326/01_Rupp.pdf
  • Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. USA: Sage Publications.
  • Sijtsma, K., Emons, W. H., Bouwmeester, S., Nyklícek, I., & Roorda, L. D. (2008). Nonparametric irt analysis of quality of life scales and its application to the world health organization quality of life scale (whoqol-bref). Quality Of Life Research: An International Journal Of Quality Of Life Aspects Of Treatment,Care And Rehabilitation, 17(2), 275-290. doi: 10.1007/s11136-007-9281-6
  • Sodano, S. M., & Tracey, T. J. (2011). A brief inventory of interpersonal problems–circumplex using nonparametric item response theory: Introducing the iip–c–irt. Journal of Personality Assessment, 93(1), 62-75. doi: 10.1080/00223891.2010.528482
  • Spoden, C. (2014). Person fit analysis with simulation-based methods (Doctoral dissertation). Universitäts bibliothek Duisburg-Essen. Retrieved from https://duepublico2.uni-due.de/servlets/MCRFileNodeServlet/duepublico_derivate_00038262/DISSERTATION_Spoden.pdf
  • Syu, J. J. (2013). Applying person-fit in faking detection-the simulation and practice of non parametric item response theory (Doctoral dissertation). National Chengchi University. Retrieved from http://nccur.lib.nccu.edu.tw/bitstream/140.119/58646/1/251501.pdf
  • Şengül Avşar, A., & Tavşancıl, E. (2017). Examination of polytomous items’ psychometric properties according to nonparametric item response theory models in different test conditions. Educational Sciences: Theory & Practice, 17(2). doi: 10.12738/estp.2017.2.0246
  • Tendeiro, J. N. (2016). Package “PerFit”. Retrieved from https://cran.r-project.org/web/packages/PerFit/PerFit.pdf
  • Twiste, L. T. (2011). Detection of unmotivated test takers through an analysis of response patterns: beyond person-fit statistics (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses database. (UMI No. 3478798)
  • van der Flier, H. (1982). Deviant response patterns and comparability of test scores. Journal of Cross-Cultural Psychology, 13(3), 267-298. doi: 10.1177/0022002182013003001
  • Voncken, L. (2014). Comparison of the lz* Person-Fit Index and 𝜔 copying-index in copying detection (First year paper). Universiteit van Tilburg. Retrieved from http://arno.uvt.nl/show.cgi?fid=135361
  • Waller, G. N., & Jones, J. (2016). Package “fungible”. Retrieved from https://www.rdocumentation.org/packages/fungible
  • Wang, S. X. (2001). Maximum weighted likelihood estimation (Doctoral dissertation).University of British Columbia. Retrieved from https://open.library.ubc.ca/cIRcle/collections/ubctheses/831/items/1.0090880
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450. doi: 10.1007/BF02294627
APA ŞENGÜL AVŞAR A (2019). Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. , 377 - 393. 10.21031/epod.525647
Chicago ŞENGÜL AVŞAR ASİYE Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. (2019): 377 - 393. 10.21031/epod.525647
MLA ŞENGÜL AVŞAR ASİYE Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. , 2019, ss.377 - 393. 10.21031/epod.525647
AMA ŞENGÜL AVŞAR A Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. . 2019; 377 - 393. 10.21031/epod.525647
Vancouver ŞENGÜL AVŞAR A Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. . 2019; 377 - 393. 10.21031/epod.525647
IEEE ŞENGÜL AVŞAR A "Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions." , ss.377 - 393, 2019. 10.21031/epod.525647
ISNAD ŞENGÜL AVŞAR, ASİYE. "Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions". (2019), 377-393. https://doi.org/10.21031/epod.525647
APA ŞENGÜL AVŞAR A (2019). Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 10(4), 377 - 393. 10.21031/epod.525647
Chicago ŞENGÜL AVŞAR ASİYE Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 10, no.4 (2019): 377 - 393. 10.21031/epod.525647
MLA ŞENGÜL AVŞAR ASİYE Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, vol.10, no.4, 2019, ss.377 - 393. 10.21031/epod.525647
AMA ŞENGÜL AVŞAR A Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2019; 10(4): 377 - 393. 10.21031/epod.525647
Vancouver ŞENGÜL AVŞAR A Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi. 2019; 10(4): 377 - 393. 10.21031/epod.525647
IEEE ŞENGÜL AVŞAR A "Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions." Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 10, ss.377 - 393, 2019. 10.21031/epod.525647
ISNAD ŞENGÜL AVŞAR, ASİYE. "Comparison of Person-Fit Statistics for Polytomous Items in Different Test Conditions". Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 10/4 (2019), 377-393. https://doi.org/10.21031/epod.525647