Yıl: 2015 Cilt: 15 Sayı: 4 Sayfa Aralığı: 969 - 980 Metin Dili: İngilizce İndeks Tarihi: 29-07-2022

The Effects of Testlets on Reliability and DifferentialItem Functioning

Öz:
Reliability and differential item functioning (DIF) analyses were conducted on testlets displaying local item dependence in this study. The data set employed in the research was obtained from the answers given by 1500 students to the 20 items included in six testlets given in English Proficiency Exam by the School of Foreign Languages of a state University in Turkey. One of the purposes of this study was to determine the influences of the tests composed of testlets on reliability, so the reliability coefficients obtained for cases where the influences of testlets were considered and those for cases where the testlet influences were not considered were compared. In consequence of the G theory analyses conducted in this context, it was found that the G and Phi coefficients estimated by not considering the testlet effects were higher than those estimated by considering the testlet effects. It was concluded that the reliability was estimated to be relatively higher when the influences of the testlet were not considered. Two methods were used in this study so as to determine the effects of testlets on differential item functioning and the results were compared. In the DIF-determining method considering the testlet effect, both the number of items displaying DIF at the significant and estimated levels of DIF were found to be higher than in the method not considering the testlet effect
Anahtar Kelime:

Konular: Eğitim, Eğitim Araştırmaları
Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Bibliyografik
  • Baykul, Y. (2000). Eğitimde ve psikolojide ölçme: klasik test teorisi ve uygulaması. Ankara: ÖSYM.
  • Bloch, R., & Norman, G. (2011). G-String 4 user manual (Version 6.1.1). Hamilton, Ontario, Canada: Ralph Bloch & Geoff Norman.
  • Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153-168.
  • Brennan, R. L. (2001a). Generalizability theory. New York, NY: Springer-Verlag.
  • Brennan, R. L. (2001b). Manual for urGENOVA (Version 2.1) (Iowa Testing Programs Occasional Paper Number 49). Iowa City, IA: Iowa Testing Programs, University of Iowa.
  • Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differentially functioning test items. ducational Measurement: Issues and Practice, 17(1), 31-47.
  • DeMars, C. E. (2006). Application of the bi-factor multidimensional item response theory model to testlet- based tests. Journal of Educational Measurement, 43(2), 168.
  • Fukuhara, H. (2009). A differential item functioning model for testlet-based items using a bı-factor multidimensional item response theory model: a Bayesian approach (Doctoral dissertation, Florida State University College of Education). Retrieved from http://diginole.lib.fsu.edu/cgi/viewcontent. cgi?article=1573&context=etd
  • Fukuhara, H., & Kamata, A. (2007, November). DIF detection in a presence of locally dependent items. Paper presented at the annual meeting of the Florida Educational Research Association, Tampa.
  • Fukuhara, H., & Kamata, A. (2011). A bi-factor multidimensional item response theory model for differential item functioning analysis on testlet-based items. Applied Psychological Measurement, 35(8), 604-622.
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. California, CA: Sage.
  • Hendrickson, A. B. (2001, April). Reliability of scores from tests composed of testlets: A comparison of methods. Paper presented at the annual meeting of the National Council on Measurement in Education, Seattle.
  • Lee, G., Dunbar, S. B., & Frisbie, D. A. (2001). The relative appropriateness of eight measurement models for analyzing scores from test composed of testlets. Educational and Psychological Measurement, 61(6), 958-975.
  • Lee, G., & Frisbie, D. A. (1997, March). A generalizability approach to evaluating the reliability of testlet-based test scores. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.
  • Lee, G., & Frisbie, D. A. (1999). Estimating reliability under a generalizability theory model for test scores composed of testlets. Applied Measurement in Education, 12(3), 237-255.
  • Lee, G., Kolen, M. J., Frisbie, D. A., & Ankenmann, R. D. (2001). Comparison of dichotomous and polytomous item response models in equating scores from test composed of testlets. Applied Psychological Measurement, 25, 357-372.
  • Lee, G., & Park, I. (2012). A comparison of the approaches of generalizability theory and item response theory in estimating the reliability of test scores for testlet-composed tests. Asia Pacific Education Review, 13(1), 47-54.
  • Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30(1), 3-21.
  • Nalbantoğlu Yılmaz, F. (2012). Genellenebilirlik kuramında dengelenmiş ve dengelenmemiş desenlerin karşılaştırılması (Doctoral dissertation, Ankara University, Turkey). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi
  • Nalbantoğlu Yılmaz, F., & Uzun Başusta, B. (2012, September). Genellenebilirlik kuramıyla dikiş atma ve alma becerileri istasyonu güvenirliğinin değerlendirilmesi. Paper presented at III. Ulusal Eğitimde ve Psikolojide Ölçme ve Değerlendirme Kongresi, Bolu, Turkey.
  • Özçelik, D. A. (2010). Test hazırlama kılavuzu. Ankara: Pegem Akademi.
  • Sedivy, S. K. (2009). Using traditional methods to detect differential item functioning in testlet data (Doctoral dissertation, University of Wisconsin-Milwaukee). Retrieved from http://gradworks.umi.com/33/73/3373884.html
  • Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.
  • Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237-247.
  • Spiegelhalter, D., Thomas, A., & Best, N. (2003). WinBUGS 4. Cambridge, UK: MRC Biostatistics Unit, Institute of Public Health.
  • Thissen, D., Steinberg, L., & Mooney, J. A. (1989). Trace lines for testlets: A use of multiple-categorical response models. Journal of Educational Measurement, 26(3), 247-260.
  • Vaughn, B. K. (2006). A hierarchical generalized linear model of random differential item functioning for polytomous items: A Bayesian multilevel approach (Doctoral dissertation, The Florida State University College of Education). Retrieved from http://diginole.lib. fsu.edu/cgi/viewcontent.cgi?article=5595&context=etd
  • Wainer, H. (1995). Precision and differential item functioning on a testlet-based test: The 1991 law school admissions test as an example. Applied Measurement in Education, 8, 157-186.
  • Wainer, H., Bradlow, E. T., & Du, Z. (2000). Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing. In W. J. van der Linden & C. A. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245-269). Dordrecht: Kluwer Academic Publishers.
  • Wainer, H., & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24(3), 185-201.
  • Wainer, H., & Lewis, C. (1990). Toward a psychometrics for testlets. Journal of Educational Measurement, 27(1), 1-14.
  • Wainer, H., Sireci, S. G., & Thissen, D. (1991). Differential testlet functioning: definitions and detection. Journal of Educational Measurement, 28(3), 197-219.
  • Wainer, H., & Thissen, D. (1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability? Educational Measurement: Issues and Practice, 15(1), 22-29.
  • Wainer, H., & Wang, C. (2000). Using a new statistical model for testlets to score TOEFL. Journal of Educational Measurement, 37, 203-220.
  • Wang, W. C., & Wilson, M. (2005). Assessment of differential item functioning in testlet-based items using the Rasch testlet model. Educational and Psychological Measurement, 65(4), 549-576.
  • Wang, X., Bradlow, E. T. & Wainer, H. (2002). A general Bayesian model for testlets: theory and application. Applied Psychological Measurement, 26(1), 109-128.
  • Yen, W. M. (1993). Scaling performance assessment: strategies for managing local item dependence. Journal of Educational Measurement, 30, 187-213.
APA Tasdelen Teker G, DOĞAN N (2015). The Effects of Testlets on Reliability and DifferentialItem Functioning. , 969 - 980.
Chicago Tasdelen Teker Gulsen,DOĞAN Nuri The Effects of Testlets on Reliability and DifferentialItem Functioning. (2015): 969 - 980.
MLA Tasdelen Teker Gulsen,DOĞAN Nuri The Effects of Testlets on Reliability and DifferentialItem Functioning. , 2015, ss.969 - 980.
AMA Tasdelen Teker G,DOĞAN N The Effects of Testlets on Reliability and DifferentialItem Functioning. . 2015; 969 - 980.
Vancouver Tasdelen Teker G,DOĞAN N The Effects of Testlets on Reliability and DifferentialItem Functioning. . 2015; 969 - 980.
IEEE Tasdelen Teker G,DOĞAN N "The Effects of Testlets on Reliability and DifferentialItem Functioning." , ss.969 - 980, 2015.
ISNAD Tasdelen Teker, Gulsen - DOĞAN, Nuri. "The Effects of Testlets on Reliability and DifferentialItem Functioning". (2015), 969-980.
APA Tasdelen Teker G, DOĞAN N (2015). The Effects of Testlets on Reliability and DifferentialItem Functioning. Kuram ve Uygulamada Eğitim Bilimleri, 15(4), 969 - 980.
Chicago Tasdelen Teker Gulsen,DOĞAN Nuri The Effects of Testlets on Reliability and DifferentialItem Functioning. Kuram ve Uygulamada Eğitim Bilimleri 15, no.4 (2015): 969 - 980.
MLA Tasdelen Teker Gulsen,DOĞAN Nuri The Effects of Testlets on Reliability and DifferentialItem Functioning. Kuram ve Uygulamada Eğitim Bilimleri, vol.15, no.4, 2015, ss.969 - 980.
AMA Tasdelen Teker G,DOĞAN N The Effects of Testlets on Reliability and DifferentialItem Functioning. Kuram ve Uygulamada Eğitim Bilimleri. 2015; 15(4): 969 - 980.
Vancouver Tasdelen Teker G,DOĞAN N The Effects of Testlets on Reliability and DifferentialItem Functioning. Kuram ve Uygulamada Eğitim Bilimleri. 2015; 15(4): 969 - 980.
IEEE Tasdelen Teker G,DOĞAN N "The Effects of Testlets on Reliability and DifferentialItem Functioning." Kuram ve Uygulamada Eğitim Bilimleri, 15, ss.969 - 980, 2015.
ISNAD Tasdelen Teker, Gulsen - DOĞAN, Nuri. "The Effects of Testlets on Reliability and DifferentialItem Functioning". Kuram ve Uygulamada Eğitim Bilimleri 15/4 (2015), 969-980.