The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format

KELECİOĞLU, Hülya; OZTÜRK GÜBES, NESE

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format

NEŞE ÖZTÜRK GÜBEŞ, (Mehmet Akif Ersoy Üniversitesi, Burdur, Türkiye)

Hülya KELECİOĞLU (Mehmet Akif Ersoy Üniversitesi, Burdur, Türkiye)

Kuram ve Uygulamada Eğitim Bilimleri

4 0

Yıl: 2016 Cilt: 16 Sayı: 3 Sayfa Aralığı: 715 - 734 Metin Dili: İngilizce İndeks Tarihi: 29-07-2022

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format

Öz:

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design. The equity property was evaluated based on firstorder equity (FOE) and second-order equity (SOE) properties. A simulation study was conducted based on actual item parameter estimates obtained from the TIMSS 2011 8th grade mathematics assessment. The results showed that: (i) The FOE and SOE properties were best preserved under the unidimensional condition, were poorly preserved when the degree of multidimensionality was severe. (ii) The TSE and OSE results, which were provided by using a mixed-format common-item set, preserved FOE better compared to equating results, which provided only a multiple-choice common item set. (iii) Under the unidimensional and multidimensional test structure, characteristic curve methods performed significantly better than moment scale linking methods in terms of preserving FOE and SOE properties

Anahtar Kelime:

Konular: Eğitim, Eğitim Araştırmaları

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Bibliyografik

Andrews, B. J. (2011). Assessing first- and second-order equity for the common item nonequivalent groups design using multidimensional IRT (Doctoral dissertation). Available from ProQuest Dissertation and Theses database. (UMI No. 3473138)
Baker, F. B., & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28(2), 147-162.
Balch, J. (1964). The influence of the evaluating instrument on students' learning. American Educational Research Journal, 1(3), 169-182.
Bastari, B. (2000). Linking multiple-choice and constructed-response items a common proficiency scale (Doctoral dissertation). Available from ProQuest Dissertation and Theses database. (UMI No. 9960735)
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397- 479). Reading, MA: Addison-Wesley.
Bolt, D. M. (1999). Evaluating the effects of multidimensionality on IRT true-score equating. Applied Measurement in Education, 12(4), 383-407. Retrieved from http://dx.doi. org/10.1207/S15324818AME1204_4
Braun, H. I., & Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9-49). New York, NY: Academic.
Brennan, R. L. (2010). First-order and second-order equity in equating (Report No. 30). Iowa City: Center for Advanced Studies in Measurement and Assessment.
Cao, Y. (2008). Mixed-format test equating: Effects of test dimensionality and common item sets (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses database. (UMI No. 3341415)
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Dorans, N. J., Moses, T. P., & Eignor, D. R. (2011). Equating test scores: Toward best practices. In A. A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (pp. 21-42). New York, NY: Springer.
Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144-149. Retrieved from https://www.jstage.jst. go.jp/article/psycholres1954/22/3/22_3_144/_pdf
Hagge, S. L. (2010). The impact of equating method and format representation of common items on the adequacy of mixed-format test equating using nonequivalent groups (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses database. (UMI No. 3422144)
problems. Journal of Educational Measurement, 14(2), 139-160. http://dx.doi.
org/10.1111/j.1745-3984.1977.tb00033.x
Martin, M. O., & Mullis, I. V. S. (Eds.). (2012). Methods and procedures in TIMSS and PIRLS 2011. Chesnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College.
Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207-218. http://dx.doi.org/10.1207/s15326985ep3404_2
Messick, (1963). Trait equivalence as construct validity of score interpretation across multiple methods of measurement. In R. E. Bennett & W. C. Ward (Eds.), Construction versus choice in cognitive measurement: Issues in constructed response, performance testing, and portfolio assessment (pp. 61-73). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Morris, C. N. (1982). On the foundations of test equating. In P. W. Holland & D. B. Rubin (Eds.) Test equating (pp. 169-191). New York, NY: Academic.
Mullis, I. V. S., & Martin, M. O. (2011). TIMSS 2011 item writing guidelines. Boston College.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159-176. http://dx.doi. org/10.1177/014662169201600206
Muraki, E., & Bock, R. D. (2003). PARSCALE 4.1 [Computer software]. Chicago, IL: Scientific Software International, Inc.
Ogasawara, H. (2001). Standard errors of item response theory equating/linking by response function methods. Applied Psychological Measurement, 25(1), 53-67. http://dx.doi. org/10.1177/01466216010251004
Ogasawara, H. (2002). Stable response functions with unstable item parameter estimates. Applied Psychological Measurement, 26(3), 239-254. http://dx.doi.org/10.1177/0146621 602026003001
Petersen, N. S. (2007). Equating: Best practices and challenges to best practices. In N. J. Dorans, M. Pommerich, & P. W. Holland (Eds.), Linking and aligning scores and scales (pp. 31-55). New York, NY: Springer.
Sinharay, S., & Holland, P. W. (2007). Is it necessary to make anchor tests mini-versions of the tests being equated or can some restrictions be relaxed? Journal of Educational Measurement, 44(3), 249-275. http://dx.doi.org/10.1111/j.1745-3984.2007.00037.x
Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201-210. http://dx.doi. org/10.1177/014662168300700208
Tong, Y., & Kolen, M. J. (2005). Assessing equating results on different equating criteria. Applied Psychological Measurement, 29(6), 418-432. http://dx.doi. org/10.1177/0146621606280071
Traub, R. E. (1993). On the equivalence of traits assessed by multiple-choice and constructed response tests. In R. E. Bennet & W. C. Ward (Eds.), Construction versus choice in cognitive measurement (pp. 29-44). Hillsdale, NJ: Erlbaum.
Wang, W. (2013). Mixed-format test score equating: Effect of item-type multidimensionality, length and composition of common-item set, and group ability difference (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses database. (UMI No. 3608502)
Wolf, R. (2013). Assessing the impact of characteristics of the test, common items, and examinees on the preservation of equity properties in mixed-format test equating (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses database. (UMI No. 3585536)
Yao, L. (2003). SimuMIRT [Computer software]. Monterey, CA: Defense Manpower Data Center.
Yao, L., & Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied Psychological Measurement, 30(6), 469-492. http://dx.doi.org/10.1177/0146621605284537

APA	OZTÜRK GÜBES N, KELECİOĞLU H (2016). The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. , 715 - 734.
Chicago	OZTÜRK GÜBES NESE,KELECİOĞLU Hülya The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. (2016): 715 - 734.
MLA	OZTÜRK GÜBES NESE,KELECİOĞLU Hülya The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. , 2016, ss.715 - 734.
AMA	OZTÜRK GÜBES N,KELECİOĞLU H The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. . 2016; 715 - 734.
Vancouver	OZTÜRK GÜBES N,KELECİOĞLU H The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. . 2016; 715 - 734.
IEEE	OZTÜRK GÜBES N,KELECİOĞLU H "The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format." , ss.715 - 734, 2016.
ISNAD	OZTÜRK GÜBES, NESE - KELECİOĞLU, Hülya. "The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format". (2016), 715-734.

APA	OZTÜRK GÜBES N, KELECİOĞLU H (2016). The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. Kuram ve Uygulamada Eğitim Bilimleri, 16(3), 715 - 734.
Chicago	OZTÜRK GÜBES NESE,KELECİOĞLU Hülya The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. Kuram ve Uygulamada Eğitim Bilimleri 16, no.3 (2016): 715 - 734.
MLA	OZTÜRK GÜBES NESE,KELECİOĞLU Hülya The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. Kuram ve Uygulamada Eğitim Bilimleri, vol.16, no.3, 2016, ss.715 - 734.
AMA	OZTÜRK GÜBES N,KELECİOĞLU H The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. Kuram ve Uygulamada Eğitim Bilimleri. 2016; 16(3): 715 - 734.
Vancouver	OZTÜRK GÜBES N,KELECİOĞLU H The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format. Kuram ve Uygulamada Eğitim Bilimleri. 2016; 16(3): 715 - 734.
IEEE	OZTÜRK GÜBES N,KELECİOĞLU H "The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format." Kuram ve Uygulamada Eğitim Bilimleri, 16, ss.715 - 734, 2016.
ISNAD	OZTÜRK GÜBES, NESE - KELECİOĞLU, Hülya. "The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format". Kuram ve Uygulamada Eğitim Bilimleri 16/3 (2016), 715-734.