Yıl: 2020 Cilt: 20 Sayı: 87 Sayfa Aralığı: 101 - 118 Metin Dili: İngilizce DOI: 10.14689/ejer.2020.87.5 İndeks Tarihi: 26-11-2020

Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions

Öz:
Purpose: The present study aims to evaluate how thereliabilities computed using α, Stratified α, AngoffFeldt, and Feldt-Raju estimators may differ whensample size (500, 1000, and 2000) and item type ratioof dichotomous to polytomous items (2:1; 1:1, 1:2)included in the scale are varied.Research Methods: In this study, Cronbach’s α,Stratified α, Angoff-Feldt, and Feldt-Raju reliabilitycoefficients were estimated on simulated datasets(sample sizes 500, 1000, 2000) and the number of dichotomous versus polytomous item ratios(2:1, 1:1, 1:2).Findings: In the simulation conditions of this research, in all sample size conditions, estimatedAngoff-Feldt, and Feldt-Raju reliability coefficients were higher when the number ofdichotomous items in the item-type ratio was higher than that of polytomous items. This wasalso the case for the estimated α and Stratified α reliability coefficients when the item-typeratio was reversed. While all different reliability estimators gave similar results in the largesamples (n≥1000), there were some differences in reliability estimates depending on the itemtype ratio in the small samples (n=500).Implications for Research and Practice: In the light of the findings and conclusions obtainedin this study, it may be advisable to use α and Stratified α for mixed-type scales when thenumber of polytomously scored items in the scale is higher than that of the dichotomouslyscored items. On the other hand, the coefficients Angoff-Feldt and Feldt-Raju arerecommended when the number of items scored dichotomously is higher.
Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • Baker, F. B. (1998). An investigation of the item parameter recovery characteristics of a Gibbs sampling procedure. Applied Psychological Measurement, 22(2), 153-169.
  • Bastari, B. (2000). Linking MCQ and CR Itemsto a common proficiency scale (Unpublished doctoral dissertation). University of Massachusetts Amherst, USA.
  • Berger, M. P. (1998). Optimal design of tests with dichotomous and polytomous items. Applied Psychological Measurement, 22(3), 248-258.
  • Cao, Y. (2008). Mixed-format test equating: Effects of test dimensionality and common-item sets(Doctoral dissertation). Retrived from https://drum.lib.umd.edu/handle/1903/8843
  • Charter, R. A. (1999). Sample size requirements for precise estimates of reliability, generalizability, and validity coefficients. Journal of Clinical and Experimental Neuropsychology, 21(4), 559-566.
  • Crocker, L., &Algina J. (2008). Introductiont a classical and modern test theory. N.Y.: Nelson Education.Cronbach, L. J., Schönemann, P., &McKie, D. (1965). α coefficients for stratified-parallel tests. Educational and Psychological Measurement, 25(2), 291-312.
  • Cronbach, L. J., &Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and psychological measurement, 64(3), 391- 418.
  • DeVellis, R. F. (2003). Scale development: Theory and application. Sage Publications: California.
  • Donoghue, J. R. (1993). An empirical examination of the IRT information in polytomously scored reading items. ETS Research Report Series, 1993(1).
  • Ercikan, K., Schwarz, R., Julian, M.W., Burket, G.R., Weber, M.W., &Link, V. (1998). Calibration and scoring of tests with multiplie-choice and constructed response test item type. Journal of Educational Measurement, 35(2), 137-154.
  • Eren, B. (2015). The comparison of student achievements, students' and teachers' views for multiple choice and mixed format test applications (Unpublished master’s dissertation). Ankara üniversitesi, Ankara.
  • Falk, C. F., & Savalei, V. (2011). The relationship between unstandardized and standardized alpha, true reliability, and the underlying measurement model. Journal of personality assessment, 93(5), 445-453.
  • Feldt, L. S., &Brennan, R. L. (1989). Reliability. In R. L. Linn (Ed.), Educational Measurement(3rd ed.,pp.105-146). New York: Macmillan.
  • Feldt, L. S. (2002). Estimating the internal consistency reliability of tests composed of test lets varying in length. Applied Measurement in Education, 15(1), 33-48.
  • Feldt, L. S., & Charter, R. A. (2003). Estimation of internal consistency reliability when test parts vary in effective length. Measurement and Evaluation in Counseling and Development, 36(1), 23-27.
  • Gao, F., & Chen, L. (2005). Bayesia nor non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model. Applied Measurement in Education, 18(4), 351-380.
  • Gay, L. R. (1996). Educational research: competencies for analysis and application (5th ed). By Prentice-HallInc.: USA.
  • Gubes, N. Ö. (2014). The effects of test dimensionality, common item format, ability distribution and scale transformation methods on mixed - format test equating(Doctoral dissertation). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/. (Accession Number: 399465)
  • Gul, E. (2015). Examining multidimensional structure in view of unidimensional and multidimensional item response theory (Doctoral dissertation). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/. (Accession Number: 419288)
  • Gultekin, S. (2011). The evaluation based on Item Response Theory of the psychometric characteristics in multiple choice, constructed response and mixed format tests (Doctoral dissertation). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/. (Accession Number: 302033)
  • Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255-282.
  • Hambleton, R. K., Swaminathan, H., Rogers, H. (1991), Fundamentals of Item Response Theory. Newbury Park CA: Sage Publications.
  • Harwell, M., Stone, C. A., Hsu, T. C., &Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied psychological measurement, 20(2), 101-125.
  • He, Q. (2009). Estimating the reliability of composite scores. Retrieved from https://pdfs.semanticscholar.org/0f54/d8c356f82fbca4fd2326239c1d21fbc9 b778.pdf
  • He, Y. (2011). Evaluating equating properties formixed-format tests (Unpublished doctoral dissertation). University of Iowa, Iowa City.
  • Hu, B. (2018). Equating Errors and Scale Drift in Linked-Chain IRT Equating with Mixed-Format Tests. Journal of applied measurement, 19(1), 41-58.
  • Kim, S. H., & Lee, W.-C. (2006). An extension of four IRT linking methods for mixedformat tests. Journal of Educational Measurement, 43, 53-76.
  • Kim, S. Y., & Lee, W. C. (2018). Simple-Structure MIRT True-Score Equating for MixedFormat Tests. Mixed-Format Tests: Psychometric Properties with a Primary Focus on Equating (Volume 5), 127.
  • Kim, S. Y., & Lee, W. C. (2019). Classification consistency and accuracy for mixedformat tests. Applied Measurement in Education, 32(2), 97-115.
  • Kinsey, T. L. (2003). A comparison of IRT and rasch procedures in a mixed-item format test (Doctoral dissertation).Retrieved from ProQuest Digital Dissertations.
  • Kirkpatrick, R. K. (2005). The effects of item format in common item equating(Unpublished doctoral dissertation). University of Iowa, Iowa City.
  • Lee, G., & Lee, W. C. (2016). Bi-factor MIRT observed-score equating for mixed-format tests. Applied Measurement in Education, 29(3), 224-241.
  • Li, Z., Chen, H., & Li, T. (2018). Exploring the Accuracy of MIRT Scale Linking Procedures for Mixed-format Tests. arXiv preprint arXiv:1805.00189.
  • Lord, F. M. (1980). Applications of item response theory topractical testing problems. London: Routledge.
  • Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading MA: Addison-Welsley Publishing CompanyLucke, J. F. (2005a). The α and ω of congeneric test theory: An extension of reliability and internal consistency to heterogeneous tests. Applied Psychological Measurements. 29(1), 65-81.
  • Masters, G. N. (1982). A Rasch model forpartial credit scoring. Psychometrika, 47(2), 149-174.
  • Mehren, W.A. & Lehmann I.J. (1973). A measurement and evaluation in educationand psychology. New York: Holt. Rinehartand Winston.
  • Nunnally, J.C. (1964). Educational measurement and evaluation (6th ed.). New York: McGraw- Hill Book Company.
  • Odabas, M. (2016). The comparison of DINA model signed difference index, standardization and logistic regression techniques for detecting differential item functioning (Unpublished doctoral dissertation). Hacettepe Üniversitesi, Ankara.
  • Osbourn, H.G. (2000). Coefficient α and related internal consistency reliability coefficients. Psychological Methods, 5, 343-355.
  • Qualls, A. L. (1995). Estimating the reliability of a test containing multiple item formats. Applied Measurement in Education, 8(2), 111-120.
  • Raykov, T., &Shrout, P. E. (2002). Reliability of scaleswith general structure: Point andinterval estimation using a structural equation modeling approach. Structural Equation Modeling, 9(2), 195-212.
  • Saen-amnuaiphon, R., Tuksino, P.,& Nichanong, C. (2012). The Effect of Proportion of Mixed-Format Scoring: Mixed-Format Achievement Tests. Procedia-Social and Behavioral Sciences, 69, 1522-1528.
  • Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s α. Psychometrika, 74(1), 107.
  • Spray, J. A. (1990). Comparison of Two Logistic Multidimensional Item Response Theory Models. (Research Report ONR90-8). ACT, Inc., Iowa City, IA. Sykes, R. C., Truskosky, D.,&White, H. (11-12 April 2001), Determining The Representation of Constructed Response Items in Mixed-Item-Format Exams. Paperpresented at Annual Meeting of the National Council on Measurement in Education, ABD: Seattle.
  • Tekin, H. (1991). Measurement and evaluation in education. Ankara: Yargı yayınevi.
  • Uysal, İ., & Kilmen, S. (2016). Comparison of Item Response Theory Test Equating Methods for Mixed Format Tests. International Online Journal of Educational Sciences, 8(2), 1-11.
  • Wainer, H. (1976). Estimating coefficients in linear models: It don't make nonever mind. Psychological Bulletin, 83(2), 213.
  • Wang, W., Drasgow, F., & Liu, L. (2016). Classification accuracy of mixed format tests: A bi-factor item response theory approach. Frontiers in psychology, 7, 270
  • Warrens, M. J. (2016). A comparison of reliability coefficients for psychometric tests that consist of two parts. Advances in Data Analysis and Classification, 10(1), 71- 84.
  • Young, M. J., &Yoon, B. (1998). Estimating the consistency and accuracy of classification in a standards-referenced assessment.Retrieved from https://cresst.org/wpcontent/uploads/TECH475.pdf.
  • Zinbarg, R. E., Revelle, W., Yovel, I. & Li, W. (2005). Cronbach’s α, Revelle’s, β and McDonalds ω: their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70(1), 1-11.
APA GÜRDİL H, Demir E (2020). Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. , 101 - 118. 10.14689/ejer.2020.87.5
Chicago GÜRDİL HATİCE,Demir Ergul Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. (2020): 101 - 118. 10.14689/ejer.2020.87.5
MLA GÜRDİL HATİCE,Demir Ergul Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. , 2020, ss.101 - 118. 10.14689/ejer.2020.87.5
AMA GÜRDİL H,Demir E Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. . 2020; 101 - 118. 10.14689/ejer.2020.87.5
Vancouver GÜRDİL H,Demir E Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. . 2020; 101 - 118. 10.14689/ejer.2020.87.5
IEEE GÜRDİL H,Demir E "Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions." , ss.101 - 118, 2020. 10.14689/ejer.2020.87.5
ISNAD GÜRDİL, HATİCE - Demir, Ergul. "Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions". (2020), 101-118. https://doi.org/10.14689/ejer.2020.87.5
APA GÜRDİL H, Demir E (2020). Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. Eurasian Journal of Educational Research, 20(87), 101 - 118. 10.14689/ejer.2020.87.5
Chicago GÜRDİL HATİCE,Demir Ergul Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. Eurasian Journal of Educational Research 20, no.87 (2020): 101 - 118. 10.14689/ejer.2020.87.5
MLA GÜRDİL HATİCE,Demir Ergul Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. Eurasian Journal of Educational Research, vol.20, no.87, 2020, ss.101 - 118. 10.14689/ejer.2020.87.5
AMA GÜRDİL H,Demir E Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. Eurasian Journal of Educational Research. 2020; 20(87): 101 - 118. 10.14689/ejer.2020.87.5
Vancouver GÜRDİL H,Demir E Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions. Eurasian Journal of Educational Research. 2020; 20(87): 101 - 118. 10.14689/ejer.2020.87.5
IEEE GÜRDİL H,Demir E "Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions." Eurasian Journal of Educational Research, 20, ss.101 - 118, 2020. 10.14689/ejer.2020.87.5
ISNAD GÜRDİL, HATİCE - Demir, Ergul. "Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions". Eurasian Journal of Educational Research 20/87 (2020), 101-118. https://doi.org/10.14689/ejer.2020.87.5