Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings
Pearson product–moment correlation coefficient between item g and test score X, known as item–test or item–total correlation (Rit), .
- Pub. date: February 15, 2020
- Pages: 207‒221
- 1279 Downloads
- 2710 Views
- 19 Citations
Pearson product–moment correlation coefficient between item g and test score X, known as item–test or item–total correlation (Rit), and item–rest correlation (Rir) are two of the most used classical estimators for item discrimination power (IDP). Both Rit and Rir underestimate IDP caused by the mismatch of the scales of the item and the score. Underestimation of IDP may be drastic when the difficulty level of the item is extreme. Based on a simulation, in a binary dataset, a good alternative for Rit and Rir could be the Somers’ D: it reaches the ultimate values +1 and –1, it underestimates IDP remarkably less than Rit and Rir, and, being a robust statistic, it is more stable against the changes in the data structure. Somers’ D has, however, one major disadvantage in a polytomous case: it tends to underestimate the magnitude of the association of item and score more than Rit does when the item scale has four categories or more.
item analysis pearson correlation somers d item total correlation item rest correlation item discrimination power
Keywords: Item analysis, Pearson correlation, Somers' D, item–total correlation, item–rest correlation, item discrimination power.
References
Birnbaum A (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical Theories of Mental Test Scores (pp. 397–479). Addison-Wesley Publishing Company.
Brogden, H. E. (1949). A new coefficient: Application to biserial correlation and to estimation of selective efficiency. Psychometrika, 14(3), 169–182. https://doi.org/10.1007/BF02289151
Byrne, B. M. (2001). Structural Equation Modeling with AMOS. Basic concepts, applications, and programming. Lawrence Erlbaum Associates, Publishers.
Crocker, L., & Algina, J. (1986). Introduction to classical & modern test theory. Wadsworth.
Cronbach, L. J. (1951). Coefficient Alpha and the Internal Structure of Tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555
Cureton, E. E. (1956). Rank-biserial correlation. Psychometrika, 21(3), 287–290. https://doi.org/10.1007%2FBF02289138
Cureton E. E. (1966a). Simplified Formulas for Item Analysis. Journal of Educational Measurement, 3(2), 187–189. https://doi.org/10.1111/j.1745-3984.1966.tb00879.x
Cureton E. E. (1966b). Corrected item–test correlations. Psychometrika, 31(1), 93–96. https://doi.org/10.1007/BF02289461.
ETS (1960). Short-cut statistics for teacher-made tests. Educational Testing Service.
ETS (2019). Glossary of Standardized Testing Terms. https://www.ets.org/understanding_testing/glossary/
FINEEC (2018). National Assessment of Learning Outcomes in Mathematics at Grade 9 in 2004. Unpublished dataset opened for the re-analysis 18.2.2018. Finnish National Education Evaluation Centre.
Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological methods, 9(4), 466–491. https://doi.org/10.1037/1082-989X.9.4.466
Forero, C. G., & Maydeu-Olivares, A. (2009). Estimation of IRT graded response models: Limited versus full information methods. Psychological Methods, 14(3), 275-299. https://doi.org/10.1037/a0015825
Glass, G. V. (1966). Note on rank biserial correlation. Educational and Psychological Measurement, 26(3), 623-631. https://doi.org/10.1177/001316446602600307
Goktas, A. & Isci. O. A. (2011). Comparison of the Most Commonly Used Measures of Association for Doubly Ordered Square Contingency Tables via Simulation. Metodoloski zvezki, 8(1), 17–37.
Goodman, L. A., & Kruskal, W. H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49(268), 732–764. https://doi.org/10.1080/01621459.1954.10501231
Greiner, R. (1909). Über das Fehlersystem der Kollektivmaßlehre [Of the Error Systemic of Collectives]. Journal of Mathematics and Physics /Zeitschift fur Mathematik und Physik, 57, 121–158, 225–260, 337–373.
Henrysson, S. (1963). Correction of Item–Total Correlations in Item Analysis. Psychometrika, 28(2), 211–218. https://doi.org/10.1007/BF02289618
Henrysson, S. (1971). Gathering, analyzing and using data on test items. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.) (pp. 130–159). American Council on Education.
Holgado–Tello, F. P., Chacón–Moscoso, S., Barbero–García, I., Vila–Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity, 44, 153–166. https://doi.org/10.1007/s11135-008-9190-y
Howard K. I, & Forehand, G. A. (1962). A Method for correcting item-total correlations for the effect of relevant item inclusion. Educational and Psychological Measurement, 22(4), 731–735. https://doi.org/10.1177/001316446202200407
IBM. (2017). IBM SPSS Statistics 25 Algorithms. IBM. ftp://public.dhe.ibm.com/software/analytics/spss/documentation/statistics/25.0/en/client/Manuals/IBM_SPSS_Statistics_Algorithms.pdf
Jöreskog, K. G. (1994). Structural equation modeling with ordinal variables. In T. W. Anderson, K. T. Fang, & I. Olkin (Eds.), Multivariate analysis and its applications (pp. 297–310). Hayward, CA: Institute of Mathematical Statistics. https://doi.org/10.1214/lnms/1215463803
Kendall, M. (1949). Rank and Product–Moment Correlation. Biometrika, 36(1/2), 177–193. https://doi.org/10.2307/2332540
Lancaster, H. O., & Hamdan, M. A. (1964). Estimation of the correlation coefficient in contingency tables with possibly nonmetrical characters. Psychometrika, 29(4), 383–391. https://doi.org/10.1007/BF02289604
Liu, F. (2008). Comparison of several popular discrimination indices based on different criteria and their application in item analysis. University of Georgia. https://getd.libs.uga.edu/pdfs/liu_fu_200808_ma.pdf
Livingston, S. A., & Dorans, N. J. (2004). A graphical approach to item analysis. (Research Report No. RR-04-10). Educational Testing Service. https:// doi.org/10.1002/j.2333-8504.2004.tb01937.x
Lord, F. M., & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Addison–Wesley Publishing Company.
Macdonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of item and person statistics based on item response theory versus classical test theory. Educational and Psychological Measurement, 62(6), 921–943. https://doi.org/10.1177/0013164402238082
Metsämuuronen, J. (2016). Item–total Correlation as the Cause for the Underestimation of the Alpha Estimate for the Reliability of the Scale. GJRA - Global Journal for Research Analysis, 5(1), 471–477. https://www.worldwidejournals.com/global-journal-for-research-analysis-GJRA/file.php?val=November_2016_1478701072__159.pdf.
Metsämuuronen, J. (2017a). Essentials of Research Methods in Human Sciences. Vol 1: Elementary Basics. SAGE Publications.
Metsämuuronen, J. (2017b). Essentials of Research Methods in Human Sciences. Vol 3: Advanced Analysis. SAGE Publications.
Moses, T. (2017). A Review of Developments and Applications in Item Analysis. In R. Bennett & M. von Davier (Eds.), Advancing Human Assessment. The Methodological, Psychological and Policy Contributions of ETS (pp. 19–46). Springer Open. https://doi.org/10.1007/978-3-319-58689-2_2
Moustaki, I, Jöreskog, K. G., & Mavridis D. (2004). Factor Models for Ordinal Variables with Covariate Effects on the Manifest and Latent Variables: A Comparison of LISREL and IRT Approaches. Structural Equation Modeling: A Multidisciplinary Journal, 11(4), 487‒513. https://doi.org/10.1207/s15328007sem1104_1
Newson, R. (2002). Parameters Behind “Nonparametric” Statistics: Kendall’s tau, Somers D and Median Differences. The Stata Journal, 2(1), 45–64. http://www.stata-journal.com/sjpdf.html?articlenum=st0007
Newson, R. (2008). Identity of Somers D and the rank biserial correlation coefficient. http://www.rogernewsonresources.org.uk/miscdocs/ranksum1.pdf
Öllerer, V., & Croux, C. (2010). Robust high-dimensional matrix estimation. In K. Nordhausen & S, Taskinen (Eds.), Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja (pp. 325–350). Springer.
Olsson, U., Drasgow, F., & Dorans, N. J. (1982). The polyserial correlation coefficient. Psychometrika, 47(3), 337–347. https://doi.org/10.1007/BF02294164
Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443–460. https://doi.org/10.1007/BF02296207
Oosterhof, A. C. (1976). Similarity of various item discrimination indices. Journal of Educational Measurement, 13(2), 145–150. https://doi.org/10.1111/j.1745-3984.1976.tb00005.x.
Pearson, K. (1896). Mathematical contributions to the theory of evolution III. regression, heredity, and panmixia. philosophical transactions of the royal society of London. Series A, Containing Papers of a Mathematical or Physical Character, 187, 253–318. https://doi.org/10.1098/rsta.1896.0007
Pearson, K. (1900). I. Mathematical contributions to the theory of evolution. VII. On the correlation of characters not quantitatively measurable. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences, 195(262–273), 1–47. https://doi.org/10.1098/rsta.1900.0022.
Pearson, K. (1903). I. Mathematical contributions to the theory of evolution. —XI. On the influence of natural selection on the variability and correlation of organs. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences, 200(321–330), 1–66. https://doi.org/10.1098/rsta.1903.0001.
Pearson, K. (1905). On the general theory of skew correlation and non-linear regression. Dulau & Co. https://archive.org/details/ongeneraltheory00peargoog/page/n3.
Pearson, K. (1913). On the measurement of the influence of “broad categories” on correlation. Biometrika, 9(1–2), 116–139. https://doi.org/10.1093/biomet/9.1-2.116
Rigdon, E. E., & Ferguson, C. E. JR. (1991). The performance of the polychoric correlation coefficient and selected fitting functions in confirmatory factor analysis with ordinal data. Journal of Marketing Research, 28(4), 491–497. https://doi.org/10.1177/002224379102800412
Siegel, S., & Castellan, N. J., Jr. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). McGraw-Hill.
Somers, R. H. (1962). A new asymmetric measure of association for ordinal variables. American Sociological Review, 27(6), 799–811. https://doi.org/10.2307/2090408
Stata corp. (2018). Stata manual. Stata. https://www.stata.com/manuals13/mvalpha.pdf
Tallis, G. (1962). The maximum likelihood estimation of correlation from contingency tables. Biometrics, 18(3), 342–353. https://doi.org/10.2307/2527476
Uebersax, J. S. (2015). The tetrachoric and polychoric correlation coefficients. Statistical Methods for Rater Agreement. http://www.john-uebersax.com/stat/tetra.htm
Wendt, H. W. (1972). Dealing with a common problem in social science: A simplified rank-biserial coefficient of correlation based on the U statistic. European Journal of Social Psychology, 2(4), 463–465. https://doi.org/10.1002/ejsp.2420020412
Verhelst ND, Glas CAW, & Verstralen HHFM (1995). One-parameter logistic model OPLM. Cito.
Wolf, R. (1967). Evaluation of several formulae for correction of item-total correlations in item analysis. Journal of Educational Measurement, 4(1), 21–26. https://doi.org/10.1111/j.1745-3984.1967.tb00565.x
Yi-Hsin, C. & Li, I. (2015). IA_CTT: A SAS® macro for conducting item analysis based on classical test theory. Paper CC184. https://analytics.ncsu.edu/sesug/2015/CC-184.pdf