logo logo International Journal of Educational Methodology

IJEM is a leading, peer-reviewed, open access, research journal that provides an online forum for studies in education, by and for scholars and practitioners, worldwide.

Subscribe to

Receive Email Alerts

for special events, calls for papers, and professional development opportunities.

Subscribe

Publisher (HQ)

RHAPSODE LTD
Eurasian Society of Educational Research
College House, 2nd Floor 17 King Edwards Road, Ruislip, London, UK. HA4 7AE
RHAPSODE LTD
Headquarters
College House, 2nd Floor 17 King Edwards Road, Ruislip, London, UK. HA4 7AE
item analysis pearson correlation somers d item total correlation item rest correlation item discrimination power

Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Jari Metsämuuronen

Pearson product–moment correlation coefficient between item g and test score X, known as item–test or item–total correlation (Rit), .

P

Pearson product–moment correlation coefficient between item g and test score X, known as item–test or item–total correlation (Rit), and item–rest correlation (Rir) are two of the most used classical estimators for item discrimination power (IDP). Both Rit and Rir underestimate IDP caused by the mismatch of the scales of the item and the score. Underestimation of IDP may be drastic when the difficulty level of the item is extreme. Based on a simulation, in a binary dataset, a good alternative for Rit and Rir could be the Somers’ D: it reaches the ultimate values +1 and –1, it underestimates IDP remarkably less than Rit and Rir, and, being a robust statistic, it is more stable against the changes in the data structure. Somers’ D has, however, one major disadvantage in a polytomous case: it tends to underestimate the magnitude of the association of item and score more than Rit does when the item scale has four categories or more.

Keywords: Item analysis, Pearson correlation, Somers' D, item–total correlation, item–rest correlation, item discrimination power.

cloud_download PDF
Cite
Article Metrics
Views
1030
Download
881
Citations
Crossref
15

References

Birnbaum A (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical Theories of Mental Test Scores (pp. 397–479). Addison-Wesley Publishing Company.

Brogden, H. E. (1949). A new coefficient: Application to biserial correlation and to estimation of selective efficiency. Psychometrika, 14(3)169–182. https://doi.org/10.1007/BF02289151

Byrne, B. M. (2001). Structural Equation Modeling with AMOS. Basic concepts, applications, and programming. Lawrence Erlbaum Associates, Publishers.

Crocker, L., & Algina, J. (1986). Introduction to classical & modern test theory. Wadsworth.

Cronbach, L. J. (1951). Coefficient Alpha and the Internal Structure of Tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555

Cureton, E. E. (1956). Rank-biserial correlation. Psychometrika21(3), 287–290. https://doi.org/10.1007%2FBF02289138

Cureton E. E. (1966a). Simplified Formulas for Item Analysis. Journal of Educational Measurement3(2), 187–189. https://doi.org/10.1111/j.1745-3984.1966.tb00879.x

Cureton E. E. (1966b). Corrected item–test correlations. Psychometrika31(1), 93–96. https://doi.org/10.1007/BF02289461. 

ETS (1960). Short-cut statistics for teacher-made tests. Educational Testing Service.

ETS (2019). Glossary of Standardized Testing Terms. https://www.ets.org/understanding_testing/glossary/

FINEEC (2018). National Assessment of Learning Outcomes in Mathematics at Grade 9 in 2004. Unpublished dataset opened for the re-analysis 18.2.2018. Finnish National Education Evaluation Centre.

Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological methods9(4), 466–491. https://doi.org/10.1037/1082-989X.9.4.466

Forero, C. G., & Maydeu-Olivares, A. (2009). Estimation of IRT graded response models: Limited versus full information methods. Psychological Methods, 14(3), 275-299. https://doi.org/10.1037/a0015825

Glass, G. V. (1966). Note on rank biserial correlation. Educational and Psychological Measurement26(3), 623-631. https://doi.org/10.1177/001316446602600307

Goktas, A. & Isci. O. A. (2011). Comparison of the Most Commonly Used Measures of Association for Doubly Ordered Square Contingency Tables via Simulation. Metodoloski zvezki8(1), 17–37.

Goodman, L. A., & Kruskal, W. H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49(268), 732–764. https://doi.org/10.1080/01621459.1954.10501231

Greiner, R. (1909). Über das Fehlersystem der Kollektivmaßlehre [Of  the Error Systemic of Collectives]. Journal of Mathematics and Physics /Zeitschift fur Mathematik und Physik57, 121–158, 225–260, 337–373.

Henrysson, S. (1963). Correction of Item–Total Correlations in Item Analysis. Psychometrika, 28(2)211–218. https://doi.org/10.1007/BF02289618

Henrysson, S. (1971). Gathering, analyzing and using data on test items. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.) (pp. 130–159). American Council on Education.

Holgado–Tello, F. P., Chacón–Moscoso, S., Barbero–García, I., Vila–Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables.  Quality & Quantity44, 153–166. https://doi.org/10.1007/s11135-008-9190-y

Howard K. I, & Forehand, G. A. (1962). A Method for correcting item-total correlations for the effect of relevant item inclusion. Educational and Psychological Measurement, 22(4), 731–735. https://doi.org/10.1177/001316446202200407

IBM. (2017). IBM SPSS Statistics 25 Algorithms. IBM. ftp://public.dhe.ibm.com/software/analytics/spss/documentation/statistics/25.0/en/client/Manuals/IBM_SPSS_Statistics_Algorithms.pdf

Jöreskog, K. G. (1994). Structural equation modeling with ordinal variables. In T. W. Anderson, K. T. Fang, & I. Olkin (Eds.), Multivariate analysis and its applications (pp. 297–310)Hayward, CA: Institute of Mathematical Statistics. https://doi.org/10.1214/lnms/1215463803

Kendall, M. (1949). Rank and Product–Moment Correlation. Biometrika, 36(1/2), 177–193. https://doi.org/10.2307/2332540

Lancaster, H. O., & Hamdan, M. A. (1964). Estimation of the correlation coefficient in contingency tables with possibly nonmetrical characters. Psychometrika, 29(4), 383–391. https://doi.org/10.1007/BF02289604

Liu, F. (2008). Comparison of several popular discrimination indices based on different criteria and their application in item analysis. University of Georgia. https://getd.libs.uga.edu/pdfs/liu_fu_200808_ma.pdf

Livingston, S.  A., & Dorans, N. J. (2004). A graphical approach to item analysis. (Research Report No. RR-04-10). Educational Testing Service.  https:// doi.org/10.1002/j.2333-8504.2004.tb01937.x

Lord, F. M., & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Addison–Wesley Publishing Company.

Macdonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of item and person statistics based on item response theory versus classical test theory. Educational and Psychological Measurement, 62(6), 921–943. https://doi.org/10.1177/0013164402238082

Metsämuuronen, J. (2016). Item–total Correlation as the Cause for the Underestimation of the Alpha Estimate for the Reliability of the Scale. GJRA - Global Journal for Research Analysis5(1), 471–477.  https://www.worldwidejournals.com/global-journal-for-research-analysis-GJRA/file.php?val=November_2016_1478701072__159.pdf.

Metsämuuronen, J. (2017a). Essentials of Research Methods in Human Sciences. Vol 1: Elementary Basics. SAGE Publications.

Metsämuuronen, J. (2017b). Essentials of Research Methods in Human Sciences. Vol 3: Advanced Analysis. SAGE Publications.

Moses, T. (2017). A Review of Developments and Applications in Item Analysis. In R. Bennett & M. von Davier (Eds.), Advancing Human AssessmentThe Methodological, Psychological and Policy Contributions of ETS (pp. 19–46). Springer Open. https://doi.org/10.1007/978-3-319-58689-2_2

Moustaki, I, Jöreskog, K. G., & Mavridis D. (2004). Factor Models for Ordinal Variables with Covariate Effects on the Manifest and Latent Variables: A Comparison of LISREL and IRT Approaches. Structural Equation Modeling: A Multidisciplinary Journal, 11(4), 487‒513. https://doi.org/10.1207/s15328007sem1104_1  

 Newson, R. (2002). Parameters Behind “Nonparametric” Statistics: Kendall’s tau, Somers D and Median Differences. The Stata Journal, 2(1), 45–64. http://www.stata-journal.com/sjpdf.html?articlenum=st0007

Newson, R. (2008). Identity of Somers D and the rank biserial correlation coefficient. http://www.rogernewsonresources.org.uk/miscdocs/ranksum1.pdf

Öllerer, V., & Croux, C. (2010). Robust high-dimensional matrix estimation. In K. Nordhausen & S, Taskinen (Eds.), Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja (pp. 325–350). Springer.

Olsson, U., Drasgow, F., & Dorans, N. J. (1982). The polyserial correlation coefficient. Psychometrika, 47(3), 337–347. https://doi.org/10.1007/BF02294164

Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443–460. https://doi.org/10.1007/BF02296207

Oosterhof, A. C. (1976). Similarity of various item discrimination indices. Journal of Educational Measurement13(2), 145–150. https://doi.org/10.1111/j.1745-3984.1976.tb00005.x.

Pearson, K. (1896). Mathematical contributions to the theory of evolution III. regression, heredity, and panmixia. philosophical transactions of the royal society of London. Series A, Containing Papers of a Mathematical or Physical Character, 187, 253–318. https://doi.org/10.1098/rsta.1896.0007

Pearson, K. (1900). I. Mathematical contributions to the theory of evolution. VII. On the correlation of characters not quantitatively measurable. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences, 195(262–273), 1–47. https://doi.org/10.1098/rsta.1900.0022.

Pearson, K. (1903). I. Mathematical contributions to the theory of evolution. —XI. On the influence of natural selection on the variability and correlation of organs. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences, 200(321–330), 1–66. https://doi.org/10.1098/rsta.1903.0001.

Pearson, K. (1905). On the general theory of skew correlation and non-linear regression. Dulau & Co. https://archive.org/details/ongeneraltheory00peargoog/page/n3.

Pearson, K. (1913). On the measurement of the influence of “broad categories” on correlation. Biometrika, 9(1–2), 116–139. https://doi.org/10.1093/biomet/9.1-2.116

Rigdon, E. E., & Ferguson, C. E. JR. (1991). The performance of the polychoric correlation coefficient and selected fitting functions in confirmatory factor analysis with ordinal data. Journal of Marketing Research, 28(4), 491–497. https://doi.org/10.1177/002224379102800412

Siegel, S., & Castellan, N. J., Jr. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). McGraw-Hill.

Somers, R. H. (1962). A new asymmetric measure of association for ordinal variables. American Sociological Review, 27(6), 799–811. https://doi.org/10.2307/2090408

Stata corp. (2018). Stata manual. Stata. https://www.stata.com/manuals13/mvalpha.pdf

Tallis, G. (1962). The maximum likelihood estimation of correlation from contingency tables. Biometrics, 18(3), 342–353. https://doi.org/10.2307/2527476

Uebersax, J. S. (2015). The tetrachoric and polychoric correlation coefficients. Statistical Methods for Rater Agreement. http://www.john-uebersax.com/stat/tetra.htm

Wendt, H. W. (1972). Dealing with a common problem in social science: A simplified rank-biserial coefficient of correlation based on the U statistic. European Journal of Social Psychology2(4), 463–465. https://doi.org/10.1002/ejsp.2420020412

Verhelst ND, Glas CAW, & Verstralen HHFM (1995). One-parameter logistic model OPLM. Cito. 

Wolf, R. (1967). Evaluation of several formulae for correction of item-total correlations in item analysis. Journal of Educational Measurement, 4(1), 21–26. https://doi.org/10.1111/j.1745-3984.1967.tb00565.x

Yi-Hsin, C. & Li, I. (2015).  IA_CTT: A SAS® macro for conducting item analysis based on classical test theory. Paper CC184. https://analytics.ncsu.edu/sesug/2015/CC-184.pdf

...