'synthetic data' Search Results
Synthetic Longitudinal Education Database: Linking National Datasets for K-16 Education and College Readiness
college readiness longitudinal database machine learning multiple imputation synthetic data...
What are missing in the U.S. education policy of “college for all” are supporting data and indicators on K-16 education pathways, i.e, how well all students get ready and stay on track from kindergarten through college. This study creates synthetic national longitudinal education database that helps track and support students’ educational pathways by combining two nationally-representative U.S. sample datasets: Early Childhood Longitudinal Study- Kindergarten (ECLS-K; Kindergarten through 8th grade) and National Education Longitudinal Study (NELS; 8th grade through age 25). The merge of these national datasets, linked together via statistical matching and imputation techniques, can help bridge the gap between elementary and secondary/postsecondary education data/research silos. Using this synthetic K-16 education longitudinal database, this study applies machine learning data analytics in search of college readiness early indicators among kindergarten students. It shows the utilities and limitations of linking preexisting national datasets to impute education pathways and assess college readiness. It discusses implications for developing more holistic and equitable educational assessment system in support of K-16 education longitudinal database.