About Me
I am a Ph.D. candidate in Applied Statistics at Columbia University, bridging statistical modeling, psychometrics, and artificial intelligence. My research focuses on large language models (LLMs), exploring how retrieval-augmented generation (RAG) and causal inference can advance automated assessment, enhance model interpretability, and support data-driven decision making.
Recent News
Nov 2025
Received the Helen M. Walker Scholarship from Teachers College, Columbia University.
Nov 2025
Paper accepted to AAAI-26 (The 40th Annual AAAI Conference on Artificial Intelligence).
Oct 2025
Received the Provost's Grant for Doctoral Research from Teachers College, Columbia University.
Apr 2025
Paper Extracting Latent Dimensions from Multidimensional Response Timing Data accepted by CogSci 25 (The 47th Annual Conference of the Cognitive Science Society).
Apr 2025
Presented research at NCME 2025 Annual Meeting (National Council on Measurement in Education).
Education
2022–2026 (expected)
Ph.D. in Applied Statistics, Columbia University
Focus: Machine Learning, LLM-RAG, LLM-Causal Inference
Advisors: James E. Corter (primary), Lawrence T. DeCarlo
Focus: Machine Learning, LLM-RAG, LLM-Causal Inference
Advisors: James E. Corter (primary), Lawrence T. DeCarlo
2020–2022
M.S. in Applied Statistics, Columbia University
2020
B.A. in Economics and Finance, University of Aberdeen
Publications
Journal Articles
Paper 2 under review
This paper is about the number 3. The number 4 is left for future work.
Paper under review
This paper is about the number 2. The number 3 is left for future work.
Conference Papers
Teaching
HUDM 5059: Psychological measurement
Teaching AssistantHUDM 4120: Introduction to Statistics
Teaching AssistantHUDM 5123: Linear Models Experimentl Dsgn
Teaching AssistantHUDM 6055: Latent Structure Analysis
Teaching AssistantCV
View Full CV →Research experience
Spring 2025 — Present: Graduate Research Assistant
- Institution: Teachers College, Columbia University
- Project: Education Leadership Data Analytics (ELDA)
- Research Focus: Conducted correspondence analysis on a large-scale dataset of state education records aligned with 16 NASEM equity indicators to uncover latent dimensions of equity representation across states.
Fall 2024 — Present: Ph.D. Researcher
- Institution: Columbia University, New York, NY
- Project: LLM-RAG for Automated Grading
- Research Focus: Developed a retrieval-augmented generation (RAG) framework for automated short-answer grading and feedback, integrating LLMs with psychometric and causal inference principles.
Jan 2023 — Jun 2024: Research Assistant
- Institution: Columbia University, New York, NY
- Project: NSF-Funded Course Recommendation-Causal Inference
- Research Focus: Processed large-scale NCES datasets to model student math course pathways across grade levels. Applied causal machine learning methods (TMLE, Causal Forests) to estimate heterogeneous intervention effects and design optimal course recommendation rules that promote fairness and maximize student outcomes.
Jan 2021 — Feb 2022: Master’s Researcher
- Institution: Columbia University
- Project: Machine and Deep Learning Research
- Research Focus: Built predictive models for crime data using principal component regression and model selection, improving predictive accuracy from 55% to 90%. Developed NLP pipelines for sentiment analysis and topic modeling, and analyzed classroom interaction networks using centrality and community measures to examine peer influence and group dynamics.
Skills
Tools
- Python
- R
- SQL
- SPSS
Machine Learning & AI
- Machine Learning
- LLM-RAG
- LLM-Causal Inference
Service and Leadership
- Reviewer Cognitive Science Society Annual Meeting (CogSci 2025)
- Reviewer AAAI Undergraduate Consortium (AAAI-UC 2026)
Awards
- Helen M. Walker Scholarship 2025
- Provost's Grant 2025
- Doctoral Fellowship 2023
Courses
Machine Learning (Stanford Online)
Published:
Advanced seminar exploring supervised learning,unsupervised learning,learning theory,reinforcement learning.
Applied Causal Inference for Data Science
Published:
This is an introductory and applied course in Casual Inference .
