Research

Causal Inference under Interference using Efficient Nonparametric Estimation

Causal Inference Research Lab (CIRL), UNC Chapel Hill, Jan 2022 ‑ Present

Developed efficient nonparametric estimation of causal network effects under interference based on semiparametric and empirical process theory.
Used ensemble of nonparametric and ML models (spline regression, GAM, boosting, Random Forest, neural net) via SuperLearnear in R.
Paper: Efficient Nonparametric Estimation of Stochastic Policy Effects with Clustered Interference.

Drug Release Prediction for Reservoir‑Style Polymeric Drug Delivery Systems

Research Triangle Institute International, Mar 2022 ‑ Sep 2022

Designed and built drug release prediction model and visualized results in R, enabling rational drug implant design without extensive in vitro testing.
Publication: Reservoir‑Style Polymeric Drug Delivery Systems: Empirical and Predictive Models for Implant Design.

Fake News Detection using Machine Learning Methods

Machine Learning Course Project, UNC Chapel Hill, Aug 2021 ‑ Dec 2021

Preprocessed fake news data based on standard NLP preprocessing procedure to generate Bag of Words, TF‑IDF, and Bigram using Pandas.
Trained ML (SVM, Random Forest, Logistic Regression) and DL (1D CNN, BERT, LSTM, Domain Adaptation) models to build fake news detection model using scikit‑learn, PyTorch, and Tensorflow, achieved 91.4% test accuracy.

Women’s Health Initiative Proteome‑Wide Association Study

Yun Li Statistical Genetics Group, UNC Chapel Hill, Sep 2020 ‑ Aug 2021

Analyzed 552 protein levels of 1,002 individuals from Women’s Health Initiative data to identify protein quantitative trait loci (pQTL) using EPACTS.
Built protein level prediction model using CV Elastic Net and investigated cardiovascular diseases related proteins based on predicted levels in R.

Growing Student Knowledge Distillation

Deep Learning Course Project, Seoul National University, Sep 2019 ‑ Dec 2019

Proposed novel knowledge distillation structure comprised of sequence of CNNs with increasing number of layers, transferring knowledge from smaller to bigger networks consecutively using PyTorch which resembles a student’s cumulative learning process.
Achieved 89.9% test accuracy on CIFAR‑10 dataset, improvement of 0.2% test accuracy compared to baseline ResNet26.