Research
Causal Inference under Interference using Efficient Nonparametric Estimation
Causal Inference Research Lab (CIRL), UNC Chapel Hill, Jan 2022 ‑ Present
- Developed efficient nonparametric estimation of causal network effects under interference based on semiparametric and empirical process theory.
- Used ensemble of nonparametric and ML models (spline regression, GAM, boosting, Random Forest, neural net) via SuperLearnear in R.
- Paper: Efficient Nonparametric Estimation of Stochastic Policy Effects with Clustered Interference.
Drug Release Prediction for Reservoir‑Style Polymeric Drug Delivery Systems
Research Triangle Institute International, Mar 2022 ‑ Sep 2022
- Designed and built drug release prediction model and visualized results in R, enabling rational drug implant design without extensive in vitro testing.
- Publication: Reservoir‑Style Polymeric Drug Delivery Systems: Empirical and Predictive Models for Implant Design.
Fake News Detection using Machine Learning Methods
Machine Learning Course Project, UNC Chapel Hill, Aug 2021 ‑ Dec 2021
- Preprocessed fake news data based on standard NLP preprocessing procedure to generate Bag of Words, TF‑IDF, and Bigram using Pandas.
- Trained ML (SVM, Random Forest, Logistic Regression) and DL (1D CNN, BERT, LSTM, Domain Adaptation) models to build fake news detection model using scikit‑learn, PyTorch, and Tensorflow, achieved 91.4% test accuracy.
Women’s Health Initiative Proteome‑Wide Association Study
Yun Li Statistical Genetics Group, UNC Chapel Hill, Sep 2020 ‑ Aug 2021
- Analyzed 552 protein levels of 1,002 individuals from Women’s Health Initiative data to identify protein quantitative trait loci (pQTL) using EPACTS.
- Built protein level prediction model using CV Elastic Net and investigated cardiovascular diseases related proteins based on predicted levels in R.
Growing Student Knowledge Distillation
Deep Learning Course Project, Seoul National University, Sep 2019 ‑ Dec 2019
- Proposed novel knowledge distillation structure comprised of sequence of CNNs with increasing number of layers, transferring knowledge from smaller to bigger networks consecutively using PyTorch which resembles a student’s cumulative learning process.
- Achieved 89.9% test accuracy on CIFAR‑10 dataset, improvement of 0.2% test accuracy compared to baseline ResNet26.