Multi-anestry polygenic risk predictions

Polygenic risk scores are useful for predicting various phenotypes; however, most PRS are developed using predominately European ancestry data, their performance in non-European populations is often poorer. To improve PRS performance in non-European populations, we propose a new method, which takes advantage of both existing large GWAS from European populations and smaller GWAS from non-European populations.

Robust Mendelian randomization methods incorporating weak and correlated instruments

Mendelian randomization (MR) is a major tool to test the causal association between risk factors and disease using genetic variants as instrumental variables.MR makes several strong assumptions, which can be violated in practice and lead to biased estimates and statistical inference. We develop a robust and powerful MR method requiring weaker and more realistic assumptions.

Testing for genetic association association and building risk prediction models for cancer incorporating tumor characteristics

Breast cancer represents a heterogeneous group of diseases with different molecular and clinical features. Thus, clarifying potential heterogeneous associations between genes and disease subtypes offers a tremendous opportunity to characterize distinct etiological pathways. I proposed several new computational and statistical approaches to develop powerful genetic associations tests.

Scientific collaborations

As a statistician and data scientist, I sincerely believe collaborations across disciplines are critical to scientific discovery and methodology development. I participated in broad research collaborations, including genetic association analysis for gall bladder cancer, effect-size distribution analysis for fourteen cancers, gene-environment interaction testing, risk factor analysis for bladder cancer, etc.