We are a group of highly motivated and creative people, discovering and innovating together to make precision medicine a reality through the catalytic power of big genomic data, statistical genetics, and artificial intelligence.
Genetic modifier discovery for common and rare diseases
Not all people carrying disease-causing mutations or high-risk genetic variants develop the disease. Thus, identifying the genetic and non-genetic modifiers is important to understand the disease mechanisms and build useful and accurate statistical models for disease risk prediction and disease prevention. By genetic admixture-mapping and analyzing gene expression data, we identified the UBD gene as a genetic modifier of APOL1 related kidney disease (two high-risk APOL1 disease risk alleles increased the risk about ten times in a recessive form). Following that, cell-based experiments showed that the UBD and APOL1 protein has physical interaction and that UBD expression mitigates APOL1-mediated cell death (Zhang*, Wang* et al. 2018, PNAS). On the other hand, illustrated by three diseases (coronary artery disease, breast cancer, and colorectal cancer) and by modeling the rare monogenic disease-causing mutations in the polygenic risk background, we show that the polygenic risk has a significant impact on the disease penetrance in individuals carrying the monogenic mutations, in some cases bringing risk closer to the population average(Fahed*, Wang*, homburger*, et al. 2020, Nat. Commun). More importantly, this model is generally applicable to many other conditions. The integration of the risk conferred by monogenic mutations with polygenic backgrounds has the opportunity to provide more accurate risk prediction and enable precise patient stratification and management.
Method development and statistical modeling
Identifying genes that have been positively selected in human evolution can not only help us understand the adaptive evolution of human beings, but also help understand the risk mechanism of diseases. Although thanks to the rapid development of next-generation gene sequencing and genotyping technology, a large amount of genetic data has been rapidly generated in recent years. However, genetic testing methods for accurately and efficiently analyzing and identifying genes subject to positive selection from these data are still very limited. Furthermore, it is even more challenging to pinpoint the driver mutation that was selected. We, therefore, designed a fast and accurate method for detecting positive selection genes and locating positive selection driving mutations. This algorithm is comparable in statistical power to the most effective and most commonly used algorithm(iHS) at that time, but the calculation speed is more than 10,000 times faster. More importantly, the accuracy of the new algorithm for locating positive selection effect sites is significantly better than that of the iHS algorithm. In addition, when the selection effect is relatively weak, the positioning ability of the new algorithm far exceeds other traditional algorithms (Wang et al. 2014, Mol Biol Evol, and He, Wang et al. 2015, Genome Research).