Abstract
Gene–environment interaction (GxE) is emphasized as one potential source of missing genetic variation on disease traits, and the ultimate goal of GxE research is prediction of individual risk and prevention of complex diseases. However, there are various challenges in statistical analysis of GxE. In this paper, we focus on the three methodological challenges: (i) the high dimensions of genes; (ii) the hierarchical structure between interaction effects and their corresponding main effects; and (iii) the correlation among subjects from family-based population studies. In this paper, we propose an algorithm that approaches all three challenges simultaneously. This is the first penalized method focusing on an interaction search based on a linear mixed effect model. For verification, we compare the empirical performance of our new method with other existing methods in simulation study. The results demonstrate the superiority of our method under overall simulation setup. In particular, the outperformance obviously becomes greater as the correlation among subjects increases. In addition, the new method provides a robust estimate for the correlation among subjects. We also apply the new method on Genetics of Lipid Lowering Drugs and Diet Network study data.
Original language | English |
---|---|
Pages (from-to) | 3547-3559 |
Number of pages | 13 |
Journal | Statistics in Medicine |
Volume | 36 |
Issue number | 22 |
DOIs | |
State | Published - 30 Sep 2017 |
Keywords
- GOLDN
- GxE
- SCAD
- family-based populations
- gene–environment interaction
- variable selection