Inference of Causal Relationships between Biomarkers and Outcomes in High Dimensions
Felix Agakov, Paul Mckeigue, Jon Krohn, Jonathan Flint
We describe a unified computational framework for learning causal dependencies between genotypes, biomarkers, and phenotypic outcomes from large-scale data. In contrast to previous studies, our framework allows for noisy measurements, hidden confounders, missing data, and pleiotropic effects of genotypes on outcomes. The method exploits the use of genotypes as “instrumental variables” to infer causal associations between phenotypic biomarkers and outcomes, without requiring the assumption that genotypic effects are mediated only through the observed biomarkers. The framework builds on sparse linear methods developed in statistics and machine learning and modified here for inferring structures of richer networks with latent variables.
Where the biomarkers are gene transcripts, the method can be used for fine mapping of quantitative trait loci (QTLs) detected in genetic linkage studies. To demonstrate our method, we examined effects of gene transcript levels in the liver on plasma HDL cholesterol levels in a sample of 260 mice from a heterogeneous stock.