Data from family studies are challenging to model due to the selection of the families in the study and the correlation among variables measured in families. Families are typically recruited based on the distribution of the primary phenotype (disease), e.g. having at least two cases. To obtain correct parameter estimates from family data it is essential to take into account the study design.
Modelling family is relevant, because traits segregate within families due to shared genetic, environmental and life style factors. For many complex diseases the effect of life style and environment is not well understood. Moreover not all genetic markers have been identified. Thus family history comprises information on individual disease risks.
In this presentation I will focus on the analysis of secondary phenotypes, i.e. other traits which are measured and modelled in addition to the primary phenotypes. There is a lot of literature available for the analysis of secondary phenotypes in case control series, but methods for family data are lacking. Examples are triglycerides in long lived families and EEG measurements in families with social anxiety disorder. Two approaches are typically used for the analysis of secondary phenotypes in families; either the selection of the families is ignored or the likelihood conditional on the trait values of the cases (primary phenotypes) is used. We will use DAGs to show that these approaches might yield biased estimates and we propose a secondary phenotype analysis method for family studies which is based on a joint model for the primary and secondary phenotypes. Parameters are obtained by maximizing the retrospective loglikelihood.
The performance of our approach is compared to other approaches via simulations. The methods are applied to data from family studies. The conclusion is that our method provides correct parameter estimates and should therefore be used for analysis of secondary phenotypes in case of ascertained families.