Statistik: mehr als Erbsen zählen

You are here:

Project I2: Prediction of phenotypical responses - from model systems to human diseases

Project I2 aims at an integrative analysis of different types of data sets from the liver to improve the prediction of the response of this organ to interventions with toxic model compounds (e.g. paracetamol).

The focus of this project is on the comparison between healthy liver and fatty liver, the most common liver disease in the Western countries. This study aims at understanding how fat accumulation in the liver perturbs hepatic metabolic functions, including its detoxification capacity, which might help to explain the wide range of complications associated with human fatty liver disease, including the development of cirrhosis and hepatocellular carcinoma.

Gene expression data and DNA methylation data from a mouse model of non-alcoholic fatty liver disease (NAFLD) (leptin-deficient ob/ob mice) are already available. Since sexual dimorphism in hepatotoxicity has been reported, data from both male and female mice have been collected. Proteome and metabolome data will be generated as well. For validation, a second model of NAFLD, namely Western diet-fed mice, is used. Phenotypic parameters such as plasma transaminases, inflammation and fibrosis are recorded as surrogates for liver damage after treatments.

In addition to mouse data sets, transcriptomes and class I epigenomes (including DNA methylation, DNase I-seq and ChIP-seq) have been generated from primary human hepatocytes isolated from 12 donors (control and fatty liver, male and female, each n=3). From a validation human cohort (n=43) gene expression and DNA methylation data are available. In this project, the functional relationship between gene expression and DNA methylation is of particular interest. Traditionally, this relationship has been considered as inversely correlated. Hypomethylated CpGs were frequently found in transcriptionally active promoter regions, and vice-versa (Wagner et al., 2014). However, this relationship is actually more complex and the contribution of DNA methylation to gene expression alterations in fatty liver and hepatotoxicity is far from being understood.

For an integrative analysis of gene expression and epigenetic data, Bayesian mixture models for the correlations have proven useful, see e.g., Klein et al. (2014). In this project, we will adapt this approach for the integrative analysis of gene expression and DNA methylation data. The integrative analysis of more than two data types has hardly been researched (cf. Ickstadt et al., 2018). One approach that we will also pursue employs a Bayesian hierarchical model for integrating multiple genomic variables to detect differences for different biological conditions (Schäfer et al., 2017).

Although mouse data and human hepatocytes will be analysed separately, we will also explore whether the findings in the mouse model reflect the situation for human hepatocytes. If so, the results of the mouse model might be employed for formulating informative priors in a Bayesian analysis of human hepatocytes. The Bayesian modelling approach also allows including external information from functional networks as prior information.


  • Ickstadt K, Schäfer M, Zucknick M (2018). Toward integrative Bayesian analysis in molecular biology. Annual Review of Statistics and Its Application 5, 141–167, doi: 10.1146/annurevstatistics-031017-100438.
  • Klein HU, Schäfer M, Porse BT, Hasemann MS, Ickstadt K, Dugas M (2014). Integrative analysis of histone ChIP-seq and transcription data using Bayesian mixture models. Bioinformatics 30(8), 1154-62, doi: 10.1093/bioinformatics/btu003.
  • Schäfer M, Klein HU, Schwender H (2017). Integrative analysis of multiple genomic variables using a hierarchical Bayesian model. Bioinformatics 33, 3220-27, doi: 10.1093/bioinformatics/ btx356.
  • Wagner JR, Busche S, Ge B, Kwan T, Pastinen T, Blanchette M (2014). The relationship between DNA methylation, genetic and expression inter-individual variation in untransformed human fibroblasts. Genome Biol 15(2), R37, doi: 10.1186/gb-2014-15-2-r37.