Informatics

AG Hildebrandt

Bioinformatics

Andreas Hildebrandt's group focusses on the development and application of modern Bioinformatics techniques for different research questions in the Life Sciences. Current projects of the group encompass studies of different human diseases and their therapies. From a methodological point of view, the group's expertise spans a broad range from Computational Proteomics, Transcriptomics, Structural Bioinformatics, Visualization, to Image and Volume Analysis.
Within the DIASyM Research Core the group will develop tools for the evaluation of mass spectrometric raw data including suitable strategies to store and handle the enormous amount of raw data by methods such as "locally sensitive hashing". In addition to classical signal processing methods, state-of-the-art machine learning methods based on deep neural networks will be employed to analyse proteomic, metabolomic and lipidomic data.

AG Kramer

Data Mining

The machine learning subproject will analyze the preprocessed data with the help of modern machine learning methods. In addition, biases need to be taken into account and reduced to generate comprehensible and principally interpretable results. The subproject focuses on the differential analysis of multi-OMICs data over time with respect to a specified target variable (e.g., an event such as HF). The approach is based on multi-view learning, where proteomics, metabolomics, and lipidomics each represent a view of the data. Other views include demographic variables and medical record data. In the context of the work steps, a flexible handling of the views and temporal information should be realized, without too many assumptions made in advance about the depth of integration and the relevance of specific time points or time intervals.

AG Andrade

Computational Biology and Data Mining

Univ.-Prof. Miguel Andrade heads the working group Computational Biology and Data Mining at the Faculty of Biology of the Johannes Gutenberg-University in Mainz. In close collaboration with Dr. Laura Bindila’s group text and data mining of the interactions, metabolism and disease associations of lipids are carried out, since there are only very few public resources. Dictionaries of lipid names will be generated based on a literature search in PubMed Central. Using co-occurrence analysis in grammatically parsed text, significant patterns identifying relations of lipids to other metabolites, their physical interactions, and their associations to human disease will be deduced. These curated lists of proteins and public resources of protein expression will be adapted to plasma and chronic cardiovascular diseases (CVD). Subsequently genes associated to these phenotypes or mentioned significantly in the literature of CVD as well as disease terms will be identified and used to develop methods that study gene and protein expression specific to heart failure. This strategy will be used to develop a disease-module approach to the study of patient signatures creating networks with multiple layers including lipids, metabolites and gene/protein expression. The results from the associations obtained from deep learning methods and from other relevant systems-medicine analyses will be integrated into a web-based database. Users will input data from markers and evaluating sample data and will obtain predictions of pathological states. During the time span of DIASyM, data from several follow up studies will become available, that will be used to optimize a method to predict outcomes based on time series analyses of networks of interacting molecules, considering modules associated to disease.