Informatics | DIASyM

Bioinformatics

Prof. Dr. Andreas Hildebrandt

Prof. Dr. Andreas Hildebrandt's bioinformatics group - located at the Institute for Computer Sciences at the Johannes Gutenberg-University - focusses on the development and application of modern bioinformatics techniques for different research questions in the life sciences. Current projects of the group encompass studies of different human diseases and their therapies. From a methodological point of view, the group's expertise spans a broad range from Computational Proteomics, Transcriptomics, Structural Bioinformatics, Visualization, to Image and Volume Analysis.

Within the DIASyM Research Core the group will develop tools for the evaluation of mass spectrometric raw data including suitable strategies to store and handle the enormous amount of raw data by methods such as "locally sensitive hashing". In addition to classical signal processing methods, state-of-the-art machine learning methods based on deep neural networks will be employed to analyse proteomic, metabolomic and lipidomic data.

Data Mining

Prof. Dr. Stefan Kramer

Prof. Dr. Stefan Kramer heads the Data Mining Gruppe at Institute for Computer Science at the Johannes Gutenberg-Universität. The group will analyze the preprocessed data with the help of modern machine learning methods. In addition, biases need to be taken into account and reduced to generate comprehensible and principally interpretable results.

The subproject focuses on the differential analysis of multi-OMICs data over time with respect to a specified target variable (e.g., an event such as HF). The approach is based on multi-view learning, where proteomics, metabolomics, and lipidomics each represent a view of the data. Other views include demographic variables and medical record data. In the context of the work steps, a flexible handling of the views and temporal information should be realized, without too many assumptions made in advance about the depth of integration and the relevance of specific time points or time intervals. For further information regarding the group, please visit the website.

Knowledge Mining

Prof. Dr. Miguel Andrade

Prof. Dr. Miguel Andrade heads the Computational Biology and Data Mining Group at the Faculty of Biology of the Johannes Gutenberg-University in Mainz. His group studies gene function by means of computational techniques including algorithms and databases to better understand underlying disease mechanisms and to develop novel treatment modalities.

In recent years it has become evident that lipids are altered under pathophysiological conditions and could thus be used as biomarkers. However, there are only very few public resources available. Within the DIASyM consortium, Miguel Andrade and Laura Bindila closely collaborate to develop a resource to identify disease associations of lipids by text and data mining. To achieve this goal dictionaries of lipid names will be generated based on a literature search in PubMed Central. Using co-occurrence analysis in grammatically parsed text, significant patterns identifying relations of lipids to other metabolites, their physical interactions, and their associations to human disease will be deduced. These curated lists of proteins and public resources of protein expression will be adapted to plasma and chronic cardiovascular diseases (CVD). Subsequently genes associated to these phenotypes or mentioned significantly in the literature of CVD as well as disease terms will be identified and used to develop methods that study gene and protein expression specific to heart failure. This strategy will be used to develop a disease-module approach to the study of patient signatures creating networks with multiple layers including lipids, metabolites and gene/protein expression. The results from the associations obtained from deep learning methods and from other relevant systems-medicine analyses will be integrated into a web-based database. Users will input data from markers and evaluating sample data and will obtain predictions of pathological states. During the time span of DIASyM, data from several follow up studies will become available, that will be used to optimize a method to predict outcomes based on time series analyses of networks of interacting molecules, considering modules associated to disease.