Prof. Dr. Miguel Andrade heads the
Computational Biology and Data Mining Group at the Faculty of Biology of the Johannes Gutenberg-University in Mainz. His group studies gene function by means of computational techniques including algorithms and databases to better understand underlying disease mechanisms and to develop novel treatment modalities.
In recent years it has become evident that lipids are altered under pathophysiological conditions and could thus be used as biomarkers. However, there are only very few public resources available. Within the DIASyM consortium, Miguel Andrade and Laura Bindila closely collaborate to develop a resource to identify disease associations of lipids by text and data mining. To achieve this goal dictionaries of lipid names will be generated based on a literature search in PubMed Central. Using co-occurrence analysis in grammatically parsed text, significant patterns identifying relations of lipids to other metabolites, their physical interactions, and their associations to human disease will be deduced. These curated lists of proteins and public resources of protein expression will be adapted to plasma and chronic cardiovascular diseases (CVD). Subsequently genes associated to these phenotypes or mentioned significantly in the literature of CVD as well as disease terms will be identified and used to develop methods that study gene and protein expression specific to heart failure. This strategy will be used to develop a disease-module approach to the study of patient signatures creating networks with multiple layers including lipids, metabolites and gene/protein expression. The results from the associations obtained from deep learning methods and from other relevant systems-medicine analyses will be integrated into a web-based database. Users will input data from markers and evaluating sample data and will obtain predictions of pathological states. During the time span of DIASyM, data from several follow up studies will become available, that will be used to optimize a method to predict outcomes based on time series analyses of networks of interacting molecules, considering modules associated to disease.