Publications

2023

2023

A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction

Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases.

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05166-7
2022

Cardiovascular profiling in the diabetic continuum: results from the population-based Gutenberg Health Study.

The study sample comprised 15,010 individuals aged 35-74 years of the population-based Gutenberg Health Study. Subjects were classified into euglycaemia, prediabetes and T2DM according to clinical and metabolic (HbA1c) information. The prevalence of prediabetes was 9.5% (n = 1415) and of T2DM 8.9% (n = 1316). Prediabetes and T2DM showed a significantly increased prevalence ratio (PR) for age, obesity, active smoking, dyslipidemia, and arterial hypertension compared to euglycaemia (for all, P < 0.0001). In a robust Poisson regression analysis, prediabetes was established as an independent predictor of clinically-prevalent cardiovascular disease (PRprediabetes 1.20, 95% CI 1.07-1.35, P = 0.002) and represented as a risk factor for asymptomatic cardiovascular organ damage independent of traditional risk factors (PR 1.04, 95% CI 1.01-1.08, P = 0.025). Prediabetes was associated with a 1.5-fold increased 10-year risk for cardiovascular disease compared to euglycaemia. In Cox regression analysis, prediabetes (HR 2.10, 95% CI 1.76-2.51, P < 0.0001) and T2DM (HR 4.28, 95% CI 3.73-4.92, P < 0.0001) indicated for an increased risk of death. After adjustment for age, sex and traditional cardiovascular risk factors, only T2DM (HR 1.89, 95% CI 1.63-2.20, P < 0.0001) remained independently associated with increased all-cause mortality.

https://link.springer.com/article/10.1007/s00392-021-01879-y
2022

Protective behavior and SARS-CoV-2 infection risk in the population - Results from the Gutenberg COVID-19 study

During the SARS-CoV-2 pandemic, preventive measures like physical distancing, wearing face masks, and hand hygiene have been widely applied to mitigate viral transmission. Beyond increasing vaccination coverage, preventive measures remain urgently needed. The aim of the present project was to assess the effect of protective behavior on SARS-CoV-2 infection risk in the population.

https://pubmed.ncbi.nlm.nih.gov/36316662/
2022

Subtype-specific plasma signatures of platelet-related protein releasate in acute pulmonary embolism

There is evidence that plasma protein profiles differ in the two subtypes of pulmonary embolism (PE), isolated PE (iPE) and deep vein thrombosis (DVT)-associated PE (DVT-PE), in the acute phase. The aim of this study was to determine specific plasma signatures for proteins related to platelets in acute iPE and DVT-PE compared to isolated DVT (iDVT).

https://pubmed.ncbi.nlm.nih.gov/36274391/
2022

Locality-sensitive hashing enables efficient and scalable signal classification in high-throughput mass spectrometry raw data

Mass spectrometry is an important experimental technique in the field of proteomics. However, analysis of certain mass spectrometry data faces a combination of two challenges: first, even a single experiment produces a large amount of multi-dimensional raw data and, second, signals of interest are not single peaks but patterns of peaks that span along the different dimensions. The rapidly growing amount of mass spectrometry data increases the demand for scalable solutions. Furthermore, existing approaches for signal detection usually rely on strong assumptions concerning the signals properties.

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-04833-5
2021

LipiDisease: associate lipids to diseases using literature mining

Lipids exhibit an essential role in cellular assembly and signaling. Dysregulation of these functions has been linked with many complications including obesity, diabetes, metabolic disorders, cancer and more. Investigating lipid profiles in such conditions can provide insights into cellular functions and possible interventions. Hence the field of lipidomics is expanding in recent years. Even though the role of individual lipids in diseases has been investigated, there is no resource to perform disease enrichment analysis considering the cumulative association of a lipid set. To address this, we have implemented the LipiDisease web server. The tool analyzes millions of records from the PubMed biomedical literature database discussing lipids and diseases, predicts their association and ranks them according to false discovery rates generated by random simulations. The tool takes into account 4270 diseases and 4798 lipids. Since the tool extracts the information from PubMed records, the number of diseases and lipids will be expanded over time as the biomedical literature grows.

https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btab559/6343440
2021

MaxDIA enables library-based and library-free data-independent acquisition proteomics

MaxDIA is a software platform for analyzing data-independent acquisition (DIA) proteomics data within the MaxQuant software environment. Using spectral libraries, MaxDIA achieves deep proteome coverage with substantially better coefficients of variation in protein quantification than other software. MaxDIA is equipped with accurate false discovery rate (FDR) estimates on both library-to-DIA match and protein levels, including when using whole-proteome predicted spectral libraries. This is the foundation of discovery DIA-hypothesis-free analysis of DIA samples without library and with reliable FDR control. MaxDIA performs three- or four-dimensional feature detection of fragment data, and scoring of matches is augmented by machine learning on the features of an identification. MaxDIA's bootstrap DIA workflow performs multiple rounds of matching with increasing quality of recalibration and stringency of matching to the library. Combining MaxDIA with two new technologies-BoxCar acquisition and trapped ion mobility spectrometry-both lead to deep and accurate proteome quantification.

https://www.nature.com/articles/s41587-021-00968-7
2021

OpenTIMS, TimsPy, and TimsR: Open and Easy Access to timsTOF Raw Data

The Bruker timsTOF Pro is an instrument that couples trapped ion mobility spectrometry (TIMS) to high-resolution time-of-flight (TOF) mass spectrometry (MS). For proteomics, lipidomics, and metabolomics applications, the instrument is typically interfaced with a liquid chromatography (LC) system. The resulting LC-TIMS-MS data sets are, in general, several gigabytes in size and are stored in the proprietary Bruker Tims data format (TDF). The raw data can be accessed using proprietary binaries in C, C++, and Python on Windows and Linux operating systems. Here we introduce a suite of computer programs for data accession, including OpenTIMS, TimsR, and TimsPy. OpenTIMS is a C++ library capable of reading Bruker TDF files. It opens up Bruker's proprietary codebase. TimsPy and TimsR build on top of OpenTIMS, enabling swift and user-friendly data access to the raw data with Python and R. Both programs are available under a GPL3 license on all major platforms, extending the possibility to interact with timsTOF data to macOS. Additionally, OpenTIMS is capable of translating Bruker data into HDF5 files that can be easily analyzed from Python with the vaex module. OpenTIMS and TimsPy therefore provide easy and quick access to Bruker timsTOF raw data.

https://pubs.acs.org/doi/10.1021/acs.jproteome.0c00962