Penalised regression with multiple sources of prior effects
Armin Rauschenberger, Zied Landoulsi, Mark A. van de Wiel, and Enrico Glaab
Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg.
Department of Epidemiology and Data Science (EDS), Amsterdam University Medical Centers (Amsterdam UMC), Amsterdam, The Netherlands.
To whom correspondence should be addressed.
Mark A. van de Wiel and Enrico Glaab share senior authorship.
Abstract
In many high-dimensional prediction or classification tasks, complementary data on the features are available, e.g. prior biological knowledge on (epi)genetic markers. Here we consider tasks with numerical prior information that provide an insight into the importance (weight) and the direction (sign) of the feature effects, e.g. regression coefficients from previous studies. We propose an approach for integrating multiple sources of such prior information into penalised regression. If suitable co-data are available, this improves the predictive performance, as shown by simulation and application. The proposed method is implemented in the R package ‘transreg’ (https://github.com/lcsb-bds/transreg, https://cran.r-project.org/package=transreg).
Full text (open access)
Rauschenberger et al. (2023). “Penalized regression with multiple sources of prior effects”. Bioinformatics 39(12):btad680. doi: 10.1093/bioinformatics/btad680. (Click here to access PDF.)