GENBoostGPU
GPU-accelerated elastic net boosting for large-scale methylation and SNP studies. GENBoostGPU orchestrates feature preprocessing, Optuna-powered hyperparameter search, and elastic net boosting on top of RAPIDS, CuPy, and Dask so you can model thousands of genomic windows in parallel without leaving Python.
Key features
Adaptive window orchestration – distribute
genboostgpu.orchestrationjobs across one or many GPUs with auto-tunedmax_in_flightconcurrency.Automated SNP curation – zero-variance filtering, missing data imputation, and LD clumping in
genboostgpu.snp_processing.Elastic net boosting core – reproducible variance decomposition and ridge refits from
genboostgpu.enet_boosting.Flexible I/O – load PLINK data, CuPy arrays, or parquet outputs with
genboostgpu.data_io.Tuning toolbox – global and per-window hyperparameter utilities in
genboostgpu.tuning, including cohort-wide Optuna refits.Reproducibility guardrails – documented seeding, metadata capture, and structured logging patterns for consistent reruns.
Supported platforms
GENBoostGPU targets Linux with NVIDIA GPUs (Ampere or newer) and CUDA 12.x.
Multi-GPU orchestration requires RAPIDS cudf/cuML 25.8 and dask-cuda 25.8
or newer. Development and documentation can be performed on CPU-only machines by
installing the mock/documentation requirements.
Get started
Quick start – minimal pipeline example with saved outputs.
Installation – environment setup for CPU docs versus GPU production.
User guide – deep dives on data formats, workflow, tuning, scaling, and reproducibility.
Tutorials – walkthroughs based on the scripts in
examples/.API Reference – autogenerated API reference.
Troubleshooting – common fixes for CUDA, RAPIDS, and Dask issues.
Contributing – guidelines for development, style, and tests.
Changelog – highlights from each release.
Citation
If you use GENBoostGPU in academic or industrial work, please cite:
Alexis Bennett and Kynon J.M. Benjamin. GENBoostGPU: GPU-accelerated elastic net boosting for large-scale epigenomics. DOI: 10.5281/zenodo.17238798.