Quick start
The snippet below shows the fastest way to orchestrate a single window analysis
with genboostgpu.orchestration.run_windows_with_dask(). It generates
toy CuPy arrays, seeds all RNGs with 42, and saves results to results/.
Note
Ensure you have followed the Installation guide and have a CUDA 12 GPU
visible to the process (CUDA_VISIBLE_DEVICES).
1import cupy as cp
2import numpy as np
3import pandas as pd
4
5from genboostgpu.orchestration import run_windows_with_dask
6
7np.random.seed(42)
8cp.random.seed(42)
9
10n_samples, n_snps = 256, 512
11geno = cp.asarray(np.random.normal(size=(n_samples, n_snps)), dtype=cp.float32)
12bim = pd.DataFrame({
13 "chrom": ["21"] * n_snps,
14 "snp": [f"rs{i}" for i in range(n_snps)],
15 "pos": np.arange(n_snps) * 100 + 150_000,
16})
17pheno = cp.asarray(np.random.normal(size=n_samples), dtype=cp.float32)
18
19windows = [{
20 "chrom": 21,
21 "start": 150_000,
22 "end": 150_000,
23 "pheno": pheno,
24 "pheno_id": "trait_1",
25}]
26
27results = run_windows_with_dask(
28 windows,
29 geno_arr=geno,
30 bim=bim,
31 outdir="results",
32 window_size=200_000,
33 n_iter=30,
34 n_trials=5,
35 batch_size=512,
36 prefix="quickstart",
37)
38
39results.to_csv("results/trait_1_summary.csv", index=False)
40print(results.head())
The call triggers genboostgpu.vmr_runner under the hood, which filters SNPs
in the cis-window, performs boosting iterations, and writes parquet/TSV files to
results/.
More to explore
Inspect the saved parquet at
results/quickstart.summary_windows.parquetfor window-level metrics.Dive into Workflow to understand how each module contributes.
Try the richer Simulation tutorial and VMR caudate tutorial walkthroughs that reuse the scripts shipped in
examples/.