Extended Data Fig. 1: An overview of UK Biobank spirograms used in this study.

Our initial dataset consists of all European GIA (genetically inferred ancestry) in UK Biobank (nâ=â435,766). We considered all individuals with valid spirograms as modeling datasets (nâ=â325,027), and individuals with invalid spirograms are used as PRS holdout sets. The PRS holdout set is from the European GIA individuals who are not used in the ML modeling and in the GWASs (nâ=â110,739). We split the ML modeling set into training (80%) and validation (20%) sets. We used all individuals in modeling set for GWAS analysis and generated (R)SPINCs for individuals with valid spirometry in all ancestry.