The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes

Huerta-Chagoya, Alicia; Schroeder, Philip; Mandla, Ravi; Deutsch, Aaron J; Zhu, Wanying; Petty, Lauren; Xiaoyan, Yi; Cole, Joanne B; Udler, Miriam S; Dornbos, Peter; Porneala, Bianca; Dicorpo, Daniel; Liu, Ching-Ti; Josephine H, Li; Szczerbiński, Lukasz; Kaur, Varinderpal; Kim, Joohyun; Yingchang, Lu; Martin, Alicia; Eizirik, Decio L; Marchetti, Piero; Marselli, Lorella; Chen, Ling; Srinivasan, Shylaja; Todd, Jennifer; Flannick, Jason; Gubitosi-Klug, Rose; Levitsky, Lynne; Shah, Rachana; Kelsey, Megan; Burke, Brian; Dabelea, Dana M; Divers, Jasmin; Marcovina, Santica; Stalbow, Lauren; Loos, Ruth J F; Darst, Burcu F; Kooperberg, Charles; Raffield, Laura M; Haiman, Christopher; Sun, Quan; Mccormick, Joseph B; Fisher-Hoch, Susan P; Ordoñez, Maria L; Meigs, James; Baier, Leslie J; González-Villalpando, Clicerio; González-Villalpando, Maria Elena; Orozco, Lorena; García-García, Lourdes; Moreno-Estrada, Andrés; Aguilar-Salinas, Carlos A; Tusié, Teresa; Dupuis, Josée; Maggie C Y, Ng; Manning, Alisa; Highland, Heather M; Cnop, Miriam; Hanson, Robert; Below, Jennifer; Florez, Jose C; Leong, Aaron; Mercader, Josep M

doi:10.1007/s00125-023-05912-9

Aims/hypothesis: The Latino population has been systematically underrepresented in large-scale genetic analyses, and previous studies have relied on the imputation of ungenotyped variants based on the 1000 Genomes (1000G) imputation panel, which results in suboptimal capture of low-frequency or Latino-enriched variants. The National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) released the largest multi-ancestry genotype reference panel representing a unique opportunity to analyse rare genetic variations in the Latino population. We hypothesise that a more comprehensive analysis of low/rare variation using the TOPMed panel would improve our knowledge of the genetics of type 2 diabetes in the Latino population. Methods: We evaluated the TOPMed imputation performance using genotyping array and whole-exome sequence data in six Latino cohorts. To evaluate the ability of TOPMed imputation to increase the number of identified loci, we performed a Latino type 2 diabetes genome-wide association study (GWAS) meta-analysis in 8150 individuals with type 2 diabetes and 10,735 control individuals and replicated the results in six additional cohorts including whole-genome sequence data from the All of Us cohort. Results: Compared with imputation with 1000G, the TOPMed panel improved the identification of rare and low-frequency variants. We identified 26 genome-wide significant signals including a novel variant (minor allele frequency 1.7%; OR 1.37, p=3.4 × 10-9). A Latino-tailored polygenic score constructed from our data and GWAS data from East Asian and European populations improved the prediction accuracy in a Latino target dataset, explaining up to 7.6% of the type 2 diabetes risk variance. Conclusions/interpretation: Our results demonstrate the utility of TOPMed imputation for identifying low-frequency variants in understudied populations, leading to the discovery of novel disease associations and the improvement of polygenic scores. Data availability: Full summary statistics are available through the Common Metabolic Diseases Knowledge Portal ( https://t2d.hugeamp.org/downloads.html ) and through the GWAS catalog ( https://www.ebi.ac.uk/gwas/ , accession ID: GCST90255648). Polygenic score (PS) weights for each ancestry are available via the PGS catalog ( https://www.pgscatalog.org , publication ID: PGP000445, scores IDs: PGS003443, PGS003444 and PGS003445).