To identify the most important parameters associated with cerebral white matter hyperintensities (WMH), in consideration of potential collinearity, we used a data-driven machine-learning approach. We analysed two independent cohorts (KORA and SHIP). WMH volumes were derived from cMRI-images (FLAIR). 90 (KORA) and 34 (SHIP) potential determinants of WMH including measures of diabetes, blood-pressure, medication-intake, sociodemographics, life-style factors, somatic/depressive-symptoms and sleep were collected. Elastic net regression was used to identify relevant predictor covariates associated with WMH volume. The ten most frequently selected variables in KORA were subsequently examined for robustness in SHIP. The final KORA sample consisted of 370 participants (58% male; age 55.7 ± 9.1 years), the SHIP sample comprised 854 participants (38% male; age 53.9 ± 9.3 years). The most often selected and highly replicable parameters associated with WMH volume were in descending order age, hypertension, components of the social environment (i.e. widowed, living alone) and prediabetes. A systematic machine-learning based analysis of two independent, population-based cohorts showed, that besides age and hypertension, prediabetes and components of the social environment might play important roles in the development of WMH. Our results enable personal risk assessment for the development of WMH and inform prevention strategies tailored to the individual patient.