Open Access Green möglich sobald Postprint bei der ZB eingereicht worden ist.
An evaluation of experimental design in QSAR modelling utilizing the k-medoid clustering.
J. Chemometr. 26, 509-517 (2012)
Verlagsversion Volltext DOI
A reliable selection of a representative subset of chemical compounds has been reported to be crucial for numerous tasks in computational chemistry and chemoinformatics. We investigated the usability of an approach on the basis of the k-medoid algorithm for this task and in particular for experimental design and the split between training and validation set. We therefore compared the performance of models derived from such a selection to that of models derived using several other approaches, such as space-filling design and D-optimal design. We validated the performance on four datasets with different endpoints, representing toxicity, physicochemical properties and others. Compared with the models derived from the compounds selected by the other examined approaches, those derived with the k-medoid selection show a high reliability for experimental design, as their performance was constantly among the best for all examined datasets. Of all the models derived with all examined approaches, those derived with the k-medoid approach were the only ones that showed a significantly improved performance compared with a random selection, for all datasets, the whole examined range of selected compounds and for each dimensionality of the search space.
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Schlagwörter design of experiments; drug design; REACH; representative selection
ISSN (print) / ISBN 0886-9383
Zeitschrift Journal of Chemometrics
Quellenangaben Band: 26, Heft: 10, Seiten: 509-517
Begutachtungsstatus Peer reviewed
Institut(e) Institute of Structural Biology (STB)