PuSH - Publikationsserver des Helmholtz Zentrums München

Maurus, S. ; Plant, C.*

Factorizing complex discrete data “with Finesse”.

In: (IEEE International Conference on Data Mining (ICDM), 13 December 2016, Barcelona). SPIE, 2016. 1-6 (Conf. Proc. IEE)
Can we mine latent patterns from discrete, nonnumeric heterogeneous data? Many modern data sets contain heterogeneous non-numerical information measured over Boolean, ordinal and ternary scales. Values for features like these are “mixable” in the sense that they have intuitive non-linear analogs to classical “addition” (e.g. logical OR for Boolean data). We present a novel, general and extensible matrix factorization framework for any such “mixable” features. The framework lets us support heterogeneous data and encourages us to deduce other interesting “mixable” features, like those which encapsulate sub-trees over an ontology. We present FINESSE, an algorithm with linear run-time complexity in the size of the data. FINESSE outperforms state-of-the-art techniques in the special cases in terms of effectiveness and efficiency, and yields insightful patterns from its novel application to large real-world heterogeneous data.
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Konferenzbeitrag
e-ISSN 2374-8486
Konferenztitel IEEE International Conference on Data Mining (ICDM)
Konferzenzdatum 13 December 2016
Konferenzort Barcelona
Quellenangaben Band: , Heft: , Seiten: 1-6 Artikelnummer: , Supplement: ,
Verlag SPIE