PuSH - Publication Server of Helmholtz Zentrum München

Maurus, S. ; Plant, C.*

Factorizing complex discrete data “with Finesse”.

In: (IEEE International Conference on Data Mining (ICDM), 13 December 2016, Barcelona). SPIE, 2016. 1-6 (Conf. Proc. IEE)
DOI Order publishers version
Can we mine latent patterns from discrete, nonnumeric heterogeneous data? Many modern data sets contain heterogeneous non-numerical information measured over Boolean, ordinal and ternary scales. Values for features like these are “mixable” in the sense that they have intuitive non-linear analogs to classical “addition” (e.g. logical OR for Boolean data). We present a novel, general and extensible matrix factorization framework for any such “mixable” features. The framework lets us support heterogeneous data and encourages us to deduce other interesting “mixable” features, like those which encapsulate sub-trees over an ontology. We present FINESSE, an algorithm with linear run-time complexity in the size of the data. FINESSE outperforms state-of-the-art techniques in the special cases in terms of effectiveness and efficiency, and yields insightful patterns from its novel application to large real-world heterogeneous data.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Conference contribution
e-ISSN 2374-8486
Conference Title IEEE International Conference on Data Mining (ICDM)
Conference Date 13 December 2016
Conference Location Barcelona
Quellenangaben Volume: , Issue: , Pages: 1-6 Article Number: , Supplement: ,
Publisher SPIE