möglich sobald bei der ZB eingereicht worden ist.
Factorizing complex discrete data “with Finesse”.
In: (IEEE International Conference on Data Mining (ICDM), 13 December 2016, Barcelona). SPIE, 2016. 1-6 (Conf. Proc. IEE)
Can we mine latent patterns from discrete, nonnumeric heterogeneous data? Many modern data sets contain heterogeneous non-numerical information measured over Boolean, ordinal and ternary scales. Values for features like these are “mixable” in the sense that they have intuitive non-linear analogs to classical “addition” (e.g. logical OR for Boolean data). We present a novel, general and extensible matrix factorization framework for any such “mixable” features. The framework lets us support heterogeneous data and encourages us to deduce other interesting “mixable” features, like those which encapsulate sub-trees over an ontology. We present FINESSE, an algorithm with linear run-time complexity in the size of the data. FINESSE outperforms state-of-the-art techniques in the special cases in terms of effectiveness and efficiency, and yields insightful patterns from its novel application to large real-world heterogeneous data.
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Konferenzbeitrag
Konferenztitel IEEE International Conference on Data Mining (ICDM)
Konferzenzdatum 13 December 2016
Zeitschrift Conference Proceedings IEE
Quellenangaben Seiten: 1-6
Institut(e) Institute of Computational Biology (ICB)