PuSH - Publication Server of Helmholtz Zentrum München

Mining massive vector data on single instruction multiple data microarchitectures.

In: (IEEE 15th International Conference on Data Mining Workshops, 14-17 November 2015). 2015.
as soon as is submitted to ZB.
Current microarchitectures are equipped with SIMD instruction sets enabling massive data parallelism within each core. Instruction sets like AVX or SSE operate on large reserved registers and support a wide range of parallel arithmetic or logical operations enabling up to 16 double precision floating point operations per clock cycle. Current data mining applications are usually far from fully exploiting the ubiquitous computing power of SIMD instruction sets. Auto-vectorization as performed by most compilers can only be effective if the logical flow of the algorithm supports data parallelism. We consider the wide-spread algorithm K-means as a simple but highly relevant use case for mining vector data. The algorithm SIMD-K-means is a completely re-engineered version of K-means for current microarchitectures with two key ingredients enabling effective vectorization: (1) data-sensitive strategies in order to minimize the information to be transferred to the registers and (2) jointly coding the minimum distance and the best cluster representative which enables a K-means algorithm without any branching. Combined with efficient collection of sufficient statistics for the update of the cluster centers and space-and time-efficient calculation of Euclidean distances by the scalar product, these two ideas provide a substantial speed-up factor of about 5 over standard K-means. Intra-core SIMD parallelism can be combined with approaches focusing on multi-core and distributed environments. Moreover, our general ideas are not limited to K-means but support a wide range of data mining tasks on vector data including classification and outlier detection.
Additional Metrics?
Edit extra informations Login
Publication type Article: Conference contribution
Keywords Registers, Data mining, Instruction sets, Parallel processing, Clocks, Algorithm design and analysis, Clustering algorithms
Reviewing status