HeAT -- a distributed and GPU-accelerated tensor framework for data analytics
To cope with the rapid growth in available data, theefficiency of data analysis and machine learning libraries has re-cently received increased attention. Although great advancementshave been made in traditional array-based computations, mostare limited by the resources available on a single computationnode. Consequently, novel approaches must be made to exploitdistributed resources, e.g. distributed memory architectures. Tothis end, we introduce HeAT, an array-based numerical pro-gramming framework for large-scale parallel processing withan easy-to-use NumPy-like API. HeAT utilizes PyTorch as anode-local eager execution engine and distributes the workloadon arbitrarily large high-performance computing systems viaMPI. It provides both low-level array computations, as wellasassorted higher-level algorithms. With HeAT, it is possible for aNumPy user to take full advantage of their available resources,significantly lowering the barrier to distributed data analysis.When compared to similar frameworks, HeAT achieves speedupsof up to two orders of magnitude.
Helmholtz AI - FZJ (HAI - FZJ)