This talk discusses cuTENSOR, a high-performance CUDA library for tensor operations that efficiently handles the ubiquitous presence of high-dimensional arrays (i.e., tensors) in today's HPC and DL workloads. This library supports highly efficient tensor operations such as tensor contractions, element-wise tensor operations such as tensor permutations, and tensor reductions. While providing high performance, cuTENSOR also enables users to express their mathematical equations for tensors in a straightforward way that hides the complexity of dealing with these high-dimensional objects behind an easy-to-use API.
Paul Springer (NVIDIA)
More from the Same Authors
2020 : Invited Talk 7 Q&A by Paul »
2020 : Panel Discussion 2: Software and High Performance Implementation »
Glen Evenbly · Martin Ganahl · Paul Springer · Xiao-Yang Liu