Tensaurus: A versatile accelerator for mixed sparse-dense tensor computations

N Srivastava, H **, S Smith, H Rong… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Tensor factorizations are powerful tools in many machine learning and data analytics
applications. Tensors are often sparse, which makes sparse tensor factorizations memory …

TuckerMPI: A parallel C++/MPI software package for large-scale data compression via the Tucker tensor decomposition

G Ballard, A Klinvex, TG Kolda - ACM Transactions on Mathematical …, 2020 - dl.acm.org
Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte
output of a high-fidelity computational simulation. For such data sets, we have developed a …

Learning nonnegative factors from tensor data: Probabilistic modeling and inference algorithm

L Cheng, X Tong, S Wang, YC Wu… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Tensor canonical polyadic decomposition (CPD) with nonnegative factor matrices, which
extracts useful latent information from multidimensional data, has found wide-spread …

Toward decoding the relationship between domain structure and functionality in ferroelectrics via hidden latent variables

SV Kalinin, K Kelley, RK Vasudevan… - ACS Applied Materials …, 2021 - ACS Publications
Polarization switching mechanisms in ferroelectric materials are fundamentally linked to
local domain structure and the presence of the structural defects, which both can act as …

Comparison of accuracy and scalability of gauss--Newton and alternating least squares for CANDECOMC/PARAFAC decomposition

N Singh, L Ma, H Yang, E Solomonik - SIAM Journal on Scientific Computing, 2021 - SIAM
Alternating least squares is the most widely used algorithm for CANDECOMC/PARAFAC
(CP) tensor decomposition. However, alternating least squares may exhibit slow or no …

Accelerating alternating least squares for tensor decomposition by pairwise perturbation

L Ma, E Solomonik - Numerical Linear Algebra with …, 2022 - Wiley Online Library
The alternating least squares (ALS) algorithm for CP and Tucker decomposition is
dominated in cost by the tensor contractions necessary to set up the quadratic optimization …

Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree

L Ma, E Solomonik - 2021 IEEE International Parallel and …, 2021 - ieeexplore.ieee.org
The widely used alternating least squares (ALS) algorithm for the canonical polyadic (CP)
tensor decomposition is dominated in cost by the matricized-tensor times Khatri-Rao product …

PLANC: Parallel low-rank approximation with nonnegativity constraints

S Eswar, K Hayashi, G Ballard, R Kannan… - ACM Transactions on …, 2021 - dl.acm.org
We consider the problem of low-rank approximation of massive dense nonnegative tensor
data, for example, to discover latent patterns in video and imaging applications. As the size …

Alternating Mahalanobis Distance Minimization for Accurate and Well-Conditioned CP Decomposition

N Singh, E Solomonik - SIAM Journal on Scientific Computing, 2023 - SIAM
Canonical polyadic decomposition (CPD) is prevalent in chemometrics, signal processing,
data mining, and many more fields. While many algorithms have been proposed to compute …

General memory-independent lower bound for MTTKRP

G Ballard, K Rouse - Proceedings of the 2020 SIAM Conference on Parallel …, 2020 - SIAM
Our goal is to establish lower bounds on the communication required to perform the
Matricized-Tensor Times Khatri-Rao Product (MTTKRP) computation on a distributed …