Abstract: Sparse matrix-vector multiplication (SpMV) is a critical kernel in scientific computing and high-performance applications, yet it remains challenging to optimize due to irregular memory ...
Abstract: Sparse-Dense Matrix Multiplication (SpMM) on GPUs has gained significant attention because of its importance in modern applications and the increasing computing power of GPUs in the last ...
In industrial recommendation systems, the shift toward Generative Retrieval (GR) is replacing traditional embedding-based nearest neighbor search with Large Language Models (LLMs). These models ...