We took this version of HeCBench and are modifying it to build the CUDA and OMP codes to gather their roofline performance data. So far we have a large portion of the CUDA and OMP codes building ...
Abstract: Matrix-matrix multiplication is one of the most important kernel in linear algebra operations with a multitude of applications in scientific and engineering computing. Sparse matrix ...
Abstract: Sparse General Matrix-Matrix Multiplication (SpGEMM) is a core operation in high-performance computing applications such as algebraic multigrid solvers, machine learning, and graph ...
Sparse Autoencoders (SAEs) have recently gained attention as a means to improve the interpretability and steerability of Large Language Models (LLMs), both of which are essential for AI safety. In ...