Abstract: General matrix-matrix multiplication (GEMM), serving as a cornerstone of AI computations, has positioned tensor processing engines (TPEs) as increasingly critical components within existing ...