When a matrix is structured (e.g. tridiagonal matrix with the same stencil) using convolution instead of direct matrix-vector multiplication remarkable improve the performance. But if the matrix is not structured, is there any algorithm (not convert sparse matrix to form like CSR or COO) to do matrix-vector multiplication to get better performance on GPU