This blog demonstrates optimization techniques for 2D depth Convolution in GPU using TVM, including block and thread organization, memory hierarchy exploitation and dimension fuse, etc.
This blog demonstrates optimization techniques for 1D GPU convolution using TVM, including thread organization, memory hierarchy exploitation, and low-level optimizations.