All Categories - Wentao's Blog

All Categories

Paper_summary ³⁵

Summary: Efficiently Modeling Long Sequences with Structured State Spaces

Summary: Demystifying NCCL: An In-depth Analysis of GPU Communication Protocols and Algorithms

Summary: PyTorch: An Imperative Style, High-Performance Deep Learning Library

Summary: AWQ: Activation-Aware Weight Quantization for on-device LLM Compression and Acceleration

Summary: Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve

Pytorch ¹⁹

Summary: PyTorch: An Imperative Style, High-Performance Deep Learning Library

Deep Dive to Pytorch Contiguous Operator(4)

Distributed Training Strategy Introduction

Csapp ¹⁰

CSAPP Class Notes(4)

Server ⁷

Ceph Learning Notes

Nginx Learning Notes

Kafka Learning Notes

Protobuf Learning Notes

Mysql Learning Notes

Tvm ⁵

TVM: 2D Depth Conv GPU Optimization

TVM: GEMM GPU Optimization

TVM: 1D convolution GPU Optimization

TVM: 1D convolution CPU Optimization

Summary: TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Technical_notes ⁴

2025 Technical Notes（3）

2025 Technical Notes(2)

2025 Technical Notes(1)

2024 Technical Notes

Algorithm ²

Understand Dynamic Programming

Understand Lightgbm

Llm ²

Llama_index Source Code Analysis(1)

Llama_index Source Code Analysis(2)

Vllm ²

Bi-weekly Journal: Contributions to vLLM

Summary: Efficient Memory Management for Large Language Model Serving with PagedAttention