vidur_mlsys24.pdf

LLM Inference

Challenges of LLM Inference Simulation

VIDUR Design

Decode Step - Attention + MLP