Speculative decoding is a common compound AI system that uses a small "drafter" model and a large..., Sonic AI
“Speculative decoding is a common compound AI system that uses a small "drafter" model and a large "verifier" model to increase latency and throughput without sacrificing output quality.”