Using the Speculative-Speculative Decoding (SSD) algorithm, it is possible to achieve a sampling ..., Sonic AI
“Using the Speculative-Speculative Decoding (SSD) algorithm, it is possible to achieve a sampling speed of 300 tokens per second for Llama 3 70B on a system with four H100 GPUs.”