“Chips from Groq and Cerebras achieve low latency by using SRAM for model weights, but this approach results in uncompetitive throughput.”