“It is very difficult to run large models with long context lengths on SRAM-based AI chips, such as those made by Cerebras and Groq.”