Some papers from DeepSeek on their sparse attention mechanism show that it can change the memory ..., Sonic AI
“Some papers from DeepSeek on their sparse attention mechanism show that it can change the memory fetch time calculation to scale with the square root of the context length, rather than linearly.”