Dan Biderman, mentioned 14 times across podcast episodes and expert conversations analyzed by Sonic.
Dan Biderman previously believed that vision and action were the keys to AI progress, but the 'ChatGPT moment' and his work at MosaicML shifted his perspective to the effectiveness of language models.
Engram aims to use offline computation to compress large KV caches, potentially reducing an 80-gigabyte cache by a factor of 1,000x.
The entire set of weights for a 70-billion-parameter Llama model occupies approximately 100 gigabytes.