To synchronize model weights across globally distributed clusters for Composer 2 training, Cursor..., Sonic AI
“To synchronize model weights across globally distributed clusters for Composer 2 training, Cursor and Fireworks developed a compression algorithm that ships only the deltas, which can be up to 20 times smaller than the full 1 terabyte model.”