Etched uses a "Prefill-Decode Disaggregation" architecture where one cluster of servers handles p..., Sonic AI
“Etched uses a "Prefill-Decode Disaggregation" architecture where one cluster of servers handles prefill tasks and then transfers the KV caches to a separate decode cluster.”