“The energy efficiency of delivering an FP8 flop, including memory access, has seen only very incremental improvements in recent GPU generations.”