“Talos created an ASIC by burning the Llama 3.1 8B model directly onto the chip, achieving an inference speed of 16,000 tokens per second.”