“Using NVIDIA's NBFP4 precision format for model checkpoints provides optimal performance when combined with TensorRT LLM optimizations.”