“NVIDIA's TensorRT LLM is an open-source SDK that can optimize inference performance on NVIDIA hardware with only a few lines of code.”