“The StreamRag paper from Meta proposes running a RAG system concurrently while a user is speaking to reduce latency in conversational voice AI.”