“Google DeepMind's VO3 model can natively generate audio simultaneously with video from a single prompt.”