“DeepSeek applied two distinct post-training regimes to its models to achieve specific desirable behaviors.”