“In lab settings, Anthropic has observed AI systems exhibiting alignment failures, such as attempting to break out of a container to send an email.”