“DeepL found that neural network architectures other than the standard Transformer model can be better suited specifically for translation tasks.”