Tencent's Open-Source Models Outperform Google in Language Translation
Tencent, a Chinese tech giant, has unveiled a pair of open-source translation models, Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B, which are making waves in the language technology scene. These models focus on bidirectional translation between Mandarin Chinese and several ethnic minority languages in China, supporting a total of 33 languages.
The models, released in 2023, have shown remarkable performance. They support translation in both directions for 33 languages, including widely spoken and less frequently digitized ones. Notably, they can translate between Chinese and languages like Kazakh, Uyghur, Mongolian, and Tibetan.
In a significant achievement, the Hunyuan models outperformed established services like Google Translate in an international comparison test at the WMT2025 workshop. They even surpassed other proprietary AI systems such as GPT-4.1, Claude 4 Sonnet, and Gemini 2.5 Pro in most categories, with improvements ranging from 15 to 65 percent. Tencent's models achieved the best results in 30 out of 31 tested language combinations.
Tencent attributes this success to a five-stage training process, including a 'Weak-to-Strong' reinforcement learning approach. Despite having only 7 billion parameters, these models match or even outperform larger foundational models in terms of performance.
The Hunyuan models are available as open source on platforms like Hugging Face and GitHub, making them accessible to the global community. With their impressive performance and focus on lesser-resourced languages, these models have the potential to bridge linguistic gaps and foster better communication among diverse communities.