Skip to main content

Tencent Hunyuan Launches Offline Smartphone Translation Model Hy-MT1.5

·406 words·2 mins
Tencent Hunyuan Translation AI Offline LLM Hy-MT1.5 Quantization Mobile AI Machine Translation NLP Models
Table of Contents

Tencent Hunyuan Launches Offline Smartphone Translation Model Hy-MT1.5

๐Ÿ“ฑ High-Performance Offline Translation for Smartphones
#

Most translation tools require internet connectivity, leaving users vulnerable in critical situations. Tencent Hunyuan addresses this limitation with Hy-MT1.5-1.8B-1.25bit, a highly compressed translation model supporting 33 languages. At just 440MB, it runs entirely offline on mobile devices, delivering translation quality surpassing Google Translate.

Demo tests on devices like the Qualcomm Snapdragon 865 with 8GB RAM demonstrate both speed and accuracy, enabling fast, reliable translations without an internet connection.

๐Ÿ”ง Built on Hy-MT1.5: Lightweight but Powerful
#

Hy-MT1.5 is a professional translation LLM with 1.8B parameters, natively supporting 33 languages, 5 dialects/minority languages, and 1,056 translation directions. Its performance rivals large commercial models and surpasses mainstream APIs in evaluation benchmarks, demonstrating that efficient optimization enables lightweight models to achieve high translation quality.

Despite FP16 precision, the full model still requires 3.3GB RAM, necessitating aggressive quantization and compression to run smoothly on smartphones.

โšก Extreme Quantization and Compression Techniques
#

Tencent introduced two quantization schemes for mobile adaptation:

2-bit Model: Balanced Performance
#

Using Stretched Elastic Quantization (SEQ) and Quantization-Aware Distillation (QAD), the 2-bit model compresses the original 1.8B model to 574MB while maintaining near-lossless translation quality. On mobile devices supporting Arm SME2, inference speed is significantly enhanced.

1.25-bit Model: Extreme Compression
#

The Sherry (Sparse Efficient Ternary Quantization) approach reduces the model to 1.25 bits per parameter by storing only the most critical parameters, combined with the STQ kernel for mobile CPUs. This compresses the original 3.3GB model to 440MB, enabling smooth performance on budget smartphones without sacrificing translation accuracy.

๐Ÿ”’ Fully Offline, Zero Privacy Risk
#

Tencent Hunyuan provides a fully functional demo supporting background text selection. All translation occurs locally, requiring no internet connection, subscription, or data uploadโ€”ensuring complete privacy.

Demo tests on devices like Qualcomm Snapdragon 7+ Gen 2 with 16GB RAM confirm usability across a range of smartphones.

๐Ÿš€ Open-Source Access
#

All models, code, and technical reports are open-source, with Android demos currently available. iOS support is planned for future releases.

Demo Links #

Model Downloads
#

Technical Reports
#

Code Repository
#

Tencent Hunyuan Hy-MT1.5 demonstrates that offline, high-quality, mobile-friendly translation is now possible, combining cutting-edge quantization techniques with a compact, fully local LLM.

Related

TurboQuant Explained: Googleโ€™s Breakthrough in AI Memory Compression
·609 words·3 mins
AI Machine Learning Google Quantization LLM Cloud Computing
Effective Claude Code Integration for Large Codebases: Best Practices
·658 words·4 mins
Claude Code AI Programming Large Codebases Developer Productivity Software Engineering AI Integration DevEx
Samsung Semiconductor Strike Sparks Crisis, 30,000 Employees Join Protests
·468 words·3 mins
Samsung Semiconductors Labor Strike DRAM NAND Flash Employee Morale Corporate Crisis