Économies mesurées sur 11 LLMs — Claude Opus 4.7 à Gemini Flash.→ Voir les données par modèle
Obtenir une clé API gratuite →
Research

OpenBMB trains 8B model to 1.58-bit on Huawei Ascend NPU with 95% performance

OpenBMB's BitCPM-CANN achieves native ternary quantization training on Huawei's Ascend NPU, retaining 95.7–97.2% of full-precision performance across models up to 8B parameters while reducing weight memory by up to 8×.

1 min read

OpenBMB released BitCPM-CANN, a family of ternary-quantized large language models trained natively on Huawei's Ascend NPU platform. The work ports 1.58-bit quantization-aware training (QAT) from GPU-based pipelines to CANN, MindSpeed, and Megatron-LM, delivering four models ranging from 0.5B to 8B p...

Sign in to read the full analysis

Free — just an email. Get full analysis on LLM unit economics, plus the weekly Cost-of-Inference column.

Method & sources
Source type
Primary publication (lab/vendor blog) — our analysis + implication
Source link
r/localllama
Published
UTC
Byline
By the gotcontext.ai team (editorial standards)
Correction?
corrections@gotcontext.ai
OpenBMB trains 8B model to 1.58-bit on Huawei Ascend NPU with 95% performance — gotcontext.ai