OpenBMB trains 8B model to 1.58-bit on Huawei Ascend NPU with 95% performance

OpenBMB's BitCPM-CANN achieves native ternary quantization training on Huawei's Ascend NPU, retaining 95.7–97.2% of full-precision performance across models up to 8B parameters while reducing weight memory by up to 8×.

2026-05-251 min read

Sourcer/localllama

OpenBMB released BitCPM-CANN, a family of ternary-quantized large language models trained natively on Huawei's Ascend NPU platform. The work ports 1.58-bit quantization-aware training (QAT) from GPU-based pipelines to CANN, MindSpeed, and Megatron-LM, delivering four models ranging from 0.5B to 8B p...

Sign in to read the full analysis

Free — just an email. Get full analysis on LLM unit economics, plus the weekly Cost-of-Inference column.

Get started for free Sign in

Method & sources

Source type: Primary publication (lab/vendor blog) — our analysis + implication
Source link: r/localllama
Published: 2026-05-25 23:08:53 UTC
Byline: By the gotcontext.ai team (editorial standards)
Correction?: corrections@gotcontext.ai

← All Intelligence

OpenBMB trains 8B model to 1.58-bit on Huawei Ascend NPU with 95% performance — gotcontext.ai