Research
OpenBMB trains 8B model to 1.58-bit on Huawei Ascend NPU with 95% performance
OpenBMB's BitCPM-CANN achieves native ternary quantization training on Huawei's Ascend NPU, retaining 95.7–97.2% of full-precision performance across models up to 8B parameters while reducing weight memory by up to 8×.
1 min read
Sourcer/localllama
OpenBMB released BitCPM-CANN, a family of ternary-quantized large language models trained natively on Huawei's Ascend NPU platform. The work ports 1.58-bit quantization-aware training (QAT) from GPU-based pipelines to CANN, MindSpeed, and Megatron-LM, delivering four models ranging from 0.5B to 8B p...
Sign in to read the full analysis
Free — just an email. Get full analysis on LLM unit economics, plus the weekly Cost-of-Inference column.
Method & sources
- Source type
- Primary publication (lab/vendor blog) — our analysis + implication
- Source link
- r/localllama
- Published
- UTC
- Byline
- By the gotcontext.ai team (editorial standards)
- Correction?
- corrections@gotcontext.ai