⚠️ SUPERSEDED — use Outlier-Ai/Outlier-10B instead. These weights are retained live for reproducibility of earlier benchmark runs. All current research has moved to the successor.

Outlier 10B V3.2 (Superseded)

Earlier ternary MoE overlay on frozen Qwen2.5-7B-Instruct. Superseded by Outlier-10B (V3.3 alpha-fixed).

What changed

V3.2 reported MMLU 76.19% was a smoke-test artifact. V3.3 canonical re-measurement (n=14,042, lm-eval-harness v0.4.9.1) is 70.87% ±0.37% with alpha-fix training applied.

Historical benchmark (reference only)

Benchmark Score Notes
MMLU 5-shot 76.19% (smoke-test, n<100) Pre-V3.3 measurement

Use the successor

Why this is still public

ML research norms: earlier checkpoints stay live so external benchmarks and papers that cite this URL remain reproducible. This is not dead weight — it's the historical record.

Related

Patents

Architecture covered by US provisional patents 64/026,886, 64/030,368, 64/034,028 (Kerr & Company LLC, 2026).

Downloads last month
2,204
Safetensors
Model size
23B params
Tensor type
F16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Outlier-Ai/Outlier-10B-V3.2

Base model

Qwen/Qwen2.5-7B
Adapter
(1845)
this model

Collection including Outlier-Ai/Outlier-10B-V3.2