⚠️ SUPERSEDED — use Outlier-Ai/Outlier-10B instead. These weights are retained live for reproducibility of earlier benchmark runs. All current research has moved to the successor.

Outlier 10B V3.2 (Superseded)

Earlier ternary MoE overlay on frozen Qwen2.5-7B-Instruct. Superseded by Outlier-10B (V3.3 alpha-fixed).

Scale: 30.4B / 13.3B active
Base (frozen): Qwen/Qwen2.5-7B-Instruct
Status: archival — do not use for new work

What changed

V3.2 reported MMLU 76.19% was a smoke-test artifact. V3.3 canonical re-measurement (n=14,042, lm-eval-harness v0.4.9.1) is 70.87% ±0.37% with alpha-fix training applied.

Historical benchmark (reference only)

Benchmark	Score	Notes
MMLU 5-shot	76.19% (smoke-test, n<100)	Pre-V3.3 measurement

Use the successor

Current: Outlier-Ai/Outlier-10B
Research collection: Outlier Research

Why this is still public

ML research norms: earlier checkpoints stay live so external benchmarks and papers that cite this URL remain reproducible. This is not dead weight — it's the historical record.

Outlier desktop app: outlier.host — v1.4 shipping
Discord: discord.gg/Hapennmdn9

Patents

Architecture covered by US provisional patents 64/026,886, 64/030,368, 64/034,028 (Kerr & Company LLC, 2026).

Downloads last month: 2,204

Safetensors

Model size

23B params

Tensor type

F16

F32

Model tree for Outlier-Ai/Outlier-10B-V3.2

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(1845)

this model

Collection including Outlier-Ai/Outlier-10B-V3.2

Outlier Research

Collection

Ternary MoE overlays on Qwen2.5. 10B/40B/70B/150B scales. V3.3 active, V3.2/V2 archived. MMLU verified at n=14,042. • 8 items • Updated 4 days ago

Outlier-Ai
/

Outlier-10B-V3.2