Motif-2-12.7B
Collection
3 items
•
Updated
•
5
Last update: 10 Dec. 2025
This is a reasoning enhanced version of Motif-2-12.7B-Instruct. Detailed information will be released later.
| Benchmark | Evaluation setting | Motif-2-12.7B | Motif-2-12.7B |
|---|---|---|---|
| Instruct | Reasoning | ||
| MMLU | 0-shot | 86.11 | 84.07 |
| MMLU-Redux | - | 90.02 | 88.89 |
| BBH | 0-shot | 85.78 | 78.34 |
| GPQA-Diamond | 0-shot, CoT | 63.6 | 70 |
| GSM8K | 0-shot, CoT | 96.13 | 95.53 |
| MATH | 0-shot | 97 | 95.07 |
| MBPP | 3-shot | 91 | 88.9 |
| LiveBench 2024-11-25 | - | 33.8 | 49.9 |
| IFEval | strict prompt | 75.78 | 79.11 |
| IFEval | 0-shot | 76.52 | 81.89 |
| MATH-500 | - | 96.8 | 99.3 |
| AIME24 | - | 72.3 | 88.3 |
| AIME25 | - | 63.6 | 80 |
| ZebraLogic | - | 69.5 | 77 |
| BFCL v3 | - | 55.34 | 60.2 |
| LiveCodeBench v5 (2024.10 - 2025.2) |
- | 50.03 | 65 |
| LiveCodeBench v5 | 0-shot, CoT | 61.66 | 60.1 |
| HumanEval | 0-shot | 93.2 | 93.2 |
| Average | - | 75.45 | 79.71 |
Base model
Motif-Technologies/Motif-2-12.7B-Base