Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 17 items • Updated 6 days ago • 39
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-from-rl 8B • Updated 15 days ago • 13
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-from-rl 8B • Updated 15 days ago • 13
pittawat/qwen2.5-14b-instruct-still-3-1k-grpo-with-length-0.1-cot-prompt-v6 15B • Updated 27 days ago • 15
pittawat/qwen2.5-14b-instruct-still-3-1k-grpo-with-length-0.1-cot-prompt-v6 15B • Updated 27 days ago • 15
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-124 8B • Updated 27 days ago • 15
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-124 8B • Updated 27 days ago • 15
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-93 8B • Updated 27 days ago • 16
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-93 8B • Updated 27 days ago • 16
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-62 8B • Updated 27 days ago • 14
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-62 8B • Updated 27 days ago • 14
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-31 8B • Updated 27 days ago • 16
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-31 8B • Updated 27 days ago • 16