Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Doohyuk Jang
jadohu
AI & ML interests
None yet
Organizations
models 14
jadohu/Qwen2.5-32B-GRPO
Reinforcement Learning • 33B • Updated
jadohu/Qwen3-8B-GRPO
Reinforcement Learning • 8B • Updated
• 1 • 1
jadohu/Qwen3-8B-MASA-efficient
Reinforcement Learning • 8B • Updated
• 3 • 1
jadohu/Qwen3-8B-MASA
Reinforcement Learning • 8B • Updated
• 2 • 2
jadohu/Qwen3-14B-GRPO
Reinforcement Learning • 15B • Updated
• 1 • 1
jadohu/Qwen3-14B-MASA
Reinforcement Learning • 15B • Updated
• 2 • 1
jadohu/Qwen2.5-32B-MASA-efficient
Reinforcement Learning • 33B • Updated
• 3
jadohu/MongMong
Text Generation • 8B • Updated
• 2
jadohu/anole_drafter
Text-to-Image • 0.5B • Updated
jadohu/llamagen_drafter
Text-to-Image • Updated
• 3 • 1
datasets 0
None public yet