declare-lab/delta-mem_qwen3_4b-instruct
Text Generation • Updated • 4
Natural Language Processing
On the Limits of LLM-as-Judge for Scientific Novelty Assessment
GRAIL: Gradient-Reweighted Advantages for Reinforcement Learning with Verifiable Rewards