trl-lib/tldr
Viewer • Updated • 130k • 2.42k • 31
How to use phh/Qwen3-0.6B-TLDR-Lora with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B")
model = PeftModel.from_pretrained(base_model, "phh/Qwen3-0.6B-TLDR-Lora")Source code available at https://github.com/phhusson/llm-rl/blob/main/grpo-tldr.py
16-bit