AI & ML interests
None yet
Organizations
None yet
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-webshop-stage3
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-webshop
Text Generation
•
Updated
•
8
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld
Text Generation
•
Updated
•
5
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard-3-rft-setting_2
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard-3-rft-setting_3
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard-3-rft-setting_5
Text Generation
•
Updated
•
4
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard-3-rft-setting_4
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard-3-rft
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard-3
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3
Text Generation
•
Updated
•
8
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-agumented-hard
Text Generation
•
Updated
•
5
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-all
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-agumented-hard
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-agumented-all
Updated
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-dpo
Text Generation
•
Updated
•
7
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-etotraj
Text Generation
•
Updated
•
5
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft-more
Text Generation
•
Updated
•
7
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-rft
Text Generation
•
Updated
•
7
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full
Text Generation
•
Updated
•
7
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-pro
Text Generation
•
Updated
•
6
dslighfdsl/Llama-3.1-8B-Instruct-GRPO
Updated
•
11
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short
Text Generation
•
Updated
•
5
dslighfdsl/Llama-3.1-8B-Instruct-Env-SFT-Latest-0410
Text Generation
•
Updated
•
5
dslighfdsl/Llama-3.1-8B-Instruct-Env-SFT-Latest
Text Generation
•
Updated
•
7
dslighfdsl/Llama-3.1-8B-Instruct-Env-SFT-GRPO
dslighfdsl/Llama-3.1-8B-Instruct-Env-SFT
Text Generation
•
Updated
•
11
dslighfdsl/sciworld_llama-3.1-8b-instruct-env-sft
Updated
dslighfdsl/Llama-3.2-3B-Instruct-ebrag
Text Generation
•
Updated
•
8
dslighfdsl/Qwen2.5-3B-Instruct-ebrag
Updated
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-GRPO