-
-
-
-
-
-
Inference Providers
Active filters: trl
JackBinary/Qwen3.5-24B-A3B-Claude-Opus-Gemini-3.1-Pro-Reasoning-Distilled-heretic
Text Generation
• 24B • Updated
• 53
• 1
Karan6124/llama3-8b-dpo-orca-adapter
Text Generation
• Updated
• 30
• 1
Text Generation
• 71B • Updated
• 1.3k
• 1
Text Generation
• 2B • Updated
• 33
• 1
N-Bot-Int/OpenElla-StoryWriter-TypeB
Text Generation
• 1B • Updated
• 118
• 1
kth8/gemma-3-270m-it-SuperGPQA-Classifier
Text Generation
• 0.3B • Updated
• 36
• 1
N-Bot-Int/OpenElla-StoryWriter-TypeB-GGUF
Text Generation
• 1B • Updated
• 162
• 1
filter-with-espresso/Qwen2.5-14B-Instruct-moltbook-finetune-v9
Updated
• 1
Dorjzodovsuren/MongolianTTS_elevenlabs
Text Generation
• 3B • Updated
• 45
• 1
Simonc-44/Cygnis-Alpha-2-7B-v0.1
Updated
• 1
mradermacher/Qwen3.5-24B-A3B-Claude-Opus-Gemini-3.1-Pro-Reasoning-Distilled-heretic-i1-GGUF
24B • Updated
• 2.1k
• 1
mirazrafi/NSFW-RP-RolePlay-LoRA-Qwen-3.5-4B
Text Generation
• Updated
• 194
• 4
arif-butt/finetuned-llama-3.2-1b-it
mirazrafi/NSFW-RP-RolePlay-LoRA-Qwen-3.5-9B
Text Generation
• Updated
• 171
• 1
Reinforcement Learning
• Updated
• 1
ybelkada/gpt-neo-125m-detox
Reinforcement Learning
• Updated
• 18
ybelkada/gpt-neo-125m-detoxified-long-context
Reinforcement Learning
• Updated
• 4
Reinforcement Learning
• Updated
• 2
• 1
SummerSigh/T5-Base-Rule-Of-Thumb-RM
Reinforcement Learning
• Updated
• 1
dshin/flan-t5-ppo-testing
Reinforcement Learning
• Updated
• 1
• 1
SummerSigh/T5-Base-EvilPrompterRM
Reinforcement Learning
• 0.2B • Updated
• 2
dshin/flan-t5-ppo-testing-violation
Reinforcement Learning
• Updated
Reinforcement Learning
• Updated
• 1
dshin/flan-t5-ppo-user-h-use-violation
Reinforcement Learning
• Updated
dshin/flan-t5-ppo-user-f-use-violation
Reinforcement Learning
• Updated
• 1
dshin/flan-t5-ppo-user-e-use-violation
Reinforcement Learning
• Updated
• 2
dshin/flan-t5-ppo-user-a-use-violation
Reinforcement Learning
• Updated
• 1
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-0
Reinforcement Learning
• Updated
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-0
Reinforcement Learning
• Updated
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-0-use-violation
Reinforcement Learning
• Updated
• 1