SFT on Concyclics/PeoplesDaily:

  • batch_size: 96
  • epochs: 2
  • learning_rate: 1.0e-5
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • total_flops: 483TFlops
  • train_loss: 1.646
Downloads last month
26
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Concyclics/PeoplesDaily-Qwen3-4B-Base

Base model

Qwen/Qwen3-4B-Base
Finetuned
(162)
this model
Finetunes
1 model
Quantizations
2 models

Dataset used to train Concyclics/PeoplesDaily-Qwen3-4B-Base