factory_qwen_results

This model is a fine-tuned version of Qwen/Qwen3-Coder-30B-A3B-Instruct on the train dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1424
  • Accuracy: 0.9676

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 12
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.08
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
0.2607 0.0811 30 0.9369 0.2531
0.2818 0.1622 60 0.9464 0.2187
0.193 0.2432 90 0.9497 0.2058
0.1835 0.3243 120 0.9512 0.1971
0.1586 0.4054 150 0.9528 0.1891
0.141 0.4865 180 0.9552 0.1821
0.1359 0.5676 210 0.9561 0.1726
0.1038 0.6486 240 0.9574 0.1720
0.1784 0.7297 270 0.9578 0.1632
0.3386 0.8108 300 0.9590 0.1573
0.1101 0.8919 330 0.9609 0.1555
0.1123 0.9730 360 0.9619 0.1513
0.0956 1.0541 390 0.9618 0.1552
0.0802 1.1351 420 0.9634 0.1525
0.0671 1.2162 450 0.9634 0.1519
0.0738 1.2973 480 0.9639 0.1493
0.0622 1.3784 510 0.9639 0.1477
0.063 1.4595 540 0.9658 0.1435
0.0593 1.5405 570 0.9654 0.1499
0.2748 1.6216 600 0.9666 0.1479
0.0804 1.7027 630 0.9661 0.1440
0.0631 1.7838 660 0.9668 0.1427
0.0414 1.8649 690 0.9668 0.1446
0.0507 1.9459 720 0.9676 0.1424
0.0261 2.0270 750 0.9689 0.1542
0.0324 2.1081 780 0.9688 0.1578
0.0291 2.1892 810 0.9681 0.1501
0.0205 2.2703 840 0.9684 0.1578
0.0271 2.3514 870 0.9688 0.1545
0.0185 2.4324 900 0.1644 0.9684
0.0243 2.5135 930 0.1571 0.9695
0.0218 2.5946 960 0.1562 0.9703
0.0229 2.6757 990 0.1565 0.9701
0.028 2.7568 1020 0.1583 0.9699
0.0193 2.8378 1050 0.1578 0.9703
0.0192 2.9189 1080 0.1598 0.9702
0.0231 3.0 1110 0.1610 0.9702

Framework versions

  • PEFT 0.17.1
  • Transformers 4.57.1
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for finalform/velocityFoamQwen-30B

Adapter
(32)
this model