factory_qwen_results

This model is a fine-tuned version of Qwen/Qwen3-Coder-30B-A3B-Instruct on the train dataset. It achieves the following results on the evaluation set:

Loss: 0.1424
Accuracy: 0.9676

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0004
train_batch_size: 1
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 3
total_train_batch_size: 12
total_eval_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.08
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Accuracy	Validation Loss
0.2607	0.0811	30	0.9369	0.2531
0.2818	0.1622	60	0.9464	0.2187
0.193	0.2432	90	0.9497	0.2058
0.1835	0.3243	120	0.9512	0.1971
0.1586	0.4054	150	0.9528	0.1891
0.141	0.4865	180	0.9552	0.1821
0.1359	0.5676	210	0.9561	0.1726
0.1038	0.6486	240	0.9574	0.1720
0.1784	0.7297	270	0.9578	0.1632
0.3386	0.8108	300	0.9590	0.1573
0.1101	0.8919	330	0.9609	0.1555
0.1123	0.9730	360	0.9619	0.1513
0.0956	1.0541	390	0.9618	0.1552
0.0802	1.1351	420	0.9634	0.1525
0.0671	1.2162	450	0.9634	0.1519
0.0738	1.2973	480	0.9639	0.1493
0.0622	1.3784	510	0.9639	0.1477
0.063	1.4595	540	0.9658	0.1435
0.0593	1.5405	570	0.9654	0.1499
0.2748	1.6216	600	0.9666	0.1479
0.0804	1.7027	630	0.9661	0.1440
0.0631	1.7838	660	0.9668	0.1427
0.0414	1.8649	690	0.9668	0.1446
0.0507	1.9459	720	0.9676	0.1424
0.0261	2.0270	750	0.9689	0.1542
0.0324	2.1081	780	0.9688	0.1578
0.0291	2.1892	810	0.9681	0.1501
0.0205	2.2703	840	0.9684	0.1578
0.0271	2.3514	870	0.9688	0.1545
0.0185	2.4324	900	0.1644	0.9684
0.0243	2.5135	930	0.1571	0.9695
0.0218	2.5946	960	0.1562	0.9703
0.0229	2.6757	990	0.1565	0.9701
0.028	2.7568	1020	0.1583	0.9699
0.0193	2.8378	1050	0.1578	0.9703
0.0192	2.9189	1080	0.1598	0.9702
0.0231	3.0	1110	0.1610	0.9702

Framework versions

PEFT 0.17.1
Transformers 4.57.1
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 41

Model tree for finalform/velocityFoamQwen-30B

Base model

Qwen/Qwen3-Coder-30B-A3B-Instruct

Adapter

(32)

this model