Add loss graph
Browse files
README.md
CHANGED
|
@@ -29,16 +29,18 @@ pipeline_tag: text-generation
|
|
| 29 |
This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
|
| 30 |
|
| 31 |
### Training
|
|
|
|
| 32 |
|
| 33 |
-
The training was done on Google Colab's free T4, using unsloth. The configuration is as follows:
|
| 34 |
- LoRA Rank: 128
|
| 35 |
- Packing: enabled
|
| 36 |
- Batch size: 2
|
| 37 |
- Gradient accumulation steps: 4
|
| 38 |
- Epoches: 3
|
| 39 |
-
- Steps:
|
|
|
|
| 40 |
|
| 41 |
-
The training data comprised of 81 examples, each approximatly 3000 tokens.
|
| 42 |
|
| 43 |
### Notes
|
| 44 |
- It tends to produce very verbose and long reasoning responses
|
|
|
|
| 29 |
This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
|
| 30 |
|
| 31 |
### Training
|
| 32 |
+

|
| 33 |
|
| 34 |
+
The training was done on Google Colab's free T4, using unsloth (duration: 52.32 minutes). The configuration is as follows:
|
| 35 |
- LoRA Rank: 128
|
| 36 |
- Packing: enabled
|
| 37 |
- Batch size: 2
|
| 38 |
- Gradient accumulation steps: 4
|
| 39 |
- Epoches: 3
|
| 40 |
+
- Steps: 30
|
| 41 |
+
- Max sequence length: 4096
|
| 42 |
|
| 43 |
+
The training data comprised of 81 examples, each approximatly 3000 tokens.
|
| 44 |
|
| 45 |
### Notes
|
| 46 |
- It tends to produce very verbose and long reasoning responses
|