Update README.md
Browse files
README.md
CHANGED
|
@@ -20,8 +20,10 @@ tags:
|
|
| 20 |
- chat
|
| 21 |
---
|
| 22 |
|
|
|
|
| 23 |
|
| 24 |
-
|
|
|
|
| 25 |
|
| 26 |
# Quants
|
| 27 |
|
|
@@ -101,7 +103,7 @@ load_in_4bit: false
|
|
| 101 |
strict: false
|
| 102 |
|
| 103 |
datasets:
|
| 104 |
-
- path:
|
| 105 |
type: sharegpt
|
| 106 |
conversation: chatml
|
| 107 |
- path: NewEden/Claude-Instruct-5K
|
|
@@ -203,11 +205,11 @@ special_tokens:
|
|
| 203 |
- [Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned](https://huggingface.co/datasets/Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned)
|
| 204 |
- [anthracite-org/kalo_opus_misc_240827](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827)
|
| 205 |
- [anthracite-org/kalo_misc_part2](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2)
|
| 206 |
-
- [
|
| 207 |
|
| 208 |
|
| 209 |
## Training
|
| 210 |
-
The training was done for
|
| 211 |
|
| 212 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
| 213 |
|
|
|
|
| 20 |
- chat
|
| 21 |
---
|
| 22 |
|
| 23 |
+

|
| 24 |
|
| 25 |
+
|
| 26 |
+
A earlier checkpoint of an unreleased (for now) model, using the same configuration as [Tor-8B]() but on Gemma rather then Nemo-8B, A finetune made for creative writing and roleplay tasks, Finetuned ontop of the base Gemma2 9B model, I trained the model for 4 epochs, with the 4 epoch checkpoint becoming the a future unreleased model and the 2 epoch checkpoint becoming my own personal release. This model aims to have good prose and writing while not as `Suggestive` as Magnum models usually are, along with keeping some of the intelligence that was nice to have with the Gemma2 family.
|
| 27 |
|
| 28 |
# Quants
|
| 29 |
|
|
|
|
| 103 |
strict: false
|
| 104 |
|
| 105 |
datasets:
|
| 106 |
+
- path: [PRIVATE CLAUDE LOG FILTER]
|
| 107 |
type: sharegpt
|
| 108 |
conversation: chatml
|
| 109 |
- path: NewEden/Claude-Instruct-5K
|
|
|
|
| 205 |
- [Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned](https://huggingface.co/datasets/Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned)
|
| 206 |
- [anthracite-org/kalo_opus_misc_240827](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827)
|
| 207 |
- [anthracite-org/kalo_misc_part2](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2)
|
| 208 |
+
- [Private re-Filter of Claude Logs](https://google.com)
|
| 209 |
|
| 210 |
|
| 211 |
## Training
|
| 212 |
+
The training was done for 4 epochs. We used 8 x [H100s](https://www.nvidia.com/en-us/data-center/h100/) GPUs graciously provided by [Lucy Knada](https://huggingface.co/lucyknada) for the full-parameter fine-tuning of the model.
|
| 213 |
|
| 214 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
| 215 |
|