GuppyLM β 9M Parameter Fish Chat Model
A ~9M parameter LLM trained from scratch that talks like a small fish named Guppy.
This project exists to show that training your own language model is not magic. One Colab notebook, 5 minutes, and you have a working LLM built from scratch.
Example
You> hi guppy
Guppy> hi there. i just found a nice spot near the rock. the temperature feels nice.
You> are you hungry
Guppy> yes. always yes. i will swim to the top right now. i promise to eat all of it.
You> what is the meaning of life
Guppy> food. the answer is always food.
You> tell me a joke
Guppy> what did the fish say when it hit the wall. dam.
You> goodnight guppy
Guppy> ok sleep time. i was following a bubble but now i'll stop. goodnight tank. goodnight water.
Architecture
| Parameters | 8.7M |
| Type | Vanilla transformer (from scratch) |
| Layers | 6 |
| Hidden dim | 384 |
| Heads | 6 |
| FFN | 768 (ReLU) |
| Vocab | 4,096 (BPE) |
| Max sequence | 128 tokens |
| Norm | LayerNorm |
| Position | Learned embeddings |
| LM head | Weight-tied with embeddings |
No GQA, no RoPE, no SwiGLU, no early exit. As simple as it gets.
Training
- Data: 60K single-turn synthetic conversations across 60 topics
- Steps: 10,000
- Optimizer: AdamW (cosine LR schedule)
- Hardware: T4 GPU (~5 min)
- No system prompt β personality is baked into the weights
Usage
from inference import GuppyInference
engine = GuppyInference('checkpoints/best_model.pt', 'data/tokenizer.json')
r = engine.chat_completion([{'role': 'user', 'content': 'hi guppy'}])
print(r['choices'][0]['message']['content'])
# hi there. i just found a nice spot near the rock.
Links
License
MIT
- Downloads last month
- 2,096