GuppyLM

GuppyLM — 9M Parameter Fish Chat Model

A ~9M parameter LLM trained from scratch that talks like a small fish named Guppy.

This project exists to show that training your own language model is not magic. One Colab notebook, 5 minutes, and you have a working LLM built from scratch.

Example

You> hi guppy
Guppy> hi there. i just found a nice spot near the rock. the temperature feels nice.

You> are you hungry
Guppy> yes. always yes. i will swim to the top right now. i promise to eat all of it.

You> what is the meaning of life
Guppy> food. the answer is always food.

You> tell me a joke
Guppy> what did the fish say when it hit the wall. dam.

You> goodnight guppy
Guppy> ok sleep time. i was following a bubble but now i'll stop. goodnight tank. goodnight water.

Architecture


Parameters	8.7M
Type	Vanilla transformer (from scratch)
Layers	6
Hidden dim	384
Heads	6
FFN	768 (ReLU)
Vocab	4,096 (BPE)
Max sequence	128 tokens
Norm	LayerNorm
Position	Learned embeddings
LM head	Weight-tied with embeddings

No GQA, no RoPE, no SwiGLU, no early exit. As simple as it gets.

Training

Data: 60K single-turn synthetic conversations across 60 topics
Steps: 10,000
Optimizer: AdamW (cosine LR schedule)
Hardware: T4 GPU (~5 min)
No system prompt — personality is baked into the weights

Usage

from inference import GuppyInference

engine = GuppyInference('checkpoints/best_model.pt', 'data/tokenizer.json')
r = engine.chat_completion([{'role': 'user', 'content': 'hi guppy'}])
print(r['choices'][0]['message']['content'])
# hi there. i just found a nice spot near the rock.

License

MIT

Downloads last month: 2,096