SpoomplesMaxx Base β Gemma 3 27B
SpoomplesMaxx Base is a continued pre-training (CPT) run on top of unsloth/gemma-3-27b-pt, targeting improved creative writing, character voice, narrative prose, and multilingual fluency in English and Brazilian Portuguese.
This is the final release checkpoint of the CPT stage. It is a base model β not instruction-tuned. Downstream SFT and DPO stages are planned.
Part of the SpoomplesMaxx project β a hobbyist ML research effort focused on creative writing and roleplay capability in open base models.
Model Details
| Property | Value |
|---|---|
| Base model | google/gemma-3-27b-pt |
| Architecture | Gemma 3 27B |
| Parameters | ~27B |
| Training stage | Continued Pre-Training (CPT) |
| Training framework | TRL + Unsloth |
| Training type | LoRA on text layers only (vision components frozen) |
| Languages | English (en), Brazilian Portuguese (pt) |
Why Gemma 3 27B?
After earlier CPT runs on GLM-4-32B, Qwen3-14B, and SmolLM3 β which surfaced issues with repetition and inconsistent cultural knowledge absorption β Gemma 3 27B PT was selected for its strong out-of-the-box creative writing quality and multilingual coverage.
Uses
Direct Use
This model is suitable for:
- Creative writing and prose generation
- Character roleplay and collaborative fiction
- Multilingual text generation (EN/PT)
- Base for downstream SFT/DPO fine-tuning
As a base model, it does not follow instructions and has no chat template. Use it with a completion interface or apply your own prompt structure.
Downstream Use
The intended pipeline is: CPT (this model) β SFT β DPO SFT and DPO stages are under active development and will be released separately.
Out-of-Scope Use
- Drop-in replacement for instruction-following or chat models β no system prompt, no chat template
- Production deployment without further alignment
- Tasks requiring factual grounding or safety constraints β this is an uncensored creative base
Training Details
Training Data
The training corpus combines two sources, concatenated and shuffled (seed 1985) before a
99.8/0.2% train/eval split:
| Source | Rows | Description |
|---|---|---|
aimeri/spoomplesmaxx-cpt-raw-small |
91,657 | Broad creative writing and prose CPT corpus |
characters_small.jsonl |
10,000 | Curated character-focused entries |
| Total | 101,657 |
Tokenization appends EOS at document boundaries. Documents are then concatenated and chunked into fixed-length sequences of 16,384 tokens, with the trailing remainder dropped. This yields the final packed training sequences.
Evaluation
No formal benchmarks have been run on this model. Evaluation is currently qualitative β creative writing samples, prose coherence, and character voice consistency across English and Portuguese.
If you run benchmarks on this model, please open a discussion β contributions welcome.
Project History
SpoomplesMaxx has gone through several base model iterations:
| Run | Base | Status |
|---|---|---|
| v1 | SmolLM3 3B | Experimental, archived |
| v2 | GLM-4-32B | Repetition issues, archived |
| v3 | Qwen3-14B | Released β aimeri/spoomplesmaxx-base-qwen3-14b |
| v4 (this) | Gemma 3 27B | Current |
Model Card Authors
- aimeri
Model Card Contact
Open a discussion on the repository page.
- Downloads last month
- 132
Model tree for aimeri/spoomplesmaxx-base-gemma3-27b
Base model
google/gemma-3-27b-pt