SpoomplesMaxx Base — Gemma 3 27B

SpoomplesMaxx Base is a continued pre-training (CPT) run on top of unsloth/gemma-3-27b-pt, targeting improved creative writing, character voice, narrative prose, and multilingual fluency in English and Brazilian Portuguese.

This is the final release checkpoint of the CPT stage. It is a base model — not instruction-tuned. Downstream SFT and DPO stages are planned.

Part of the SpoomplesMaxx project — a hobbyist ML research effort focused on creative writing and roleplay capability in open base models.

Model Details

Property	Value
Base model	google/gemma-3-27b-pt
Architecture	Gemma 3 27B
Parameters	~27B
Training stage	Continued Pre-Training (CPT)
Training framework	TRL + Unsloth
Training type	LoRA on text layers only (vision components frozen)
Languages	English (`en`), Brazilian Portuguese (`pt`)

Why Gemma 3 27B?

After earlier CPT runs on GLM-4-32B, Qwen3-14B, and SmolLM3 — which surfaced issues with repetition and inconsistent cultural knowledge absorption — Gemma 3 27B PT was selected for its strong out-of-the-box creative writing quality and multilingual coverage.

Uses

Direct Use

This model is suitable for:

Creative writing and prose generation
Character roleplay and collaborative fiction
Multilingual text generation (EN/PT)
Base for downstream SFT/DPO fine-tuning

As a base model, it does not follow instructions and has no chat template. Use it with a completion interface or apply your own prompt structure.

Downstream Use

The intended pipeline is: CPT (this model) → SFT → DPO SFT and DPO stages are under active development and will be released separately.

Out-of-Scope Use

Drop-in replacement for instruction-following or chat models — no system prompt, no chat template
Production deployment without further alignment
Tasks requiring factual grounding or safety constraints — this is an uncensored creative base

Training Details

Training Data

The training corpus combines two sources, concatenated and shuffled (seed 1985) before a 99.8/0.2% train/eval split:

Source	Rows	Description
`aimeri/spoomplesmaxx-cpt-raw-small`	91,657	Broad creative writing and prose CPT corpus
`characters_small.jsonl`	10,000	Curated character-focused entries
Total	101,657

Tokenization appends EOS at document boundaries. Documents are then concatenated and chunked into fixed-length sequences of 16,384 tokens, with the trailing remainder dropped. This yields the final packed training sequences.

Evaluation

No formal benchmarks have been run on this model. Evaluation is currently qualitative — creative writing samples, prose coherence, and character voice consistency across English and Portuguese.

If you run benchmarks on this model, please open a discussion — contributions welcome.

Project History

SpoomplesMaxx has gone through several base model iterations:

Run	Base	Status
v1	SmolLM3 3B	Experimental, archived
v2	GLM-4-32B	Repetition issues, archived
v3	Qwen3-14B	Released — aimeri/spoomplesmaxx-base-qwen3-14b
v4 (this)	Gemma 3 27B	Current

Model Card Authors

aimeri

Model Card Contact

Open a discussion on the repository page.

Downloads last month: 132

Safetensors

Model size

27B params

Tensor type

BF16

Model tree for aimeri/spoomplesmaxx-base-gemma3-27b

Base model

google/gemma-3-27b-pt

Finetuned

(75)

this model