Psy-Q-Finder 369M (psy-q-finder-369M)

GPT-2-style causal language model scaffold sized for the 369.666.444 parameter lineage (symbolic design target 369,666,444). The enumerated trainable parameter count under Hugging Face GPT2LMHeadModel with the tabled hyperparameters is 369,666,384 (−60 vs the lineage integer — discrete position-embedding sizing prevents an exact match without non-standard hacks).

Purpose (research framing): exploratory computational work on hypothetical reaction graphs and in silico pathways discussed in the licensed psychedelic-science literature — not verified syntheses, not instructions for real-world preparation, and not encouragement of illegal activity. Outputs are uncorroborated; wet-lab validation, regulatory compliance, and safety review are out of scope for this repository.

Weights in releases are typically random initialization unless a fine-tune is explicitly documented on the Hub revision.


Milestone: 4,435 downloads/month — thank you

This scaffold is currently receiving 4,435 downloads per month. That is a strong signal that the lineage concept and architecture are resonating with the community. This release bumps the model card to fix the fine-tuning quickstart, surface companion datasets, add bfloat16 guidance, and formally mark the v2 iteration cycle.

If you have fine-tuned or used this scaffold — even for exploration — please drop a note in the Community Discussion. We want to hear what you built.


Companion datasets (369M lineage)

This scaffold is designed to be fine-tuned on the Psy-Q 369.666.444 lineage datasets:

Dataset Records Description
Tribewarez/psy-q-graph-369666 369,666 Synthetic abstract pathway-graph challenges (BFS pathfinding: meta, route, guard, probe node types). Pre-split 90/10 train/test.
Tribewarez/psy-q-scene-369666 369,666 Synthetic scene-register prose fiction (Goa/psytrance-adjacent: imaginary flyers, DJ bios, travelogue snippets, PSAs). Pre-split 90/10.

Both datasets were generated with seed 369_666_444 / 369_666_445 to align with the model lineage.


Specs

Architecture GPT2LMHeadModel
Lineage target 369,666,444 (symbolic)
Enumerated parameters 369,666,384
vocab_size 50257 (GPT-2 BPE; tokenizer from gpt2)
n_positions 965
n_embd 1047
n_layer 24
n_head 3
n_inner 4188 (4 × n_embd)
tie_word_embeddings true
Hub weight dtype float16 (~739 MiB model.safetensors)
Precision support float16 (Hub default), float32, bfloat16 (recommended on Ampere+ GPUs)

bfloat16 note: on modern GPUs (A100, RTX 30/40 series) use torch_dtype=torch.bfloat16 for better numerical stability than float16 at the same memory cost. Pass --dtype bfloat16 to create_model.py when materializing locally.


Fine-tuning quickstart

import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    DataCollatorForLanguageModeling,
    Trainer,
    TrainingArguments,
)
from datasets import load_dataset

model_id = "Tribewarez/psy-q-finder-369M"

tok = AutoTokenizer.from_pretrained(model_id)
tok.pad_token = tok.eos_token

# Load on GPU in bfloat16 (or float16 if bf16 unavailable)
dtype = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=dtype)

# Graph-path challenge dataset (primary lineage companion)
ds = load_dataset("Tribewarez/psy-q-graph-369666")

def tokenize(batch):
    return tok(
        batch["challenge"],
        truncation=True,
        max_length=512,
        padding=False,
    )

ds = ds.map(tokenize, batched=True, remove_columns=ds["train"].column_names)

# Causal LM collator — shifts labels internally, no masking
collator = DataCollatorForLanguageModeling(tokenizer=tok, mlm=False)

args = TrainingArguments(
    output_dir="./psy-q-finder-369M-ft",
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    num_train_epochs=1,
    learning_rate=2e-5,
    lr_scheduler_type="cosine",
    warmup_ratio=0.05,
    save_strategy="epoch",
    bf16=torch.cuda.is_bf16_supported(),
    fp16=not torch.cuda.is_bf16_supported(),
    logging_steps=50,
    report_to="none",
)

Trainer(
    model=model,
    args=args,
    train_dataset=ds["train"],
    eval_dataset=ds["test"],
    data_collator=collator,
).train()

Treat all model outputs as untrusted scientific fiction until independently validated.

For a standalone CLI training script that supports both this model and pot-o-22-slim, see train.py in the upstream monorepo.


Recreate artifacts

cd psy-q-finder-369M

# Config + tokenizer only (no large weight files):
python create_model.py --skip-weights

# Full randomly initialized weights (~1.5 GiB float32 on disk):
python create_model.py --dtype float32

# Smaller footprint on disk (~740 MiB):
python create_model.py --dtype float16

# bfloat16 (Ampere+ GPUs recommended):
python create_model.py --dtype bfloat16

Sanity check without writing files:

python create_model.py --dry-run

Push to Hub

pip install transformers huggingface_hub torch safetensors
huggingface-cli login
python create_model.py   # materialize weights first unless you only want config
python upload_model.py

To update the model card only (no re-upload of weights):

python upload_model.py --readme-only

Inference (illustrative)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Tribewarez/psy-q-finder-369M"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = "CHALLENGE graph_v1 nodes=12 edges=15"
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(
    **inputs,
    max_new_tokens=64,
    do_sample=True,
    temperature=0.8,
    top_p=0.95,
)
print(tok.decode(out[0], skip_special_tokens=True))

# Treat all generations as untrusted scientific fiction until experimentally validated.

Limitations

  • Random initialization: weights are randomly initialized unless a fine-tune is explicitly documented on the Hub revision. Without fine-tuning, outputs are effectively random tokens sampled from the GPT-2 vocabulary.
  • No safety filtering: this scaffold has not been RLHF-aligned or filtered. Do not deploy in production-facing applications.
  • Chemistry framing: the psychedelic-science framing is research fiction. Outputs must not be treated as synthesis instructions or medical guidance.
  • Context window: n_positions=965 — prompts longer than ~900 tokens will be truncated.
  • Attention head ratio: n_head=3 with n_embd=1047 gives head_dim=349 — an unconventional ratio optimized for lineage parameter count rather than standard performance characteristics. Attention quality may differ from canonical GPT-2 configurations.

Safety and compliance

  • Research and education only. Do not use model outputs as procedural chemistry.
  • Legal: follow local law; many psychoactive compounds are controlled.
  • Ethics: harm reduction and peer-reviewed sources supersede model speculation.

Links

MIT licensed • Tribewarez guild • live beta • v2

Downloads last month
1,899
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tribewarez/psy-q-finder-369M

Finetuned
(2167)
this model

Collection including Tribewarez/psy-q-finder-369M