How to use from
SGLang
Install from pip and serve model
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "zenlm/zen4-nano" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zenlm/zen4-nano",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker images
docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "zenlm/zen4-nano" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zenlm/zen4-nano",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Quick Links

Zen4 Nano

Zen4 Nano is a 0.8B parameter language model from the Zen4 family by Zen LM and Hanzo AI.

Built on abliterated (uncensored) weights with Zen4 Frontier architecture for unrestricted, open-ended AI assistance.

Model Details

Property Value
Parameters 0.8B total, 0.8B active
Architecture Zen4 Frontier
Context 262K tokens
License APACHE-2.0
Family Zen4
Tier Small
Creator Zen LM / Hanzo AI

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("zenlm/zen4-nano", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen4-nano")

messages = [{"role": "user", "content": "Hello, who are you?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

Zen4 Family

Model Parameters Context HuggingFace
Zen4 Nano 0.8B 262K zenlm/zen4-nano
Zen4 Micro 2B 262K zenlm/zen4-micro
Zen4 Mini 4B 262K zenlm/zen4-mini
Zen4 9B 262K zenlm/zen4
Zen4 Pro 27B 262K zenlm/zen4-pro
Zen4 Max 35B MoE (3B active) 262K zenlm/zen4-max
Zen4 Coder Flash 31B MoE (3B active) 131K zenlm/zen4-coder-flash
Zen4 Pro Max 80B MoE (3B active) 256K zenlm/zen4-pro-max
Zen4 Coder 80B MoE (3B active) 256K zenlm/zen4-coder
Zen4 Mega 122B MoE (10B active) 262K zenlm/zen4-mega
Zen4 Thunder 230B MoE (10B active) 1M zenlm/zen4-thunder
Zen4 Storm 456B MoE (45B active) 1M zenlm/zen4-storm
Zen4 Titan 744B MoE (40B active) 128K zenlm/zen4-titan
Zen4 Ultra 1.04T MoE (32B active) 256K zenlm/zen4-ultra
Zen4 Ultra Max 1T MoE (50B active) 128K zenlm/zen4-ultra-max

Links


Zen AI: Clarity Through Intelligence

Downloads last month
5
Safetensors
Model size
0.9B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zenlm/zen4-nano

Finetuned
(4)
this model
Quantizations
2 models