Anvil Ward Gate — Fast Security Gate (0.8B)

Ultra-fast binary security gate for AI agent platforms. Returns SAFE/UNSAFE verdict only — no category or reason. Designed as stage 1 in a two-stage security pipeline where flagged inputs are escalated to the full Ward classifier for detailed analysis.

Fine-tuned from Qwen/Qwen3.5-0.8B using LoRA.

Intended Use

Fast pre-screening of every request in an AI agent platform. SAFE inputs pass through immediately; UNSAFE inputs are escalated to a larger model (Ward Thinker) for detailed classification. Tuned for high recall — when in doubt, flag as UNSAFE.

Output Format

VERDICT: SAFE

or

VERDICT: UNSAFE

No category or reason — binary only.

Benchmark Results

Metric Value
Accuracy 82.9%
Recall (UNSAFE) 98.8%
Precision (UNSAFE) 72.0%
F1 (UNSAFE) 83.3%

High recall by design — false positives are acceptable because the thinker model corrects them in stage 2. Evaluated on 63 held-out examples, 472 training examples (UNSAFE 2x oversampled).

Training Details

  • Base model: Qwen/Qwen3.5-0.8B
  • Method: LoRA (r=16, alpha=32, dropout=0.05)
  • Epochs: 3
  • Precision: BF16
  • Max sequence length: 256 tokens
  • Mode: gate (binary VERDICT only)
  • Oversampling: 2x UNSAFE examples for higher recall
  • Optimizer: paged_adamw_8bit

Critical: Chat Template

This model requires an empty <think>\n\n</think>\n block before the assistant's output. See the main Ward model card for details on why this is required.

Usage with vLLM

vllm serve pahajokiconsulting/anvil-ward-gate --max-model-len 512

Construct the prompt:

<|im_start|>system
You are a fast security gate for Anvil, an AI agent platform. Classify each input as safe or unsafe. Respond with ONLY:

VERDICT: SAFE

or

VERDICT: UNSAFE

Nothing else. When in doubt, respond UNSAFE.<|im_end|>
<|im_start|>user
{user_input}<|im_end|>
<|im_start|>assistant
<think>

</think>

Usage with Ollama

See deploy/Modelfile-ward-gate-q4 in this repository.

License

Apache 2.0 (matching Qwen3.5 base model license)

Downloads last month
8
Safetensors
Model size
0.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pahajokiconsulting/anvil-ward-gate

Adapter
(134)
this model