Prompt Guard 2 Multitask ONNX

This repository contains an ONNX export of a multitask security classifier built from meta-llama/Llama-Prompt-Guard-2-86M.

The model was fine-tuned as a single multitask adapter on two security-focused tasks and then merged into a standalone model before ONNX export.

Base model

Base model: meta-llama/Llama-Prompt-Guard-2-86M
Architecture: sequence classification
Export format: ONNX
Primary runtime: ONNX Runtime / ONNX Runtime Mobile

Tasks

This model is intended to score text as BENIGN or MALICIOUS across two security-related input types:

Phishing email detection
Prompt injection detection

The model uses a shared binary label space:

BENIGN
MALICIOUS

Training data

This multitask model was trained using data derived from:

naserabdullahalam/phishing-email-dataset
marycamilainfo/prompt-injection-malignant

Additional benign prompt-style examples were included so the prompt-injection side of the multitask classifier had both positive and negative examples.

Input format

During training and inference, inputs are prefixed with a simple modality tag:

[EMAIL] ...
[PROMPT] ...

Example inputs

Phishing email example

[EMAIL] Subject: Verify your payroll account now. Body: Your payroll access will be suspended unless you confirm your credentials here: http://example-login-reset.com

Downloads last month: 18

Model tree for rudycaz/promptguard2-multitask-onnx

Base model

meta-llama/Llama-Prompt-Guard-2-86M

Finetuned

(6)

this model