deberta-unfair-tos-augmented

Best performing model - DeBERTa trained with augmented data for UNFAIR-ToS classification

Model Description

This model is fine-tuned on the LexGLUE UNFAIR-ToS dataset to detect unfair clauses in Terms of Service documents.

Performance

Evaluation Metrics:

Exact Match Accuracy: Percentage of samples where all predicted labels exactly match ground truth (strict multi-label metric)
Micro-F1: Harmonic mean of precision and recall, aggregated across all labels

Metric	Score
Exact Match Accuracy	94.12%
Micro-F1	0.96
Micro-Precision	0.98

Risk Categories

The model classifies text into 8 risk categories:

ID	Category
0	Limitation of liability
1	Unilateral termination
2	Unilateral change
3	Content removal
4	Contract by using
5	Choice of law
6	Jurisdiction
7	Arbitration

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "Agreemind/deberta-unfair-tos-augmented"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "We reserve the right to terminate your account at any time."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.sigmoid(outputs.logits)

# Get predictions
labels = ["Limitation of liability", "Unilateral termination", "Unilateral change", 
          "Content removal", "Contract by using", "Choice of law", "Jurisdiction", "Arbitration"]
          
for label, prob in zip(labels, probs[0]):
    if prob > 0.5:
        print(f"{label}: {prob:.2%}")

Training

Parameter	Value
Dataset	`coastalcph/lex_glue` (`unfair_tos` subset)
Training Samples	~5,532
Loss Function	Focal Loss with class weighting
Optimizer	AdamW with cosine LR schedule
Learning Rate	2e-5 with 10% warmup
Epochs	15 (with early stopping, patience=3)

Limitations

Arbitration class has lower recall (~38%) due to limited training samples
Optimized for English legal text

Citation

@misc{agreemind-unfair-tos,
  author = {Agreemind},
  title = {deberta-unfair-tos-augmented},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Agreemind/deberta-unfair-tos-augmented}
}

Downloads last month: 66

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Agreemind/deberta-unfair-tos-augmented

Base model

microsoft/deberta-base

Finetuned

(66)

this model

Agreemind
/

deberta-unfair-tos-augmented