RoBERTa-Logic-Finetuned on Chinese Logic Dataset
This model is a RoBERTa model fine-tuned on a Chinese logic dataset to perform logic-related sentiment classification tasks. The dataset was generated using the Doubao API. It achieves an F1 score of 96.00% on the validation set.
Model Details
- Architecture: RoBERTa
- Base Model: uer/roberta-base-finetuned-dianping-chinese
- Framework: Transformers, PyTorch
- Fine-tuning Dataset:
YiMeng-SYSU/chinese-logic-sentiment-dataset(Generated by Doubao API) - Hardware: Trained on NVIDIA RTX 5070 Ti + AMD 9800X3D
Performance
The performance of this model on the Golden Test Set (960 samples) is as follows:
| Metric | Score |
|---|---|
| Accuracy | 96.15% |
| Macro-F1 | 96.00% |
| Recall (Negative) | 97.53% |
| Recall (Positive) | 94.15% |
Detailed Performance by Logic Type
| Type | Accuracy | Note |
|---|---|---|
| Simple (简单句) | 97.50% | 基础稳固 |
| Transition (转折) | 97.03% | 完美理解转折逻辑 |
| Double Neg (双重否定) | 96.67% | 彻底解决"双重否定表肯定/否定"的歧义 |
| Irony (反讽) | 92.47% | 核心亮点:具备识别"反话正说"的能力 |
Usage
You can use the model for sentiment analysis tasks in Chinese by loading it through the Hugging Face Transformers library.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model_name = "YiMeng-SYSU/roberta-logic-sentiment-zh"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = "这部剧简直有毒,害得我昨晚又熬到凌晨三点,黑眼圈都出来了!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) # 补充tokenizer参数,更健壮
with torch.no_grad():
outputs = model(**inputs)
probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
# 结合config中的id2label映射,输出可读标签(而非数字)
id2label = {0: "Negative (负面)", 1: "Positive (正面)"}
pred_label_id = torch.argmax(probs).item()
print(f"Label: {id2label[pred_label_id]}")
print(f"Confidence: {probs.max().item():.4f}")
- Downloads last month
- 98
Model tree for YiMeng-SYSU/roberta-logic-sentiment-zh
Base model
uer/roberta-base-finetuned-dianping-chinese