π°π Khmer Sentiment Analysis using XLM-RoBERTa
This model is a fine-tuned XLM-RoBERTa model for sentiment classification.
It is designed mainly for Khmer text sentiment analysis, but it can also process English text due to the multilingual pretraining of XLM-RoBERTa.
π Model Details
- Base Model: XLM-RoBERTa (FacebookAI/xlm-roberta-base)
- Architecture: Transformer Encoder for Sequence Classification
- Task: Sentiment Analysis
- Supported Languages:
- Khmer (Primary π°π)
- English (Partial π¬π§)
- Labels:
0β negative1β positive
Model Description
This model is fine-tuned on a Khmer sentiment dataset using XLM-RoBERTa.
It leverages multilingual pretraining, allowing it to process both Khmer and English inputs. However, performance is optimized for Khmer text.
How to Use
Install dependencies
pip install transformers torch
Run inference
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "phonsobon/khmer-sentiment-xlm-roberta"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
labels = {
0: "negative",
1: "positive"
}
text = "ααααΆααααααα’ααΆαα"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
pred = torch.argmax(outputs.logits, dim=1).item()
print("Text:", text)
print("Prediction:", labels[pred])
- Downloads last month
- 4
Model tree for phonsobon/khmer-sentiment-xlm-roberta
Base model
FacebookAI/xlm-roberta-base