en Uzbek POS-tagger (Fine-tuned BERTbek Model)
πΊπΏ Uzbek POS-tagger (BERTbek Model asosida qurilgan)
This repository contains a Part-of-Speech (POS) Tagging model for the Uzbek language, fine-tuned from the BERTbek model by Elmurod Kuriyozov.
en Uzbek POS-tagger (Fine-tuned BERTbek Model)
πΊπΏ Uzbek POS-tagger (BERTbek Model asosida qurilgan)
This repository contains a Part-of-Speech (POS) Tagging model for the Uzbek language, fine-tuned from the BERTbek model by Elmurod Kuriyozov.
π§ Model Overview
- Model name: Uzbek POS-tagger based on BERTbek
- Base model: BERTbek (news-big-cased)
- Architecture: BERT (Transformer-based encoder)
- Fine-tuned for: POS tagging task (token classification)
- Training platform: Google Colab (NVIDIA A100 GPU)
- License: CC BY-NC 4.0
π Dataset
- Source: Manually annotated Uzbek POS-tagged dataset
- Size:
4,000 sentences (50,000 tokens) - Tags: 16 POS tags based on the Universal Dependencies (UD) tagset
- Annotation: Conducted manually by linguists for high-quality labeling
π Model Performance
| Metric | Score |
|---|---|
| Accuracy | ~91% |
| F1-score | ~87% |
π§© Applications
- Linguistic analysis of Uzbek texts
- Corpus annotation
- Preprocessing pipeline for downstream NLP tasks (NER, parsing, etc.)
π§βπ» Author & Credits
- Model fine-tuning and dataset: Maksud Sharipov
- Base model author: Elmurod Kuriyozov
- Framework: Hugging Face Transformers
π License
This model is licensed under the Creative Commons AttributionβNonCommercial 4.0 International (CC BY-NC 4.0) license.
You are free to use and adapt the model for non-commercial research with appropriate credit.
π£ Citation
If you use this model in your research, please cite as:
@misc{sharipov2025uzbekpos,
title = {Uzbek POS-tagger (Fine-tuned BERTbek Model)},
author = {Maksud Sharipov},
year = {2025},
howpublished = {Hugging Face},
url = {https://huggingface.co/MaksudSharipov/UzbekPosTagger_BERTbek}
}
- Downloads last month
- 25