Swahili-English Translation Model for Child Helpline Services

Model Description

This model is a fine-tuned version of Helsinki-NLP/opus-mt-mul-en for Swahili-to-English translation, specifically optimized for child helpline call transcriptions in East Africa.

Developed by: BITZ IT Consulting Ltd
Project: OpenCHS (Open Child Helpline System)
Funded by: UNICEF Venture Fund
License: Apache 2.0

Performance

Test Set (General Translation)

  • BLEU: 0.2735
  • chrF: 46.78
  • Improvement over baseline: +0.0%

Domain Evaluation (Call Transcriptions)

  • Domain BLEU: 0.0000
  • Domain chrF: 2.93
  • Domain COMET-QE: 0.0000

Intended Use

Primary Use Case: Translating Swahili helpline call transcriptions to English for case documentation, quality assurance, and cross-border referrals.

Languages: Swahili (source) โ†’ English (target)

Usage

from transformers import MarianTokenizer, MarianMTModel

model_name = "rogendo/mul-sw-en-translation-phase1"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)

swahili_text = "Habari za asubuhi. Ninaitwa Amina na nina miaka 14."
inputs = tokenizer(swahili_text, return_tensors="pt", padding=True)
outputs = model.generate(**inputs, num_beams=5, max_length=256)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translation)

Training Details

Base Model: Helsinki-NLP/opus-mt-mul-en
Training Epochs: 8
Batch Size: 16
Learning Rate: 3e-05
Hardware: NVIDIA GPU with FP16 mixed precision


This model is part of the OpenCHS project supporting child helpline services across East Africa.

Downloads last month
7
Safetensors
Model size
77.1M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support