Bengali CRNN OCR β Custom EasyOCR Recognition Model
DocReader BD β CSC4233 NLP Final Project, AIUB
Results
| Model | CER β | WER β | Char Accuracy |
|---|---|---|---|
| Tesseract (baseline) | ~0.45 | ~0.60 | ~55% |
| EasyOCR default | ~0.25 | ~0.40 | ~75% |
| BengaliCRNN (ours) | 0.0348 | 0.1020 | 96.5% |
Architecture
ResNet34 (grayscale) + 2Γ BiLSTM (hidden=256) + CTC loss Vocab: 152 Bengali + English chars | Input: 64Γ200px
Files
bengali_crnn.pthβ EasyOCR-ready weights (module. prefix)phase1_best.pthβ clean weights for further trainingbengali_crnn.pyβ EasyOCR network definitionbengali_crnn.yamlβ EasyOCR configvocab.jsonβ character vocabularyconfig.jsonβ model config
Usage
import easyocr
reader = easyocr.Reader(
lang_list=["bn"],
recog_network="bengali_crnn",
model_storage_directory="./bengali_ocr_model",
user_network_directory="./bengali_ocr_model",
gpu=True
)
results = reader.readtext("bengali_doc.jpg")
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support