--- license: apache-2.0 --- # TrOCR Base Fine-Tuned for Czech Historical Vital Records This is a fine-tuned TrOCR-Base model (`microsoft/trocr-base-handwritten`) specializing in Handwritten Text Recognition of 19th-century Czech vital records (birth, marriage, death registers). The model was trained as part of the Master's Thesis: **"Automated Transcription and Search in Historical Records Using Handwritten Text Recognition"**. It was developed on an original, manually annotated dataset of historical Czech scripts and is designed to be used inside the full historical document processing pipeline (layout analysis → text detection → recognition → post-processing). For detailed performance metrics, evaluation, and the full pipeline description, please refer to the thesis text. ## Citation ``` @misc{palkovic2025htr, AUTHOR = {Palkovič, Radoslav}, TITLE = {Automated Transcription and Search in Historical Records Using Handwritten Text Recognition}, YEAR = {2025}, TYPE = {Master Thesis}, INSTITUTION = {Masaryk University, Faculty of Informatics}, LOCATION = {Brno}, SUPERVISOR = {Michal Batko} } ```