---
library_name: transformers
license: apache-2.0
base_model: bert-base-multilingual-cased
tags:
- generated_from_trainer
model-index:
- name: bert-base-multilingual-cased-finetuned-yiddish-experiment-3
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# bert-base-multilingual-cased-finetuned-yiddish-experiment-3

This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.4254

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 300
- num_epochs: 10

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 11.143        | 0.2364 | 100  | 7.6591          |
| 4.1737        | 0.4728 | 200  | 2.2642          |
| 2.0579        | 0.7092 | 300  | 1.7710          |
| 1.6963        | 0.9456 | 400  | 1.6712          |
| 1.5705        | 1.1820 | 500  | 1.6379          |
| 1.5353        | 1.4184 | 600  | 1.6003          |
| 1.5213        | 1.6548 | 700  | 1.5273          |
| 1.4387        | 1.8913 | 800  | 1.5415          |
| 1.3973        | 2.1277 | 900  | 1.5530          |
| 1.4266        | 2.3641 | 1000 | 1.5328          |
| 1.3365        | 2.6005 | 1100 | 1.5154          |
| 1.4423        | 2.8369 | 1200 | 1.4662          |
| 1.3948        | 3.0733 | 1300 | 1.5041          |
| 1.3244        | 3.3097 | 1400 | 1.4530          |
| 1.3645        | 3.5461 | 1500 | 1.4656          |
| 1.329         | 3.7825 | 1600 | 1.4542          |
| 1.3326        | 4.0189 | 1700 | 1.5293          |
| 1.2768        | 4.2553 | 1800 | 1.4575          |
| 1.3125        | 4.4917 | 1900 | 1.4638          |
| 1.2925        | 4.7281 | 2000 | 1.4867          |
| 1.281         | 4.9645 | 2100 | 1.4827          |
| 1.2966        | 5.2009 | 2200 | 1.4359          |
| 1.28          | 5.4374 | 2300 | 1.4761          |
| 1.2436        | 5.6738 | 2400 | 1.5006          |
| 1.2787        | 5.9102 | 2500 | 1.4511          |
| 1.2344        | 6.1466 | 2600 | 1.4430          |
| 1.199         | 6.3830 | 2700 | 1.4254          |
| 1.2899        | 6.6194 | 2800 | 1.4339          |
| 1.2637        | 6.8558 | 2900 | 1.4609          |
| 1.2186        | 7.0922 | 3000 | 1.4300          |
| 1.181         | 7.3286 | 3100 | 1.4407          |
| 1.2815        | 7.5650 | 3200 | 1.4471          |
| 1.2161        | 7.8014 | 3300 | 1.4413          |
| 1.1562        | 8.0378 | 3400 | 1.4695          |
| 1.1668        | 8.2742 | 3500 | 1.4940          |
| 1.2557        | 8.5106 | 3600 | 1.4430          |
| 1.1985        | 8.7470 | 3700 | 1.4562          |
| 1.2051        | 8.9835 | 3800 | 1.4412          |
| 1.1588        | 9.2199 | 3900 | 1.4421          |
| 1.2002        | 9.4563 | 4000 | 1.4477          |
| 1.2339        | 9.6927 | 4100 | 1.4573          |
| 1.1918        | 9.9291 | 4200 | 1.4463          |


### Framework versions

- Transformers 4.47.0
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0