You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Card for Model ID

Model Details

Model Description

This is a specialized cross encoder designed for French language tasks. It is based on Google's BERT (bert-base-multilingual-cased) architecture and fine-tuned on the PhilipMay/stsb_multi_mt French dataset. After 10 epochs of training, the model achieved a Pearson correlation of 0.83621 and a Spearman correlation of 0.82456 on the STS-B test set.

Developed by: Leviatan Research Team
Model type: Cross Encoder
Language(s) (NLP): French
Finetuned from model [optional]: Google's BERT (bert-base-multilingual-cased)

Results

STS-B Test Set:
- Metric: CECorrelationEvaluator
  - Pearson: 0.83621
  - Spearman: 0.82456
Zero-Shot Test using FQuAD as Knowledge Base:
- Number of questions tested: 3188
- Number of documents considered: 768
- Top 5 k@precision: 0.8563
- Top 5 MRR: 0.6898
Comparison with dangvantuan/CrossEncoder-camembert-large:
- Top 5 k@precision: 0.6688
- Top 5 MRR: 0.4131

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Collection including LeviatanAIResearch/cross-encoder-bert-base-fr-v1

Cross Encoder

Collection

4 items • Updated Feb 27, 2025