Text Classification
Transformers
Safetensors
sentence-transformers
French
bert
cross-encoder
text-embeddings-inference
Instructions to use LeviatanAIResearch/cross-encoder-bert-base-fr-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LeviatanAIResearch/cross-encoder-bert-base-fr-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="LeviatanAIResearch/cross-encoder-bert-base-fr-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("LeviatanAIResearch/cross-encoder-bert-base-fr-v1") model = AutoModelForSequenceClassification.from_pretrained("LeviatanAIResearch/cross-encoder-bert-base-fr-v1") - sentence-transformers
How to use LeviatanAIResearch/cross-encoder-bert-base-fr-v1 with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("LeviatanAIResearch/cross-encoder-bert-base-fr-v1") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Model Card for Model ID
Model Details
Model Description
This is a specialized cross encoder designed for French language tasks. It is based on Google's BERT (bert-base-multilingual-cased) architecture and fine-tuned on the PhilipMay/stsb_multi_mt French dataset. After 10 epochs of training, the model achieved a Pearson correlation of 0.83621 and a Spearman correlation of 0.82456 on the STS-B test set.
- Developed by: Leviatan Research Team
- Model type: Cross Encoder
- Language(s) (NLP): French
- Finetuned from model [optional]: Google's BERT (bert-base-multilingual-cased)
Results
STS-B Test Set:
- Metric: CECorrelationEvaluator
- Pearson: 0.83621
- Spearman: 0.82456
- Metric: CECorrelationEvaluator
Zero-Shot Test using FQuAD as Knowledge Base:
- Number of questions tested: 3188
- Number of documents considered: 768
- Top 5 k@precision: 0.8563
- Top 5 MRR: 0.6898
Comparison with dangvantuan/CrossEncoder-camembert-large:
- Top 5 k@precision: 0.6688
- Top 5 MRR: 0.4131
- Downloads last month
- -
Collection including LeviatanAIResearch/cross-encoder-bert-base-fr-v1
Collection
4 items • Updated