Knesset-multi-e5-large

This is a sentence-transformers model. It maps sentences and paragraphs to a 1024-dimensional dense vector space and can be used for tasks like clustering or semantic search.

Knesset-multi-e5-large is based on the intfloat/multilingual-e5-large model. The transformer encoder has been fine-tuned on Knesset data to better capture legislative and parliamentary language.

ArXiv

Usage (Sentence-Transformers)

Using this model is straightforward if you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["ื–ื” ืžืฉืคื˜ ืจืืฉื•ืŸ ืœื“ื•ื’ืžื”", "ื–ื” ื”ืžืฉืคื˜ ื”ืฉื ื™"]

model = SentenceTransformer('GiliGold/Knesset-multi-e5-large')
embeddings = model.encode(sentences)
print(embeddings)

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
  (2): Normalize()
)

Additional Details

  • Base Model: intfloat/multilingual-e5-large
  • Fine-Tuning Data: Knesset data
  • Key Modifications: The encoder part has been fine-tuned on Knesset data to enhance performance for tasks involving legislative and parliamentary content. The original pooling and normalization layers have been retained to ensure that the model's embeddings remain consistent with the architecture of the base model.

Citing & Authors

@article{10.1162/COLI.a.600,
    author = {Goldin, Gili and Rabinovich, Ella and Wintner, Shuly},
    title = {Unveiling Affective Polarization Trends in Parliamentary Proceedings},
    journal = {Computational Linguistics},
    pages = {1-33},
    year = {2026},
    month = {01},
    abstract = {Recent years have seen an increase in polarized discourse worldwide, on various platforms. We propose a novel method for quantifying polarization, based on the emotional style of the discourse rather than on differences in ideological stands. Using measures of Valence, Arousal and Dominance, we detect signals of emotional discourse and use them to operationalize the concept of affective polarization. Applying this method to a recently released corpus of proceedings of the Knesset, the Israeli parliament (in Hebrew), we find that the emotional style of members of government differs from that of opposition members; and that the level of affective polarization, as reflected by this style, is significantly increasing with time.},
    issn = {0891-2017},
    doi = {10.1162/COLI.a.600},
    url = {https://doi.org/10.1162/COLI.a.600},
    eprint = {https://direct.mit.edu/coli/article-pdf/doi/10.1162/COLI.a.600/2573634/coli.a.600.pdf},
}
Downloads last month
15
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for GiliGold/Knesset-multi-e5-large

Finetuned
(149)
this model

Dataset used to train GiliGold/Knesset-multi-e5-large

Paper for GiliGold/Knesset-multi-e5-large