Meno-Lite-0.1
A 7B language model built to read, not to memorize.
💡 TL;DR
- 🎯 Focus: RAG, document QA, information extraction, knowledge graph construction, summarization
- 🧠 Core idea: train language skills (comprehension, extraction, reasoning), not factual memorization — knowledge comes from context
- 🏆 Results: top-performing 7B model on MultiQ (multi-hop QA); #1 on NEREL-bench (knowledge graph construction), outperforming models up to 32B; near-perfect passkey retrieval up to 128k tokens
- 🇷🇺 Languages: Russian (primary) + English
- ⚡ Tokenizer: 3.77 chars/token on Russian — 47% more efficient than vanilla Qwen2.5
- 🖥️ Deployment: fits on a single consumer GPU; works with vLLM and transformers out of the box
- 📜 License: Apache 2.0
Use when: you have documents and need to extract information, answer questions over them, or build a knowledge graph. Don't use when: you need a general-purpose chatbot with broad world knowledge and no retrieval pipeline.
🧠 Key idea
Why "Meno"? The name alludes to Plato's dialogue Meno, where Socrates argues that knowledge is not learned but recollected from within (ἀνάμνησις). We invert this metaphor: rather than assuming knowledge is already inside the model, we externalize it into a retrieval corpus and let the model "recollect" through a RAG pipeline. Like Socrates' interlocutor, the model doesn't carry the answers within itself — but given the right context, it can arrive at them. This is why Meno-Lite's training focuses on sharpening the skills that make such recollection possible: comprehension, extraction, inference, and generation.
We hypothesize that the capabilities of LLMs can be roughly decomposed into world knowledge (facts, dates, entities) and language skills (comprehension, extraction, inference, generation). While world knowledge demands ever more parameters, language skills appear to reach a usable plateau even in 7B-class models — provided they are deliberately cultivated. Meno-Lite-0.1 is an empirical test of this idea: by investing training compute into language skills rather than factual recall, we aim for a model that performs competitively on context-grounded tasks while remaining deployable on a single consumer GPU. The upcoming technical report will examine where this trade-off holds and where it breaks down.
🧬 Model Lineage
Meno-Lite-0.1 is derived from RuadaptQwen2.5-7B-Lite-Beta through a carefully designed two-stage training pipeline (continued pretraining → supervised fine-tuning) that sharpens the model's ability to work with documents rather than from parametric memory. The full lineage is:
Qwen/Qwen2.5-7B-Instruct
└─► t-tech/T-lite-it-1.0
└─► RefalMachine/RuadaptQwen2.5-7B-Lite-Beta
└─► bond005/Meno-Lite-0.1 ◄── you are here
Each ancestor added a layer of Russian-language adaptation; Meno-Lite-0.1 adds a final layer of skill-oriented training focused on information extraction, entity normalization, multi-hop reasoning over long contexts, and instruction following for RAG scenarios. Although the model is primarily oriented toward Russian, it retains strong English performance thanks to bilingual pretraining data (sampled FineWeb-Edu) and English-language SFT examples (MultiHopRAG, MTRAGEval).
- Developed by: Ivan Bondarenko, Novosibirsk State University (NSU)
- Model type: Causal decoder-only transformer (Qwen2.5 architecture)
- Parameters: ~7B
- Language(s): Russian (primary), English (retained)
- License: Apache 2.0
- Base model: RefalMachine/RuadaptQwen2.5-7B-Lite-Beta
⚡ Tokenizer Efficiency
An often-overlooked determinant of real-world throughput is tokenizer efficiency: the more characters each token covers, the fewer autoregressive steps are needed to generate text of a given length. Meno-Lite-0.1 inherits the extended tokenizer from RuadaptQwen2.5-7B-Lite-Beta, which dramatically improves Russian-language efficiency compared to the original Qwen2.5 vocabulary.
| Model | Chars/token (RU) | Chars/token (EN) |
|---|---|---|
| Meno-Lite-0.1 | 3.77 | 4.13 |
| RuadaptQwen2.5-7B-Lite-Beta | 3.77 | 4.13 |
| AvitoTech/avibe (8B) | 3.79 | 4.06 |
| t-tech/T-lite-it-2.1 (7B) | 3.74 | 4.14 |
| t-tech/T-lite-it-1.0 (7B) | 2.57 | 4.14 |
| Qwen/Qwen2.5-7B-Instruct | 2.57 | 4.14 |
| GigaChat3-10B-A1.8B | 3.74 | 3.99 |
Meno-Lite-0.1 achieves 3.77 characters per token on Russian text — a 47% improvement over the original Qwen2.5 tokenizer (2.57 chars/token). This translates directly into faster inference and lower serving costs for Russian-language workloads, while English efficiency remains on par with the best models in the class.
📊 Evaluation
Note: A more detailed analysis of Meno-Lite-0.1's performance will be provided in an upcoming technical report.
MERA Benchmark
https://mera.a-ai.ru/ru/text/leaderboard
MERA is the most comprehensive benchmark for evaluating Russian LLMs on "strong AI" tasks. The benchmark comprises 23 tasks covering world knowledge, logic, causality, and AI ethics. Below we present results on 5 selected tasks chosen for their relevance to RAG and document processing scenarios:
- MultiQ: Multi-hop question answering over multi-document contexts — directly measures core RAG capability
- RWSD: Coreference resolution (Winograd Schema) — tests discourse understanding
- RCB: Natural language inference with causality detection — evaluates reasoning over text
- CheGeKa: World knowledge QA — included for comparison to show the model's intentional design trade-off
- ruWorldTree: Elementary science facts — tests knowledge vs. reasoning balance
However, the Overall Score column reflects performance across all 23 MERA tasks, not just the 5 shown here.
| Model | Size | Overall Score | MultiQ | RWSD | RCB | CheGeKa | ruWorldTree |
|---|---|---|---|---|---|---|---|
| GPT-4o | - | 0.642 | 0.572 / 0.431 | 0.496 | 0.557 / 0.521 | 0.553 / 0.464 | 0.985 / 0.985 |
| Meno-Lite-0.1 | 7B | 0.555 | 0.536 / 0.403 | 0.569 | 0.541 / 0.458 | 0.346 / 0.293 | 0.949 / 0.760 |
| T-lite-it-1.0 | 7B | 0.552 | 0.523 / 0.398 | 0.535 | 0.571 / 0.533 | 0.502 / 0.413 | 0.964 / 0.964 |
| AvitoTech/avibe | 8B | 0.618 | 0.539 / 0.410 | 0.565 | 0.582 / 0.547 | 0.168 / 0.12 | 0.968 / 0.968 |
| RuadaptQwen2.5-7B-Lite-Beta | 7B | 0.536 | 0.479 / 0.342 | 0.465 | 0.553 / 0.458 | 0.379 / 0.308 | 0.960 / 0.960 |
| Qwen2.5-7B-Instruct | 7B | 0.482 | 0.425 / 0.296 | 0.515 | 0.562 / 0.493 | 0.077 / 0.048 | 0.939 / 0.939 |
Key observations:
- Meno-Lite-0.1 achieves solid results within its size class (7B parameters), improving notably over its direct ancestors (RuadaptQwen2.5-7B-Lite-Beta and Qwen2.5-7B-Instruct)
- The model shows competitive performance on MultiQ (multi-hop question answering), which is particularly relevant for RAG pipelines
- As expected by design, world knowledge tasks (CheGeKa) are not the model's strength — this is an intentional trade-off for better context-grounded performance
NEREL-bench: Knowledge Graph Construction
https://huggingface.co/datasets/bond005/NEREL_bench
NEREL-bench evaluates LLM capabilities for knowledge graph construction: named entity recognition, relation extraction, and contextual definition generation. These tasks are critical for GraphRAG and knowledge-intensive applications.
Note: NEREL-bench was developed by the author of this model. To prevent data leakage between the SFT training set (NEREL-instruct) and the evaluation set (NEREL-bench), we followed the original train/dev/test split defined in the NEREL paper (Loukachevitch et al., 2021). Only the training portion of NEREL was used to construct SFT instructions; the test portion was held out and used exclusively for evaluation in NEREL-bench. However, because Meno-Lite-0.1 was exposed to the NEREL annotation schema and text domain during SFT, it may have a distributional advantage over models that were not. We encourage independent replication on other IE benchmarks.
| Model | Size | RuEntityRecognition (F1) | RuEntityDefinition (chrF++) | RuRelationExtraction (F1) | RuRelationDefinition (chrF++) | Harmonic Mean |
|---|---|---|---|---|---|---|
| Meno-Lite-0.1 | 7B | 0.5043 | 0.5273 | 0.3469 | 0.5582 | 0.4676 |
| Qwen2.5-32B-Instruct | 32B | 0.5361 | 0.5275 | 0.2393 | 0.5993 | 0.4163 |
| gemma-3-12b-it | 12B | 0.5136 | 0.4955 | 0.2450 | 0.5649 | 0.4075 |
| gemma-3-27b-it | 27B | 0.5436 | 0.4818 | 0.2243 | 0.5827 | 0.3964 |
| Qwen2.5-14B-Instruct | 14B | 0.5096 | 0.5182 | 0.2222 | 0.5829 | 0.3957 |
| AvitoTech/avibe | 8B | 0.4683 | 0.4351 | 0.2207 | 0.3971 | 0.3483 |
| T-lite-it-1.0 | 7B | 0.4660 | 0.4644 | 0.1741 | 0.5329 | 0.3356 |
| T-lite-it-2.1 | 8B | 0.4889 | 0.3933 | 0.1308 | 0.5469 | 0.2845 |
| Qwen2.5-7B-Instruct | 7B | 0.4770 | 0.4790 | 0.1919 | 0.5411 | 0.3558 |
| RuadaptQwen2.5-7B-Lite-Beta | 7B | 0.4208 | 0.3925 | 0.1215 | 0.5041 | 0.2642 |
Meno-Lite-0.1 achieves the highest harmonic mean score, outperforming models 2–4× larger on knowledge graph construction tasks. Keeping in mind the distributional advantage noted above, this result suggests that the model is well-suited for:
- GraphRAG pipelines
- Automated knowledge base construction
- Document analysis and entity extraction
- Building structured representations from unstructured text
LIBRA: Long-Context Understanding
https://huggingface.co/datasets/ai-forever/LIBRA
LIBRA evaluates long-context understanding across tasks ranging from 4k to 128k tokens.
Simple Information Retrieval (Passkey)
| Model | 4k | 8k | 16k | 32k | 64k | 128k |
|---|---|---|---|---|---|---|
| Meno-Lite-0.1 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.98 |
| T-lite-it-2.1 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.895 |
| AvitoTech/avibe | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.895 |
| RefalMachine/RuadaptQwen2.5-7B-Lite-Beta | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.98 |
| T-lite-it-1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.58 |
| Qwen2.5-7B-Instruct | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.58 |
Meno-Lite-0.1 maintains near-perfect passkey retrieval performance across all context lengths, including 128k tokens.
Multi-hop Question Answering
Scores are reported as a range from shortest to longest context (4k → 128k).
| Model | LibrusecMHQA (8k) | ruBABILongQA1 (4k→128k) | ruBABILongQA4 (4k→128k) | ruBABILongQA5 (4k→128k) |
|---|---|---|---|---|
| Meno-Lite-0.1 | 0.484 | 0.72 → 0.36 | 0.56 → 0.22 | 0.80 → 0.54 |
| Qwen2.5-14B-Instruct | 0.484 | 0.90 → 0.38 | 0.66 → 0.15 | 0.86 → 0.64 |
| T-lite-it-2.1 | 0.453 | 0.77 → 0.44 | 0.60 → 0.27 | 0.79 → 0.69 |
| T-lite-it-1.0 | 0.456 | 0.74 → 0.34 | 0.56 → 0.15 | 0.81 → 0.54 |
| RefalMachine/RuadaptQwen2.5-7B-Lite-Beta | 0.432 | 0.74 → 0.29 | 0.59 → 0.22 | 0.79 → 0.49 |
| Qwen2.5-7B-Instruct | 0.419 | 0.65 → 0.48 | 0.62 → 0.08 | 0.81 → 0.69 |
Meno-Lite-0.1 shows competitive multi-hop reasoning at shorter contexts, matching Qwen2.5-14B-Instruct on LibrusecMHQA. Performance degrades at very long contexts, which is consistent with other models in this size class.
LLM Tool Calling Benchmark (BFCL Russian)
https://github.com/MKreGGo/ru_tool_calling_tests
For completeness, we include tool-calling results, although function calling is not a target capability of Meno-Lite-0.1.
| Model | Overall Success Rate |
|---|---|
| T-lite-it-2.1 | 84.5% |
| Qwen3-8B (thinking mode) | 80.3% |
| Qwen2.5-7B-Instruct | 76.1% |
| AvitoTech/avibe | 69.2% |
| Meno-Lite-0.1 | 58.9% |
| RefalMachine/RuadaptQwen2.5-7B-Lite-Beta | 58.5% |
| T-lite-it-1.0 | 2.9% |
Function-calling performance is moderate. Meno-Lite-0.1 significantly outperforms its direct predecessor (T-lite-it-1.0) and matches its immediate ancestor (RuadaptQwen2.5-7B-Lite-Beta), but lags behind models with dedicated tool-calling training.
👨💻 Usage
How to Get Started with the Model
1. RAG Question Answering
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "bond005/meno-lite-0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
SYSTEM_PROMPT = "Вы — полезный ассистент. Отвечайте на вопросы, опираясь на предоставленный контекст."
CHUNKS = [
"Новосибирский государственный университет (НГУ) был основан в 1959 году в Академгородке.",
"12 сентября 1959 года был успешно осуществлён запуск автоматической межпланетной станции «Луна-2». "
"14 сентября 1959 года станция «Луна-2» впервые в мире достигла поверхности Луны в районе Моря Дождей "
"вблизи кратеров Аристилл, Архимед и Автолик.",
"Московский государственный университет имени М. В. Ломоносова (МГУ) был основан в 1755 году. "
"Изначально университет располагался в здании Главной аптеки (бывший Земский приказ) на месте "
"Государственного исторического музея на Красной площади.",
]
CONTEXT = "\n\n".join([f"Контекст {idx + 1}:\n```text\n{val}\n```" for idx, val in enumerate(CHUNKS)])
question = "Какой университет был основан в том же году, когда впервые в истории рукотворный аппарат достиг поверхности Луны?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": CONTEXT + "\n\nВопрос: " + question + "\n"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)
Expected output:
Новосибирский государственный университет (НГУ) был основан в том же году, когда впервые в истории рукотворный аппарат достиг поверхности Луны.
2. Multi-hop Reasoning
This example reuses the same model, tokenizer, and context from the previous snippet.
question = "Через сколько лет после университета в Москве был основан университет в Новосибирске?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": CONTEXT + "\n\nВопрос: " + question + "\n"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)
Expected output:
Университет в Новосибирске был основан через 204 года после Московского государственного университета.
3. Few-Shot Named Entity Recognition
This example reuses the same model and tokenizer from the first snippet.
import json
few_shot_messages = [
{
"role": "system",
"content": "Вы - эксперт в области анализа текстов и извлечения семантической информации из них."
},
{
"role": "user",
"content": "Выделите именованные сущности классов ORGANIZATION, PERSON и LOCATION из входного текста "
"и запишите ответ в JSON-формате.\n\n"
"Входной текст:\n\n```text\n"
"Научный сотрудник лаборатории прикладных цифровых технологий Международного "
"научно-образовательного математического центра НГУ Иван Бондаренко рассказал "
"о грантовой программе и о том, как его проект RAGU попал в число победителей.\n```\n"
},
{
"role": "assistant",
"content": '{"ORGANIZATION": ["лаборатория прикладных цифровых технологий Международного '
'научно-образовательного математического центра НГУ", '
'"Международный научно-образовательный математический центр НГУ", "НГУ"], '
'"PERSON": ["Иван Бондаренко"], "LOCATION": []}'
},
{
"role": "user",
"content": "Выделите именованные сущности классов ORGANIZATION, PERSON и LOCATION из входного текста "
"и запишите ответ в JSON-формате.\n\n"
"Входной текст:\n\n```text\n"
"Национальный исследовательский университет «Высшая школа экономики» (НИУ ВШЭ) представил "
"результаты 15-го мониторинга качества приема на бюджетные и платные места российских вузов "
"в 2025 году. В группе лидеров 10 московских университетов, три питерских и по одному "
"представителю из таких регионов, как Татарстан (Иннополис), Нижний Новгород "
"и Новосибирск (НГУ).\n```\n"
},
{
"role": "assistant",
"content": '{"ORGANIZATION": ["Национальный исследовательский университет «Высшая школа экономики»", '
'"НИУ ВШЭ", "НГУ"], "PERSON": [], "LOCATION": ["московский", "питерский", "Татарстан", '
'"Иннополис", "Нижний Новгород", "Новосибирск"]}'
},
{
"role": "user",
"content": "Выделите именованные сущности классов ORGANIZATION, PERSON и LOCATION из входного текста "
"и запишите ответ в JSON-формате.\n\n"
"Входной текст:\n\n```text\n"
"Почему китайская ИИ-модель DeepSeek гораздо эффективнее и дешевле западных аналогов?\n```\n"
},
{
"role": "assistant",
"content": '{"ORGANIZATION": [], "PERSON": [], "LOCATION": ["китайская", "западный"]}'
}
]
input_text = (
"Станислав Владимирович Дробышевский – российский антрополог, кандидат биологических наук, "
"доцент кафедры антропологии биологического факультета МГУ им. М.В. Ломоносова, научный редактор "
"портала \u201cАнтропогенез.ру\u201d и, без сомнения, одна из самых ярких и узнаваемых фигур "
"в российской науке."
)
text = tokenizer.apply_chat_template(
few_shot_messages + [{"role": "user", "content": input_text}],
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
response = json.loads(
tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
)
print(json.dumps(response, ensure_ascii=False, indent=4))
Expected output:
{
"ORGANIZATION": [
"биологический факультет МГУ им. М.В. Ломоносова",
"МГУ им. М.В. Ломоносова"
],
"PERSON": [
"Станислав Владимирович Дробышевский"
],
"LOCATION": [
"российская"
]
}
Using vLLM for high-throughput serving
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
model_name = "bond005/meno-lite-0.1"
tok = AutoTokenizer.from_pretrained(model_name)
llm = LLM(
model=model_name,
dtype="bfloat16",
max_model_len=32768,
gpu_memory_utilization=0.85
)
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512)
messages = [
{
"role": "system",
"content": "Вы — Менон, разработанный в Новосибирском государственном университете. Вы — полезный помощник."
},
{
"role": "user",
"content": "Привет! Расскажи о себе."
}
]
input_text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = llm.generate([input_text], sampling_params)
print(outputs[0].outputs[0].text)
As a result, you will see text similar to the following:
Привет! Меня зовут Менон, и я — виртуальный помощник, созданный в Новосибирском государственном университете. Я здесь, чтобы помочь вам с различными вопросами и задачами.
Additional Examples of Usage for NER and Relation Extraction
Important Note on Few-Shot Prompting
Using few-shot prompting (in-context learning) significantly improves Meno-Lite-0.1's performance on NER and relation extraction tasks and is therefore strongly recommended. The examples below demonstrate this approach.
The set of entity and relation classes is not limited to those shown in the examples — you can define any classes relevant to your domain. For more detailed examples covering diverse entity and relation types, see the NEREL-bench dataset card.
Named Entity Recognition
https://colab.research.google.com/drive/1onh4ovG7iGEjr3SX_IeZ-ZmdO6VpOWBR?usp=sharing
Relation Extraction
https://colab.research.google.com/drive/1P5qQyjrv811jAKGqVZ4oSVO5M5WHsSzv?usp=sharing
📚 Training Details
Continued Pretraining (CPT) data
1.3B tokens
| Source | Language | Description |
|---|---|---|
| FineWeb-Edu (randomly sampled) | EN | High-quality educational web text |
| RuLM subset filtered by quality | RU | Russian web text selected for maximal FineWeb-Edu similarity using gte-multilingual-base embeddings |
| RU FinePDFs-edu | RU | Educational PDF documents in Russian |
| RuREBus (Dialogue'20) | RU | Unlabeled text corpus from the RuREBus shared task |
Supervised Fine-Tuning (SFT) data
50M tokens
| Source | Language | Description |
|---|---|---|
| NEREL-instruct → instructions | RU | Named entity recognition corpus converted to instruction format, plus LLM–generated and validated synthetic entity normalization and definitions |
| LightRAG query logs | RU | GPT-4o-generated queries over Habr articles and the NSU website |
| MultiHopRAG | EN | Multi-hop question answering training dialogs |
| MTRAGEval | EN | Multi-turn RAG evaluation training dialogs |
| Additional custom instructions | RU | Manually created samples for self-cognition and alignment |
Training Procedure
Stage 1 — Continued Pretraining (CPT): The model was further pretrained on a balanced mix of Russian and English educational, legal, and scientific-technical texts. The Russian subset was specifically selected to match the quality distribution of FineWeb-Edu, ensuring that the model absorbs high-quality linguistic patterns rather than noisy web crawls.
Stage 2 — Supervised Fine-Tuning (SFT): The SFT stage used a custom instruction set designed to reinforce extraction, normalization, summarization, and multi-hop QA capabilities. The critical distinction from conventional SFT: our instructions teach models to use context rather than to recall facts.
⚠️ Bias, Risks, and Limitations
- Hallucination risk: Like all autoregressive LLMs, Meno-Lite-0.1 can generate plausible-sounding but factually incorrect text, especially when relevant context is not provided in the prompt.
- World knowledge gaps: The model deliberately trades factual recall capacity for context-grounded skills. It should not be used as a standalone knowledge base.
- Language coverage: While the model retains good English capabilities, it has been primarily validated on Russian and English. Performance on other languages supported by the Qwen2.5 backbone is untested.
- Training data biases: The model inherits biases present in its pretraining corpora (FineWeb-Edu, RuLM, Habr) and in the GPT-4o/GPT-4o-mini generations used for synthetic SFT data.
- Context window: Although the model handles contexts up to 128K tokens in passkey tasks, complex reasoning performance degrades at very long contexts (>32K), consistent with other models in this size class.
🎯 Recommendations
Best suited for:
- RAG pipelines — document QA, retrieval-augmented generation
- Information extraction — named entity recognition, relation extraction
- Knowledge graph construction — GraphRAG, automated KB building
- Document processing — summarization, analysis of legal/technical documents
- Structured data extraction — converting unstructured text to structured formats
Not recommended for:
- General-purpose conversational AI without context grounding
- Tasks requiring extensive world knowledge not provided in context
- Complex mathematical reasoning
- Production code generation
📖 Citation
If you use Meno-Lite-0.1 in your research, please cite:
BibTeX:
@misc{bondarenko2026menolite,
title={Meno-Lite-0.1: A 7B Language Model Optimized for Russian RAG Pipelines},
author={Ivan Bondarenko},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/bond005/meno-lite-0.1}
}
📜 License
This model is released under the Apache 2.0 license.
🙏 Acknowledgments
This model was developed at Novosibirsk State University. Special thanks to:
- The Qwen team for the base Qwen2.5 architecture
- The T-Tech team for T-lite-it-1.0
- Mikhail Tikhomirov and his colleagues for RuadaptQwen2.5-7B-Lite-Beta
- Natalia Loukachevitch and her colleagues for NEREL
- The creators of MERA, LIBRA, and LLM Tool Calling benchmarks
- Downloads last month
- 153
Model tree for bond005/meno-lite-0.1
Base model
Qwen/Qwen2.5-7B