MindscapeRAG commited on
Commit
e71673f
·
verified ·
1 Parent(s): 4164bc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +227 -3
README.md CHANGED
@@ -1,3 +1,227 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - Qwen/Qwen3-Embedding-4B
8
+ tags:
9
+ - embedding
10
+ - retriever
11
+ - RAG
12
+ ---
13
+
14
+ # Mindscape-Aware RAG (MiA-RAG)
15
+
16
+ [![Paper](https://img.shields.io/badge/Paper-arXiv%3A2512.17220-red)](https://arxiv.org/pdf/2512.17220)
17
+ [![Model](https://img.shields.io/badge/HuggingFace-MiA--Emb--4B-yellow)](https://huggingface.co/MindscapeRAG/MiA-Emb-4B)
18
+
19
+ This repository provides the inference implementation for **MiA-Emb (Mindscape-Aware Embedding)**, the retriever component in the **MiA-RAG** framework.
20
+
21
+ **MiA-RAG** introduces explicit **global context awareness** via a **Mindscape**—a document-level semantic scaffold constructed by **hierarchical summarization**. By conditioning **both retrieval and generation** on the same Mindscape, MiA-RAG enables globally grounded retrieval and more coherent long-context reasoning.
22
+
23
+ ---
24
+
25
+ ## ✨ Key Features
26
+
27
+ - **Mindscape as Global Semantic Scaffold**
28
+ Builds a Mindscape through **hierarchical bottom-up summarization** (chunk summaries → global summary) and uses it as persistent global memory.
29
+
30
+ - **Mindscape-Aware Capabilities**
31
+ Supports the three core benefits for long-context understanding:
32
+ - **Enriched Understanding**: fill in missing context and resolve underspecified meanings
33
+ - **Selective Retrieval**: bias retrieval toward the active topic’s semantic frame
34
+ - **Integrative Reasoning**: interpret retrieved evidence within a coherent global context
35
+
36
+ - **Dual-Granularity Retrieval**
37
+ - **Chunk Retrieval** for narrative passages (standard RAG)
38
+ - **Node Retrieval** for knowledge graph entities (GraphRAG-style)
39
+
40
+ - **State-of-the-Art Retrieval Performance**
41
+ Strong results on long-context benchmarks such as NarrativeQA and DetectiveQA, outperforming strong baselines including Qwen3-Embedding and [SitEmb](https://huggingface.co/SituatedEmbedding/SitEmb-v1.5-Qwen3).
42
+
43
+
44
+ ---
45
+
46
+
47
+ ## 🚀 Usage
48
+
49
+ ### Installation
50
+
51
+ ```bash
52
+ pip install torch transformers>=4.53.0
53
+ ```
54
+
55
+ ---
56
+
57
+ ### 1) Initialization
58
+
59
+ > MiA-Emb-4B is initialized from **`Qwen3-Embedding-4B`**.
60
+
61
+ ```python
62
+ import torch
63
+ import torch.nn.functional as F
64
+ from transformers import AutoTokenizer, AutoModel
65
+
66
+ # Configuration
67
+
68
+ device = "cuda" if torch.cuda.is_available() else "cpu"
69
+
70
+ # Inference Parameters
71
+ residual = True # Enable residual connection logic
72
+ residual_factor = 0.5 # Balance between local and global
73
+ node_delimiter = "<|repo_name|>" # Special token for Node tasks
74
+
75
+ # Load Tokenizer (base)
76
+ tokenizer = AutoTokenizer.from_pretrained(
77
+ "Qwen/Qwen3-Embedding-4B",
78
+ trust_remote_code=True,
79
+ padding_side="left"
80
+ )
81
+
82
+ # Load Model
83
+ model = AutoModel.from_pretrained(
84
+ "MindscapeRAG/MiA-Emb-4B",
85
+ trust_remote_code=True,
86
+ torch_dtype=torch.bfloat16,
87
+ attn_implementation="flash_attention_2",
88
+ device_map={"": 0}
89
+ )
90
+ ```
91
+
92
+ ---
93
+
94
+ ### 2) Chunk Retrieval
95
+
96
+ Use this mode to retrieve narrative text chunks. A **Global Summary** is injected into the prompt as the “Mindscape”.
97
+
98
+ ```python
99
+ def get_query_prompt(query, summary="", residual=False):
100
+ """Construct input prompt with global summary (Eq. 5 in paper)."""
101
+ task_desc = "Given a search query with the book's summary, retrieve relevant chunks or helpful entities summaries from the given context that answer the query"
102
+ summary_prefix = "\n\nHere is the summary providing possibly useful global information. Please encode the query based on the summary:\n"
103
+
104
+ # Insert PAD token to capture residual embedding before the summary
105
+ middle_token = tokenizer.pad_token if residual else ""
106
+
107
+ return (
108
+ f"Instruct: {task_desc}\n"
109
+ f"Query: {query}{middle_token}{summary_prefix}{summary}{node_delimiter}"
110
+ )
111
+
112
+ def encode_chunk(texts, is_query=False, residual=False):
113
+ batch = tokenizer(
114
+ texts,
115
+ max_length=4096,
116
+ padding=True,
117
+ truncation=True,
118
+ return_tensors="pt"
119
+ ).to(model.device)
120
+
121
+ outputs = model(**batch)
122
+
123
+ # 1) Main Embedding (Last Token)
124
+ emb_main = last_token_pool(outputs.last_hidden_state, batch["attention_mask"])
125
+
126
+ # 2) Residual Embedding (PAD Token)
127
+ emb_res = None
128
+ if residual and is_query:
129
+ emb_res = extract_residual_token(outputs, batch, tokenizer.pad_token_id)
130
+
131
+ emb_main = F.normalize(emb_main, p=2, dim=-1)
132
+ emb_res = F.normalize(emb_res, p=2, dim=-1) if emb_res is not None else None
133
+ return emb_main, emb_res
134
+
135
+
136
+ # --- Example ---
137
+ query = "Who is the protagonist?"
138
+ global_summ = "A summary of the entire book..."
139
+ chunk = "Harry looked at the scar on his forehead."
140
+
141
+ # Encode
142
+ q_emb, q_res = encode_chunk(
143
+ [get_query_prompt(query, global_summ, residual=True)],
144
+ is_query=True,
145
+ residual=True
146
+ )
147
+ c_emb, _ = encode_chunk([chunk], is_query=False)
148
+
149
+ # Score Fusion
150
+ score = q_emb @ c_emb.T
151
+ if q_res is not None:
152
+ score = (1 - residual_factor) * score + residual_factor * (q_res @ c_emb.T)
153
+
154
+ print(f"Chunk Similarity: {score.item():.4f}")
155
+ ```
156
+
157
+ ---
158
+
159
+ ### 3) Node Retrieval
160
+
161
+ MiA-Emb can retrieve knowledge graph entities (**Nodes**). This mode extracts embeddings from the `<|repo_name|>` token position.
162
+
163
+ **Candidate format:**
164
+ `Entity Name : Entity Description`
165
+
166
+ Example:
167
+ `Mary Campbell Smith : Mary Campbell Smith is mentioned as the translator...`
168
+
169
+ ```python
170
+ def encode_node_query(texts, residual=True, node_delimiter="<|repo_name|>"):
171
+ batch = tokenizer(texts, padding=True, return_tensors="pt").to(model.device)
172
+ outputs = model(**batch)
173
+
174
+ # 1) Node Main Embedding: extract from <|repo_name|> position
175
+ node_id = tokenizer.encode(node_delimiter, add_special_tokens=False)[0]
176
+ q_emb_node = extract_specific_token(outputs, batch, node_id)
177
+
178
+ # 2) Residual Embedding: extract from [PAD] position
179
+ q_emb_res = extract_residual_token(outputs, batch, tokenizer.pad_token_id) if residual else None
180
+
181
+ q_emb_node = F.normalize(q_emb_node, p=2, dim=-1)
182
+ q_emb_res = F.normalize(q_emb_res, p=2, dim=-1) if q_emb_res is not None else None
183
+ return q_emb_node, q_emb_res
184
+
185
+
186
+ # --- Example ---
187
+ query = "Who is the protagonist?"
188
+ global_summ = "A summary of the entire book..."
189
+
190
+ # 1) Encode Query (Node Token)
191
+ q_emb_node, q_emb_res = encode_node_query(
192
+ [get_query_prompt(query, global_summ, residual=True)],
193
+ residual=True
194
+ )
195
+
196
+ # 2) Encode Entity Candidate
197
+ entity_text = "Harry Potter : The main protagonist of the series..."
198
+ n_emb, _ = encode_chunk([entity_text], is_query=False)
199
+
200
+ # 3) Score Fusion
201
+ final_score = (1 - residual_factor) * (q_emb_node @ n_emb.T)
202
+ if q_emb_res is not None:
203
+ final_score = final_score + residual_factor * (q_emb_res @ n_emb.T)
204
+
205
+ print(f"Node Similarity: {final_score.item():.4f}")
206
+ ```
207
+
208
+ ---
209
+
210
+
211
+ ## 📜 Citation
212
+
213
+
214
+
215
+ If you find this work useful, please cite:
216
+
217
+ ```bibtex
218
+ @misc{li2025mindscapeawareretrievalaugmentedgeneration,
219
+ title={Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding},
220
+ author={Yuqing Li and Jiangnan Li and Zheng Lin and Ziyan Zhou and Junjie Wu and Weiping Wang and Jie Zhou and Mo Yu},
221
+ year={2025},
222
+ eprint={2512.17220},
223
+ archivePrefix={arXiv},
224
+ primaryClass={cs.CL},
225
+ url={https://arxiv.org/abs/2512.17220},
226
+ }
227
+ ```