Multilingual Knowledge RAG Bot – Cross-Lingual Retrieval-Augmented Generation

This model is designed for cross-lingual question answering using Retrieval-Augmented Generation (RAG).
It can take documents in multiple languages — Urdu, Hindi, Spanish, English — and answer in the same or different language.

Key Features

  • LLM Used: Meta-Llama-3-8B-Instruct
  • Embedding Model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
  • RAG Pipeline: FAISS-based vector search + context injection
  • Training/Processing: Implemented entirely in Google Colab using open-source tools only
  • Zero paid APIs — 100% free and deployable

Techniques Used

  • Vector Database: FAISS for similarity search
  • Cross-Lingual Embeddings: multilingual sentence transformers
  • Prompt Engineering: Context-aware question answering
  • Open-Source Deployment Ready: Hugging Face Spaces compatible

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support