eudr_chabo_orchestrator

Running on CPU Upgrade

App Files Files Community

eudr_chabo_orchestrator / README.md

mtyrrell

updated README for sources

50dc32e about 2 months ago

preview code

raw

history blame contribute delete

19 kB

metadata

title: EUDR Chabo Orchestrator
emoji: 🐠
colorFrom: yellow
colorTo: pink
sdk: docker
pinned: false

Chabo Orchestrator Documentation

Overview
System Architecture
Components
Configuration
Deployment Guide
API Reference
Usage Examples
Troubleshooting

Overview

The Chabo Orchestrator is the central coordination module of the Chabo RAG system. It orchestrates the flow between multiple microservices to provide intelligent document processing and question-answering capabilities. The system is designed for deployment on Huggingface Spaces.

Key Features

Workflow Orchestration: Uses LangGraph to manage complex processing pipelines
Multi-Modal Support: Handles files dependent on ChatUI and Ingestor config (e.g. PDF, DOCX, GeoJSON, and JSON )
Streaming Responses: Real-time response generation with Server-Sent Events (SSE)
Dual Processing Modes:
- Direct Output Mode: Returns ingestor results immediately (e.g. EUDR use case)
- Standard RAG Mode: Full retrieval-augmented generation pipeline
Intelligent Caching: Prevents redundant file processing (e.g. EUDR use case)
Multiple Interfaces: FastAPI endpoints for modules; LangServe endpoints for ChatUI; Gradio UI for testing

System Architecture

High-Level Architecture

┌─────────────────┐
│   ChatUI        │
│   Frontend      │
└────────┬────────┘
         │ HTTP/SSE
         ▼
┌─────────────────────────────────┐
│   Chabo Orchestrator            │
│   ┌─────────────────────────┐   │
│   │   LangGraph Workflow    │   │
│   │   ┌─────────────────┐   │   │
│   │   │ Detect File     │   │   │
│   │   │ Type            │   │   │
│   │   └────────┬────────┘   │   │
│   │            │            │   │
│   │   ┌────────▼────────┐   │   │
│   │   │ Ingest File     │   │   │
│   │   └────────┬────────┘   │   │
│   │            │            │   │
│   │      ┌─────┴──────┐     │   │
│   │      │            │     │   │
│   │   ┌──▼───┐   ┌────▼───┐ │   │
│   │   │Direct│   │Retrieve│ │   │
│   │   │Output│   │Context │ │   │
│   │   └──┬───┘   └────┬───┘ │   │
│   │      │            │     │   │
│   │      │       ┌────▼───┐ │   │
│   │      │       │Generate│ │   │
│   │      │       │Response│ │   │
│   │      │       └────────┘ │   │
│   └──────┴──────────────────┘   │
└──────┬───────────┬──────────┬───┘
       │           │          │
   ┌───▼──┐   ┌───▼───┐   ┌──▼────┐
   │Ingest│   │Retrie-│   │Genera-│
   │or    │   │ver    │   │tor    │
   └──────┘   └───────┘   └───────┘

Component Communication

All communication between modules happens over HTTP:

Orchestrator ↔ Ingestor: Gradio Client (file upload, processing)
Orchestrator ↔ Retriever: Gradio Client (semantic search)
Orchestrator ↔ Generator: HTTP streaming (SSE for real-time responses)
ChatUI ↔ Orchestrator: LangServe streaming endpoints

Workflow Logic and File Processing

The orchestrator implements a dual-mode workflow designed to handle non-standard ingesion operations (e.g., Whisp GeoJSON API calls - as these need to be returned directly without going through the generator) while maintaining conversational context across multiple turns. This also addresses an issue with ChatUI in that it resends uploaded files on each turn in the conversation (e.g. follow-up queries).

Processing Modes Overview

Mode 1: Direct Output (DIRECT_OUTPUT = True)

Purpose: Immediately return long-running ingestor results to the user without Generator (LLM) processing. Results are maintained in message history context for follow-up questions.

File Upload:

File Upload → Detect Type → Direct Output Ingest → Return Raw Results

Key Behaviors:

File uploads return raw ingestor output immediately (no LLM generation)
Each file upload is processed through the ingestor
Suitable for immediate analysis results (e.g., Whisp API responses)

Example Conversation Flow:

User: [Uploads plot_boundaries.geojson]
System: [Returns API analysis results directly - no LLM processing]

User: "What deforestation risks were identified?"
System: [Conversation history + Retrieval → Generator - processes as standard query]

User: "How does this compare to EUDR requirements?"
System: [Conversation history + Retrieval → Generator - processes as standard query]

Mode 2: Standard RAG (DIRECT_OUTPUT = False)

Purpose: Traditional RAG pipeline where uploaded files are treated as additional context for query-based generation from first instance.

Every Query (with or without file):

Query + Optional File → Detect Type → Ingest → Retrieved Context → Combined Context → Generator
                                         ↓
                                    Add to Context

Key Behaviors:

Files are processed through ingestor when uploaded
Ingestor output is added to the retrieval context (not returned directly)
Generator always processes the combined context (ingestor + retriever)

Example Conversation Flow:

User: "What are EUDR requirements?" + policy_document.pdf
System: [PDF → Retrieval → Combined Context → Generator]

User: "Summarize section 3"
System: [Retrieval → Combined Context → Generator]

Conversation Context Management

The system maintains conversation history separately from file processing with a simple management approach:

Context Building (build_conversation_context()):

Always includes first user/assistant exchange
Includes last N complete turns (default: 3)
Respects character limits (default: 12,000 chars)

Retrieval Strategy:

Uses only the latest user query for semantic search
Does NOT send entire conversation history to retriever
Ensures relevant document retrieval based on current question

Generation Context:

Combines: Conversation history + Retrieved context + File ingestor results (if present)
Generator uses full context to produce coherent, contextually-aware responses

Components

1. Main Application (`main.py`)

LangServe endpoints for ChatUI integration
Gradio web interface for testing
FastAPI endpoints for diagnostics and future use (e.g. /health)

Key Functions:

chatui_adapter(): Handles text-only queries
chatui_file_adapter(): Handles file uploads with queries
create_gradio_interface(): Test UI

2. Workflow Nodes (`nodes.py`)

LangGraph nodes that implement the processing pipeline:

Node Functions:

detect_file_type_node(): Identifies file type and determines routing
ingest_node(): Processes files through appropriate ingestor
direct_output_node(): Returns raw ingestor results
retrieve_node(): Fetches relevant context from vector store
generate_node_streaming(): Streams LLM responses
route_workflow(): Conditional routing logic

Helper Functions:

process_query_streaming(): Unified streaming interface

3. Data Models (`models.py`)

Pydantic models for type validation

4. Retriever Adapter (`retriever_adapter.py`)

Abstraction layer for managing different retriever configurations:

Handles authentication
Formats queries and filters

5. Utilities (`utils.py`)

Helper functions

Conversation Context Management

The build_conversation_context() function manages conversation history to provide relevant context to the generator while respecting token limits and conversation flow.

Key Features:

Context Selection: Always includes the first user and assistant messages to maintain conversation context
Recent Turn Limiting: Includes only the last N complete turns (user + assistant pairs) to focus on recent conversation (default: 3)
Character Limit Management: Truncates to maximum character limits to prevent context overflow

Function Parameters:

def build_conversation_context(
    messages,           # List of Message objects from conversation
    max_turns: int = 3, # Maximum number of recent turns to include
    max_chars: int = 8000  # Maximum total characters in context
) -> str

Configuration

Configuration File (`params.cfg`)

[file_processing]
# Enable direct output mode: when True, ingestor results are returned directly
# without going through the generator. When False, all files go through full RAG pipeline.
# This also prevents ChatUI from resending the file in the conversation history with each turn
# Note: File type validation is handled by the ChatUI frontend
DIRECT_OUTPUT = True

[conversation_history]
# Limit the context window for the conversation history
MAX_TURNS = 3
MAX_CHARS = 12000

[retriever]
RETRIEVER = https://giz-chatfed-retriever0-3.hf.space/
# Optional
COLLECTION_NAME = EUDR  

[generator]
GENERATOR = https://giz-eudr-chabo-generator.hf.space

[ingestor]
INGESTOR = https://giz-eudr-chabo-ingestor.hf.space

[general]
# need to include this for HF inference endpoint limits
MAX_CONTEXT_CHARS = 15000

Environment Variables

Create a .env file with:

# Required for accessing private HuggingFace Spaces modules
HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxx

ChatUI Configuration

ChatUI DOTENV_LOCAL example deployment configuration:

MODELS=`[
  {
    "name": "asistente_eudr",
    "displayName": "Asistente EUDR",
    "description": "Retrieval-augmented generation on EUDR Whisp API powered by ChatFed modules.",
  "instructions": {
    "title": "EUDR Asistente: Instructiones",
    "content": "Hola, soy Asistente EUDR, un asistente conversacional basado en inteligencia artificial diseñado para ayudarle a comprender el cumplimiento y el análisis del Reglamento de la UE sobre la deforestación. Responderé a sus preguntas utilizando los informes EUDR y los archivos GeoJSON cargados.\n\n💡 **Cómo utilizarlo (panel a la derecha)**\n\n**Modo de uso:** elija entre subir un archivo GeoJSON para su análisis o consultar los informes EUDR filtrados por país.\n\n**Ejemplos:** seleccione entre preguntas de ejemplo seleccionadas de diferentes categorías.\n\n**Referencias:** consulte las fuentes de contenido utilizadas para la verificación de datos.\n\n⚠️ Para conocer las limitaciones y la información sobre la recopilación de datos, consulte la pestaña «Exención de responsibilidad».\n\n⚠️ Al utilizar esta aplicación, usted acepta que recopilemos estadísticas de uso (como preguntas formuladas, comentarios realizados, duración de la sesión, tipo de dispositivo e información geográfica anónima) para comprender el rendimiento y mejorar continuamente la herramienta, basándonos en nuestro interés legítimo por mejorar nuestros servicios."
  },
    "multimodal": true,
    "multimodalAcceptedMimetypes": [
      "application/geojson"
    ],
    "chatPromptTemplate": "{{#each messages}}{{#ifUser}}{{content}}{{/ifUser}}{{#ifAssistant}}{{content}}{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.0,
      "max_new_tokens": 2048
    },
    "endpoints": [{
      "type": "langserve-streaming",
      "url": "https://giz-eudr-chabo-orchestrator.hf.space/chatfed-ui-stream",
      "streamingFileUploadUrl": "https://giz-eudr-chabo-orchestrator.hf.space/chatfed-with-file-stream",
      "inputKey": "text",
      "fileInputKey": "files"
    }]
  }
]`

PUBLIC_ANNOUNCEMENT_BANNERS=`[
    {
    "title": "This is Chat Prototype for DSC users",
    "linkTitle": "Keep it Clean"
  }
]`

PUBLIC_APP_DISCLAIMER_MESSAGE="Disclaimer: AI is an area of active research with known problems such as biased generation and misinformation. Do not use this application for high-stakes decisions or advice. Do not insert your personal data, especially sensitive, like health data."
PUBLIC_APP_DESCRIPTION="Internal Chat-tool for DSC users for testing"

PUBLIC_APP_NAME="EUDR ChatUI" 
ENABLE_ASSISTANTS=false
ENABLE_ASSISTANTS_RAG=false
COMMUNITY_TOOLS=false
MONGODB_URL=mongodb://localhost:27017

# Disable LLM-based title generation to prevent template queries
LLM_SUMMARIZATION=false

Key things to ensure here:

multimodalAcceptedMimetypes: file types to accept for upload via ChatUI
endpoints: orchestrator url + endpoints (note these are the HF API urls, not the HF UI urls)

ChatUI Sources Citations

ChatUI expects the final SSE from the Orchestrator to include sources with a Sources: heading. Each source can use either a 'doc' or 'http(s)' prefix in the format: Title or Title For example:

**Sources:**
1. [decision - 72 - Decision Decision 72/18](https://multilateralfund.org/node/3600)
2. [decision - 67 - Decision Decision 67/6](https://multilateralfund.org/node/3394)

Both schemes are parsed into ChatUI's webSources for presentation in the UI below the response body, with https rendered as hyperlinks in the UI. Duplicate URLs are ignored; the first occurrence is retained.

Deployment Guide

Local Development

Prerequisites:

Python 3.10+
pip

Steps:

Clone the repository:

git clone <your-repo-url>
cd chabo-orchestrator

Install dependencies:

pip install -r requirements.txt

Configure the system:

# Create .env file
echo "HF_TOKEN=your_token_here" > .env

# Edit params.cfg with your service URLs
nano params.cfg

Run the application:

python app/main.py

Access interfaces:

Gradio UI: http://localhost:7860/gradio
API Docs: http://localhost:7860/docs
Health Check: http://localhost:7860/health

Docker Deployment

Build the image:

docker build -t chabo-orchestrator .

Run the container:

docker run -d \
  --name chabo-orchestrator \
  -p 7860:7860 \
  chabo-orchestrator

HuggingFace Spaces Deployment

Repository Structure:

your-space/
├── app/
│   ├── main.py
│   ├── nodes.py
│   ├── models.py
│   ├── retriever_adapter.py
│   └── utils.py
├── Dockerfile
├── requirements.txt
├── params.cfg
└── README.md

Steps:

Create a new Space on HuggingFace
Select "Docker" as the SDK
Push your code to the Space repository
Add secrets in Space settings:
- HF_TOKEN: Your HuggingFace token
The Space will automatically build and deploy

Important: Ensure all service URLs in params.cfg are publicly accessible.

Docker Compose (Multi-Service)

Example orchestrated deployment for the entire Chabo stack (NOTE - docker-compose will not run on Huggingface spaces)

version: '3.8'

services:
  orchestrator:
    build: ./orchestrator
    ports:
      - "7860:7860"
    environment:
      - HF_TOKEN=${HF_TOKEN}
      - RETRIEVER=http://retriever:7861
      - GENERATOR=http://generator:7862
      - INGESTOR=http://ingestor:7863
    depends_on:
      - retriever
      - generator
      - ingestor

  retriever:
    build: ./retriever
    ports:
      - "7861:7861"
    environment:
      - QDRANT_API_KEY=${QDRANT_API_KEY}

  generator:
    build: ./generator
    ports:
      - "7862:7862"
    environment:
      - HF_TOKEN=${HF_TOKEN}

  ingestor:
    build: ./ingestor
    ports:
      - "7863:7863"

API Reference

Endpoints

Health Check

GET /health

Returns service health status.

Response:

{
  "status": "healthy"
}

Root Information

GET /

Returns API metadata and available endpoints.

Text Query (Streaming)

POST /chatfed-ui-stream/stream
Content-Type: application/json

Request Body:

{
  "input": {
    "text": "What are EUDR requirements?"
  }
}

Response: Server-Sent Events stream

event: data
data: "The EUDR requires..."

event: sources
data: {"sources": [...]}

event: end
data: ""

File Upload Query (Streaming)

POST /chatfed-with-file-stream/stream
Content-Type: application/json

Request Body:

{
  "input": {
    "text": "Analyze this GeoJSON",
    "files": [
      {
        "name": "boundaries.geojson",
        "type": "base64",
        "content": "base64_encoded_content"
      }
    ]
  }
}

Gradio Interface

Interactive Query

Gradio's default API endpoint for UI interactions. If running on huggingface spaces, access via: https://[ORG_NAME]-[SPACE_NAME].hf.space/gradio/

NOTE - for HF deployment we have to access the Gradio test UI in this way, because it is not possible to expose multiple ports on HF Spaces. For the other modules we directly expose Gradio, but with the Orchestrator we need to run FastAPI to support the LangServe endpoints.

Troubleshooting

Common Issues

1. File Upload Fails

Symptoms: "Error reading file" or "Failed to decode uploaded file"

Solutions:

Verify file is properly base64 encoded
Check file size limits (default: varies by deployment)
Ensure MIME type is in multimodalAcceptedMimetypes

2. Slow Responses

Symptoms: Long wait times for responses

Solutions:

Check network latency to external services
Verify MAX_CONTEXT_CHARS isn't too high
Consider enabling DIRECT_OUTPUT for suitable file types
Check logs for retrieval/generation bottlenecks

3. Service Connection Errors

Symptoms: "Connection refused" or timeout errors

Solutions:

Verify all service URLs in params.cfg are accessible
Check HF_TOKEN is valid and has access to private spaces (NOTE - THE ORCHESTRATOR CURRENTLY MUST BE PUBLIC)
Test each service independently with health checks
Review firewall/network policies

Version History

v1.0.0: Initial release with LangGraph orchestration
Current implementation supports streaming and dual-mode processing

Documentation Last Updated: 2025-10-01
Compatible With: Python 3.10+, LangGraph 0.2+, FastAPI 0.100+

Chabo Orchestrator Documentation

Table of Contents

Overview

Key Features

System Architecture

High-Level Architecture

Component Communication

Workflow Logic and File Processing

Processing Modes Overview

Mode 1: Direct Output (DIRECT_OUTPUT = True)

Mode 2: Standard RAG (DIRECT_OUTPUT = False)

Conversation Context Management

Components

1. Main Application (main.py)

2. Workflow Nodes (nodes.py)

3. Data Models (models.py)

4. Retriever Adapter (retriever_adapter.py)

5. Utilities (utils.py)

Conversation Context Management

Configuration

Configuration File (params.cfg)

Environment Variables

ChatUI Configuration

ChatUI Sources Citations

Deployment Guide

Local Development

Docker Deployment

HuggingFace Spaces Deployment

Docker Compose (Multi-Service)

API Reference

Endpoints

Health Check

Root Information

Text Query (Streaming)

File Upload Query (Streaming)

Gradio Interface

Interactive Query

Troubleshooting

Common Issues

1. File Upload Fails

2. Slow Responses

3. Service Connection Errors

Version History

1. Main Application (`main.py`)

2. Workflow Nodes (`nodes.py`)

3. Data Models (`models.py`)

4. Retriever Adapter (`retriever_adapter.py`)

5. Utilities (`utils.py`)

Configuration File (`params.cfg`)