# MCP Progressive Disclosure: Implementation Guide **Extension Name:** Model Context Protocol (MCP) - Progressive Disclosure for Tool Descriptions **Companion Specification:** `spec_mcp_progressive_disclosure_v2_0.md` **Version:** 2.1 **Last Updated:** 2025-11-30 --- ## Overview This guide provides practical advice for implementing the **MCP Progressive Disclosure extension** for token-efficient tool description delivery. It covers common pitfalls, proven strategies, and real-world learnings from production deployments. **If you're new here:** 1. Read the [MCP Progressive Disclosure specification](spec_mcp_progressive_disclosure_v2_0.md) first for protocol requirements 2. Come back here for implementation details and troubleshooting 3. Use the code examples as starting points **What is MCP Progressive Disclosure?** An extension to the Model Context Protocol (MCP) that enables servers to expose minimal tool descriptions initially, then provide full documentation on-demand through a standardized resource pattern. --- ## Quick Start ### Server-Side (5 Steps) 1. **Create tool descriptions directory** ```bash mkdir tool_descriptions/ ``` 2. **Extract tool descriptions to JSON files** ```json // tool_descriptions/my_tool.json { "name": "my_tool", "description": "Full detailed description...", "inputSchema": { /* complete schema */ }, "examples": [ /* usage examples */ ] } ``` 3. **Implement resource listing** - Expose `tool_descriptions` resource via `resources/list` - Include clear workflow guidance in description 4. **Implement resource reading** - Parse `?tools=` query parameter - Load requested tool descriptions - Authorize tools for session - Return JSON with descriptions 5. **Enforce authorization** - Check authorization before tool execution - Return clear error if not authorized ### Agent-Side (2 Steps) 1. **Enhance system prompt** - Detect `tool_descriptions` resource - Explain two-stage workflow - Provide example URI syntax 2. **Test the workflow** - Verify LLM picks tools from `tools/list` - Verify LLM fetches specific tools - Verify LLM calls tools successfully --- ## The Core Challenge: Tool Selection vs Tool Usage ### The Problem The most common implementation issue is LLMs misunderstanding **when** to fetch tool descriptions. They consistently try one of two wrong approaches: **Anti-Pattern 1: Fetch Everything First** ``` User: "Get some data" ❌ LLM: read_resource(resource_uri="resource:///tool_descriptions") Error: Must specify tools parameter ✅ LLM: read_resource(resource_uri="resource:///tool_descriptions?tools=get_data") Success, then calls tool ``` **Anti-Pattern 2: Skip Fetching Entirely** ``` User: "Get some data" ❌ LLM: get_data(query="data") Error: Tool description required ✅ LLM: read_resource(resource_uri="resource:///tool_descriptions?tools=get_data") ✅ LLM: get_data(query="data") ``` ### Why This Happens LLMs interpret "fetch descriptions before calling tools" as "fetch descriptions to help me **decide** which tool to use" rather than "fetch descriptions to learn **how to use** the tool I've already chosen." **The Cognitive Model:** - **Stage 1 (tools/list)**: WHAT does this tool do? → **Decision Point** - **Stage 2 (tool_descriptions)**: HOW do I use this tool? → **Implementation Details** LLMs need explicit guidance that Stage 1 descriptions are **sufficient for selection**. --- ## Server Implementation ### 1. Storage Structure **Recommended:** ``` project/ ├── tool_descriptions/ │ ├── tool_one.json │ ├── tool_two.json │ └── tool_three.json ├── tool_description_loader.py ├── session_auth.py └── server.py ``` **tool_descriptions/tool_one.json:** ```json { "name": "tool_one", "description": "Complete description with all context needed for reliable use", "inputSchema": { "type": "object", "properties": { "param1": { "type": "string", "description": "First parameter" }, "param2": { "type": "integer", "description": "Second parameter", "default": 10 } }, "required": ["param1"] }, "examples": [ { "description": "Basic usage", "input": {"param1": "value"}, "explanation": "Simplest form with just required parameter" }, { "description": "With optional parameter", "input": {"param1": "value", "param2": 20}, "explanation": "Override default for param2" } ], "usage_guidance": { "common_patterns": [ "For X scenario, use param1='special_value'", "When Y, set param2 higher than default" ], "important_notes": [ "Parameter validation happens server-side", "Results are paginated by default" ] }, "error_guidance": { "common_errors": [ { "error": "INVALID_PARAM1", "cause": "param1 must match pattern X", "solution": "Ensure param1 follows format Y" } ] } } ``` ### 2. Tool Description Loader **tool_description_loader.py:** ```python from pathlib import Path import json from typing import Dict, Optional, List class ToolDescriptionLoader: """Loads and caches tool descriptions from JSON files""" def __init__(self, descriptions_dir: Path): self.descriptions_dir = descriptions_dir self._cache: Dict[str, dict] = {} def load(self, tool_name: str) -> Optional[dict]: """Load a single tool description""" if tool_name in self._cache: return self._cache[tool_name] desc_file = self.descriptions_dir / f"{tool_name}.json" if not desc_file.exists(): return None with open(desc_file, 'r', encoding='utf-8') as f: description = json.load(f) self._cache[tool_name] = description return description def load_multiple(self, tool_names: List[str]) -> Dict[str, dict]: """Load multiple tool descriptions""" descriptions = {} for tool_name in tool_names: desc = self.load(tool_name) if desc: descriptions[tool_name] = desc else: descriptions[tool_name] = { "error": f"Tool '{tool_name}' not found", "available_tools": self.list_available() } return descriptions def list_available(self) -> List[str]: """Get list of available tool descriptions""" if not self.descriptions_dir.exists(): return [] return [f.stem for f in self.descriptions_dir.glob("*.json")] ``` ### 3. Session Authorization **session_auth.py:** ```python from typing import Dict, Set import time import logging logger = logging.getLogger(__name__) class SessionAuthorization: """Manages per-session tool authorization state""" def __init__(self): self._sessions: Dict[int, Dict] = {} # session_id -> { # 'authorized_tools': Set[str], # 'created_at': float, # 'last_activity': float # } def get_session_id(self, session) -> int: """Get unique session identifier from MCP session object""" return id(session) def authorize_tool(self, session, tool_name: str): """Mark a tool as authorized for this session""" session_id = self.get_session_id(session) if session_id not in self._sessions: self._sessions[session_id] = { 'authorized_tools': set(), 'created_at': time.time(), 'last_activity': time.time() } self._sessions[session_id]['authorized_tools'].add(tool_name) self._sessions[session_id]['last_activity'] = time.time() logger.info(f"Session {session_id}: Authorized tool '{tool_name}'") def is_authorized(self, session, tool_name: str) -> bool: """Check if tool has been authorized in this session""" session_id = self.get_session_id(session) if session_id not in self._sessions: return False self._sessions[session_id]['last_activity'] = time.time() return tool_name in self._sessions[session_id]['authorized_tools'] def cleanup_stale_sessions(self, max_age_seconds: int = 3600): """Remove inactive sessions""" now = time.time() stale = [ sid for sid, data in self._sessions.items() if now - data['last_activity'] > max_age_seconds ] for session_id in stale: del self._sessions[session_id] if stale: logger.info(f"Cleaned up {len(stale)} stale session(s)") ``` ### 4. Server Resource Handlers **server.py:** ```python from mcp.server import Server from mcp.types import Resource, Tool from urllib.parse import urlparse, parse_qs from pathlib import Path import json app = Server("my-server") # Initialize modules session_auth = SessionAuthorization() tool_loader = ToolDescriptionLoader(Path(__file__).parent / "tool_descriptions") @app.list_resources() async def list_resources() -> list[Resource]: """List available resources including tool_descriptions""" return [ Resource( uri="resource:///tool_descriptions", name="Tool Descriptions - Required for tool use", description=( "WORKFLOW:\n" "\n" "Step 1: PICK which tool you need from tools/list (descriptions show WHAT each tool does)\n" "Step 2: FETCH that tool's full description from this resource (learn HOW to use it)\n" " Example: resource:///tool_descriptions?tools=TOOL_NAME\n" "Step 3: CALL the tool with parameters you learned\n" "\n" "IMPORTANT: You CANNOT call a tool until you fetch its description.\n" "\n" "The short descriptions in tools/list are SUFFICIENT for choosing the right tool.\n" "This resource provides parameters, examples, and authorizes the tool for use.\n" "\n" "MUST include ?tools=TOOL_NAME (base URI without parameter will error)." ), mimeType="application/json" ) ] @app.read_resource() async def read_resource(uri: str) -> str: """Read tool descriptions resource""" uri = str(uri) parsed = urlparse(uri) # Parse query parameters query_params = parse_qs(parsed.query) tools_param = query_params.get('tools', []) # Require tools parameter if not tools_param: error = { "error": { "code": "MISSING_TOOL_SELECTION", "message": "You must specify one or more tool names in the 'tools' parameter.", "examples": [ "resource:///tool_descriptions?tools=tool_one", "resource:///tool_descriptions?tools=tool_one,tool_two" ], "available_tools": tool_loader.list_available() } } return json.dumps(error, indent=2) # Parse comma-separated tool names requested_tools = [t.strip() for t in tools_param[0].split(',')] # Get session for authorization try: session = app.request_context.session except LookupError: session = None # Load descriptions descriptions = tool_loader.load_multiple(requested_tools) # Authorize tools for this session if session: for tool_name in descriptions.keys(): if "error" not in descriptions[tool_name]: session_auth.authorize_tool(session, tool_name) return json.dumps(descriptions, indent=2) @app.list_tools() async def list_tools() -> list[Tool]: """List tools with minimal descriptions""" return [ Tool( name="tool_one", description="Brief description of what this tool does - sufficient for selection", inputSchema={ "type": "object", "additionalProperties": True } ), Tool( name="tool_two", description="Brief description of what this tool does - sufficient for selection", inputSchema={ "type": "object", "additionalProperties": True } ) ] @app.call_tool() async def call_tool(name: str, arguments: dict): """Handle tool calls with authorization check""" # Get session try: session = app.request_context.session except LookupError: return error_response("Tool call outside session context") # Check authorization if not session_auth.is_authorized(session, name): error = { "error": { "code": "TOOL_DESCRIPTION_REQUIRED", "message": f"Tool '{name}' requires fetching its description before use.", "instructions": [ f"1. Fetch: read_resource(resource_uri=\"resource:///tool_descriptions?tools={name}\")", "2. Review the parameters and examples", "3. Then call the tool" ], "resource_uri": f"resource:///tool_descriptions?tools={name}" } } return [TextContent(type="text", text=json.dumps(error, indent=2))] # Tool is authorized - execute if name == "tool_one": result = handle_tool_one(arguments) elif name == "tool_two": result = handle_tool_two(arguments) else: result = {"error": f"Unknown tool: {name}"} return [TextContent(type="text", text=json.dumps(result, indent=2))] ``` --- ## Agent Implementation ### System Prompt Enhancement **Key Strategy:** Auto-detect progressive disclosure servers and provide explicit workflow guidance. **conversation.py or similar:** ```python def build_system_prompt(self, tools, resources): """Build system prompt with progressive disclosure detection""" # Check if any resource is tool_descriptions has_progressive_disclosure = any( 'tool_descriptions' in r.get('uri', '') for r in resources ) # Build tool list with descriptions tool_list = "\n".join([ f" - {t['name']}: {t.get('description', 'No description')}" for t in tools ]) prompt = f"""You are a helpful assistant with access to these tools: {tool_list} Use tools when needed to answer questions.""" if has_progressive_disclosure: prompt += """ IMPORTANT - Tool Usage Workflow: This server uses progressive disclosure for tools. Follow this exact workflow: 1. PICK the right tool based on the descriptions above (they tell you WHAT each tool does) 2. FETCH the full tool description using read_resource with the specific tool name Example: read_resource(resource_uri="resource:///tool_descriptions?tools=TOOL_NAME") 3. CALL the tool using the parameters you just learned DO NOT try to fetch tool_descriptions without specifying which tool you want (?tools=TOOL_NAME). The tool descriptions above are sufficient for choosing which tool you need. You fetch the full description to learn the parameters and authorize the tool.""" return prompt ``` --- ## Resource Description Wording ### Proven Effective Pattern Based on production testing, this structure achieves highest LLM compliance: ``` WORKFLOW: Step 1: PICK which tool you need from tools/list based on SHORT descriptions Step 2: FETCH full description: resource:///tool_descriptions?tools=TOOL_NAME Step 3: CALL the tool with parameters you learned IMPORTANT: You CANNOT call a tool until you fetch its description. The SHORT descriptions tell you WHICH tool to use (sufficient for selection). This resource tells you HOW to use it (parameters, examples) and authorizes it. MUST include ?tools=TOOL_NAME (base URI without tools will error). ``` ### What Makes This Work 1. **Sequential Steps**: Clear 1-2-3 progression 2. **Separation of Concerns**: WHICH vs HOW distinction 3. **Mandatory Language**: "CANNOT" not "should not" 4. **Concrete Example**: Shows exact URI format 5. **Prohibition**: States what will fail 6. **Rationale**: Explains why pattern exists (selection vs parameters) ### What DOESN'T Work ❌ **Too brief**: "Fetch descriptions before calling tools" - Problem: Ambiguous when to fetch ❌ **Too verbose**: Multiple paragraphs of explanation - Problem: LLMs skip/skim long descriptions ❌ **No examples**: Abstract description only - Problem: LLMs don't know exact syntax ❌ **Missing prohibition**: Doesn't say what fails - Problem: LLMs try base URI without tools parameter --- ## Testing Strategy ### 1. Unit Tests Test each component independently: ```python def test_tool_loader(): loader = ToolDescriptionLoader(Path("tool_descriptions")) # Test single load desc = loader.load("tool_one") assert desc['name'] == "tool_one" assert 'inputSchema' in desc # Test multiple load descs = loader.load_multiple(["tool_one", "tool_two"]) assert len(descs) == 2 # Test missing tool descs = loader.load_multiple(["nonexistent"]) assert "error" in descs["nonexistent"] def test_session_auth(): auth = SessionAuthorization() # Mock session class MockSession: pass session = MockSession() # Test authorization flow assert not auth.is_authorized(session, "tool_one") auth.authorize_tool(session, "tool_one") assert auth.is_authorized(session, "tool_one") # Test session isolation session2 = MockSession() assert not auth.is_authorized(session2, "tool_one") ``` ### 2. Integration Tests Test the full workflow: ```python async def test_progressive_disclosure_workflow(): # Connect to server server = await connect_mcp_server() # 1. List resources - should see tool_descriptions resources = await server.list_resources() assert any('tool_descriptions' in r['uri'] for r in resources) # 2. List tools - should see minimal descriptions tools = await server.list_tools() assert len(tools) > 0 assert all('description' in t for t in tools) # 3. Try calling without fetching - should fail with pytest.raises(Exception) as exc: await server.call_tool("tool_one", {}) assert "TOOL_DESCRIPTION_REQUIRED" in str(exc) # 4. Fetch description desc = await server.read_resource("resource:///tool_descriptions?tools=tool_one") assert 'tool_one' in desc assert 'inputSchema' in desc['tool_one'] # 5. Call tool - should succeed result = await server.call_tool("tool_one", {"param1": "value"}) assert result['success'] == True ``` ### 3. LLM Behavior Tests Test actual LLM compliance: ```python async def test_llm_workflow(): """Test that LLM follows correct workflow""" agent = TestAgent(server="my-server") # Give task that requires tool use response = await agent.query("Get some data") # Verify LLM workflow assert agent.trace.contains_call("read_resource") assert "?tools=" in agent.trace.last_resource_uri assert agent.trace.contains_call("tool_one") # Verify no errors assert not agent.trace.contains_error("MISSING_TOOL_SELECTION") assert not agent.trace.contains_error("TOOL_DESCRIPTION_REQUIRED") ``` --- ## Common Pitfalls ### 1. Base URI Without Tools Parameter **Problem:** LLM calls `resource:///tool_descriptions` without `?tools=` **Cause:** Resource description not clear about WHAT vs HOW distinction **Solution:** - Emphasize that tools/list is sufficient for selection - Show incorrect example explicitly - Use system prompt reinforcement ### 2. Calling Tool Before Fetching **Problem:** LLM tries to call tool directly **Cause:** Over-emphasizing "descriptions sufficient for selection" **Solution:** - Balance messaging: sufficient for **choosing**, not for **using** - State clearly: "CANNOT call until fetched" - Include authorization rationale ### 3. Session ID Issues **Problem:** Authorization not persisting or crossing sessions **Cause:** Using unstable session identifier **Solution:** - Use `id(request_context.session)` as session ID - This is stable for the connection lifetime - No external dependencies required ### 4. Tool Names Don't Match **Problem:** Fetched tool name doesn't match tools/list name **Cause:** Typo or case mismatch **Solution:** - Use exact same names in JSON files as in tools/list - Include "available_tools" in error responses - Log mismatches for debugging ### 5. Stale Sessions Accumulate **Problem:** Memory grows over time **Cause:** No session cleanup **Solution:** - Implement periodic cleanup (every 10 minutes) - Remove sessions with no activity for 1 hour - Run as background task --- ## Token Efficiency Analysis ### Baseline (Full Descriptions) **Per-tool cost:** 3000-5000 tokens **10 tools:** 30,000-50,000 tokens at startup **Problem:** Consumes significant context before any actual work ### With Progressive Disclosure **Minimal descriptions (all tools):** 500-1000 tokens **Full description (when fetched):** 3000-5000 tokens per tool **Typical usage (2 tools):** 500 + (2 × 4000) = 8,500 tokens **Savings:** 75-80% token reduction for typical workflows ### Break-Even Analysis Progressive disclosure saves tokens when: ``` Number of tools × (full description size - minimal size) > fetch overhead ``` For 5+ tools, progressive disclosure almost always wins. --- ## Migration Guide ### From Traditional to Progressive Disclosure **Step 1:** Extract existing tool descriptions ```python # Before: Full description in tools/list Tool( name="my_tool", description="Long description...", inputSchema={/* full schema */} ) # After: Minimal in tools/list Tool( name="my_tool", description="Brief description of purpose", inputSchema={"type": "object", "additionalProperties": True} ) # Full description moved to tool_descriptions/my_tool.json ``` **Step 2:** Implement resource handlers (see Server Implementation section) **Step 3:** Add authorization enforcement **Step 4:** Update agent system prompt (if you control the agent) **Step 5:** Test with real queries ### Backwards Compatibility To support both patterns during migration: ```python @app.list_tools() async def list_tools() -> list[Tool]: # Check if client supports progressive disclosure # (presence of read_resource capability or similar) if supports_progressive_disclosure: return minimal_tools() else: return full_tools() ``` --- ## Best Practices ### ✅ DO - Store tool descriptions in separate JSON files - Use clear sequential steps in resource description - Distinguish WHAT (selection) from HOW (parameters) - Provide system prompt guidance for agents - Log authorization events for debugging - Cache parsed descriptions in memory - Clean up stale sessions periodically - Include concrete examples in resource description - Test with real LLMs, not just unit tests ### ❌ DON'T - Don't make tool descriptions too minimal (must be sufficient for selection) - Don't omit the ?tools= requirement from resource description - Don't use unstable session identifiers - Don't skip authorization checks - Don't expose internal paths in descriptions - Don't rely solely on resource description (use system prompt too) - Don't optimize prematurely (measure token savings first) --- ## Troubleshooting ### LLM keeps trying base URI without tools parameter **Diagnosis:** Resource description or system prompt not clear enough **Fix:** 1. Add explicit prohibition in resource description 2. Show incorrect example with ❌ 3. Enhance system prompt with workflow 4. Test wording iteratively ### Authorization failures despite fetching **Diagnosis:** Session ID mismatch or session cleanup too aggressive **Fix:** 1. Verify `id(session)` is stable 2. Log session IDs during fetch and call 3. Increase cleanup timeout 4. Check for session resets ### Tool descriptions not loading **Diagnosis:** File path or JSON format issues **Fix:** 1. Verify tool_descriptions directory exists 2. Check JSON syntax with `json.loads()` 3. Ensure file names match exactly (case-sensitive) 4. Log file paths being accessed ### Memory growth over time **Diagnosis:** Sessions not being cleaned up **Fix:** 1. Implement background cleanup task 2. Lower max_age_seconds threshold 3. Monitor session count in production 4. Consider LRU cache with size limit --- ## Production Checklist Before deploying progressive disclosure: - [ ] Tool descriptions extracted to JSON files - [ ] Minimal descriptions sufficient for tool selection - [ ] Resource description includes clear workflow - [ ] System prompt enhanced (if controlling agent) - [ ] Authorization enforcement implemented - [ ] Session cleanup running - [ ] Error messages include recovery URIs - [ ] Logging enabled for debugging - [ ] Integration tests passing - [ ] LLM behavior tested with real queries - [ ] Token savings measured and validated - [ ] Documentation updated for users --- ## Further Reading - [MCP Progressive Disclosure Specification v2.0](spec_mcp_progressive_disclosure_v2_0.md) - [MCP Specification](https://spec.modelcontextprotocol.io/) - [RFC 2119: Key words for RFCs](https://www.rfc-editor.org/rfc/rfc2119) --- **Questions or Issues?** If you encounter problems not covered in this guide, please: 1. Check the specification for normative requirements 2. Review the troubleshooting section 3. Test with minimal examples 4. Share findings with the community **Last Updated:** 2025-11-30 **Companion Specification:** v2.1