runtime error

Exit code: 1. Reason: master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some( "/data", ), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-generation-inference.router", cors_allow_origin: [], api_key: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false, max_client_batch_size: 4, lora_adapters: None, usage_stats: On, payload_limit: 2000000, enable_prefill_logprobs: false, graceful_termination_timeout: 90, } Attempt 1/120 - waiting... (TGI PID: 14) 2026-01-10T13:36:16.668095Z  WARN text_generation_launcher::gpu: Cannot determine GPU compute capability: ModuleNotFoundError: No module named 'torch' 2026-01-10T13:36:16.668122Z  INFO text_generation_launcher: Using attention flashinfer - Prefix caching true 2026-01-10T13:36:16.668736Z  INFO text_generation_launcher: Default `max_batch_prefill_tokens` to 4096 2026-01-10T13:36:16.668749Z  INFO text_generation_launcher: Using default cuda graphs [1, 2, 4, 8, 16, 32] 2026-01-10T13:36:16.668753Z  WARN text_generation_launcher: `trust_remote_code` is set. Trusting that model `NousResearch/Hermes-3-Llama-3.1-8B` do not contain malicious code. 2026-01-10T13:36:16.668864Z  INFO download: text_generation_launcher: Starting check and download process for NousResearch/Hermes-3-Llama-3.1-8B 2026-01-10T13:36:16.670055Z ERROR download: text_generation_launcher: Permission denied (os error 13) Error: DownloadError āœ— TGI process died! Check logs above for errors

Container logs:

Fetching error logs...