160be31b31
Dual-provider architecture: - Both Gemini and Ollama initialize at startup (if configured) - Primary (Gemini) tried first for every request - On any error (429, 503, timeout), automatically falls back to Ollama - No manual switching needed — completely transparent to the user - Log shows: "Primary failed (gemini: ...), falling back to ollama: ..." Warmup: - POST /api/chat/warmup called silently when chat panel opens - Pre-loads Ollama model in background (10-15s) while user reads welcome - By the time user types, model is ready for instant response - Warms up fallback provider specifically (Gemini doesn't need it) Timeout: - Agent context increased to 60s (Ollama first response can be slow) - Each request creates a fresh session (stateless for fallback compat)