Files
cv-site/templates
juanatsap 160be31b31 feat: auto-fallback Gemini→Ollama + model warmup on chat open
Dual-provider architecture:
- Both Gemini and Ollama initialize at startup (if configured)
- Primary (Gemini) tried first for every request
- On any error (429, 503, timeout), automatically falls back to Ollama
- No manual switching needed — completely transparent to the user
- Log shows: "Primary failed (gemini: ...), falling back to ollama: ..."

Warmup:
- POST /api/chat/warmup called silently when chat panel opens
- Pre-loads Ollama model in background (10-15s) while user reads welcome
- By the time user types, model is ready for instant response
- Warms up fallback provider specifically (Gemini doesn't need it)

Timeout:
- Agent context increased to 60s (Ollama first response can be slow)
- Each request creates a fresh session (stateless for fallback compat)
2026-04-08 14:57:38 +01:00
..