feat: Ollama adapter + chat rate limiter (30 req/hour)

Ollama adapter (internal/chat/ollama.go):
- Implements model.LLM interface for ADK Go
- Talks to Ollama's OpenAI-compatible API (/v1/chat/completions)
- Full tool/function calling support (tested with Mistral Small 3.2)
- Converts ADK types to OpenAI format (messages, tools, tool_calls)
- Configurable via OLLAMA_HOST and OLLAMA_MODEL env vars

Multi-provider handler:
- MODEL_PROVIDER env: "gemini" (default) or "ollama"
- Gemini: requires GOOGLE_API_KEY (pay-as-you-go recommended)
- Ollama: connects to local or Tailscale-remote instance

Rate limiter:
- 30 requests/hour per IP on /api/chat endpoint
- Uses existing middleware.NewRateLimiter pattern

Tested: Ollama + Mistral Small 3.2 on M4 Pro 64GB — correct answers
This commit is contained in:
juanatsap
2026-04-08 14:47:14 +01:00
parent 4f558ac842
commit 8205a22972
5 changed files with 510 additions and 17 deletions
+5 -2
View File
@@ -18,8 +18,11 @@ func Setup(cvHandler *handlers.CVHandler, healthHandler *handlers.HealthHandler,
mux.HandleFunc("/cv-jamr-", cvHandler.DefaultCVShortcut)
// API routes (must be before "/" to avoid catch-all)
mux.HandleFunc("/api/cmd-k", cvHandler.CmdKData) // CMD+K command palette data
mux.HandleFunc("/api/chat", chatHandler.HandleChat) // AI chat endpoint
mux.HandleFunc("/api/cmd-k", cvHandler.CmdKData) // CMD+K command palette data
// Chat endpoint with rate limiting (30 requests/hour per IP)
chatRateLimiter := middleware.NewRateLimiter(c.RateLimitChatRequests, c.RateLimitChatWindow)
mux.Handle("/api/chat", chatRateLimiter.Middleware(http.HandlerFunc(chatHandler.HandleChat)))
// Public routes
mux.HandleFunc("/", cvHandler.Home)