feat: Ollama adapter + chat rate limiter (30 req/hour)
Ollama adapter (internal/chat/ollama.go): - Implements model.LLM interface for ADK Go - Talks to Ollama's OpenAI-compatible API (/v1/chat/completions) - Full tool/function calling support (tested with Mistral Small 3.2) - Converts ADK types to OpenAI format (messages, tools, tool_calls) - Configurable via OLLAMA_HOST and OLLAMA_MODEL env vars Multi-provider handler: - MODEL_PROVIDER env: "gemini" (default) or "ollama" - Gemini: requires GOOGLE_API_KEY (pay-as-you-go recommended) - Ollama: connects to local or Tailscale-remote instance Rate limiter: - 30 requests/hour per IP on /api/chat endpoint - Uses existing middleware.NewRateLimiter pattern Tested: Ollama + Mistral Small 3.2 on M4 Pro 64GB — correct answers
This commit is contained in:
@@ -18,8 +18,11 @@ func Setup(cvHandler *handlers.CVHandler, healthHandler *handlers.HealthHandler,
|
||||
mux.HandleFunc("/cv-jamr-", cvHandler.DefaultCVShortcut)
|
||||
|
||||
// API routes (must be before "/" to avoid catch-all)
|
||||
mux.HandleFunc("/api/cmd-k", cvHandler.CmdKData) // CMD+K command palette data
|
||||
mux.HandleFunc("/api/chat", chatHandler.HandleChat) // AI chat endpoint
|
||||
mux.HandleFunc("/api/cmd-k", cvHandler.CmdKData) // CMD+K command palette data
|
||||
|
||||
// Chat endpoint with rate limiting (30 requests/hour per IP)
|
||||
chatRateLimiter := middleware.NewRateLimiter(c.RateLimitChatRequests, c.RateLimitChatWindow)
|
||||
mux.Handle("/api/chat", chatRateLimiter.Middleware(http.HandlerFunc(chatHandler.HandleChat)))
|
||||
|
||||
// Public routes
|
||||
mux.HandleFunc("/", cvHandler.Home)
|
||||
|
||||
Reference in New Issue
Block a user