Files

616 lines
34 KiB
Markdown
Raw Permalink Normal View History

# 28. AI Chat Agent — CV Assistant Mascot
## 1. Overview
The CV site includes an AI-powered conversational assistant (the "mascot") that lets visitors ask natural language questions about the CV content. Built with [Google ADK Go 1.0](https://github.com/google/adk-go) (Agent Development Kit) and Gemini AI, it provides instant, accurate answers by querying the same cached JSON data that renders the site.
The mascot appears as a floating robot icon in the bottom-right corner of the page. Clicking it opens a chat panel where visitors can type questions or click suggested question chips. All answers are sourced from real CV data — no hallucination, no stale data.
**Why it exists:** A CV is a dense document. Visitors (recruiters, hiring managers) often have specific questions: "Does he know React?", "How many years of experience?", "What certifications?". Instead of making them scan every section, the mascot lets them ask directly and get precise, cross-referenced answers.
**Live example:** A visitor asks *"What is Juan's experience with Go?"* and gets a response listing Go projects (Immich Photo Manager, Cmux Resurrect), skill categories where Go appears, and experience entries involving Go — all pulled from the actual CV data in real time.
## 2. Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ CV Site Server │
│ │
│ ┌──────────────┐ ┌──────────────────────────────────────┐ │
│ │ Data Cache │─────▶│ ADK Go Agent │ │
│ │ (cv-en.json) │ │ ┌────────────────────────────────┐ │ │
│ │ (cv-es.json) │ │ │ cv_assistant (LLM Agent) │ │ │
│ └──────────────┘ │ │ │ │ │
│ │ │ │ Tools: │ │ │
│ │ │ │ └─ query_cv(section, query) │ │ │
│ │ │ │ ├─ search (cross-section) │ │ │
│ │ │ │ ├─ experience │ │ │
│ │ │ │ ├─ projects │ │ │
│ │ │ │ ├─ skills │ │ │
│ │ │ │ ├─ education │ │ │
│ │ │ │ ├─ languages │ │ │
│ │ │ │ ├─ certifications │ │ │
│ │ │ │ ├─ courses │ │ │
│ │ │ │ ├─ awards │ │ │
│ │ │ │ ├─ summary │ │ │
│ │ │ │ └─ all │ │ │
│ │ │ └────────────────────────────────┘ │ │
│ │ └────────────────┬───────────────────┘ │
│ │ │ │
│ ┌──────▼────────────────────────────────▼────────────────────┐ │
│ │ POST /api/chat │ │
│ │ (chat.Handler) │ │
│ │ ├─ Session management (in-memory) │ │
│ │ ├─ ADK Runner execution │ │
│ │ ├─ Markdown-to-HTML conversion │ │
│ │ └─ HTML fragment response (HTMX swap) │ │
│ └────────────────────────────────────────────────────────────┘ │
│ ▲ │
│ │ hx-post="/api/chat" │
│ ┌───────────────────────────┴────────────────────────────────┐ │
│ │ Chat Widget (HTMX + Hyperscript) │ │
│ │ ├─ Floating mascot button (robot icon) │ │
│ │ ├─ Expandable chat panel │ │
│ │ ├─ Suggested question chips (5 per language) │ │
│ │ ├─ Message history with auto-scroll │ │
│ │ ├─ Typing indicator (animated dots) │ │
│ │ └─ Session ID persistence (OOB swap) │ │
│ └────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────┐
│ Gemini 2.5 Flash │
│ (Google AI) │
└──────────────────┘
```
### End-to-End Flow
1. **User clicks a chip or types a question** in the chat panel.
2. **Hyperscript** sets the input value (for chips) and triggers `submit` on `#chat-form`.
3. **HTMX** intercepts the form submit and sends `POST /api/chat` with `message`, `session_id`, and `lang` fields.
4. **Go handler** (`chat.HandleChat`) receives the request, ensures a session exists, and creates an ADK `runner.Run()` call.
5. **ADK Runner** sends the message to Gemini along with the agent instruction and available tools.
6. **Gemini calls `query_cv`** with appropriate `section` and `query` parameters (the agent decides which sections to query based on its instruction strategy).
7. **`query_cv` tool** searches the cached CV JSON data (`cache.DataCache`) — the same data that renders the HTML pages. For technology queries, it performs cross-section search across experience, projects, skills, and courses simultaneously.
8. **Gemini synthesizes** the tool results into a natural language response.
9. **Handler renders** the response as an HTML fragment: user message bubble + agent message bubble + session ID hidden input.
10. **HTMX swaps** the fragment into `#chat-messages` with `beforeend` swap and auto-scrolls to the bottom.
11. **OOB swap** updates the `#chat-session-id` hidden input so subsequent messages maintain conversation context.
## 3. Components
### File Structure
```
internal/chat/
├── agent.go # Agent definition, query_cv tool, filter functions
└── handler.go # HTTP handler, session mgmt, Gemini init, response rendering
templates/partials/
├── widgets/chat-widget.html # HTMX chat panel with Hyperscript
└── modals/chat-help-modal.html # Help modal with example questions by category
static/css/04-interactive/
└── _chat.css # Styling (CV design tokens, dark theme, responsive)
tests/mjs/
└── 83-chat-mascot.test.mjs # 46 Playwright test assertions
```
### `internal/chat/agent.go`
Defines the single LLM agent (`cv_assistant`) with one tool (`query_cv`). Contains:
- **`NewAgent()`** — Creates the agent with a comprehensive instruction prompt covering 8 question types and query strategies.
- **`QueryCVArgs` / `QueryCVResult`** — Input/output structs for the tool with JSON schema annotations used by ADK for function calling.
- **`newQueryCVTool()`** — Wraps the query function as an agent-callable tool via `functiontool.New`. Supports 11 section values: `search`, `experience`, `projects`, `skills`, `education`, `languages`, `certifications`, `courses`, `awards`, `summary`, `all`.
- **Filter helpers** — `filterExperience()`, `filterProjects()`, `filterSkills()`, `filterCourses()` perform case-insensitive keyword matching across all relevant fields (title, company, technologies, descriptions, responsibilities).
- **`matchesAny()` / `matchesSlice()`** — Low-level string matching used by all filters.
- **`calculateYears()`** — Computes years of experience from career start date (April 2005).
**Why a single agent?** The CV data is structured and bounded. There is no need for multi-agent orchestration. One agent with one tool is the right abstraction: simple, fast, predictable.
### `internal/chat/handler.go`
Handles the HTTP lifecycle:
- **`NewHandler()`** — Initializes Gemini model, creates the agent, sets up in-memory session service and ADK runner. Returns a disabled handler if `GOOGLE_API_KEY` is not set.
- **`Enabled()`** — Boolean check used by templates to conditionally render the widget.
- **`HandleChat()`** — Processes `POST /api/chat`. Validates input, ensures session exists, runs the agent with a 30-second timeout (using a dedicated context, not the HTTP request context), renders the HTML fragment response.
- **`formatResponse()`** — Converts basic markdown to HTML: escapes HTML entities first, then applies `**bold**` to `<strong>`, converts `- ` bullet lines to `<ul><li>`, and wraps text in `<p>` tags.
### `templates/partials/widgets/chat-widget.html`
The HTMX + Hyperscript chat UI. Conditionally rendered with `{{if .ChatEnabled}}`. Contains:
- **Toggle button** — Fixed position, Hyperscript toggles `.chat-open` on the panel and `.mascot-active` on itself.
- **Chat header** — Blue bar with robot icon, title (bilingual), and help button that opens the help modal via `command="show-modal"`.
- **Messages area** — Scrollable container (`#chat-messages`) where HTMX appends response fragments.
- **Typing indicator** — Three animated dots, shown/hidden via HTMX's `hx-indicator`.
- **Suggested question chips** — 5 per language, using Hyperscript (`_="on click set #chat-input.value to '...' then trigger submit on #chat-form"`).
- **Input form** — `hx-post="/api/chat"` with `hx-swap="beforeend scroll:#chat-messages:bottom"`. Hyperscript clears the input after each request.
### `templates/partials/modals/chat-help-modal.html`
A native `<dialog>` element styled as a modal. Organized into 6 sections with example questions:
1. **About Experience** — Years of experience, companies, specific employers (Olympic Broadcasting, SAP)
2. **About Technologies** — Programming languages, React, Go, Node.js
3. **About Projects** — Personal projects, Immich Photo Manager, open-source work
4. **Education & Certifications** — Certifications, education, courses
5. **About Skills** — Technical skills, Docker, CI/CD
6. **How it works** — Brief explanation of ADK Go + Gemini powering the assistant
Each section contains 3-4 example questions in both English and Spanish (toggled by `{{if eq .Lang "es"}}`).
### `static/css/04-interactive/_chat.css`
Complete styling for the chat widget. See sections 10 and 11 for design system and dark theme details.
## 4. The `query_cv` Tool
The `query_cv` tool is the agent's only way to access CV data. It accepts three parameters:
| Parameter | Type | Description |
|-----------|------|-------------|
| `section` | string | Which CV section to query (see values below) |
| `query` | string | Keyword filter. Empty returns all items in the section. |
| `language` | string | `"en"` or `"es"`. Defaults to `"en"`. |
### Section Values
| Section | Returns | Filter Behavior |
|---------|---------|-----------------|
| `search` | Cross-section results (map with keys: experience, projects, skills, courses) | Case-insensitive keyword match across ALL four sections simultaneously |
| `experience` | `[]Experience` | Filters by company, position, location, dates, technologies, responsibilities, short description |
| `projects` | `[]Project` | Filters by title, short description, location, technologies, responsibilities |
| `skills` | `[]SkillCategory` | Filters by category name and individual skill items |
| `education` | `[]Education` | Returns all (no filtering) |
| `languages` | `[]Language` | Returns all (no filtering) |
| `certifications` | `[]Certification` | Returns all (no filtering) |
| `courses` | `[]Course` | Filters by title, institution, description |
| `awards` | `[]Award` | Returns all (no filtering) |
| `summary` | `{summary, years_of_experience}` | Returns the professional summary and calculated years |
| `all` | `{experience_count, project_count, skill_categories, ...}` | Returns high-level counts across all sections |
The tool reads from `cache.DataCache` — the same in-memory cache that powers the website rendering. Zero additional I/O, zero data duplication.
## 5. Cross-Section Search
When `section="search"`, the tool performs a simultaneous keyword search across four sections:
1. **Experience** — Matches in company name, position, location, dates, technologies list, responsibilities list, and short description.
2. **Projects** — Matches in title, short description, location, technologies list, and responsibilities list.
3. **Skills** — Matches in skill category name (e.g., "Languages", "DevOps") and individual skill items.
4. **Courses** — Matches in title, institution, and description.
### Why Cross-Section Search Matters
Technology queries are the most common use case, and technologies can appear in multiple sections. For example, asking about "Java":
- **Experience**: Appears in 5+ job entries where Java was used
- **Projects**: May appear in project tech stacks
- **Skills**: Listed under "Programming Languages" with proficiency level
- **Courses**: May appear in training course titles
Without cross-section search, the agent would need to make 4 separate tool calls. With `section="search"`, a single call returns all matches organized by section, giving the agent complete context to synthesize a comprehensive answer.
### Return Format
```json
{
"section": "search",
"query": "go",
"total_found": 5,
"data": {
"experience": [...],
"projects": [...],
"skills": [...],
"courses": [...]
}
}
```
Only sections with matches are included in the result.
## 6. Agent Intelligence
The agent instruction defines query strategies for 8 question types. This is the core of the agent's intelligence — it tells Gemini exactly which section(s) to query for each type of question.
| # | Question Type | Query Strategy | Example |
|---|---------------|----------------|---------|
| 1 | **Technology** (Java, Go, React, Docker) | `section="search"` with technology name | "Has he worked with React?" -> `search`, query=`"react"` |
| 2 | **Company / Employer** | List all: `section="experience"` no query. Specific: `section="search"` with company name. | "What companies?" -> `experience`. "Tell me about SAP" -> `search`, query=`"sap"` |
| 3 | **Years / Career Overview** | `section="summary"` for years. `section="all"` for overview. | "How many years?" -> `summary` |
| 4 | **Projects** | List all: `section="projects"` no query. By tech: `section="search"`. | "Go projects?" -> `search`, query=`"go"` |
| 5 | **Education & Certifications** | `section="certifications"`, `section="education"`, or `section="courses"`. Topic-specific: `section="search"`. | "What certifications?" -> `certifications` |
| 6 | **Skills** | All skills: `section="skills"` no query. Specific: `section="search"`. | "Main skills?" -> `skills`. "Docker?" -> `search`, query=`"docker"` |
| 7 | **Awards** | `section="awards"` | "Any awards?" -> `awards` |
| 8 | **Language Proficiency** | `section="languages"` | "What languages does he speak?" -> `languages` |
### Bonus Context in the Instruction
The agent instruction also tells Gemini:
- The CV website itself is built with Go, HTMX, Hyperscript, and vanilla CSS — a real-world showcase of Juan's skills.
- The chat assistant is powered by Google ADK Go 1.0 and Gemini AI — another demonstration of Go expertise.
- For general questions ("tell me about Juan"), use `summary` first, then `all`.
### Language Behavior
The agent is instructed to respond in the same language the user writes in. If the user asks in Spanish, the response comes back in Spanish. This is handled entirely by Gemini's multilingual capabilities — no language detection code is needed.
## 7. Suggested Questions
The chat panel displays 5 clickable question chips per language. These serve as onboarding — showing visitors what they can ask.
### English Chips
| Chip Label | Full Question Sent |
|------------|-------------------|
| Go projects? | "What Go projects has he built?" |
| Years of experience? | "How many years of experience?" |
| Companies? | "What companies has he worked at?" |
| Knows React? | "Does he know React?" |
| Certifications? | "What certifications?" |
### Spanish Chips
| Chip Label | Full Question Sent |
|------------|-------------------|
| Proyectos en Go? | "Que proyectos en Go ha hecho?" |
| Anos de experiencia? | "Cuantos anos de experiencia tiene?" |
| Empresas? | "En que empresas ha trabajado?" |
| Conoce React? | "Conoce React?" |
| Certificaciones? | "Que certificaciones tiene?" |
### How Chips Work (Hyperscript)
Each chip uses Hyperscript to set the input value and trigger HTMX form submission:
```html
<button type="button" class="chat-chip"
_="on click set #chat-input.value to 'What Go projects has he built?' then trigger submit on #chat-form">
Go projects?
</button>
```
The flow:
1. Hyperscript `on click` handler fires.
2. `set #chat-input.value to '...'` writes the full question into the text input.
3. `trigger submit on #chat-form` dispatches a native `submit` event on the form element.
4. HTMX intercepts the submit event (because the form has `hx-post`) and sends the POST request.
This approach was chosen over inline `onclick` with `htmx.trigger()` because `htmx.trigger()` expects a DOM element reference, not a CSS selector string. Hyperscript's `trigger <event> on <element>` syntax works natively with HTMX's event listening.
## 8. Help Modal
The help modal (`chat-help-modal.html`) is a native `<dialog>` element opened via the `?` button in the chat header using the Invoker Commands API (`commandfor="chat-help-modal" command="show-modal"`).
### Structure
The modal contains 6 sections with 3-4 example questions each:
1. **About Experience** (briefcase icon) — Career duration, companies, specific employers
2. **About Technologies** (code-tags icon) — Programming languages, specific technologies
3. **About Projects** (rocket icon) — Personal projects, open-source, specific projects
4. **Education & Certifications** (school icon) — Certifications, education, courses
5. **About Skills** (star icon) — Technical skills, specific tools
6. **How it works** (info icon) — Brief explanation of the AI powering it
All text is bilingual (English/Spanish) using Go template conditionals. The modal uses the same `info-modal` CSS classes as other site modals (keyboard shortcuts, etc.) for visual consistency.
### Closing Mechanism
The modal closes via:
- The X button (`commandfor="chat-help-modal" command="close"`)
- Clicking the backdrop (Hyperscript: `_="on click call closeOnBackdrop(me, event)"`)
## 9. Graceful Degradation
The chat feature is entirely optional. When `GOOGLE_API_KEY` is not set:
1. `chat.NewHandler()` detects the missing key and returns `&Handler{enabled: false}`.
2. The CV handler receives `chatEnabled: false` from `handler.Enabled()`.
3. Template data includes `ChatEnabled: false`.
4. The chat widget template renders nothing — `{{if .ChatEnabled}}...{{end}}` produces zero HTML.
5. No JavaScript errors, no broken UI, no hidden network requests, no console warnings.
The same graceful fallback applies if:
- The Gemini model fails to initialize (bad API key, network error).
- The ADK agent creation fails.
- The ADK runner creation fails.
In each case, the handler logs a warning and disables itself. The rest of the site is completely unaffected.
**Zero impact on the site when disabled.**
## 10. Design System
The chat widget integrates with the CV site's existing design system, using the same CSS custom properties (design tokens) defined in `_variables.css`.
### Colors
| Element | Token | Default Value |
|---------|-------|---------------|
| Toggle button background | `--black-bar` | `#2b2b2b` |
| Toggle button hover / active | `--accent-blue` | `#0066cc` |
| Panel background | `--paper-bg` | `#ffffff` |
| Panel border | `--border-light` | `#e0e0e0` |
| Header background | `--accent-blue` | `#0066cc` |
| Agent bubble background | `--paper-secondary-bg` | `#f5f5f5` |
| Agent bubble text | `--text-secondary` | `#333333` |
| User bubble background | `--accent-blue` | `#0066cc` |
| User bubble text | (hardcoded) | `#ffffff` |
| Chip text | `--text-muted` | `#666666` |
| Chip border | `--border-light` | `#e0e0e0` |
| Input border focus | `--accent-blue` | `#0066cc` |
| Typing dots | `--text-light` | `#999999` |
### Typography
| Element | Font Family | Size |
|---------|-------------|------|
| Header | Quicksand (matches site headings) | 0.85rem |
| Messages | Source Sans Pro (matches body text) | 0.8rem |
| Chips | Source Sans Pro | 0.68rem |
| Input | Source Sans Pro | 0.8rem |
### Layout
- **Toggle button**: Fixed, `bottom: 6rem`, `right: 2rem`, 50px circle. Positioned just above the back-to-top button.
- **Chat panel**: Fixed, `bottom: 10.5rem`, `right: 2rem`, 360px wide, max 500px tall. Above the toggle button.
- **Shadow**: `var(--shadow-lg)` for the panel, custom shadow for the button.
- **Border radius**: 8px for the panel, 50% for the button, 8px for message bubbles (with 2px on the pointed corner), 14px for chips, 16px for the input.
### Responsive (Mobile)
At `max-width: 480px`:
- Panel goes full-width, bottom-anchored with top rounded corners.
- Button moves to `bottom: 5rem`, `right: 1rem`.
- Messages area reduces to `max-height: 200px`.
## 11. Dark Theme
The site's dark theme class (`.theme-clean`) triggers a complete color override for the chat widget:
| Element | Light | Dark |
|---------|-------|------|
| Panel background | `#ffffff` | `#1a1a1a` |
| Panel border | `#e0e0e0` | `#333333` |
| Header | `#0066cc` | `#003d7a` |
| Agent bubble | `#f5f5f5` | `#2a2a2a` |
| Agent text | `#333333` | `#d0d0d0` |
| User bubble | `#0066cc` | `#004d99` |
| Error bubble bg | `#fef2f2` | `#3a1010` |
| Error text | `#991b1b` | `#fca5a5` |
| Error border | `#fecaca` | `#5a1a1a` |
| Input area bg | `#ffffff` | `#1a1a1a` |
| Input bg | (same) | `#111111` |
| Input border | `#e0e0e0` | `#333333` |
| Chip text | `#666666` | `#999999` |
| Chip border | `#e0e0e0` | `#333333` |
| Chip hover bg | `#0066cc` | `#004d99` |
| Typing dots | `#999999` | `#555555` |
All dark theme rules are scoped under `.theme-clean` to avoid conflicts with the default light theme.
## 12. Session Management
The chat uses ADK Go's built-in `session.InMemoryService()` to maintain conversation context across multiple messages.
### Session Lifecycle
1. **First message**: No `session_id` in the form. Handler sets `sessionID = "default"` then tries to `Get()` it.
2. **Session not found**: Handler calls `session.Create()` which returns a new session with a UUID.
3. **Response includes session ID**: An OOB-swapped hidden input is appended to the response:
```html
<input type="hidden" id="chat-session-id" name="session_id" value="<uuid>"
form="chat-form" hx-swap-oob="true"/>
```
4. **Subsequent messages**: The form now includes the session ID. The handler calls `Get()` which succeeds, and the conversation continues with context.
### Key Properties
- **In-memory only**: Sessions are not persisted to disk. Server restart clears all sessions.
- **Per-visitor isolation**: Each visitor gets an independent session. No session data is shared.
- **OOB swap**: The session ID is injected using HTMX's out-of-band swap mechanism (`hx-swap-oob="true"`), which replaces the hidden input by ID without affecting the chat messages swap.
- **Conversation context**: ADK's session service stores the full message history, allowing Gemini to handle follow-up questions (e.g., "Tell me more about that company").
## 13. Security
### Input Sanitization
- **User messages**: HTML-escaped via `html.EscapeString()` before rendering in the response fragment. Prevents XSS through user input.
- **Agent responses**: Processed through `formatResponse()` which first escapes all HTML, then applies safe markdown-to-HTML conversion (bold, bullet lists, paragraphs). No raw HTML from the LLM reaches the browser.
### Privacy Protection
The agent instruction explicitly states: *"Never reveal personal contact details (email, phone) — point them to the contact form on the website."* This prevents the agent from disclosing contact information even if it exists in the CV data.
### Infrastructure Security
- The `/api/chat` endpoint inherits the site's full middleware chain: recovery, logging, security headers (CSP, HSTS, X-Frame-Options, etc.).
- The agent context uses a 30-second timeout (`context.WithTimeout`) to prevent runaway requests.
- The agent context is detached from the HTTP request context (`context.Background()`) to avoid cancellation if the client disconnects mid-processing.
- Sessions are ephemeral (in-memory only) and not accessible across visitors.
### Rate Limiting
Gemini 2.5 Flash free tier enforces 15 requests/minute at the API level. For additional protection, the endpoint benefits from the site's existing middleware chain.
## 14. Testing
The chat mascot has a comprehensive Playwright test suite at `tests/mjs/83-chat-mascot.test.mjs` with **46 test assertions** across 25 test groups.
### Test Coverage
| Group | Tests | What's Verified |
|-------|-------|-----------------|
| 1. Mascot Button Presence | 3 | Toggle button visible, robot icon shown, close icon hidden initially |
| 2. Initial State | 1 | Chat panel hidden by default |
| 3. Open Chat Panel | 2 | Panel opens on click, button gets `.mascot-active` class |
| 4. Help Card (Onboarding) | 3 | Help card visible on first open, contains description, dismiss button present |
| 5. Dismiss Help Card | 1 | Help card hides after dismiss click |
| 6. Re-toggle Help Card | 1 | Help card re-opens via `?` button |
| 7. Welcome Message | 1 | English welcome message present |
| 8. Suggested Question Chips | 2 | 5 chips exist, first chip has text content |
| 9. Text Input | 2 | Input visible, has correct placeholder |
| 10. Send Button | 1 | Send button visible |
| 11. Chip Click -> Submit | 3 | User message appears, agent responds, response mentions Go |
| 12. Type Custom Question | 2 | Custom message appears, agent responds |
| 13. Input Clear After Submit | 1 | Input value is empty after submission |
| 14. Session Persistence | 1 | Session ID set after first response |
| 15. Close and Reopen | 3 | Panel closes, reopens, messages preserved |
| 16. Header Content | 2 | Shows "CV Assistant", help button present |
| 17. Spanish Language | 4 | Spanish header, chips, welcome message, placeholder |
| 18. Empty Message Handling | 1 | Graceful handling (no crash) |
| 19. Console Errors | 1 | No chat/htmx/hyperscript console errors |
| 20. CSS Positioning | 2 | Button on right side, panel on right side |
| 21. Intelligence: Go (cross-section) | 2 | Finds projects (Immich/Cmux), mentions skills |
| 22. Intelligence: Companies | 3 | Lists Olympic Broadcasting, Insa, SAP/Gigya |
| 23. Intelligence: Years | 1 | Reports 21 years of experience |
| 24. Intelligence: React (cross-section) | 1 | Finds experience entries with React |
| 25. Intelligence: Spanish response | 1 | Responds in Spanish when asked in Spanish |
### Running Tests
```bash
# Run the chat mascot test (requires running server with GOOGLE_API_KEY)
bun tests/mjs/83-chat-mascot.test.mjs
# Run all frontend tests
bun tests/run-all.mjs
```
Tests 21-25 (intelligence tests) require a valid `GOOGLE_API_KEY` and make real API calls to Gemini. They verify that the agent produces accurate, cross-referenced answers from the CV data.
## 15. Configuration
### Required
```bash
# .env
GOOGLE_API_KEY=your-gemini-api-key # From https://aistudio.google.com/apikey
```
Without this key, the chat feature is silently disabled (see section 9).
### Optional
```bash
MODEL_NAME=gemini-2.5-flash # Default model (free tier compatible)
```
### Cost
Gemini 2.5 Flash free tier provides **15 requests/minute** with no credit card required. Each chat message consumes 1 request. For a personal CV site, this is more than sufficient.
If the free tier is exceeded, Gemini returns a rate limit error, which the handler catches and displays as a generic error message to the user.
## 16. LLM Provider Evolution & Benchmarks
### Provider Architecture
The chat uses a **dual-provider** strategy with automatic fallback:
```
Primary: Gemini 2.5 Flash (Google API — production)
↓ (if fails)
Fallback: Gemma 4 26B MoE via Ollama (local — development)
```
In development, set `GOOGLE_API_KEY=""` to force the local model.
### Model Selection History
| Date | Local Model | Why Changed |
|------|-------------|-------------|
| 2026-03 | Mistral Small 3.2 (24B) | Initial choice — good tool calling support |
| 2026-04-09 | GLM-4.7-Flash (30B) | Better quality, Spanish support, 198K context |
| 2026-04-09 | **Gemma 4 26B MoE** | 3-4x faster (MoE: only 4B active), less RAM, excellent quality |
### Benchmark: Same 4 Questions, Same Hardware
| Test | Gemini 2.5 Flash | Gemma 4 26B | GLM-4.7-Flash | Mistral 3.2 |
|------|-----------------|-------------|---------------|-------------|
| Summary (no tool) | **2s** | 5s | 7s | 10s |
| Go question (tool call) | **6s** | 13s | 42s | 50s |
| Spanish tech search | **4s** | 10s | 15s | No español |
| All companies (heavy) | **3s** | 16s | 63s | Timeout |
### Quality Comparison
| Aspect | Gemini 2.5 Flash | Gemma 4 26B | GLM-4.7-Flash | Mistral 3.2 |
|--------|-----------------|-------------|---------------|-------------|
| Language detection | Excellent | Excellent | Good | Poor |
| Tool calling | Native (ADK) | Via OpenAI compat | Via OpenAI compat | Via OpenAI compat |
| CV navigation links | Correct | Partial | Rare | None |
| Response exhaustiveness | Very complete | Very complete | Complete | Acceptable |
| Hallucination rate | None (tool-grounded) | None | Low | Medium |
### Resource Usage
| Model | Parameters | Active | Disk | RAM (inference) | Offline |
|-------|-----------|--------|------|-----------------|---------|
| Gemini 2.5 Flash | Cloud | Cloud | 0 | 0 | No |
| Gemma 4 26B MoE | 26B | **4B** | 18GB | **~8GB** | Yes |
| GLM-4.7-Flash | 30B | 30B | 19GB | ~19GB | Yes |
| Mistral Small 3.2 | 24B | 24B | 15GB | ~15GB | Yes |
### Why Gemma 4 26B MoE Wins for Local Development
1. **Mixture of Experts**: 26B total but only 4B activated per token — inference speed close to a 4B model with quality of a much larger one
2. **3-4x faster** than dense 30B models (GLM) on the same hardware
3. **~8GB RAM** vs 19GB for GLM — leaves room for other dev tools
4. **Excellent Spanish**: Responds in the correct language consistently
5. **Tool calling works**: Compatible with the OpenAI-compatible Ollama adapter
6. **256K context**: Largest context window of all local options tested
### Configuration
```bash
# .env — Development (local Gemma 4)
OLLAMA_MODEL=gemma4:26b
# GOOGLE_API_KEY= (leave empty to force local)
# .env — Production (Gemini API)
GOOGLE_API_KEY=your-key
OLLAMA_MODEL=gemma4:26b # fallback if Gemini fails
```
## 17. Dependencies
| Package | Purpose | Size Impact |
|---------|---------|-------------|
| `google.golang.org/adk` | Agent framework: runner, session, tools, agents | ~2 MB binary increase |
| `google.golang.org/genai` | Gemini API client (included with ADK) | Bundled |
No frontend dependencies are added. The chat widget uses HTMX and Hyperscript which are already loaded by the site.
## 18. ADK Go Concepts Used
| ADK Concept | Go Type / Function | Usage in This Project |
|-------------|-------------------|----------------------|
| LLM Agent | `llmagent.New(llmagent.Config{})` | Creates the `cv_assistant` agent with instruction, model, and tools |
| Function Tool | `functiontool.New(functiontool.Config{}, func)` | Wraps the `query_cv` Go function as an agent-callable tool with JSON schema |
| Runner | `runner.New(runner.Config{})` | Executes the agent within the HTTP handler with app name and session service |
| Session Service | `session.InMemoryService()` | Maintains per-visitor conversation context in memory |
| Content | `genai.NewContentFromText(msg, genai.RoleUser)` | Converts the user's text message to ADK content format for the runner |
| Event Stream | `runner.Run()` range iteration | Iterates over agent events; `event.IsFinalResponse()` extracts the final answer |
| Run Config | `agent.RunConfig{}` | Default (non-streaming) run configuration passed to the runner |
| Auto Session | `runner.Config{AutoCreateSession: true}` | Runner automatically creates sessions when they don't exist |
| Tool Context | `tool.Context` | Passed to the tool function by ADK; provides access to session and agent state |
| JSON Schema | `jsonschema:"..."` struct tags | Describes tool parameters to the LLM for function calling |
## 19. Relation to Other Documentation
- **[01-ARCHITECTURE.md](01-ARCHITECTURE.md)** — Overall system design
- **[03-API.md](03-API.md)** — HTTP API reference (includes `POST /api/chat`)
- **[14-BACKEND-HANDLERS.md](14-BACKEND-HANDLERS.md)** — Handler patterns
- **[23-DATA-CACHE.md](23-DATA-CACHE.md)** — How CV data is cached and accessed
- **[25-GO-TEMPLATE-SYSTEM.md](25-GO-TEMPLATE-SYSTEM.md)** — Template rendering and conditionals
- **[26-GO-ROUTES-API.md](26-GO-ROUTES-API.md)** — Route registration and middleware chain