cv-site/doc/29-AI-CHAT-SHOWCASE.md

# 29. AI-Powered CV Navigation — Technical Showcase

## What This Is

This CV site includes an AI assistant that lets visitors navigate and query the CV through natural language conversation. Instead of scanning a dense document, visitors ask questions like *"What Go projects has he built?"* or *"Has he worked with React?"* and get instant, cross-referenced answers with **clickable links that scroll directly to the relevant section**.

**Live at:** [juan.andres.morenorub.io](https://juan.andres.morenorub.io)

---

## The Problem

A CV is information-dense. Recruiters and hiring managers have specific questions but must scan every section to find answers. Technologies span multiple sections (a language appears in experience, projects, AND skills). Cross-referencing is manual and slow.

## The Solution

An AI agent embedded in the CV page that:
1. **Understands the entire CV** — searches across all sections simultaneously
2. **Answers in natural language** — bilingual (English/Spanish), concise, with bullet points
3. **Navigates the document** — every company, project, and section name in the response is a **clickable link** that closes the chat, scrolls to the target, and highlights it with a green pulse
4. **Degrades gracefully** — no API key? No chat icon. API down? Automatic fallback to local AI.

---

## Technical Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                      CV Site (Go)                            │
│                                                              │
│   Visitor clicks mascot → chat opens → types question        │
│          │                                                   │
│          ▼                                                   │
│   HTMX POST /api/chat ──→ Go Handler                        │
│                              │                               │
│                    ┌─────────┴──────────┐                    │
│                    ▼                    ▼                     │
│              Try Gemini           Try Ollama                 │
│              (primary)          (auto-fallback)              │
│                    │                    │                     │
│                    └────────┬───────────┘                    │
│                             ▼                                │
│                    ADK Go Agent                              │
│                    "cv_assistant"                             │
│                         │                                    │
│                         ▼                                    │
│                    query_cv tool                             │
│                    (cross-section search)                     │
│                         │                                    │
│                         ▼                                    │
│                    Cached CV JSON                            │
│                    (same data that renders the page)          │
│                         │                                    │
│                         ▼                                    │
│                    Response with navigation links             │
│                    [Olympic Broadcasting](#exp-olympic)       │
│                         │                                    │
│                         ▼                                    │
│                    HTMX swaps into chat panel                │
│                    Links scroll + highlight on click          │
└─────────────────────────────────────────────────────────────┘
```

## Key Technical Decisions

### 1. Google ADK Go 1.0 as the Agent Framework

We chose [ADK Go](https://github.com/google/adk-go) (v1.0, released March 2026) for the agent layer. ADK Go provides:
- **`llmagent.New`** — declarative agent definition with instruction and tools
- **`functiontool.New`** — type-safe Go function → agent tool bridge with auto-generated JSON schema
- **`runner.Runner`** — manages agent execution, sessions, and tool calling loops
- **`session.InMemoryService`** — lightweight session management for conversation context

Why not a simpler approach (raw API calls)? ADK Go handles the tool-calling protocol automatically — the agent decides which tool to call, the framework executes it, feeds results back, and the agent synthesizes. With raw API calls, we'd need to implement this loop ourselves.

### 2. Single Agent, Single Tool

The CV data is bounded and structured. We use one agent (`cv_assistant`) with one tool (`query_cv`). Multi-agent orchestration would be over-engineering here. The intelligence comes from:
- A comprehensive instruction prompt covering 8 question types
- A `search` mode that queries across experience, projects, skills, and courses simultaneously
- Instruction to always include navigation links using CV anchor IDs

### 3. Cross-Section Search

When a visitor asks about a technology (e.g., "Go"), the tool searches **all sections at once**:

```go
case "search":
    crossResult := make(map[string]any)
    if exp := filterExperience(cv.Experience, q); len(exp) > 0 {
        crossResult["experience"] = exp
    }
    if proj := filterProjects(cv.Projects, q); len(proj) > 0 {
        crossResult["projects"] = proj
    }
    if skills := filterSkills(cv.Skills, q); len(skills) > 0 {
        crossResult["skills"] = skills
    }
    if courses := filterCourses(cv.Courses, q); len(courses) > 0 {
        crossResult["courses"] = courses
    }
```

This prevents the classic problem of "I searched projects but the answer was in experience."

### 4. CV Navigation Links (GPS for the CV)

The agent includes markdown links in its responses:

```markdown
[Olympic Broadcasting](#exp-olympic-broadcasting) — SAP CDC solutions...
[Immich Photo Manager](#proj-immich-photo-manager) — MCP server for...
See the [Skills section](#skills) for full proficiency details.
```

The `formatResponse` function converts these to clickable HTML links. When clicked, JavaScript:
1. Closes the chat panel
2. Smooth-scrolls to the target element
3. Pulses a green highlight for 2 seconds

This turns the chat into a **navigation tool** — like Google Maps for a document.

### 5. Dual-Provider with Automatic Fallback

```go
// Handler has primary + fallback runners
type Handler struct {
    primary  *chatRunner  // Gemini (fast, cloud)
    fallback *chatRunner  // Ollama (local, unlimited)
}

// Try primary, fall back on any error
response, sessionID, err := h.runAgent(h.primary, message)
if err != nil && h.fallback != nil {
    log.Printf("Primary failed, falling back to %s", h.fallback.label)
    response, sessionID, err = h.runAgent(h.fallback, message)
}
```

- **Primary:** Gemini 2.5 Flash — fast (~2s), pay-as-you-go (~$0.0003/question)
- **Fallback:** Ollama with Mistral Small 3.2 on local Mac Mini via Tailscale — free, unlimited
- **Switching:** Automatic and transparent. If Gemini returns 429/503, Ollama handles the request.
- **No manual intervention** — visitors never see the provider switch.

### 6. Model Warmup on Chat Open

Ollama loads models on demand (~10-15s cold start). To hide this latency:

```javascript
function toggleChatPanel() {
    // ... open panel ...
    if (!chatWarmedUp) {
        chatWarmedUp = true;
        fetch('/api/chat/warmup', { method: 'POST' });  // background
    }
}
```

When the visitor opens the chat, a silent warmup request fires. By the time they type a question, the model is loaded and ready.

### 7. HTMX + Plain JavaScript

The chat widget uses HTMX for server communication and plain JavaScript for interactions:

```html
<!-- Form submits via HTMX -->
<form id="chat-form" hx-post="/api/chat"
      hx-target="#chat-messages"
      hx-swap="beforeend scroll:#chat-messages:bottom"
      hx-indicator="#chat-typing">

<!-- Chips trigger via JavaScript -->
<button onclick="sendChatQuestion('What Go projects has he built?')">
    Go projects?
</button>
```

Responses are HTML fragments — the server renders the chat bubbles, HTMX swaps them in. No client-side state management, no JSON parsing, no virtual DOM.

### 8. Rate Limiting

```go
chatRateLimiter := middleware.NewRateLimiter(30, 1*time.Hour)
mux.Handle("/api/chat", chatRateLimiter.Middleware(...))
```

30 requests per hour per IP — generous for genuine visitors, prevents abuse.

### 9. Graceful Degradation

```go
func NewHandler(dataCache *cache.DataCache) *Handler {
    // Try Gemini → Try Ollama → Disable chat
    // If neither provider works, chat icon doesn't appear at all
}
```

```html
{{if .ChatEnabled}}
    <!-- entire chat widget -->
{{end}}
```

No API key? No Ollama? The chat icon simply doesn't render. Zero JavaScript errors, zero broken UI, zero console noise.

---

## Technology Stack

| Component | Technology | Purpose |
|-----------|-----------|---------|
| Agent Framework | [Google ADK Go 1.0](https://github.com/google/adk-go) | Agent definition, tool calling, session management |
| Primary LLM | Gemini 2.5 Flash | Cloud inference, fast responses |
| Fallback LLM | Mistral Small 3.2 via [Ollama](https://ollama.com) | Local inference on Apple Silicon |
| Server Communication | HTMX 2.0 | Form submission, response swapping, indicators |
| Interactions | Plain JavaScript | Panel toggle, chip clicks, navigation scroll |
| Backend | Go 1.25+ stdlib `net/http` | HTTP handler, markdown→HTML, rate limiting |
| Styling | CSS with CV design tokens | Green theme, dark mode, responsive |

## Files

```
internal/chat/
├── agent.go      # LLM agent + query_cv tool + cross-section search
├── handler.go    # Dual-provider handler, warmup, fallback, response rendering
└── ollama.go     # Ollama model.LLM adapter (OpenAI-compatible API)

templates/partials/
├── widgets/chat-widget.html        # HTMX chat panel + JS functions
└── modals/chat-help-modal.html     # Accordion help modal with clickable questions

static/css/04-interactive/_chat.css  # Full styling (tokens, dark theme, responsive, nav links)
tests/mjs/83-chat-mascot.test.mjs   # 46 Playwright test assertions
```

## What This Demonstrates

- **AI agent integration in production Go applications** — not a prototype, a deployed feature
- **ADK Go 1.0 in a real-world use case** — function calling, session management, multi-provider
- **Multi-provider LLM architecture** — cloud primary with local fallback, transparent switching
- **Hypermedia-driven AI UI** — HTMX server-rendered responses, no SPA framework needed
- **Document navigation via AI** — chat responses that link to and highlight document sections
- **Graceful engineering** — degrades cleanly, rate-limited, bilingual, theme-aware