Files
cv-site/doc/15-SEO.md
T
juanatsap 19951b6f42 feat: Auto-detect text browsers and serve plain text CV
- Detect curl, wget, lynx, w3m, links, elinks, browsh, carbonyl
- Check User-Agent and Accept: text/plain header
- Redirect to /text endpoint automatically
- Document in SEO guide and modern techniques
2025-11-30 14:28:51 +00:00

8.8 KiB

SEO Implementation Guide

Project: CV Interactive Website Last Updated: 2025-11-30 Status: Production Ready


Overview

This document describes the comprehensive SEO (Search Engine Optimization) implementation for the CV website, including traditional search engine optimization and modern AI-era optimizations for LLM crawlers and AI Overviews.


SEO Architecture

1. Traditional SEO Elements

Meta Tags (templates/index.html)

<!-- Primary Meta Tags -->
<title>{{.CV.Personal.Name}} - {{.CV.SEO.PageTitle}}</title>
<meta name="title" content="...">
<meta name="description" content="...">
<meta name="keywords" content="...">
<meta name="author" content="...">
<meta name="robots" content="index, follow">
<link rel="canonical" href="{{.CanonicalURL}}">

International SEO (Hreflang)

<link rel="alternate" hreflang="en" href="{{.AlternateEN}}">
<link rel="alternate" hreflang="es" href="{{.AlternateES}}">
<link rel="alternate" hreflang="x-default" href="https://juan.andres.morenorub.io/?lang=en">

Social Media Integration

Platform Meta Type Implementation
Facebook Open Graph og:type, og:title, og:description, og:image
Twitter/X Twitter Cards twitter:card, twitter:title, twitter:description
LinkedIn Open Graph Uses same og:* tags

2. Structured Data (JSON-LD)

The site implements multiple Schema.org types for comprehensive semantic understanding:

Person Schema (Primary)

{
  "@type": "Person",
  "@id": "{{.CV.Personal.Website}}/#person",
  "name": "...",
  "jobTitle": "...",
  "description": "...",
  "knowsAbout": [...],
  "knowsLanguage": [...],
  "worksFor": [...],
  "hasOccupation": [...]
}

Fields included:

  • Basic info: name, givenName, familyName, jobTitle
  • Contact: email, telephone, url
  • Demographics: birthDate, birthPlace, nationality
  • Location: address with locality and country
  • Social: sameAs (LinkedIn, GitHub, Domestika)
  • Education: alumniOf
  • Skills: knowsAbout (array of expertise areas)
  • Languages: knowsLanguage (with Language type)
  • Employment: worksFor (multiple organizations)
  • Occupations: hasOccupation (dynamically generated from experience)

WebSite Schema

{
  "@type": "WebSite",
  "name": "... - Professional CV",
  "url": "...",
  "inLanguage": ["en", "es"],
  "potentialAction": { "@type": "SearchAction", ... }
}

BreadcrumbList Schema

{
  "@type": "BreadcrumbList",
  "itemListElement": [
    { "position": 1, "name": "Home", "item": "..." },
    { "position": 2, "name": "CV (English/Español)", "item": ".../?lang=..." }
  ]
}

ProfilePage Schema

{
  "@type": "ProfilePage",
  "mainEntity": { "@id": ".../#person" },
  "dateCreated": "...",
  "dateModified": "...",
  "inLanguage": "..."
}

EducationalOccupationalCredential Schema

Generated dynamically for each education entry:

{
  "@type": "EducationalOccupationalCredential",
  "name": "{{.Degree}}",
  "description": "{{.Field}}",
  "educationalLevel": "Bachelor's Degree",
  "credentialCategory": "degree",
  "recognizedBy": { "@type": "CollegeOrUniversity", ... }
}

Course Schema

Generated dynamically for each course/certification:

{
  "@type": "Course",
  "name": "{{.Title}}",
  "description": "...",
  "provider": { "@type": "Organization", ... },
  "hasCourseInstance": { "@type": "CourseInstance", ... },
  "timeRequired": "{{.Duration}}"
}

3. AI-Era SEO Optimizations

llms.txt File (static/llms.txt)

A dedicated file for AI crawlers following the llmstxt.org standard:

# llms.txt - AI Crawler Information
name: Juan Andrés Moreno Rubio - Professional CV
description: Interactive curriculum vitae...

## Professional Summary
- Senior Technical Consultant...

## Key Expertise
- SAP Customer Data Cloud...

## Contact
- Website: ...
- LinkedIn: ...

Purpose: Provides AI systems (ChatGPT, Claude, Perplexity, etc.) with structured, human-readable information about the site content.

Plain Text Auto-Detection (/text endpoint)

The site automatically detects text-based browsers and CLI tools, serving a clean 80-character plain text version:

Auto-detected clients:

Client Type
curl CLI tool
wget CLI tool
HTTPie CLI tool
Lynx Text browser
w3m Text browser
Links/ELinks Text browser
Browsh Terminal browser
Carbonyl Terminal browser

Usage:

# Auto-detected (serves plain text):
curl https://juan.andres.morenorub.io/

# Explicit endpoint:
curl https://juan.andres.morenorub.io/text?lang=en

# With Accept header:
curl -H "Accept: text/plain" https://juan.andres.morenorub.io/

Output features:

  • 80-character line wrapping
  • ASCII art section headers
  • Clean, structured text
  • All CV content preserved

robots.txt AI Bot Rules (static/robots.txt)

Explicit permissions for AI crawlers:

Bot Service Status
GPTBot OpenAI/ChatGPT Allowed
ChatGPT-User OpenAI Allowed
ClaudeBot Anthropic Allowed
Claude-Web Anthropic Allowed
anthropic-ai Anthropic Allowed
Google-Extended Google AI/Gemini Allowed
PerplexityBot Perplexity AI Allowed
cohere-ai Cohere Allowed
CCBot Common Crawl Allowed
Amazonbot Amazon/Alexa Allowed
Applebot Apple/Siri Allowed
Copilot Microsoft Allowed
YouBot You.com Allowed
BraveBot Brave Search Allowed

E-E-A-T Signals

The implementation supports Google's E-E-A-T (Experience, Expertise, Authority, Trust) framework:

Experience

  • Detailed work history with responsibilities
  • Real project descriptions
  • Duration and dates for credibility

Expertise

  • Skills categorized by domain
  • Technologies listed per job
  • Certifications and courses

Authority

  • Links to LinkedIn, GitHub, portfolio
  • Company associations (SAP, Olympic Broadcasting)
  • Client count and project metrics in summary

Trust

  • Canonical URLs prevent duplicate content
  • HTTPS enforced
  • Clear contact information
  • Privacy-respecting analytics (Matomo)

Files Overview

File Purpose
templates/index.html Meta tags, JSON-LD schemas
static/robots.txt Search engine and AI bot directives
static/llms.txt AI crawler information file
static/sitemap.xml XML sitemap for search engines
data/cv-en.json SEO fields (pageTitle, metaTitle, etc.)
data/cv-es.json Spanish SEO fields
/text endpoint Plain text CV for CLI/TUI browsers
templates/cv-text.txt Plain text template

SEO Data Model

The SEO-specific fields in data/cv-{lang}.json:

{
  "seo": {
    "pageTitle": "Curriculum Vitae",
    "metaTitle": "Professional CV",
    "metaDescription": "18 years of experience in...",
    "ogDescription": "Senior Technical Consultant...",
    "keywords": "CV, Resume, FullStack Developer, SAP CDC..."
  }
}

Validation & Testing

Schema Validation

Test structured data at:

Expected Schema Count

The site generates 12+ JSON-LD blocks:

  • 1 Person schema
  • 1 WebSite schema
  • 1 BreadcrumbList schema
  • 1 ProfilePage schema
  • N EducationalOccupationalCredential schemas (1 per education)
  • N Course schemas (1 per course)

robots.txt Validation

Test at: Google Robots.txt Tester


Best Practices Implemented

Content Structure

  • Clear H1-H6 heading hierarchy
  • Semantic HTML5 elements (article, section, nav)
  • Alt text for images
  • Descriptive link text

Technical SEO

  • Mobile-responsive design
  • Fast page load (bundled CSS, preload fonts)
  • Canonical URLs
  • Hreflang for multilingual
  • Sitemap.xml
  • robots.txt with AI bot rules

Modern SEO (AI-Era)

  • llms.txt file
  • Comprehensive JSON-LD schemas
  • AI bot permissions in robots.txt
  • Clear, parseable content structure

Maintenance

When to Update

  1. Content changes: Update data/cv-{lang}.json SEO fields
  2. New sections: Add corresponding Schema.org types
  3. New AI bots: Add to robots.txt
  4. Annual review: Update llms.txt with current info

Monitoring

  • Google Search Console for traditional SEO
  • Matomo Analytics for traffic patterns
  • Manual testing in AI chat interfaces (ChatGPT, Claude, Perplexity)

References