feat: Implement comprehensive AI-era SEO optimizations

- Add llms.txt file for AI crawlers (llmstxt.org standard) - Enhance robots.txt with 15+ AI bot rules (GPTBot, ClaudeBot, etc.) - Expand JSON-LD structured data from 1 to 12+ schema blocks: - Person (enhanced with occupations, languages, employers) - WebSite, BreadcrumbList, ProfilePage - EducationalOccupationalCredential (dynamic per education) - Course (dynamic per certification) - Create doc/15-SEO.md with comprehensive SEO documentation - Update MODERN-WEB-TECHNIQUES.md with SEO section (techniques 11-13) Based on WPBeginner 2025 SEO recommendations for AI Overviews, structured data, and E-E-A-T signals.
2025-11-30 13:23:22 +00:00
parent 93ca00f26c
commit ae430e6ea7
5 changed files with 797 additions and 15 deletions
@@ -0,0 +1,314 @@
+# SEO Implementation Guide
+
+**Project:** CV Interactive Website
+**Last Updated:** 2025-11-30
+**Status:** Production Ready
+
+---
+
+## Overview
+
+This document describes the comprehensive SEO (Search Engine Optimization) implementation for the CV website, including traditional search engine optimization and modern AI-era optimizations for LLM crawlers and AI Overviews.
+
+---
+
+## SEO Architecture
+
+### 1. Traditional SEO Elements
+
+#### Meta Tags (`templates/index.html`)
+
+```html
+<!-- Primary Meta Tags -->
+<title>{{.CV.Personal.Name}} - {{.CV.SEO.PageTitle}}</title>
+<meta name="title" content="...">
+<meta name="description" content="...">
+<meta name="keywords" content="...">
+<meta name="author" content="...">
+<meta name="robots" content="index, follow">
+<link rel="canonical" href="{{.CanonicalURL}}">
+```
+
+#### International SEO (Hreflang)
+
+```html
+<link rel="alternate" hreflang="en" href="{{.AlternateEN}}">
+<link rel="alternate" hreflang="es" href="{{.AlternateES}}">
+<link rel="alternate" hreflang="x-default" href="https://juan.andres.morenorub.io/?lang=en">
+```
+
+#### Social Media Integration
+
+| Platform | Meta Type | Implementation |
+|----------|-----------|----------------|
+| Facebook | Open Graph | `og:type`, `og:title`, `og:description`, `og:image` |
+| Twitter/X | Twitter Cards | `twitter:card`, `twitter:title`, `twitter:description` |
+| LinkedIn | Open Graph | Uses same `og:*` tags |
+
+---
+
+### 2. Structured Data (JSON-LD)
+
+The site implements multiple Schema.org types for comprehensive semantic understanding:
+
+#### Person Schema (Primary)
+```json
+{
+  "@type": "Person",
+  "@id": "{{.CV.Personal.Website}}/#person",
+  "name": "...",
+  "jobTitle": "...",
+  "description": "...",
+  "knowsAbout": [...],
+  "knowsLanguage": [...],
+  "worksFor": [...],
+  "hasOccupation": [...]
+}
+```
+
+**Fields included:**
+- Basic info: name, givenName, familyName, jobTitle
+- Contact: email, telephone, url
+- Demographics: birthDate, birthPlace, nationality
+- Location: address with locality and country
+- Social: sameAs (LinkedIn, GitHub, Domestika)
+- Education: alumniOf
+- Skills: knowsAbout (array of expertise areas)
+- Languages: knowsLanguage (with Language type)
+- Employment: worksFor (multiple organizations)
+- Occupations: hasOccupation (dynamically generated from experience)
+
+#### WebSite Schema
+```json
+{
+  "@type": "WebSite",
+  "name": "... - Professional CV",
+  "url": "...",
+  "inLanguage": ["en", "es"],
+  "potentialAction": { "@type": "SearchAction", ... }
+}
+```
+
+#### BreadcrumbList Schema
+```json
+{
+  "@type": "BreadcrumbList",
+  "itemListElement": [
+    { "position": 1, "name": "Home", "item": "..." },
+    { "position": 2, "name": "CV (English/Español)", "item": ".../?lang=..." }
+  ]
+}
+```
+
+#### ProfilePage Schema
+```json
+{
+  "@type": "ProfilePage",
+  "mainEntity": { "@id": ".../#person" },
+  "dateCreated": "...",
+  "dateModified": "...",
+  "inLanguage": "..."
+}
+```
+
+#### EducationalOccupationalCredential Schema
+Generated dynamically for each education entry:
+```json
+{
+  "@type": "EducationalOccupationalCredential",
+  "name": "{{.Degree}}",
+  "description": "{{.Field}}",
+  "educationalLevel": "Bachelor's Degree",
+  "credentialCategory": "degree",
+  "recognizedBy": { "@type": "CollegeOrUniversity", ... }
+}
+```
+
+#### Course Schema
+Generated dynamically for each course/certification:
+```json
+{
+  "@type": "Course",
+  "name": "{{.Title}}",
+  "description": "...",
+  "provider": { "@type": "Organization", ... },
+  "hasCourseInstance": { "@type": "CourseInstance", ... },
+  "timeRequired": "{{.Duration}}"
+}
+```
+
+---
+
+### 3. AI-Era SEO Optimizations
+
+#### llms.txt File (`static/llms.txt`)
+
+A dedicated file for AI crawlers following the [llmstxt.org](https://llmstxt.org/) standard:
+
+```
+# llms.txt - AI Crawler Information
+name: Juan Andrés Moreno Rubio - Professional CV
+description: Interactive curriculum vitae...
+
+## Professional Summary
+- Senior Technical Consultant...
+
+## Key Expertise
+- SAP Customer Data Cloud...
+
+## Contact
+- Website: ...
+- LinkedIn: ...
+```
+
+**Purpose:** Provides AI systems (ChatGPT, Claude, Perplexity, etc.) with structured, human-readable information about the site content.
+
+#### robots.txt AI Bot Rules (`static/robots.txt`)
+
+Explicit permissions for AI crawlers:
+
+| Bot | Service | Status |
+|-----|---------|--------|
+| GPTBot | OpenAI/ChatGPT | Allowed |
+| ChatGPT-User | OpenAI | Allowed |
+| ClaudeBot | Anthropic | Allowed |
+| Claude-Web | Anthropic | Allowed |
+| anthropic-ai | Anthropic | Allowed |
+| Google-Extended | Google AI/Gemini | Allowed |
+| PerplexityBot | Perplexity AI | Allowed |
+| cohere-ai | Cohere | Allowed |
+| CCBot | Common Crawl | Allowed |
+| Amazonbot | Amazon/Alexa | Allowed |
+| Applebot | Apple/Siri | Allowed |
+| Copilot | Microsoft | Allowed |
+| YouBot | You.com | Allowed |
+| BraveBot | Brave Search | Allowed |
+
+---
+
+## E-E-A-T Signals
+
+The implementation supports Google's E-E-A-T (Experience, Expertise, Authority, Trust) framework:
+
+### Experience
+- Detailed work history with responsibilities
+- Real project descriptions
+- Duration and dates for credibility
+
+### Expertise
+- Skills categorized by domain
+- Technologies listed per job
+- Certifications and courses
+
+### Authority
+- Links to LinkedIn, GitHub, portfolio
+- Company associations (SAP, Olympic Broadcasting)
+- Client count and project metrics in summary
+
+### Trust
+- Canonical URLs prevent duplicate content
+- HTTPS enforced
+- Clear contact information
+- Privacy-respecting analytics (Matomo)
+
+---
+
+## Files Overview
+
+| File | Purpose |
+|------|---------|
+| `templates/index.html` | Meta tags, JSON-LD schemas |
+| `static/robots.txt` | Search engine and AI bot directives |
+| `static/llms.txt` | AI crawler information file |
+| `static/sitemap.xml` | XML sitemap for search engines |
+| `data/cv-en.json` | SEO fields (pageTitle, metaTitle, etc.) |
+| `data/cv-es.json` | Spanish SEO fields |
+
+---
+
+## SEO Data Model
+
+The SEO-specific fields in `data/cv-{lang}.json`:
+
+```json
+{
+  "seo": {
+    "pageTitle": "Curriculum Vitae",
+    "metaTitle": "Professional CV",
+    "metaDescription": "18 years of experience in...",
+    "ogDescription": "Senior Technical Consultant...",
+    "keywords": "CV, Resume, FullStack Developer, SAP CDC..."
+  }
+}
+```
+
+---
+
+## Validation & Testing
+
+### Schema Validation
+Test structured data at:
+- [Google Rich Results Test](https://search.google.com/test/rich-results)
+- [Schema.org Validator](https://validator.schema.org/)
+
+### Expected Schema Count
+The site generates **12+ JSON-LD blocks**:
+- 1 Person schema
+- 1 WebSite schema
+- 1 BreadcrumbList schema
+- 1 ProfilePage schema
+- N EducationalOccupationalCredential schemas (1 per education)
+- N Course schemas (1 per course)
+
+### robots.txt Validation
+Test at: [Google Robots.txt Tester](https://www.google.com/webmasters/tools/robots-testing-tool)
+
+---
+
+## Best Practices Implemented
+
+### Content Structure
+- [ ] Clear H1-H6 heading hierarchy
+- [x] Semantic HTML5 elements (article, section, nav)
+- [x] Alt text for images
+- [x] Descriptive link text
+
+### Technical SEO
+- [x] Mobile-responsive design
+- [x] Fast page load (bundled CSS, preload fonts)
+- [x] Canonical URLs
+- [x] Hreflang for multilingual
+- [x] Sitemap.xml
+- [x] robots.txt with AI bot rules
+
+### Modern SEO (AI-Era)
+- [x] llms.txt file
+- [x] Comprehensive JSON-LD schemas
+- [x] AI bot permissions in robots.txt
+- [x] Clear, parseable content structure
+
+---
+
+## Maintenance
+
+### When to Update
+
+1. **Content changes**: Update `data/cv-{lang}.json` SEO fields
+2. **New sections**: Add corresponding Schema.org types
+3. **New AI bots**: Add to `robots.txt`
+4. **Annual review**: Update `llms.txt` with current info
+
+### Monitoring
+
+- Google Search Console for traditional SEO
+- Matomo Analytics for traffic patterns
+- Manual testing in AI chat interfaces (ChatGPT, Claude, Perplexity)
+
+---
+
+## References
+
+- [Schema.org](https://schema.org/)
+- [Google Search Central](https://developers.google.com/search)
+- [llmstxt.org Standard](https://llmstxt.org/)
+- [WPBeginner SEO Guide 2025](https://www.wpbeginner.com/opinion/does-seo-still-work/)
@@ -3409,4 +3409,147 @@ make css-clean   # Remove generated bundles

 ---

-*This document serves as both a technical reference and a demonstration of modern web development practices that prioritize web standards, performance, progressive enhancement, and superior user experience over JavaScript-heavy solutions.*
+## 🔍 SEO & AI-Era Optimization (2025)
+
+### The Challenge
+
+Traditional SEO focused on keywords and backlinks. Modern SEO must optimize for:
+1. **AI Overviews** - Content appearing in generative AI summaries
+2. **LLM Crawlers** - ChatGPT, Claude, Perplexity bots
+3. **Structured Data** - Schema.org for semantic understanding
+4. **E-E-A-T** - Experience, Expertise, Authority, Trust signals
+
+### Implementation
+
+#### 11. Comprehensive Schema.org Structured Data
+
+**Problem:** Single JSON-LD schema only described the person, not the content structure.
+
+**Solution:** Multiple interconnected Schema.org types:
+
+```html
+<!-- Person Schema (primary) -->
+<script type="application/ld+json">
+{
+  "@type": "Person",
+  "@id": "{{.Website}}/#person",
+  "name": "...",
+  "hasOccupation": [...],  // Dynamic from experience
+  "knowsLanguage": [...],
+  "worksFor": [...]
+}
+</script>
+
+<!-- WebSite Schema -->
+<script type="application/ld+json">
+{
+  "@type": "WebSite",
+  "author": { "@id": ".../#person" }  // Links to Person
+}
+</script>
+
+<!-- BreadcrumbList, ProfilePage, Course, EducationalOccupationalCredential... -->
+```
+
+**Schemas implemented:**
+| Schema Type | Purpose | Dynamic |
+|-------------|---------|---------|
+| Person | Primary profile | Yes (from CV data) |
+| WebSite | Site metadata | No |
+| BreadcrumbList | Navigation structure | Yes (language-aware) |
+| ProfilePage | CV page metadata | Yes |
+| EducationalOccupationalCredential | Education entries | Yes (loop) |
+| Course | Certifications/training | Yes (loop) |
+| Occupation | Work experience | Yes (embedded in Person) |
+
+**Result:** 12+ JSON-LD blocks providing comprehensive semantic data for search engines and AI.
+
+#### 12. llms.txt - AI Crawler Information File
+
+**Problem:** AI systems (ChatGPT, Claude, Perplexity) need structured access to site content.
+
+**Solution:** Implement [llmstxt.org](https://llmstxt.org/) standard:
+
+```
+# static/llms.txt
+name: Juan Andrés Moreno Rubio - Professional CV
+description: Interactive curriculum vitae...
+
+## Professional Summary
+- Senior Technical Consultant...
+
+## Key Expertise
+- SAP Customer Data Cloud...
+
+## Contact
+- Website: ...
+```
+
+**Purpose:** Provides AI systems with human-readable, structured information about the site—optimized for RAG (Retrieval-Augmented Generation) systems.
+
+#### 13. robots.txt AI Bot Rules
+
+**Problem:** AI bots weren't explicitly permitted, potentially missing from training data.
+
+**Solution:** Comprehensive AI crawler permissions:
+
+```txt
+# static/robots.txt
+
+# Traditional Search Engines
+User-agent: Googlebot
+Allow: /
+
+# AI Crawlers - Explicitly Allowed
+User-agent: GPTBot          # OpenAI
+Allow: /
+
+User-agent: ClaudeBot       # Anthropic
+Allow: /
+
+User-agent: PerplexityBot   # Perplexity AI
+Allow: /
+
+User-agent: Google-Extended # Google AI/Gemini
+Allow: /
+
+# ... 15+ AI bot rules
+```
+
+**Bots covered:**
+- OpenAI (GPTBot, ChatGPT-User)
+- Anthropic (ClaudeBot, Claude-Web, anthropic-ai)
+- Google (Google-Extended)
+- Meta (FacebookBot, Meta-ExternalAgent)
+- Perplexity, Cohere, Amazon, Apple, Microsoft, You.com, Brave
+
+### E-E-A-T Signal Implementation
+
+| Signal | Implementation |
+|--------|----------------|
+| **Experience** | Detailed work history with dates, responsibilities, technologies |
+| **Expertise** | Skills categorization, certifications, course completions |
+| **Authority** | Social links (LinkedIn, GitHub), company associations |
+| **Trust** | HTTPS, canonical URLs, clear contact info, privacy-respecting analytics |
+
+### SEO Files Overview
+
+| File | Purpose |
+|------|---------|
+| `templates/index.html` | Meta tags, JSON-LD schemas |
+| `static/robots.txt` | Search engine + AI bot directives |
+| `static/llms.txt` | AI crawler information file |
+| `static/sitemap.xml` | XML sitemap |
+| `data/cv-{lang}.json` | SEO fields per language |
+
+### Validation
+
+Test structured data:
+- [Google Rich Results Test](https://search.google.com/test/rich-results)
+- [Schema.org Validator](https://validator.schema.org/)
+
+**Full documentation:** See `doc/15-SEO.md`
+
+---
+
+*This document serves as both a technical reference and a demonstration of modern web development practices that prioritize web standards, performance, progressive enhancement, AI-era SEO, and superior user experience over JavaScript-heavy solutions.*