feat: Implement comprehensive AI-era SEO optimizations

- Add llms.txt file for AI crawlers (llmstxt.org standard) - Enhance robots.txt with 15+ AI bot rules (GPTBot, ClaudeBot, etc.) - Expand JSON-LD structured data from 1 to 12+ schema blocks: - Person (enhanced with occupations, languages, employers) - WebSite, BreadcrumbList, ProfilePage - EducationalOccupationalCredential (dynamic per education) - Course (dynamic per certification) - Create doc/15-SEO.md with comprehensive SEO documentation - Update MODERN-WEB-TECHNIQUES.md with SEO section (techniques 11-13) Based on WPBeginner 2025 SEO recommendations for AI Overviews, structured data, and E-E-A-T signals.
2025-11-30 13:23:22 +00:00
parent 93ca00f26c
commit ae430e6ea7
5 changed files with 797 additions and 15 deletions
@@ -3409,4 +3409,147 @@ make css-clean   # Remove generated bundles

 ---

-*This document serves as both a technical reference and a demonstration of modern web development practices that prioritize web standards, performance, progressive enhancement, and superior user experience over JavaScript-heavy solutions.*
+## 🔍 SEO & AI-Era Optimization (2025)
+
+### The Challenge
+
+Traditional SEO focused on keywords and backlinks. Modern SEO must optimize for:
+1. **AI Overviews** - Content appearing in generative AI summaries
+2. **LLM Crawlers** - ChatGPT, Claude, Perplexity bots
+3. **Structured Data** - Schema.org for semantic understanding
+4. **E-E-A-T** - Experience, Expertise, Authority, Trust signals
+
+### Implementation
+
+#### 11. Comprehensive Schema.org Structured Data
+
+**Problem:** Single JSON-LD schema only described the person, not the content structure.
+
+**Solution:** Multiple interconnected Schema.org types:
+
+```html
+<!-- Person Schema (primary) -->
+<script type="application/ld+json">
+{
+  "@type": "Person",
+  "@id": "{{.Website}}/#person",
+  "name": "...",
+  "hasOccupation": [...],  // Dynamic from experience
+  "knowsLanguage": [...],
+  "worksFor": [...]
+}
+</script>
+
+<!-- WebSite Schema -->
+<script type="application/ld+json">
+{
+  "@type": "WebSite",
+  "author": { "@id": ".../#person" }  // Links to Person
+}
+</script>
+
+<!-- BreadcrumbList, ProfilePage, Course, EducationalOccupationalCredential... -->
+```
+
+**Schemas implemented:**
+| Schema Type | Purpose | Dynamic |
+|-------------|---------|---------|
+| Person | Primary profile | Yes (from CV data) |
+| WebSite | Site metadata | No |
+| BreadcrumbList | Navigation structure | Yes (language-aware) |
+| ProfilePage | CV page metadata | Yes |
+| EducationalOccupationalCredential | Education entries | Yes (loop) |
+| Course | Certifications/training | Yes (loop) |
+| Occupation | Work experience | Yes (embedded in Person) |
+
+**Result:** 12+ JSON-LD blocks providing comprehensive semantic data for search engines and AI.
+
+#### 12. llms.txt - AI Crawler Information File
+
+**Problem:** AI systems (ChatGPT, Claude, Perplexity) need structured access to site content.
+
+**Solution:** Implement [llmstxt.org](https://llmstxt.org/) standard:
+
+```
+# static/llms.txt
+name: Juan Andrés Moreno Rubio - Professional CV
+description: Interactive curriculum vitae...
+
+## Professional Summary
+- Senior Technical Consultant...
+
+## Key Expertise
+- SAP Customer Data Cloud...
+
+## Contact
+- Website: ...
+```
+
+**Purpose:** Provides AI systems with human-readable, structured information about the site—optimized for RAG (Retrieval-Augmented Generation) systems.
+
+#### 13. robots.txt AI Bot Rules
+
+**Problem:** AI bots weren't explicitly permitted, potentially missing from training data.
+
+**Solution:** Comprehensive AI crawler permissions:
+
+```txt
+# static/robots.txt
+
+# Traditional Search Engines
+User-agent: Googlebot
+Allow: /
+
+# AI Crawlers - Explicitly Allowed
+User-agent: GPTBot          # OpenAI
+Allow: /
+
+User-agent: ClaudeBot       # Anthropic
+Allow: /
+
+User-agent: PerplexityBot   # Perplexity AI
+Allow: /
+
+User-agent: Google-Extended # Google AI/Gemini
+Allow: /
+
+# ... 15+ AI bot rules
+```
+
+**Bots covered:**
+- OpenAI (GPTBot, ChatGPT-User)
+- Anthropic (ClaudeBot, Claude-Web, anthropic-ai)
+- Google (Google-Extended)
+- Meta (FacebookBot, Meta-ExternalAgent)
+- Perplexity, Cohere, Amazon, Apple, Microsoft, You.com, Brave
+
+### E-E-A-T Signal Implementation
+
+| Signal | Implementation |
+|--------|----------------|
+| **Experience** | Detailed work history with dates, responsibilities, technologies |
+| **Expertise** | Skills categorization, certifications, course completions |
+| **Authority** | Social links (LinkedIn, GitHub), company associations |
+| **Trust** | HTTPS, canonical URLs, clear contact info, privacy-respecting analytics |
+
+### SEO Files Overview
+
+| File | Purpose |
+|------|---------|
+| `templates/index.html` | Meta tags, JSON-LD schemas |
+| `static/robots.txt` | Search engine + AI bot directives |
+| `static/llms.txt` | AI crawler information file |
+| `static/sitemap.xml` | XML sitemap |
+| `data/cv-{lang}.json` | SEO fields per language |
+
+### Validation
+
+Test structured data:
+- [Google Rich Results Test](https://search.google.com/test/rich-results)
+- [Schema.org Validator](https://validator.schema.org/)
+
+**Full documentation:** See `doc/15-SEO.md`
+
+---
+
+*This document serves as both a technical reference and a demonstration of modern web development practices that prioritize web standards, performance, progressive enhancement, AI-era SEO, and superior user experience over JavaScript-heavy solutions.*