feat: Implement comprehensive AI-era SEO optimizations

- Add llms.txt file for AI crawlers (llmstxt.org standard)
- Enhance robots.txt with 15+ AI bot rules (GPTBot, ClaudeBot, etc.)
- Expand JSON-LD structured data from 1 to 12+ schema blocks:
  - Person (enhanced with occupations, languages, employers)
  - WebSite, BreadcrumbList, ProfilePage
  - EducationalOccupationalCredential (dynamic per education)
  - Course (dynamic per certification)
- Create doc/15-SEO.md with comprehensive SEO documentation
- Update MODERN-WEB-TECHNIQUES.md with SEO section (techniques 11-13)

Based on WPBeginner 2025 SEO recommendations for AI Overviews,
structured data, and E-E-A-T signals.
This commit is contained in:
juanatsap
2025-11-30 13:23:22 +00:00
parent 93ca00f26c
commit ae430e6ea7
5 changed files with 797 additions and 15 deletions
+144 -1
View File
@@ -3409,4 +3409,147 @@ make css-clean # Remove generated bundles
---
*This document serves as both a technical reference and a demonstration of modern web development practices that prioritize web standards, performance, progressive enhancement, and superior user experience over JavaScript-heavy solutions.*
## 🔍 SEO & AI-Era Optimization (2025)
### The Challenge
Traditional SEO focused on keywords and backlinks. Modern SEO must optimize for:
1. **AI Overviews** - Content appearing in generative AI summaries
2. **LLM Crawlers** - ChatGPT, Claude, Perplexity bots
3. **Structured Data** - Schema.org for semantic understanding
4. **E-E-A-T** - Experience, Expertise, Authority, Trust signals
### Implementation
#### 11. Comprehensive Schema.org Structured Data
**Problem:** Single JSON-LD schema only described the person, not the content structure.
**Solution:** Multiple interconnected Schema.org types:
```html
<!-- Person Schema (primary) -->
<script type="application/ld+json">
{
"@type": "Person",
"@id": "{{.Website}}/#person",
"name": "...",
"hasOccupation": [...], // Dynamic from experience
"knowsLanguage": [...],
"worksFor": [...]
}
</script>
<!-- WebSite Schema -->
<script type="application/ld+json">
{
"@type": "WebSite",
"author": { "@id": ".../#person" } // Links to Person
}
</script>
<!-- BreadcrumbList, ProfilePage, Course, EducationalOccupationalCredential... -->
```
**Schemas implemented:**
| Schema Type | Purpose | Dynamic |
|-------------|---------|---------|
| Person | Primary profile | Yes (from CV data) |
| WebSite | Site metadata | No |
| BreadcrumbList | Navigation structure | Yes (language-aware) |
| ProfilePage | CV page metadata | Yes |
| EducationalOccupationalCredential | Education entries | Yes (loop) |
| Course | Certifications/training | Yes (loop) |
| Occupation | Work experience | Yes (embedded in Person) |
**Result:** 12+ JSON-LD blocks providing comprehensive semantic data for search engines and AI.
#### 12. llms.txt - AI Crawler Information File
**Problem:** AI systems (ChatGPT, Claude, Perplexity) need structured access to site content.
**Solution:** Implement [llmstxt.org](https://llmstxt.org/) standard:
```
# static/llms.txt
name: Juan Andrés Moreno Rubio - Professional CV
description: Interactive curriculum vitae...
## Professional Summary
- Senior Technical Consultant...
## Key Expertise
- SAP Customer Data Cloud...
## Contact
- Website: ...
```
**Purpose:** Provides AI systems with human-readable, structured information about the site—optimized for RAG (Retrieval-Augmented Generation) systems.
#### 13. robots.txt AI Bot Rules
**Problem:** AI bots weren't explicitly permitted, potentially missing from training data.
**Solution:** Comprehensive AI crawler permissions:
```txt
# static/robots.txt
# Traditional Search Engines
User-agent: Googlebot
Allow: /
# AI Crawlers - Explicitly Allowed
User-agent: GPTBot # OpenAI
Allow: /
User-agent: ClaudeBot # Anthropic
Allow: /
User-agent: PerplexityBot # Perplexity AI
Allow: /
User-agent: Google-Extended # Google AI/Gemini
Allow: /
# ... 15+ AI bot rules
```
**Bots covered:**
- OpenAI (GPTBot, ChatGPT-User)
- Anthropic (ClaudeBot, Claude-Web, anthropic-ai)
- Google (Google-Extended)
- Meta (FacebookBot, Meta-ExternalAgent)
- Perplexity, Cohere, Amazon, Apple, Microsoft, You.com, Brave
### E-E-A-T Signal Implementation
| Signal | Implementation |
|--------|----------------|
| **Experience** | Detailed work history with dates, responsibilities, technologies |
| **Expertise** | Skills categorization, certifications, course completions |
| **Authority** | Social links (LinkedIn, GitHub), company associations |
| **Trust** | HTTPS, canonical URLs, clear contact info, privacy-respecting analytics |
### SEO Files Overview
| File | Purpose |
|------|---------|
| `templates/index.html` | Meta tags, JSON-LD schemas |
| `static/robots.txt` | Search engine + AI bot directives |
| `static/llms.txt` | AI crawler information file |
| `static/sitemap.xml` | XML sitemap |
| `data/cv-{lang}.json` | SEO fields per language |
### Validation
Test structured data:
- [Google Rich Results Test](https://search.google.com/test/rich-results)
- [Schema.org Validator](https://validator.schema.org/)
**Full documentation:** See `doc/15-SEO.md`
---
*This document serves as both a technical reference and a demonstration of modern web development practices that prioritize web standards, performance, progressive enhancement, AI-era SEO, and superior user experience over JavaScript-heavy solutions.*