feat: add origin validation and rate limiting for PDF endpoint

- Implemented origin checker middleware to prevent external sites from hotlinking the PDF generation endpoint
- Added rate limiter (3 requests per minute per IP) to protect resource-intensive PDF operations
- Configured allowed origins via ALLOWED_ORIGINS environment variable with localhost defaults for development
This commit is contained in:
juanatsap
2025-11-09 14:00:10 +00:00
parent 5e132e7ec7
commit 24b2401519
4 changed files with 646 additions and 1 deletions
+8
View File
@@ -18,9 +18,17 @@ DATA_DIR=data
READ_TIMEOUT=15
WRITE_TIMEOUT=15
# Security Configuration
# Allowed origins for API access (comma-separated domains)
# Prevents external sites from accessing your API/PDF endpoint
# Leave empty for development (allows localhost)
# Example for production: ALLOWED_ORIGINS=yourdomain.com,www.yourdomain.com
ALLOWED_ORIGINS=
# Production Settings
# Uncomment for production:
# GO_ENV=production
# TEMPLATE_HOT_RELOAD=false
# READ_TIMEOUT=30
# WRITE_TIMEOUT=30
# ALLOWED_ORIGINS=yourdomain.com,www.yourdomain.com
+452
View File
@@ -0,0 +1,452 @@
# API Protection & Security
**Protection against external access and DDoS attacks on resource-intensive endpoints.**
---
## Overview
The CV website implements multiple layers of protection to prevent external sites from accessing the API and to protect against DDoS attacks on resource-intensive endpoints like PDF generation.
### Protection Layers
1. **Origin Checking** - Only allows requests from your domain
2. **Rate Limiting** - Prevents abuse even from allowed origins
3. **Production Restrictions** - Stricter rules in production environments
---
## 1. Origin Checking
**File:** `internal/middleware/security.go` (`OriginChecker` middleware)
### How It Works
The origin checker examines incoming HTTP requests and validates them against a whitelist of allowed domains. It checks two headers:
1. **Origin Header** - Set by browsers for CORS requests
2. **Referer Header** - Set by browsers for navigation requests
### Configuration
**Environment Variable:** `ALLOWED_ORIGINS`
```bash
# Development (default)
ALLOWED_ORIGINS=
# Production
ALLOWED_ORIGINS=yourdomain.com,www.yourdomain.com
```
### Behavior
| Environment | No Header | Localhost | Your Domain | External Domain |
|-------------|-----------|-----------|-------------|-----------------|
| **Development** | ✅ Allowed | ✅ Allowed | ✅ Allowed | ❌ Blocked |
| **Production** | ❌ Blocked (PDF) | ✅ Allowed | ✅ Allowed | ❌ Blocked |
**Production PDF Endpoint:**
- Requires `Origin` or `Referer` header
- Blocks direct URL access without headers
- Prevents bookmarking and external hotlinking
### Example Responses
**Allowed Request:**
```bash
curl -H "Referer: https://yourdomain.com/" http://localhost:1999/export/pdf?lang=en
# Status: 200 OK
# PDF file downloaded
```
**Blocked Request (External Domain):**
```bash
curl -H "Referer: https://externaldomain.com/" http://localhost:1999/export/pdf?lang=en
# Status: 403 Forbidden
# Response: Forbidden: External access not allowed
```
**Blocked Request (Production, No Headers):**
```bash
# In production with GO_ENV=production
curl http://yourdomain.com/export/pdf?lang=en
# Status: 403 Forbidden
# Response: Forbidden: Direct access not allowed
```
---
## 2. Rate Limiting
**File:** `internal/middleware/security.go` (`RateLimiter`)
### How It Works
The rate limiter tracks requests per IP address and enforces limits on resource-intensive endpoints.
**Current Configuration:**
- **Limit:** 3 requests per minute per IP
- **Window:** 1 minute
- **Applied to:** `/export/pdf` endpoint only
### Implementation
```go
// Create rate limiter for PDF endpoint
pdfRateLimiter := middleware.NewRateLimiter(3, 1*time.Minute)
// Apply to PDF endpoint
protectedPDFHandler := middleware.OriginChecker(
pdfRateLimiter.Middleware(
http.HandlerFunc(cvHandler.ExportPDF),
),
)
```
### Behavior
| Requests | Status | Response |
|----------|--------|----------|
| 1st request | ✅ 200 OK | PDF generated |
| 2nd request | ✅ 200 OK | PDF generated |
| 3rd request | ✅ 200 OK | PDF generated |
| 4th request (within 1 min) | ❌ 429 Too Many Requests | Rate limit exceeded |
| After 1 minute | ✅ 200 OK | Counter reset |
### Headers
**Rate Limit Exceeded Response:**
```
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: text/plain; charset=utf-8
Rate limit exceeded. Please try again later.
```
### IP Detection
The rate limiter detects client IP from:
1. `X-Forwarded-For` header (proxy/CDN)
2. `X-Real-IP` header (alternative proxy header)
3. `RemoteAddr` (direct connection)
**Supports reverse proxies:** Yes (Nginx, Cloudflare, etc.)
---
## 3. Combined Protection
The PDF endpoint has **both** origin checking and rate limiting applied:
```
Request → OriginChecker → RateLimiter → PDF Handler
```
**Protection Flow:**
1. **Check Origin/Referer**
- If external domain → 403 Forbidden
- If production + no headers → 403 Forbidden
- Otherwise, continue
2. **Check Rate Limit**
- If > 3 requests/minute → 429 Too Many Requests
- Otherwise, continue
3. **Generate PDF**
- Process request normally
---
## 4. Configuration Examples
### Development Environment
```bash
# .env
PORT=1999
HOST=localhost
GO_ENV=development
ALLOWED_ORIGINS=
```
**Behavior:**
- Allows `localhost` and `127.0.0.1`
- Allows requests without headers
- Rate limit: 3 PDF/min per IP
### Production Environment
```bash
# .env
PORT=1999
HOST=0.0.0.0
GO_ENV=production
ALLOWED_ORIGINS=yourdomain.com,www.yourdomain.com
```
**Behavior:**
- Only allows `yourdomain.com` and `www.yourdomain.com`
- Requires `Origin` or `Referer` header for PDF endpoint
- Rate limit: 3 PDF/min per IP
### Multiple Domains
```bash
# Support multiple domains (e.g., staging + production)
ALLOWED_ORIGINS=yourdomain.com,www.yourdomain.com,staging.yourdomain.com
```
---
## 5. Testing Protection
### Test Origin Checking
```bash
# ✅ Allowed (localhost in development)
curl http://localhost:1999/export/pdf?lang=en
# ✅ Allowed (with referer)
curl -H "Referer: http://localhost:1999/" http://localhost:1999/export/pdf?lang=en
# ❌ Blocked (external referer)
curl -H "Referer: https://evil.com/" http://localhost:1999/export/pdf?lang=en
# Expected: 403 Forbidden
```
### Test Rate Limiting
```bash
# Generate 4 PDFs quickly
for i in {1..4}; do
echo "Request $i:"
curl -w "Status: %{http_code}\n" -o /dev/null -s http://localhost:1999/export/pdf?lang=en
sleep 1
done
# Expected output:
# Request 1: Status: 200
# Request 2: Status: 200
# Request 3: Status: 200
# Request 4: Status: 429
```
### Test Combined Protection
```bash
# Should be blocked by origin checker before rate limiter
for i in {1..5}; do
curl -H "Referer: https://evil.com/" -w "Status: %{http_code}\n" -o /dev/null -s \
http://localhost:1999/export/pdf?lang=en
done
# Expected: All requests get 403 (origin check fails immediately)
```
---
## 6. Monitoring & Logs
### Log Messages
**Origin Check Failure:**
```
# No specific log (returns 403 silently)
# Check server logs for 403 responses
```
**Rate Limit Exceeded:**
```
# No specific log (returns 429 silently)
# Monitor for frequent 429 responses
```
### Recommended Monitoring
1. **Track 403 responses** - Indicates potential attack attempts
2. **Track 429 responses** - Indicates rate limiting in effect
3. **Monitor PDF generation times** - Detect abuse patterns
4. **Alert on sustained high request rates** - DDoS detection
---
## 7. Customization
### Adjust Rate Limits
**File:** `main.go`
```go
// Current: 3 requests per minute
pdfRateLimiter := middleware.NewRateLimiter(3, 1*time.Minute)
// More restrictive: 5 per hour
pdfRateLimiter := middleware.NewRateLimiter(5, 1*time.Hour)
// Less restrictive: 10 per minute
pdfRateLimiter := middleware.NewRateLimiter(10, 1*time.Minute)
```
### Apply to Other Endpoints
```go
// Protect /cv endpoint
protectedCVHandler := middleware.OriginChecker(
http.HandlerFunc(cvHandler.CVContent),
)
mux.Handle("/cv", protectedCVHandler)
```
### Disable Origin Checking (Not Recommended)
```go
// Apply only rate limiting (no origin check)
mux.Handle("/export/pdf", pdfRateLimiter.Middleware(
http.HandlerFunc(cvHandler.ExportPDF),
))
```
---
## 8. Security Best Practices
### ✅ Recommended
1. **Set ALLOWED_ORIGINS in production** - Never run production without it
2. **Use HTTPS** - Prevents header spoofing
3. **Monitor 403/429 responses** - Detect attack patterns
4. **Consider CloudFlare** - Additional DDoS protection layer
5. **Log suspicious requests** - For forensic analysis
### ❌ Anti-Patterns
1. **Don't disable protection in production** - Always use origin checking
2. **Don't set rate limits too high** - PDF generation is expensive
3. **Don't trust IP addresses alone** - Use combined protection
4. **Don't expose internal endpoints** - Keep admin routes private
---
## 9. Production Deployment Checklist
Before deploying to production:
- [ ] Set `GO_ENV=production` in environment
- [ ] Configure `ALLOWED_ORIGINS` with your domain(s)
- [ ] Test origin checking with external domain
- [ ] Test rate limiting with rapid requests
- [ ] Verify HTTPS is enabled (prevents header spoofing)
- [ ] Set up monitoring for 403/429 responses
- [ ] Configure log retention for security analysis
- [ ] Test PDF generation under load
- [ ] Verify reverse proxy headers (X-Forwarded-For)
- [ ] Document allowed origins in runbook
---
## 10. Troubleshooting
### Problem: Legitimate users getting 403
**Cause:** ALLOWED_ORIGINS not configured correctly
**Solution:**
```bash
# Ensure all your domains are listed
ALLOWED_ORIGINS=yourdomain.com,www.yourdomain.com
# Check for typos (case-insensitive but must match exactly)
```
### Problem: Rate limit too restrictive
**Cause:** Legitimate users hitting limit
**Solution:**
```go
// Increase limit or window in main.go
pdfRateLimiter := middleware.NewRateLimiter(5, 1*time.Minute)
```
### Problem: Behind reverse proxy, rate limit not working
**Cause:** IP detection failing
**Solution:**
```nginx
# Ensure Nginx passes correct headers
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
```
### Problem: Origin header not being sent
**Cause:** Browser doesn't send Origin for same-origin requests
**Solution:** This is normal. The middleware checks Referer as fallback.
---
## 11. Attack Scenarios & Mitigation
### Scenario 1: DDoS via PDF Generation
**Attack:** External site hotlinks to `/export/pdf`, triggering many PDF generations
**Mitigation:**
1. ✅ Origin checker blocks external domains (403)
2. ✅ Rate limiter prevents >3 requests/min per IP (429)
3. ✅ Production mode requires headers (blocks direct access)
**Result:** Attack fails, server protected
### Scenario 2: Header Spoofing
**Attack:** Attacker spoofs `Referer` header to bypass origin check
**Mitigation:**
1. ⚠️ HTTPS prevents header modification in transit
2. ✅ Rate limiter still applies (3 req/min limit)
3. ✅ IP-based tracking prevents distributed spoofing
**Result:** Individual attacker limited to 3 req/min
### Scenario 3: Distributed Attack
**Attack:** Botnet with many IPs, each generating PDFs
**Mitigation:**
1. ✅ Each IP limited to 3 req/min
2. ✅ Origin checker blocks if no valid referer
3. 🔴 Consider CloudFlare for large-scale DDoS
**Result:** Slowed but not fully blocked (add CloudFlare)
---
## Summary
**Protection Enabled:** ✅ Origin Checking + Rate Limiting
**Endpoints Protected:**
- `/export/pdf` - Full protection (origin + rate limit)
**Endpoints Unprotected:**
- `/` - Public home page
- `/cv` - Public CV content
- `/health` - Public health check
- `/static/*` - Public static files
**Configuration:** Environment-based via `ALLOWED_ORIGINS`
**Production Ready:** Yes (after setting ALLOWED_ORIGINS)
---
**For questions or to adjust protection levels, modify:**
- `internal/middleware/security.go` - Origin checking and rate limiting logic
- `main.go` - Apply protection to additional endpoints
- `.env` - Configure ALLOWED_ORIGINS for your domain
+174
View File
@@ -3,6 +3,9 @@ package middleware
import (
"net/http"
"os"
"strings"
"sync"
"time"
)
// SecurityHeaders adds production-grade security headers to responses
@@ -47,3 +50,174 @@ func SecurityHeaders(next http.Handler) http.Handler {
next.ServeHTTP(w, r)
})
}
// OriginChecker restricts API access to requests from allowed origins only
// Prevents external sites from hotlinking/accessing resource-intensive endpoints
func OriginChecker(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Get allowed domains from environment (comma-separated)
// Example: ALLOWED_ORIGINS="yourdomain.com,www.yourdomain.com"
allowedOriginsEnv := os.Getenv("ALLOWED_ORIGINS")
// If empty, add "juan.andres.morenorub.io", as it is the domain of the CV
if allowedOriginsEnv == "" {
allowedOriginsEnv = "juan.andres.morenorub.io"
}
// Default to localhost for development
allowedOrigins := []string{"localhost", "127.0.0.1"}
if allowedOriginsEnv != "" {
customOrigins := strings.Split(allowedOriginsEnv, ",")
for _, origin := range customOrigins {
allowedOrigins = append(allowedOrigins, strings.TrimSpace(origin))
}
}
// Check Origin header (for CORS requests)
origin := r.Header.Get("Origin")
if origin != "" {
if !isAllowedOrigin(origin, allowedOrigins) {
http.Error(w, "Forbidden: External access not allowed", http.StatusForbidden)
return
}
}
// Check Referer header (for direct requests)
referer := r.Header.Get("Referer")
if referer != "" {
if !isAllowedOrigin(referer, allowedOrigins) {
http.Error(w, "Forbidden: External access not allowed", http.StatusForbidden)
return
}
}
// Allow if no Origin/Referer (direct browser access)
// This allows your own site visitors to access the endpoint
if origin == "" && referer == "" {
// For production, you might want to be stricter here
// For now, allow it (users can bookmark /export/pdf directly)
if os.Getenv("GO_ENV") == "production" && r.URL.Path == "/export/pdf" {
// In production, require at least a referer for PDF endpoint
http.Error(w, "Forbidden: Direct access not allowed", http.StatusForbidden)
return
}
}
next.ServeHTTP(w, r)
})
}
// isAllowedOrigin checks if the origin/referer matches allowed domains
func isAllowedOrigin(originURL string, allowedOrigins []string) bool {
originURL = strings.TrimSpace(originURL)
originURL = strings.TrimPrefix(originURL, "http://")
originURL = strings.TrimPrefix(originURL, "https://")
// Extract domain from URL (remove path)
parts := strings.Split(originURL, "/")
domain := parts[0]
// Remove port if present
domain = strings.Split(domain, ":")[0]
for _, allowed := range allowedOrigins {
if strings.EqualFold(domain, allowed) {
return true
}
}
return false
}
// rateLimitEntry tracks rate limiting per IP
type rateLimitEntry struct {
count int
resetTime time.Time
}
// RateLimiter provides simple in-memory rate limiting
type RateLimiter struct {
mu sync.RWMutex
clients map[string]*rateLimitEntry
limit int // requests allowed
window time.Duration // time window
}
// NewRateLimiter creates a new rate limiter
func NewRateLimiter(limit int, window time.Duration) *RateLimiter {
rl := &RateLimiter{
clients: make(map[string]*rateLimitEntry),
limit: limit,
window: window,
}
// Cleanup expired entries every minute
go rl.cleanup()
return rl
}
// Middleware returns rate limiting middleware
func (rl *RateLimiter) Middleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Get client IP (handle X-Forwarded-For for proxies)
ip := r.Header.Get("X-Forwarded-For")
if ip == "" {
ip = r.Header.Get("X-Real-IP")
}
if ip == "" {
ip = strings.Split(r.RemoteAddr, ":")[0]
}
if !rl.allow(ip) {
w.Header().Set("Retry-After", "60")
http.Error(w, "Rate limit exceeded. Please try again later.", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
// allow checks if the request is allowed based on rate limit
func (rl *RateLimiter) allow(ip string) bool {
rl.mu.Lock()
defer rl.mu.Unlock()
now := time.Now()
entry, exists := rl.clients[ip]
if !exists || now.After(entry.resetTime) {
// New client or window expired
rl.clients[ip] = &rateLimitEntry{
count: 1,
resetTime: now.Add(rl.window),
}
return true
}
if entry.count >= rl.limit {
return false
}
entry.count++
return true
}
// cleanup removes expired entries periodically
func (rl *RateLimiter) cleanup() {
ticker := time.NewTicker(1 * time.Minute)
defer ticker.Stop()
for range ticker.C {
rl.mu.Lock()
now := time.Now()
for ip, entry := range rl.clients {
if now.After(entry.resetTime) {
delete(rl.clients, ip)
}
}
rl.mu.Unlock()
}
}
+12 -1
View File
@@ -40,12 +40,23 @@ func main() {
// Setup router
mux := http.NewServeMux()
// Create rate limiter for PDF endpoint
// Allow 3 PDF generations per minute per IP
pdfRateLimiter := middleware.NewRateLimiter(3, 1*time.Minute)
// Routes
mux.HandleFunc("/", cvHandler.Home)
mux.HandleFunc("/cv", cvHandler.CVContent)
mux.HandleFunc("/export/pdf", cvHandler.ExportPDF)
mux.HandleFunc("/health", healthHandler.Check)
// Protected PDF endpoint with origin checking + rate limiting
protectedPDFHandler := middleware.OriginChecker(
pdfRateLimiter.Middleware(
http.HandlerFunc(cvHandler.ExportPDF),
),
)
mux.Handle("/export/pdf", protectedPDFHandler)
// Static files with cache control
staticHandler := http.StripPrefix("/static/", http.FileServer(http.Dir("static")))
mux.Handle("/static/", cacheControl(staticHandler))