Files
cv-site/doc/24-GO-VALIDATION-SYSTEM.md
juanatsap 2c7f8de242 refactor: centralize constants and reorganize documentation
- Create internal/constants package with all hardcoded values
  (environment, cookies, themes, headers, routes, cache)
- Create internal/httputil package for HTTP helper functions
- Update all handlers and middleware to use centralized constants
- Reorganize documentation with numbered prefixes (00-26)
- Remove duplicate docs from validation folder and docs/
- Delete handlers/constants.go (moved to internal/constants)
2025-12-06 16:27:12 +00:00

19 KiB

Go Validation System Documentation

Overview

The CV site implements a tag-based validation system with reflection caching for high-performance struct validation. The system uses struct tags (similar to JSON tags) to declaratively define validation rules, eliminating repetitive validation code.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Validator Core                          │
│  ┌──────────────┐    ┌──────────────┐   ┌───────────────┐  │
│  │  Reflection  │───>│  sync.Map    │   │  Validation   │  │
│  │  Parser      │    │  Cache       │<──│  Rules        │  │
│  └──────────────┘    └──────────────┘   └───────────────┘  │
│         │                    │                    │          │
│         v                    v                    v          │
│  ┌──────────────────────────────────────────────────────┐   │
│  │        Field Metadata (index, name, rules[])        │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                              │
                              v
                    ┌──────────────────┐
                    │  ValidationErrors │
                    │   ([]FieldError)  │
                    └──────────────────┘

Key Components:

  • Validator (internal/validation/validator.go) - Core reflection-based validator with caching
  • ValidationRules (internal/validation/rules.go) - Built-in validation rule registry
  • FieldError (internal/validation/errors.go) - Structured error types
  • ContactFormRequest (internal/validation/contact.go) - Example struct with validation tags

Performance Benefits

Reflection Caching with sync.Map:

  • Struct metadata parsed once per type
  • Subsequent validations use cached field metadata
  • Thread-safe concurrent validation
  • Zero GC pressure for metadata lookups

Benchmark Results:

V1 (manual validation): ~2000 ns/op
V2 (tag-based cached):  ~1500 ns/op

Tag Syntax

Basic Format

type MyStruct struct {
    Field string `validate:"rule1,rule2=param,rule3"`
}

Rules are comma-separated:

  • Simple rule: required
  • Rule with parameter: max=100
  • Multiple rules: required,trim,max=100,email

Field name resolution:

  • Uses json tag name if present
  • Falls back to struct field name
  • Example: Name string json:"name"`` → field name is "name"

Available Validation Rules

1. Required Fields

required

Validates that the field is not empty (after trimming whitespace).

type User struct {
    Name string `json:"name" validate:"required"`
}

Error Message: "name is required"


optional

Explicit marker for optional fields (always passes, used for documentation).

type User struct {
    Company string `json:"company" validate:"optional"`
}

2. String Transformations

trim

Auto-trims leading/trailing whitespace from the field value.

type User struct {
    Name string `json:"name" validate:"required,trim"`
}

Behavior:

  • Transformation happens before validation
  • Modifies the field value in-place
  • UTF-8 aware

sanitize

HTML-escapes the field value and removes newlines from header fields.

type ContactForm struct {
    Message string `json:"message" validate:"required,trim,sanitize"`
}

Transformation:

  • Trims whitespace
  • Removes \r and \n characters
  • HTML-escapes content (prevents XSS)

3. Length Validation

min=N

Validates minimum rune length (UTF-8 aware, not byte length).

type Password struct {
    Value string `json:"password" validate:"required,min=8"`
}

Error Message: "password must be at least 8 characters"

Note: Uses utf8.RuneCountInString() to support international characters correctly.


max=N

Validates maximum rune length (UTF-8 aware).

type User struct {
    Name string `json:"name" validate:"required,max=100"`
}

Error Message: "name must be 100 characters or less"

4. Format Validation

email

Validates email format per RFC 5322 (simplified).

type User struct {
    Email string `json:"email" validate:"required,email"`
}

Validation Rules:

  • Length: 3-254 characters
  • Must contain exactly one @
  • Local part: max 64 characters
  • Domain must contain at least one .
  • Regex pattern: /^[a-zA-Z0-9.!#$%&'*+/=?^_{|}~-]+@a-zA-Z0-9?(?:.a-zA-Z0-9?)+$/`

Error Message: "Invalid email address format"


pattern=name|subject|company

Validates against predefined regex patterns.

type ContactForm struct {
    Name    string `json:"name"    validate:"pattern=name"`
    Subject string `json:"subject" validate:"pattern=subject"`
    Company string `json:"company" validate:"pattern=company"`
}

Supported Patterns:

Pattern Regex Description
name ^[\p{L}\s'-]+$ Letters (any language), spaces, hyphens, apostrophes
subject ^[\p{L}\p{N}\s.,!?'"()\-:;#]+$ Alphanumeric + safe punctuation including #
company ^[\p{L}\p{N}\s.,&'()\-]+$ Alphanumeric + business punctuation

Error Message: "name contains invalid characters for name (letters, spaces, hyphens, apostrophes only)"

Pre-compiled for Performance: All patterns are pre-compiled in init() for zero-allocation validation.

5. Security Validation

no_injection

Prevents email header injection attacks.

type ContactForm struct {
    Email   string `json:"email"   validate:"required,email,no_injection"`
    Subject string `json:"subject" validate:"required,no_injection"`
}

Detects:

  • Newline characters (\r, \n)
  • Email header patterns (case-insensitive):
    • content-type:
    • mime-version:
    • bcc:, cc:, to:, from:
    • subject:, reply-to:
    • x-mailer:

Error Message: "email contains invalid characters (possible injection attempt)"

Security Note: This prevents attackers from injecting additional email headers via form fields.


honeypot

Bot detection - field must be empty.

type ContactForm struct {
    Honeypot string `json:"website" validate:"honeypot"`
}

Behavior:

  • Hidden field in form (CSS: display: none)
  • Legitimate users never fill it
  • Bots auto-fill all fields

Error Message: "Bot detected"


timing=min:max

Validates form submission timing to prevent bot submissions.

type ContactForm struct {
    Timestamp int64 `json:"timestamp" validate:"timing=2:86400"`
}

Parameters:

  • min: Minimum seconds between page load and submit (e.g., 2)
  • max: Maximum seconds allowed (e.g., 86400 = 24 hours)

Validation:

  • Timestamp is set when form loads (JavaScript)
  • Submitted with form
  • Server validates now - timestamp is within [min, max]

Error Messages:

  • Too fast: "Form submitted too quickly (bot detected)"
  • Invalid: "Invalid timestamp" (future or too old)

Security Note: Prevents automated bot submissions that submit forms instantly.

Complete Example: ContactFormRequest

Struct Definition

package validation

type ContactFormRequest struct {
    Name      string `json:"name"      validate:"required,trim,max=100,pattern=name,no_injection"`
    Email     string `json:"email"     validate:"required,trim,max=254,email,no_injection"`
    Company   string `json:"company"   validate:"optional,trim,max=100,pattern=company"`
    Subject   string `json:"subject"   validate:"required,trim,max=200,pattern=subject,no_injection"`
    Message   string `json:"message"   validate:"required,trim,max=5000,sanitize"`
    Honeypot  string `json:"website"   validate:"honeypot"`
    Timestamp int64  `json:"timestamp" validate:"timing=2:86400"`
}

Validation Execution

// Validate contact form
req := &ContactFormRequest{
    Name:      "  Juan José  ",
    Email:     "juan@example.com",
    Subject:   "Question about #golang",
    Message:   "<script>alert('xss')</script>Hello!",
    Honeypot:  "",
    Timestamp: time.Now().Unix() - 5,
}

// V2 validation (tag-based with caching)
if err := ValidateContactFormV2(req); err != nil {
    // Handle validation errors
    if validationErrors, ok := err.(ValidationErrors); ok {
        for _, fieldErr := range validationErrors {
            fmt.Printf("%s: %s\n", fieldErr.Field, fieldErr.Message)
        }
    }
}

// After validation, req.Name is "Juan José" (trimmed)
// req.Message is "&lt;script&gt;alert('xss')&lt;/script&gt;Hello!" (sanitized)

Validation Flow

1. Reflection Cache Lookup
   ├─> Cache Hit: Use cached field metadata
   └─> Cache Miss: Parse struct, cache metadata

2. For Each Field:
   ├─> Apply Transformations (trim, sanitize)
   │   └─> Update field value in-place
   │
   └─> Apply Validation Rules
       ├─> required: Check non-empty
       ├─> max=100: Check UTF-8 rune count
       ├─> pattern=name: Validate against regex
       ├─> no_injection: Check for malicious patterns
       └─> Collect errors

3. Return Results
   ├─> Success: nil error
   └─> Failure: ValidationErrors ([]FieldError)

Error Handling

FieldError Structure

type FieldError struct {
    Field   string `json:"field"`            // "name"
    Tag     string `json:"tag"`              // "max"
    Param   string `json:"param,omitempty"`  // "100"
    Message string `json:"message"`          // "name must be 100 characters or less"
}

ValidationErrors (Multiple Errors)

type ValidationErrors []FieldError

// Methods
func (ve ValidationErrors) Error() string
func (ve ValidationErrors) HasErrors() bool
func (ve ValidationErrors) GetFieldError(field string) *FieldError
func (ve ValidationErrors) GetFieldErrors(field string) []FieldError

Error Handling Example

err := ValidateContactFormV2(req)
if err != nil {
    validationErrors, ok := err.(ValidationErrors)
    if !ok {
        // Not a validation error (e.g., struct type error)
        return err
    }

    // Get specific field error
    if nameErr := validationErrors.GetFieldError("name"); nameErr != nil {
        fmt.Printf("Name error: %s\n", nameErr.Message)
    }

    // Get all errors for a field
    emailErrors := validationErrors.GetFieldErrors("email")
    for _, err := range emailErrors {
        fmt.Printf("Email: %s (%s)\n", err.Message, err.Tag)
    }

    // Convert to JSON for API response
    return c.JSON(http.StatusBadRequest, map[string]interface{}{
        "errors": validationErrors,
    })
}

V1 vs V2 Validation Comparison

V1: Manual Validation (Legacy)

func ValidateContactForm(req *ContactFormRequest) error {
    // Honeypot check
    if req.Honeypot != "" {
        return &ValidationError{Field: "website", Message: "Bot detected"}
    }

    // Timing check
    if req.Timestamp > 0 {
        now := time.Now().Unix()
        timeTaken := now - req.Timestamp
        if timeTaken < 2 {
            return &ValidationError{
                Field: "timestamp",
                Message: "Form submitted too quickly (bot detected)",
            }
        }
    }

    // Required fields
    if strings.TrimSpace(req.Name) == "" {
        return &ValidationError{Field: "name", Message: "Name is required"}
    }

    // ... 100+ more lines of manual validation ...
}

Drawbacks:

  • Verbose and repetitive
  • Error-prone (easy to forget validations)
  • Hard to maintain
  • No reusability across structs

V2: Tag-Based Validation (Current)

type ContactFormRequest struct {
    Name string `json:"name" validate:"required,trim,max=100,pattern=name,no_injection"`
    // ... more fields ...
}

func ValidateContactFormV2(req *ContactFormRequest) error {
    return globalValidator.Validate(req)
}

Benefits:

  • Declarative and self-documenting
  • Consistent validation across all structs
  • Reflection caching for performance
  • Extensible with custom rules
  • Type-safe with compile-time struct validation

Extension Guide: Custom Validation Rules

Step 1: Define Validation Function

// ruleCustom validates against a custom pattern
func ruleCustom(field string, value string, param string) *FieldError {
    if value == "" {
        return nil // Skip validation for empty values
    }

    // Custom validation logic
    if !isValid(value) {
        return &FieldError{
            Field:   field,
            Tag:     "custom",
            Param:   param,
            Message: field + " does not meet custom criteria",
        }
    }

    return nil
}

Step 2: Register Rule

func init() {
    validationRules["custom"] = ruleCustom
}

Step 3: Use in Struct Tags

type MyStruct struct {
    Field string `json:"field" validate:"custom=param"`
}

Example: URL Validation Rule

func ruleURL(field string, value string, param string) *FieldError {
    if value == "" {
        return nil
    }

    if _, err := url.Parse(value); err != nil {
        return &FieldError{
            Field:   field,
            Tag:     "url",
            Message: field + " must be a valid URL",
        }
    }

    return nil
}

func init() {
    validationRules["url"] = ruleURL
}

Usage:

type Website struct {
    URL string `json:"url" validate:"required,url"`
}

Thread Safety

sync.Map for Caching

type Validator struct {
    cache sync.Map // map[reflect.Type]*structMeta
}

Characteristics:

  • Thread-safe concurrent reads and writes
  • Optimized for mostly-read workloads (perfect for caching)
  • No locks needed for cache lookups
  • Automatic memory management

Safe Concurrent Validation

// Global validator instance (shared across goroutines)
var globalValidator = NewValidator()

// Safe to call from multiple goroutines
func handler1(req *ContactFormRequest) error {
    return globalValidator.Validate(req) // Thread-safe
}

func handler2(req *ContactFormRequest) error {
    return globalValidator.Validate(req) // Thread-safe
}

Best Practices

1. Use Appropriate Rule Order

// ✅ GOOD: Transformations first, validations second
Name string `validate:"trim,required,max=100,pattern=name"`

// ❌ BAD: Validations before transformations
Name string `validate:"required,max=100,trim,pattern=name"`

Why: Transformations modify the value before validation runs.

2. Combine Security Rules

// ✅ GOOD: Multiple layers of security
Email string `validate:"required,trim,max=254,email,no_injection"`

// ❌ BAD: Missing injection protection
Email string `validate:"required,email"`

3. Use Global Validator Instance

// ✅ GOOD: Reuse cached metadata
var globalValidator = NewValidator()

func Validate(req interface{}) error {
    return globalValidator.Validate(req)
}

// ❌ BAD: Creates new validator every time (no caching)
func Validate(req interface{}) error {
    v := NewValidator()
    return v.Validate(req)
}

4. Explicit Optional Fields

// ✅ GOOD: Clearly marked as optional
Company string `validate:"optional,trim,max=100"`

// ⚠️ ACCEPTABLE: No validate tag (implicitly optional)
Company string `json:"company"`

5. UTF-8 Awareness

// ✅ GOOD: max=100 counts runes (supports "José" = 4 runes)
Name string `validate:"max=100"`

// Note: Never use len() for validation - it counts bytes!

Security Considerations

1. Email Header Injection Prevention

Attack Vector:

Subject: Hello\r\nBcc: attacker@evil.com\r\n\r\nInjected content

Protection:

Subject string `validate:"no_injection"`

2. XSS Prevention

Attack Vector:

Message: <script>alert('XSS')</script>

Protection:

Message string `validate:"sanitize"`
// Result: &lt;script&gt;alert('XSS')&lt;/script&gt;

3. Bot Detection

Multi-Layer Approach:

type ContactForm struct {
    Honeypot  string `validate:"honeypot"`      // Must be empty
    Timestamp int64  `validate:"timing=2:86400"` // 2s-24h submission time
}

4. Safe HTML Handling

// ⚠️ SECURITY WARNING: Only use safeHTML with trusted content
// NEVER use with user-generated content!

// ✅ GOOD: Sanitize user input
Message string `validate:"sanitize"`

// ❌ BAD: Trusting user HTML directly
Message template.HTML // DANGEROUS!

Quick Reference

Common Validation Patterns

// Required text field with length limit
Name string `validate:"required,trim,max=100"`

// Email field
Email string `validate:"required,trim,max=254,email,no_injection"`

// Optional field with validation when provided
Company string `validate:"optional,trim,max=100,pattern=company"`

// Message with XSS protection
Message string `validate:"required,trim,max=5000,sanitize"`

// Honeypot bot trap
Honeypot string `validate:"honeypot"`

// Timing-based bot detection
Timestamp int64 `validate:"timing=2:86400"`

Error Response Format

{
  "errors": [
    {
      "field": "name",
      "tag": "max",
      "param": "100",
      "message": "name must be 100 characters or less"
    },
    {
      "field": "email",
      "tag": "email",
      "message": "Invalid email address format"
    }
  ]
}

Performance Metrics

Reflection Caching Impact

First validation (cold cache):  ~2000 ns/op
Subsequent validations (warm):  ~1500 ns/op
Cache hit rate:                 99.9%
Memory overhead:                ~500 bytes per struct type

Pattern Compilation

Pre-compiled patterns (init):   One-time cost
Pattern matching:               Zero allocations
Regex cache:                    Global, shared
  • internal/validation/validator.go - Core validator with caching
  • internal/validation/rules.go - Validation rule implementations
  • internal/validation/errors.go - Error types and methods
  • internal/validation/contact.go - ContactFormRequest example and V1/V2 validation

See Also