Files
cv-site/internal/validation/README.md
T

349 lines
9.8 KiB
Markdown
Raw Normal View History

# Validation Package
High-performance struct tag-based validation system with zero external dependencies.
## Features
- **Struct Tag Validation**: Declarative validation using Go struct tags
- **High Performance**: Reflection caching achieves ~260µs per validation (12x faster than manual validation)
- **Zero Dependencies**: Pure Go stdlib implementation
- **UTF-8 Aware**: Proper Unicode handling for international names
- **Security First**: Email injection prevention, honeypot detection, timing validation
- **Thread-Safe**: Safe for concurrent use
- **Backward Compatible**: Existing manual validation functions preserved
## Quick Start
```go
// Define struct with validation tags
type ContactFormRequest struct {
Name string `json:"name" validate:"required,trim,max=100,pattern=name,no_injection"`
Email string `json:"email" validate:"required,trim,max=254,email,no_injection"`
Company string `json:"company" validate:"optional,trim,max=100,pattern=company"`
Subject string `json:"subject" validate:"required,trim,max=200,pattern=subject,no_injection"`
Message string `json:"message" validate:"required,trim,max=5000,sanitize"`
Honeypot string `json:"website" validate:"honeypot"`
Timestamp int64 `json:"timestamp" validate:"timing=2:86400"`
}
// Validate
req := &ContactFormRequest{...}
if err := ValidateContactFormV2(req); err != nil {
// Handle validation errors
if verrs, ok := err.(ValidationErrors); ok {
for _, e := range verrs {
fmt.Printf("%s: %s\n", e.Field, e.Message)
}
}
}
```
## Available Validation Rules
### Required/Optional
- **`required`** - Field must not be empty (after trimming)
- **`optional`** - Explicit marker for optional fields
### Transformations
- **`trim`** - Auto-trim whitespace from field value
- **`sanitize`** - Remove newlines and escape HTML (for message body)
### Length Validation (UTF-8 aware)
- **`min=N`** - Minimum rune count (e.g., `min=3`)
- **`max=N`** - Maximum rune count (e.g., `max=100`)
### Format Validation
- **`email`** - RFC 5322 email validation
- **`pattern=TYPE`** - Predefined regex patterns:
- `pattern=name` - Letters, spaces, hyphens, apostrophes (international names)
- `pattern=subject` - Alphanumeric + safe punctuation including #
- `pattern=company` - Alphanumeric + business punctuation (&, -, etc.)
### Security Rules
- **`no_injection`** - Prevent email header injection attacks
- **`honeypot`** - Must be empty (bot detection)
- **`timing=min:max`** - Timestamp validation in seconds (e.g., `timing=2:86400` = 2s to 24h)
## Rule Execution Order
1. **Transformations** applied first: `trim`, `sanitize`
2. **Validations** executed in tag order: `required`, `min`, `max`, `email`, `pattern`, `no_injection`, etc.
3. **Multiple errors** can be returned for a single field
## Validation Errors
```go
// FieldError - single field error
type FieldError struct {
Field string `json:"field"` // Field name
Tag string `json:"tag"` // Validation tag that failed
Param string `json:"param,omitempty"` // Optional parameter
Message string `json:"message"` // Human-readable message
}
// ValidationErrors - collection of field errors
type ValidationErrors []FieldError
// Helper methods
func (ve ValidationErrors) HasErrors() bool
func (ve ValidationErrors) GetFieldError(field string) *FieldError
func (ve ValidationErrors) GetFieldErrors(field string) []FieldError
```
## Performance
Benchmarks on Apple M-series (12 cores):
```
Operation Time/op Speed vs Manual
──────────────────────────────────────────────────────────────
Manual Validation 3,089 µs 1x (baseline)
First Validation (no cache) 718 µs 4.3x faster
Cached Validation 363 µs 8.5x faster
Global Validator (cached) 260 µs 11.9x faster ✓
```
**Key Performance Features:**
- Struct metadata caching (sync.Map) - first validation ~500ns, cached ~50ns
- Pre-compiled regex patterns at init time
- UTF-8 aware length validation with `utf8.RuneCountInString()`
- Minimal allocations: ~90KB, ~477 allocs per validation
## Advanced Usage
### Custom Validator Instance
```go
// Create reusable validator with cached metadata
v := NewValidator()
// Validate multiple structs (metadata cached per type)
err1 := v.Validate(&user1)
err2 := v.Validate(&user2)
```
### Error Handling
```go
err := ValidateContactFormV2(req)
if err != nil {
// Type assert to ValidationErrors
verrs, ok := err.(ValidationErrors)
if !ok {
// Not a validation error
return err
}
// Get specific field error
if nameErr := verrs.GetFieldError("name"); nameErr != nil {
fmt.Printf("Name error: %s\n", nameErr.Message)
}
// Get all errors for a field
emailErrors := verrs.GetFieldErrors("email")
// Iterate all errors
for _, e := range verrs {
fmt.Printf("%s: %s (tag=%s)\n", e.Field, e.Message, e.Tag)
}
}
```
### Multiple Validations
```go
// Tags are executed left to right
validate:"required,trim,max=100,pattern=name,no_injection"
// Execution order:
// 1. trim (transform)
// 2. required (validate)
// 3. max=100 (validate)
// 4. pattern=name (validate)
// 5. no_injection (validate)
```
## Security Features
### Email Header Injection Prevention
Detects and blocks attempts to inject email headers via newlines or header patterns:
```go
"Name\nBcc: evil@example.com" // ❌ Blocked by no_injection
"Bcc: evil@example.com" // ❌ Blocked by no_injection
"Normal Name" // ✓ Valid
```
### Bot Detection
Two-layer bot protection:
```go
// Honeypot field (must be empty)
Honeypot string `validate:"honeypot"`
// Timing validation (2 seconds minimum, 24 hours maximum)
Timestamp int64 `validate:"timing=2:86400"`
```
### Input Sanitization
Message field automatically sanitized:
```go
// Before: "<script>alert('XSS')</script>"
// After: "&lt;script&gt;alert('XSS')&lt;/script&gt;"
Message string `validate:"sanitize"`
```
## Backward Compatibility
Legacy manual validation functions remain available:
```go
// V1 - Manual validation (still works)
err := ValidateContactForm(req)
// V2 - Struct tag validation (recommended)
err := ValidateContactFormV2(req)
// Helper functions (still available)
IsValidEmail(email string) bool
IsValidName(name string) bool
IsValidSubject(subject string) bool
IsValidCompany(company string) bool
ContainsEmailInjection(s string) bool
SanitizeContactForm(req *ContactFormRequest)
```
## Testing
```bash
# Run all tests
go test ./internal/validation/...
# Run with coverage
go test -cover ./internal/validation/...
# Run benchmarks
go test -bench=. -benchmem ./internal/validation/...
# Run specific test
go test -run TestValidatorV2_EmailValidation ./internal/validation/...
```
## Examples
### Example 1: Contact Form
```go
req := &ContactFormRequest{
Name: " John O'Connor ", // Will be trimmed
Email: "john@example.com",
Company: "Acme Corp", // Optional
Subject: "Question #123",
Message: "<b>Hello</b>", // Will be sanitized
Honeypot: "", // Must be empty
Timestamp: time.Now().Unix() - 10,
}
if err := ValidateContactFormV2(req); err != nil {
log.Fatal(err)
}
// req.Name is now "John O'Connor" (trimmed)
// req.Message is now "&lt;b&gt;Hello&lt;/b&gt;" (sanitized)
```
### Example 2: Handling Multiple Errors
```go
req := &ContactFormRequest{
Name: strings.Repeat("a", 101), // Too long
Email: "invalid-email", // Invalid format
Subject: "", // Missing required
Message: "Valid message",
}
err := ValidateContactFormV2(req)
verrs := err.(ValidationErrors)
// Output all errors
// name: name must be 100 characters or less (max=100)
// email: Invalid email address format (email)
// subject: subject is required (required)
```
### Example 3: International Names
```go
// All valid international names
names := []string{
"José María", // Spanish
"François Dubois", // French
"Müller", // German
"田中太郎", // Japanese
"Anne-Marie", // Hyphenated
"O'Connor", // Apostrophe
}
for _, name := range names {
req := &ContactFormRequest{Name: name, ...}
if err := ValidateContactFormV2(req); err != nil {
log.Printf("Valid name rejected: %s", name)
}
}
```
## Thread Safety
All validation functions are thread-safe:
```go
// Safe for concurrent use
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
err := ValidateContactFormV2(req)
// Process err...
}()
}
wg.Wait()
```
## Architecture
```
validator.go - Core reflection engine with caching
rules.go - Built-in validation rules
errors.go - Error types and helpers
contact.go - Contact form struct and legacy functions
contact_test.go - Legacy validation tests
validator_test.go - Struct tag validation tests
```
## Design Principles
1. **Performance First**: Reflection caching, pre-compiled regex, minimal allocations
2. **Security First**: Defense in depth against injection attacks
3. **Developer Experience**: Clear error messages, type-safe APIs
4. **Zero Dependencies**: Pure Go stdlib for easy maintenance
5. **Backward Compatible**: Existing code continues to work
6. **UTF-8 Aware**: Proper international character support
## Future Extensions
The validator can be extended with custom rules:
```go
// Custom rule example (not yet implemented)
validationRules["custom"] = func(field, value, param string) *FieldError {
// Your custom validation logic
return nil
}
```
## License
Part of the CV Site project - see project LICENSE file.