chore: remove duplicate docs from validation folder

Documentation lives in docs/go-validation-system.md
This commit is contained in:
juanatsap
2025-12-06 15:22:13 +00:00
parent 6c7595b041
commit d51e1f4520
2 changed files with 0 additions and 532 deletions
-184
View File
@@ -1,184 +0,0 @@
# Validation Performance Comparison
## Benchmark Results (Apple M3 Pro, 12 cores)
### Direct Comparison
| Metric | Manual V1 | Struct Tag V2 | Improvement |
|--------|-----------|---------------|-------------|
| **Time/op** | 151.8 µs | 25.2 µs | **6.0x faster** ✓ |
| **Memory/op** | 283.5 KB | 90.0 KB | **68% less memory** ✓ |
| **Allocs/op** | 653 | 477 | **27% fewer allocations** ✓ |
### Detailed Breakdown
```
BenchmarkValidateContactForm-12 8,001 151,817 ns/op 283,545 B/op 653 allocs/op
BenchmarkValidatorV2_GlobalValidator-12 48,710 25,199 ns/op 89,968 B/op 477 allocs/op
```
## Performance Characteristics
### V1 Manual Validation
- **Approach**: Procedural validation with repeated regex compilation
- **Time**: 151.8 µs per validation
- **Memory**: 283.5 KB per validation
- **Allocations**: 653 per validation
- **Issues**:
- Regex patterns compiled on every validation
- No caching of validation logic
- High memory overhead
### V2 Struct Tag Validation
- **Approach**: Reflection with metadata caching and pre-compiled patterns
- **Time**: 25.2 µs per validation (cached)
- **Memory**: 90.0 KB per validation
- **Allocations**: 477 per validation
- **Optimizations**:
- ✅ Struct metadata cached (first: ~500ns, subsequent: ~50ns)
- ✅ Regex patterns pre-compiled at init
- ✅ UTF-8 aware with `utf8.RuneCountInString()`
- ✅ Thread-safe caching with `sync.Map`
## Performance by Operation
| Operation | Time/op | Memory/op | Notes |
|-----------|---------|-----------|-------|
| Email validation (IsValidEmail) | 20.4 µs | 89.8 KB | Regex compilation overhead |
| Injection check (ContainsEmailInjection) | 150.9 ns | 32 B | Very fast string matching |
| Full form validation (V1) | 151.8 µs | 283.5 KB | Baseline |
| First validation (V2, no cache) | 25.4 µs | 92.6 KB | ~6x faster than V1 |
| Cached validation (V2) | 23.3 µs | 89.9 KB | ~6.5x faster than V1 |
| Global validator (V2, recommended) | 25.2 µs | 90.0 KB | ~6x faster than V1 |
## Scalability
### Throughput Comparison
**V1 Manual Validation:**
- 8,001 operations/sec
- ~6,588 validations/second (total)
**V2 Struct Tag Validation:**
- 48,710 operations/sec
- ~39,682 validations/second (total)
**Result: 6.0x higher throughput**
## Memory Efficiency
### Allocation Breakdown
**V1 (653 allocations):**
- Field validation: ~100 allocs
- Regex compilation: ~400 allocs (major overhead)
- String operations: ~153 allocs
**V2 (477 allocations):**
- Reflection: ~50 allocs (one-time per struct type)
- Validation logic: ~350 allocs
- String operations: ~77 allocs
**Savings: 176 fewer allocations per validation**
## Real-World Impact
### Scenario: High-Traffic Contact Form
- **Traffic**: 1,000 form submissions/hour
- **Peak**: 100 concurrent validations
**V1 Manual Validation:**
- Total time: 151.8 ms/validation × 1,000 = 151,800 ms (~2.5 minutes)
- Peak memory: 283.5 KB × 100 = 28.3 MB
**V2 Struct Tag Validation:**
- Total time: 25.2 ms/validation × 1,000 = 25,200 ms (~25 seconds)
- Peak memory: 90.0 KB × 100 = 9.0 MB
**Savings:**
- ⏱️ Time saved: **126.6 seconds per 1,000 validations**
- 💾 Memory saved: **19.3 MB peak memory**
- 🔥 CPU saved: **83% reduction** in CPU time
## Cache Performance
### First Validation (Cold Cache)
```
BenchmarkValidatorV2_FirstValidation-12 47,972 25,356 ns/op
```
- Includes reflection struct parsing
- Still 6x faster than manual validation
### Subsequent Validations (Warm Cache)
```
BenchmarkValidatorV2_CachedValidation-12 53,360 23,283 ns/op
```
- Uses cached struct metadata
- Cache lookup: ~500ns overhead
- 6.5x faster than manual validation
### Cache Hit Rate
- First validation per struct type: Cache miss (~25.4 µs)
- Subsequent validations: Cache hit (~23.3 µs)
- Performance gain: ~8% faster with warm cache
## Optimization Techniques Used
1. **Struct Metadata Caching**
```go
type Validator struct {
cache sync.Map // map[reflect.Type]*structMeta
}
```
- Cache struct metadata on first validation
- Subsequent validations reuse metadata (~500ns lookup)
2. **Pre-compiled Regex Patterns**
```go
func init() {
namePattern = regexp.MustCompile(`^[\p{L}\s'-]+$`)
subjectPattern = regexp.MustCompile(`^[\p{L}\p{N}\s.,!?'"()\-:;#]+$`)
companyPattern = regexp.MustCompile(`^[\p{L}\p{N}\s.,&'()\-]+$`)
}
```
- Patterns compiled once at startup
- Eliminates ~400 allocations per validation
3. **UTF-8 Aware Length Validation**
```go
runeCount := utf8.RuneCountInString(value)
```
- Correct handling of international characters
- Faster than `len([]rune(value))`
4. **Thread-Safe Caching**
```go
v.cache.Store(t, meta) // sync.Map for lock-free reads
```
- Lock-free reads for cached metadata
- Concurrent validations don't block
## Comparison with Popular Libraries
| Library | Time/op | Memory/op | Dependencies | Features |
|---------|---------|-----------|--------------|----------|
| **Our V2** | **25.2 µs** | **90.0 KB** | **0 (stdlib only)** | ✅ Custom rules, security |
| go-playground/validator | ~30 µs | ~100 KB | 5+ dependencies | ✅ Full featured |
| asaskevich/govalidator | ~80 µs | ~150 KB | 2 dependencies | ⚠️ Less flexible |
| Our V1 | 151.8 µs | 283.5 KB | 0 | ⚠️ Manual code |
**Our V2 is competitive with industry-standard libraries while maintaining zero dependencies!**
## Conclusion
The struct tag validation system delivers:
**6.0x faster** validation (151.8 µs → 25.2 µs)
**68% less memory** per validation (283.5 KB → 90.0 KB)
**27% fewer allocations** (653 → 477)
**6.0x higher throughput** (8K ops/sec → 48K ops/sec)
**Zero dependencies** - pure Go stdlib
**Thread-safe** with lock-free cached reads
**Production-ready** with 81.3% test coverage
The performance gains make this system ideal for high-traffic production environments while maintaining code clarity and security.
-348
View File
@@ -1,348 +0,0 @@
# Validation Package
High-performance struct tag-based validation system with zero external dependencies.
## Features
- **Struct Tag Validation**: Declarative validation using Go struct tags
- **High Performance**: Reflection caching achieves ~260µs per validation (12x faster than manual validation)
- **Zero Dependencies**: Pure Go stdlib implementation
- **UTF-8 Aware**: Proper Unicode handling for international names
- **Security First**: Email injection prevention, honeypot detection, timing validation
- **Thread-Safe**: Safe for concurrent use
- **Backward Compatible**: Existing manual validation functions preserved
## Quick Start
```go
// Define struct with validation tags
type ContactFormRequest struct {
Name string `json:"name" validate:"required,trim,max=100,pattern=name,no_injection"`
Email string `json:"email" validate:"required,trim,max=254,email,no_injection"`
Company string `json:"company" validate:"optional,trim,max=100,pattern=company"`
Subject string `json:"subject" validate:"required,trim,max=200,pattern=subject,no_injection"`
Message string `json:"message" validate:"required,trim,max=5000,sanitize"`
Honeypot string `json:"website" validate:"honeypot"`
Timestamp int64 `json:"timestamp" validate:"timing=2:86400"`
}
// Validate
req := &ContactFormRequest{...}
if err := ValidateContactFormV2(req); err != nil {
// Handle validation errors
if verrs, ok := err.(ValidationErrors); ok {
for _, e := range verrs {
fmt.Printf("%s: %s\n", e.Field, e.Message)
}
}
}
```
## Available Validation Rules
### Required/Optional
- **`required`** - Field must not be empty (after trimming)
- **`optional`** - Explicit marker for optional fields
### Transformations
- **`trim`** - Auto-trim whitespace from field value
- **`sanitize`** - Remove newlines and escape HTML (for message body)
### Length Validation (UTF-8 aware)
- **`min=N`** - Minimum rune count (e.g., `min=3`)
- **`max=N`** - Maximum rune count (e.g., `max=100`)
### Format Validation
- **`email`** - RFC 5322 email validation
- **`pattern=TYPE`** - Predefined regex patterns:
- `pattern=name` - Letters, spaces, hyphens, apostrophes (international names)
- `pattern=subject` - Alphanumeric + safe punctuation including #
- `pattern=company` - Alphanumeric + business punctuation (&, -, etc.)
### Security Rules
- **`no_injection`** - Prevent email header injection attacks
- **`honeypot`** - Must be empty (bot detection)
- **`timing=min:max`** - Timestamp validation in seconds (e.g., `timing=2:86400` = 2s to 24h)
## Rule Execution Order
1. **Transformations** applied first: `trim`, `sanitize`
2. **Validations** executed in tag order: `required`, `min`, `max`, `email`, `pattern`, `no_injection`, etc.
3. **Multiple errors** can be returned for a single field
## Validation Errors
```go
// FieldError - single field error
type FieldError struct {
Field string `json:"field"` // Field name
Tag string `json:"tag"` // Validation tag that failed
Param string `json:"param,omitempty"` // Optional parameter
Message string `json:"message"` // Human-readable message
}
// ValidationErrors - collection of field errors
type ValidationErrors []FieldError
// Helper methods
func (ve ValidationErrors) HasErrors() bool
func (ve ValidationErrors) GetFieldError(field string) *FieldError
func (ve ValidationErrors) GetFieldErrors(field string) []FieldError
```
## Performance
Benchmarks on Apple M-series (12 cores):
```
Operation Time/op Speed vs Manual
──────────────────────────────────────────────────────────────
Manual Validation 3,089 µs 1x (baseline)
First Validation (no cache) 718 µs 4.3x faster
Cached Validation 363 µs 8.5x faster
Global Validator (cached) 260 µs 11.9x faster ✓
```
**Key Performance Features:**
- Struct metadata caching (sync.Map) - first validation ~500ns, cached ~50ns
- Pre-compiled regex patterns at init time
- UTF-8 aware length validation with `utf8.RuneCountInString()`
- Minimal allocations: ~90KB, ~477 allocs per validation
## Advanced Usage
### Custom Validator Instance
```go
// Create reusable validator with cached metadata
v := NewValidator()
// Validate multiple structs (metadata cached per type)
err1 := v.Validate(&user1)
err2 := v.Validate(&user2)
```
### Error Handling
```go
err := ValidateContactFormV2(req)
if err != nil {
// Type assert to ValidationErrors
verrs, ok := err.(ValidationErrors)
if !ok {
// Not a validation error
return err
}
// Get specific field error
if nameErr := verrs.GetFieldError("name"); nameErr != nil {
fmt.Printf("Name error: %s\n", nameErr.Message)
}
// Get all errors for a field
emailErrors := verrs.GetFieldErrors("email")
// Iterate all errors
for _, e := range verrs {
fmt.Printf("%s: %s (tag=%s)\n", e.Field, e.Message, e.Tag)
}
}
```
### Multiple Validations
```go
// Tags are executed left to right
validate:"required,trim,max=100,pattern=name,no_injection"
// Execution order:
// 1. trim (transform)
// 2. required (validate)
// 3. max=100 (validate)
// 4. pattern=name (validate)
// 5. no_injection (validate)
```
## Security Features
### Email Header Injection Prevention
Detects and blocks attempts to inject email headers via newlines or header patterns:
```go
"Name\nBcc: evil@example.com" // ❌ Blocked by no_injection
"Bcc: evil@example.com" // ❌ Blocked by no_injection
"Normal Name" // ✓ Valid
```
### Bot Detection
Two-layer bot protection:
```go
// Honeypot field (must be empty)
Honeypot string `validate:"honeypot"`
// Timing validation (2 seconds minimum, 24 hours maximum)
Timestamp int64 `validate:"timing=2:86400"`
```
### Input Sanitization
Message field automatically sanitized:
```go
// Before: "<script>alert('XSS')</script>"
// After: "&lt;script&gt;alert('XSS')&lt;/script&gt;"
Message string `validate:"sanitize"`
```
## Backward Compatibility
Legacy manual validation functions remain available:
```go
// V1 - Manual validation (still works)
err := ValidateContactForm(req)
// V2 - Struct tag validation (recommended)
err := ValidateContactFormV2(req)
// Helper functions (still available)
IsValidEmail(email string) bool
IsValidName(name string) bool
IsValidSubject(subject string) bool
IsValidCompany(company string) bool
ContainsEmailInjection(s string) bool
SanitizeContactForm(req *ContactFormRequest)
```
## Testing
```bash
# Run all tests
go test ./internal/validation/...
# Run with coverage
go test -cover ./internal/validation/...
# Run benchmarks
go test -bench=. -benchmem ./internal/validation/...
# Run specific test
go test -run TestValidatorV2_EmailValidation ./internal/validation/...
```
## Examples
### Example 1: Contact Form
```go
req := &ContactFormRequest{
Name: " John O'Connor ", // Will be trimmed
Email: "john@example.com",
Company: "Acme Corp", // Optional
Subject: "Question #123",
Message: "<b>Hello</b>", // Will be sanitized
Honeypot: "", // Must be empty
Timestamp: time.Now().Unix() - 10,
}
if err := ValidateContactFormV2(req); err != nil {
log.Fatal(err)
}
// req.Name is now "John O'Connor" (trimmed)
// req.Message is now "&lt;b&gt;Hello&lt;/b&gt;" (sanitized)
```
### Example 2: Handling Multiple Errors
```go
req := &ContactFormRequest{
Name: strings.Repeat("a", 101), // Too long
Email: "invalid-email", // Invalid format
Subject: "", // Missing required
Message: "Valid message",
}
err := ValidateContactFormV2(req)
verrs := err.(ValidationErrors)
// Output all errors
// name: name must be 100 characters or less (max=100)
// email: Invalid email address format (email)
// subject: subject is required (required)
```
### Example 3: International Names
```go
// All valid international names
names := []string{
"José María", // Spanish
"François Dubois", // French
"Müller", // German
"田中太郎", // Japanese
"Anne-Marie", // Hyphenated
"O'Connor", // Apostrophe
}
for _, name := range names {
req := &ContactFormRequest{Name: name, ...}
if err := ValidateContactFormV2(req); err != nil {
log.Printf("Valid name rejected: %s", name)
}
}
```
## Thread Safety
All validation functions are thread-safe:
```go
// Safe for concurrent use
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
err := ValidateContactFormV2(req)
// Process err...
}()
}
wg.Wait()
```
## Architecture
```
validator.go - Core reflection engine with caching
rules.go - Built-in validation rules
errors.go - Error types and helpers
contact.go - Contact form struct and legacy functions
contact_test.go - Legacy validation tests
validator_test.go - Struct tag validation tests
```
## Design Principles
1. **Performance First**: Reflection caching, pre-compiled regex, minimal allocations
2. **Security First**: Defense in depth against injection attacks
3. **Developer Experience**: Clear error messages, type-safe APIs
4. **Zero Dependencies**: Pure Go stdlib for easy maintenance
5. **Backward Compatible**: Existing code continues to work
6. **UTF-8 Aware**: Proper international character support
## Future Extensions
The validator can be extended with custom rules:
```go
// Custom rule example (not yet implemented)
validationRules["custom"] = func(field, value, param string) *FieldError {
// Your custom validation logic
return nil
}
```
## License
Part of the CV Site project - see project LICENSE file.