Files
cv-site/_go-learning/architecture/server-design.md
T

558 lines
16 KiB
Markdown
Raw Normal View History

# Go Server Architecture: Why Goroutines and Graceful Shutdown
**Last Updated**: 2025-11-20
**Learning Value**: ⭐⭐⭐⭐⭐
## 📋 Table of Contents
1. [Server Startup Flow](#server-startup-flow)
2. [Why Start Server in a Goroutine?](#why-start-server-in-a-goroutine)
3. [Graceful Shutdown Pattern](#graceful-shutdown-pattern)
4. [Channel Communication](#channel-communication)
5. [Context and Timeouts](#context-and-timeouts)
6. [Production Best Practices](#production-best-practices)
---
## 🚀 Server Startup Flow
### The Code (main.go:60-101)
```go
// Start server in goroutine
serverErrors := make(chan error, 1)
go func() {
log.Printf("✓ Server listening on http://%s:%s", cfg.Server.Host, cfg.Server.Port)
serverErrors <- server.ListenAndServe() // Blocks until server stops
}()
// Setup graceful shutdown
shutdown := make(chan os.Signal, 1)
signal.Notify(shutdown, os.Interrupt, syscall.SIGTERM)
// Wait for shutdown signal or server error
select {
case err := <-serverErrors:
// Server stopped unexpectedly
if !errors.Is(err, http.ErrServerClosed) {
log.Fatalf("❌ Server error: %v", err)
}
case sig := <-shutdown:
// User pressed Ctrl+C or system sent SIGTERM
log.Printf("🛑 Shutdown signal received: %v", sig)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
log.Printf("⚠️ Graceful shutdown failed, forcing: %v", err)
server.Close() // Force close if graceful fails
}
}
```
### Visual Flow
```
main() starts
├─→ Load config
├─→ Initialize templates
├─→ Create handlers
├─→ Setup routes
└─→ Create HTTP server
├─→ Create error channel (serverErrors)
├─→ Launch GOROUTINE ──────────────┐
│ │
│ (main thread continues) │ (goroutine: runs in parallel)
│ │
│ ├─→ Log startup messages
│ │
│ └─→ server.ListenAndServe()
│ (BLOCKS here, handling HTTP requests)
├─→ Create shutdown channel
├─→ Setup signal handler (Ctrl+C, SIGTERM)
└─→ SELECT statement (BLOCKS here)
┌───────────────────────────────┐
│ │
├─→ Case 1: serverErrors │ Case 2: shutdown signal
│ (server crashed) │ (user pressed Ctrl+C)
│ → Log error │ → Initiate graceful shutdown
│ → Exit │ → Wait max 30s for requests to finish
│ │ → Force close if timeout
└───────────────────────────────┘
```
---
## 🤔 Why Start Server in a Goroutine?
### The Question
> "Why do we do `go func() { server.ListenAndServe() }()` instead of just `server.ListenAndServe()` directly?"
### The Answer: Blocking vs. Non-Blocking
#### Without Goroutine (WRONG):
```go
func main() {
server := &http.Server{Addr: ":1999"}
log.Println("Starting server...")
server.ListenAndServe() // ← BLOCKS HERE FOREVER
// This code NEVER runs!
setupGracefulShutdown() // ❌ Never reached
waitForSignals() // ❌ Never reached
}
```
**Problem**: `ListenAndServe()` **blocks** (waits forever) handling HTTP requests. The function never returns unless the server crashes. Any code after it is unreachable!
#### With Goroutine (CORRECT):
```go
func main() {
server := &http.Server{Addr: ":1999"}
// Launch server in separate goroutine
serverErrors := make(chan error, 1)
go func() {
log.Println("Starting server...")
serverErrors <- server.ListenAndServe() // Runs in parallel
}()
// Main thread continues immediately!
setupGracefulShutdown() // ✅ Runs
waitForSignals() // ✅ Runs
}
```
**Solution**: The goroutine runs **in parallel** with the main thread. The server handles requests in the goroutine while the main thread sets up shutdown logic.
### What is a Goroutine?
> **Goroutine** = Lightweight thread managed by Go runtime
```go
// Regular function call (synchronous)
doWork() // Wait for doWork to finish before continuing
// Goroutine (asynchronous)
go doWork() // Start doWork in parallel, continue immediately
```
**Key Characteristics**:
-**Lightweight**: ~2KB memory (OS threads: ~2MB)
- 🚀 **Fast**: Cheap to create, Go can run thousands simultaneously
- 🎯 **Scheduled by Go**: Runtime multiplexes goroutines onto OS threads
- 📡 **Communicate via channels**: Don't share memory, share by communicating
---
## 📡 Channel Communication
### What is a Channel?
> **Channel** = Typed conduit through which goroutines communicate
```go
// Create a channel of ints
messages := make(chan int)
// Send value to channel (blocks until someone receives)
messages <- 42
// Receive value from channel (blocks until someone sends)
value := <-messages
```
### Why Channels?
**Problem**: How does the main thread know when the server goroutine encounters an error?
**Solution**: Use a channel to communicate errors from goroutine → main thread
```go
// Main thread
serverErrors := make(chan error, 1) // Buffered channel (capacity 1)
// Goroutine (different thread)
go func() {
err := server.ListenAndServe()
serverErrors <- err // Send error to channel
}()
// Main thread
select {
case err := <-serverErrors: // Receive error from channel
log.Fatalf("Server failed: %v", err)
}
```
### Buffered vs. Unbuffered Channels
```go
// Unbuffered (capacity 0) - default
ch := make(chan int)
// Sender blocks until receiver is ready
// Receiver blocks until sender sends
// Buffered (capacity > 0)
ch := make(chan int, 1)
// Sender doesn't block if buffer has space
// Receiver blocks only if buffer is empty
```
**Why buffered for `serverErrors`?**
```go
serverErrors := make(chan error, 1) // Buffer size 1
```
If the server crashes **before** we reach the `select` statement, the error can be stored in the buffer. Without buffering, the send would block forever (deadlock).
---
## 🛑 Graceful Shutdown Pattern
### The Problem: Abrupt Shutdown
```go
// BAD: Immediate shutdown
func main() {
server.ListenAndServe()
// User presses Ctrl+C
// → Server IMMEDIATELY stops
// → Ongoing requests are KILLED mid-flight
// → Data loss, corrupted responses
}
```
**Consequences**:
- User uploading a file → Upload lost
- Database transaction → Data inconsistent
- API call → Client gets network error
### The Solution: Graceful Shutdown
```go
// GOOD: Graceful shutdown
server.Shutdown(ctx) // Waits for ongoing requests to finish
```
**Process**:
1. Stop accepting new requests
2. Wait for ongoing requests to complete
3. Close idle connections
4. Shut down cleanly
### The Code Breakdown
#### Step 1: Setup Signal Handler
```go
shutdown := make(chan os.Signal, 1)
signal.Notify(shutdown, os.Interrupt, syscall.SIGTERM)
```
**What this does**:
- Creates a channel to receive OS signals
- Tells Go: "When user presses Ctrl+C (`SIGINT`) or system sends `SIGTERM`, send signal to this channel"
**Why `SIGTERM`?**
- Docker uses `SIGTERM` to stop containers
- Kubernetes uses `SIGTERM` before killing pods
- Systemd uses `SIGTERM` to stop services
#### Step 2: Wait for Signal
```go
select {
case err := <-serverErrors:
// Server crashed
case sig := <-shutdown:
// Shutdown requested
}
```
**`select` statement**: Waits for **whichever happens first**:
- Server crashes → handle error
- User presses Ctrl+C → initiate shutdown
#### Step 3: Graceful Shutdown with Timeout
```go
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
log.Printf("⚠️ Graceful shutdown failed, forcing: %v", err)
server.Close() // Force close
}
```
**What happens**:
1. Create context with 30-second timeout
2. Call `server.Shutdown(ctx)`:
- Stop accepting new connections
- Wait for active requests to finish (max 30s)
- Close idle connections
3. If timeout expires:
- Graceful shutdown fails
- Force close the server (kill all connections)
---
## 🕒 Context and Timeouts
### What is a Context?
> **Context** = Carries deadlines, cancellation signals, and request-scoped values across API boundaries
```go
// Create context with 5-second timeout
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel() // Always call cancel to release resources
// Use context in operation
err := longRunningOperation(ctx)
if errors.Is(err, context.DeadlineExceeded) {
log.Println("Operation timed out!")
}
```
### Why Context for Shutdown?
```go
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
server.Shutdown(ctx)
```
**Benefits**:
1. **Prevents infinite waiting**: If requests don't finish in 30s, proceed anyway
2. **Resource cleanup**: `defer cancel()` ensures context resources are freed
3. **Cancellation propagation**: All handlers get notified to wrap up
### Context Hierarchy
```
context.Background() ← Root context
└─→ context.WithTimeout(30s) ← Child context
└─→ HTTP request handlers use this context
(when timeout expires, all get cancelled)
```
---
## 🎯 Why This Pattern?
### Comparison
| Approach | Pro | Con |
|----------|-----|-----|
| **No goroutine** | Simple | Can't handle shutdown, server blocks forever |
| **Goroutine without channels** | Server runs in background | Can't detect errors, no communication |
| **Our pattern** | Clean shutdown, error handling, production-ready | Slightly more complex |
### Real-World Scenarios
#### Scenario 1: Server Crash
```
1. Server goroutine encounters error (port already in use)
2. Error sent to serverErrors channel
3. Main thread receives error via select
4. Log error and exit gracefully
```
#### Scenario 2: Graceful Deployment (Kubernetes)
```
1. Kubernetes sends SIGTERM (wants to update pod)
2. Signal handler receives SIGTERM
3. Server stops accepting new requests
4. Waits up to 30s for active requests to finish
5. Closes cleanly
6. Kubernetes starts new pod
→ Zero downtime deployment! ✨
```
#### Scenario 3: Developer Stops Server (Ctrl+C)
```
1. Developer presses Ctrl+C (SIGINT)
2. Signal handler receives SIGINT
3. Ongoing PDF generation continues for up to 30s
4. Server shuts down cleanly
5. No corrupted files or broken requests
```
---
## 💼 Production Best Practices
### 1. **Always Use Timeouts**
```go
server := &http.Server{
Addr: ":1999",
Handler: handler,
ReadTimeout: 15 * time.Second, // Max time to read request
WriteTimeout: 15 * time.Second, // Max time to write response
IdleTimeout: 120 * time.Second, // Max time for keep-alive connections
}
```
**Why?**
- Prevents slow clients from tying up server resources
- Protects against slowloris attacks
- Ensures predictable performance
### 2. **Use Buffered Channels for Errors**
```go
serverErrors := make(chan error, 1) // Buffer size 1
```
**Why?**
- Prevents goroutine deadlock if error occurs before select
- Allows error to be queued even if no receiver yet
### 3. **Always `defer cancel()` with Contexts**
```go
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel() // ← CRITICAL: Release resources
```
**Why?**
- Prevents context leaks
- Frees timers and goroutines associated with context
- Go vet will warn if you forget
### 4. **Handle Both SIGINT and SIGTERM**
```go
signal.Notify(shutdown, os.Interrupt, syscall.SIGTERM)
```
**Why?**
- `SIGINT` (Ctrl+C): Developer stopping server locally
- `SIGTERM`: System/orchestrator (Docker, K8s) stopping server
- Ensures shutdown works in all environments
### 5. **Have a Force-Close Fallback**
```go
if err := server.Shutdown(ctx); err != nil {
log.Printf("⚠️ Graceful shutdown failed, forcing: %v", err)
server.Close() // ← Force close if graceful fails
}
```
**Why?**
- If requests don't finish in timeout, force close
- Prevents server hanging indefinitely
- Ensures server always stops eventually
---
## 🧪 Testing Shutdown Logic
### Manual Test
```bash
# Terminal 1: Start server
go run main.go
# Terminal 2: Send request that takes 5 seconds
curl "http://localhost:1999/slow-endpoint"
# Terminal 1: Press Ctrl+C while request is active
# → Server waits for request to finish (up to 30s)
# → Then shuts down cleanly
```
### Simulating SIGTERM (Production)
```bash
# Get process ID
ps aux | grep "go run main.go"
# Send SIGTERM (like Kubernetes would)
kill -TERM <PID>
# Server should shut down gracefully
```
---
## 📚 Key Concepts Summary
### Goroutines
- **Lightweight threads** managed by Go runtime
- Use `go func()` to run functions concurrently
- Cheap to create (2KB vs. 2MB for OS threads)
### Channels
- **Communication pipes** between goroutines
- Use `<-` to send/receive values
- Buffered channels can store values when no receiver ready
### Select
- **Multiplex** on multiple channel operations
- Blocks until one case can proceed
- Used to wait for first of multiple events
### Context
- **Carries deadlines** and cancellation signals
- Use `WithTimeout` to set operation deadlines
- Always `defer cancel()` to prevent leaks
### Graceful Shutdown
- **Stop accepting** new requests
- **Wait** for active requests to finish (with timeout)
- **Force close** if timeout expires
---
## 🎓 Interview Talking Points
### "Why do you use goroutines for the HTTP server?"
> "I use a goroutine to run `ListenAndServe()` because it's a blocking call—it runs forever handling requests. By launching it in a goroutine, the main thread remains free to set up graceful shutdown logic. This pattern allows me to handle OS signals like SIGTERM (from Kubernetes) and SIGINT (Ctrl+C) to shut down cleanly, ensuring ongoing requests finish before the server stops."
### "How do you handle server errors in a goroutine?"
> "I use a buffered channel (`make(chan error, 1)`) to communicate errors from the server goroutine back to the main thread. The goroutine sends any error from `ListenAndServe()` to this channel, and the main thread uses a `select` statement to wait for either an error or a shutdown signal, whichever comes first."
### "What is graceful shutdown and why is it important?"
> "Graceful shutdown means stopping the server without killing active requests. When I receive a shutdown signal, I call `server.Shutdown()` with a context that has a 30-second timeout. This stops accepting new connections but waits for ongoing requests to complete naturally. If the timeout expires, I force-close as a fallback. This prevents data loss and gives clients a clean response instead of a broken connection."
### "Why use context with timeout?"
> "Context with timeout ensures graceful shutdown doesn't wait forever. If some requests are hanging, the 30-second timeout ensures the server shuts down eventually. The context also propagates cancellation to all handlers, signaling them to wrap up. Without timeout, a misbehaving request could prevent the server from ever shutting down."
---
## 🔗 Further Reading
### Official Go Documentation
- [Effective Go: Concurrency](https://go.dev/doc/effective_go#concurrency)
- [Go Blog: Concurrency Patterns](https://go.dev/blog/pipelines)
- [Go Blog: Context](https://go.dev/blog/context)
### Server Patterns
- [Graceful Shutdown in Go](https://pkg.go.dev/net/http#Server.Shutdown)
- [Go HTTP Server Guide](https://github.com/golang/go/wiki/HttpServerShutdown)
### Books
- "Concurrency in Go" - Katherine Cox-Buday
- "Go in Practice" - Matt Butcher & Matt Farina
---
**Last Updated**: 2025-11-20
**Practice Project**: CV Server (github.com/juanatsap/cv-site)