Backend

Hardening gRPC Services in Go

Deadlines & retries Require client deadlines; enforce server-side context with grpc.DeadlineExceeded handling. Configure retry/backoff on idempotent calls; avoid retry storms with jitter + max attempts. Interceptors Unary/stream interceptors for auth, metrics (Prometheus), logging, and panic recovery. Use per-RPC circuit breakers and rate limits for critical dependencies. TLS & auth Enable TLS everywhere; prefer mTLS for internal services. Rotate certs automatically; watch expiry metrics. Add authz checks in interceptors; propagate identity via metadata. Resource protection Limit concurrent streams and max message sizes. Bounded worker pools for handlers performing heavy work. Tune keepalive to detect dead peers without flapping. Observability Metrics: latency, error codes, message sizes, active streams, retries. Traces: annotate methods, peer info, attempt counts; sample smartly. Logs: structured fields for method, code, duration, peer. Checklist Deadlines required; retries only for idempotent calls with backoff. Interceptors for auth/metrics/logging/recovery. TLS/mTLS enabled; cert rotation automated. Concurrency and message limits set; keepalive tuned.

Java Virtual Threads (Loom) for IO-heavy Services

When it shines IO-heavy workloads with many concurrent requests. Simplifies thread-per-request code without callback hell. Great for blocking JDBC (with drivers that release threads), HTTP clients, and file IO. Caveats Avoid blocking operations that pin VTs (synchronized blocks, some native calls). Watch libraries that block on locks; prefer async-friendly drivers when possible. Pinning shows as carrier thread exhaustion; monitor. Usage Executors: Executors.newVirtualThreadPerTaskExecutor(). For servers (e.g., Spring): set spring.threads.virtual.enabled=true (Spring Boot 3.2+). Keep per-request timeouts; use structured concurrency where possible. Observability Metrics: carrier thread pool usage, VT creation rate, blocked/pinned threads. Profiling: use JDK Flight Recorder; check for pinning events. Checklist Dependencies vetted for blocking/pinning. Timeouts on all IO; circuit breakers still apply. Dashboards for carrier thread utilization and pinning. Load test before prod; compare throughput/latency vs platform threads.

PHP-FPM Tuning Guide

Process manager modes pm=dynamic for most apps; pm=static only when workload is predictable and memory bounded. Key knobs: pm.max_children, pm.start_servers, pm.min_spare_servers, pm.max_spare_servers. Size max_children = (available RAM - OS/webserver/DB) / avg worker RSS. Opcache Enable: opcache.enable=1, opcache.enable_cli=0, opcache.memory_consumption sized for codebase, opcache.interned_strings_buffer, opcache.max_accelerated_files. Avoid opcache.revalidate_freq=0 in prod unless you control deploy restarts; prefer deploy-triggered reloads. Timeouts & queues Keep request_terminate_timeout sane (e.g., 30s-60s); long requests move to queues. Use pm.max_requests to recycle leaky workers (e.g., 500-2000). Watch slowlog to catch blocking I/O or heavy CPU. Observability Export status_path and scrape: active/idle/slow requests, max children reached. Correlate with Nginx/Apache logs for upstream latency and 502/504s. Alert on max children reached, slowlog entries, and rising worker RSS. Checklist Pool sizing validated under load test. Opcache enabled and sized; reload on deploy. Timeouts/queues tuned; slowlog on. Status endpoint protected and scraped.