Deadlines & retries
- Require client deadlines; enforce server-side
contextwithgrpc.DeadlineExceededhandling. - Configure retry/backoff on idempotent calls; avoid retry storms with jitter + max attempts.
Interceptors
- Unary/stream interceptors for auth, metrics (Prometheus), logging, and panic recovery.
- Use per-RPC circuit breakers and rate limits for critical dependencies.
TLS & auth
- Enable TLS everywhere; prefer mTLS for internal services.
- Rotate certs automatically; watch expiry metrics.
- Add authz checks in interceptors; propagate identity via metadata.
Resource protection
- Limit concurrent streams and max message sizes.
- Bounded worker pools for handlers performing heavy work.
- Tune keepalive to detect dead peers without flapping.
Observability
- Metrics: latency, error codes, message sizes, active streams, retries.
- Traces: annotate methods, peer info, attempt counts; sample smartly.
- Logs: structured fields for method, code, duration, peer.
Checklist
- Deadlines required; retries only for idempotent calls with backoff.
- Interceptors for auth/metrics/logging/recovery.
- TLS/mTLS enabled; cert rotation automated.
- Concurrency and message limits set; keepalive tuned.