Metrics

  • Use Micrometer + Prometheus: management.endpoints.web.exposure.include=prometheus,health,info.
  • Add JVM+Tomcat/db pool meters; set percentiles for latencies.
  • Create SLIs: request latency, error rate, saturation (threads/connections), GC pauses.

Traces

  • Spring Boot 3 ships with OTel starter: add spring-boot-starter-actuator + micrometer-tracing-bridge-otel + exporter (OTLP/Zipkin/Jaeger).
  • Propagate headers (traceparent); ensure async executors use ContextPropagatingExecutor.
  • Sample smartly: lower rates on noisy paths; raise for errors.

Logs

  • Use JSON layout; include traceId/spanId for correlation.
  • Avoid verbose INFO in hot paths; keep payload size bounded.

Dashboards & alerts

  • Latency/error SLO dashboards per endpoint.
  • DB pool saturation, thread pool queue depth, GC pause, heap used %, 5xx rate.
  • Alerts on SLO burn rates; include exemplars linking metrics → traces → logs.

Checklist

  • Actuator endpoints secured and exposed only where needed.
  • OTLP exporter configured; sampling tuned.
  • Trace/log correlation verified in staging.
  • Dashboards + alerts reviewed with oncall.