Containerized systems have unique failure modes. Here’s how to identify and prevent common issues.
1. Resource Exhaustion
Memory Limits
# docker-compose.yml
services:
app:
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
CPU Throttling
services:
app:
deploy:
resources:
limits:
cpus: '1.0'
2. Container Restart Loops
Health Checks
# Dockerfile
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s \
CMD curl -f http://localhost:8080/health || exit 1
Restart Policies
services:
app:
restart: unless-stopped
# Options: no, always, on-failure, unless-stopped
3. Network Issues
Port Conflicts
services:
app:
ports:
- "8080:8080" # host:container
DNS Resolution
services:
app:
dns:
- 8.8.8.8
- 8.8.4.4
4. Volume Mount Problems
Permission Issues
# Fix permissions
RUN chown -R appuser:appuser /app
USER appuser
Volume Mounts
services:
app:
volumes:
- ./data:/app/data:ro # Read-only
- cache:/app/cache
5. Image Layer Caching
Optimize Dockerfile
# Bad: Changes invalidate cache
COPY . .
RUN npm install
# Good: Layer caching
COPY package*.json ./
RUN npm install
COPY . .
6. Log Management
Log Rotation
services:
app:
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
7. Security Issues
Non-Root User
RUN useradd -m appuser
USER appuser
Secrets Management
services:
app:
secrets:
- db_password
environment:
DB_PASSWORD_FILE: /run/secrets/db_password
Prevention Strategies
- Set resource limits
- Implement health checks
- Use proper restart policies
- Monitor container metrics
- Test failure scenarios
- Use orchestration tools (Kubernetes, Docker Swarm)
Conclusion
Prevent container failures by:
- Resource management
- Health monitoring
- Proper configuration
- Security best practices
- Regular testing
Build resilient containerized systems! 🐳