Distributed Application Architecture Patterns

7 Resilience and Reliability Patterns

This chapter presents patterns that combat network unreliability and service instability, both of which are inevitable in distributed systems. As such, it is the largest chapter in this work.

  1. The Bulkheads pattern in § 7.1 limits the impact of a failing service on the rest of the system

  2. The Queue-Based Load Levelling pattern in § 7.2 evens out the load between services

  3. The Retry pattern in § 7.3 guards against transient failures

  4. The Health Monitoring pattern in § 7.4 ensures the system is running as expected

  5. The Rate Limiting pattern in § 7.5 protects services from misuse

  6. The Leader and Followers pattern in § 7.6 coordinates services in a cluster without a natural leader

Other patterns that could be considered part of this category include [92]:

Notable omissions in this category due to the methodology described in § 1 include: