Distributed Application Architecture Patterns

7 Resilience and Reliability Patterns

This chapter presents patterns that combat network unreliability and service instability, both of which are inevitable in distributed systems. As such, it is the largest chapter in this work.

The Bulkheads pattern in § 7.1 limits the impact of a failing service on the rest of the system
The Queue-Based Load Levelling pattern in § 7.2 evens out the load between services
The Retry pattern in § 7.3 guards against transient failures
The Health Monitoring pattern in § 7.4 ensures the system is running as expected
The Rate Limiting pattern in § 7.5 protects services from misuse
The Leader and Followers pattern in § 7.6 coordinates services in a cluster without a natural leader

Other patterns that could be considered part of this category include [92]:

Message brokers can improve communication reliability
Circuit Breaker improves overall service reliability during failures of its dependencies
Ambassador and Offload to Gateway present opportunities to implement these patterns
Competing Consumers & Load Balancer provide natural backups for failed services
Claim Check can help recover message contents in the event of a disaster and Partitioning or Identity Provider can limit the amount of information lost
Chapter 10 discusses patterns to ensure consistency during failures
Backend for Frontend (BFF) can be used to personalize reliability for the different requirements of clients
Gateway Aggregation can improve reliability when a client is slow or on an unreliable network
Strangler Fig improves reliability during transitional periods

Notable omissions in this category due to the methodology described in § 1 include:

Cache-Aside and Throttling by Microsoft [93, 94], although common techniques, do not add anything new to the discussion in the context of distributed systems and are not recognised as patterns by other authors
Compensating Transaction by Microsoft [95] is discussed as part of Sagas

Distributed Application Architecture Patterns

An unopinionated catalogue of the status quo

Methodology

Scope of this work

Existing resources

Selection process

Pattern construction

Design and implementation

Inspiration

Visual style

Technologies

Visual language

Design process

Overview

Prerequisites

Pattern categories

Conventions used

Example application

Communication Patterns

Gateway Routing

Abstract service locations from the clients

Publisher–Subscriber (Pub–Sub, Topics)

Asynchronous one-to-many communication

Asynchronous Request–Reply

Asynchronous two-way communication

Decomposition Patterns

Sidecar

Language-agnostic, locally-running supporting components

Ambassador

Handle network communication on behalf of a service

Offload to Gateway

Move common functionality to a gateway to offload backend services

Backend for Frontend (BFF)

Reduce backend complexity by specialising for each frontend

Scalability Patterns

Competing Consumers & Load Balancer

Continuous parallel request processing

Partitioning (Sharding)

Scale out by separating tasks or data into logical partitions

Scatter–Gather

Asynchronously distribute workloads and aggregate results

Externalised Configuration

Centralised configuration management

Resilience and Reliability Patterns

Bulkheads

Use logical partitions to isolate failures

Queue-Based Load Leveling

Use a messaging queue to manage and cope with peaks in demand

Retry

Do not fail because of transient errors

Health Monitoring

Proactively check and react to service failures

Rate Limiting

Control the rate of incoming requests to prevent overload or starvation

Leader and Followers

Decentrally appoint a replaceable group leader

Performance and Latency Patterns

Circuit Breaker

Isolate failure with controlled recovery

Colocate

Improve network reliability by decreasing the distance between services with high coupling

Aggregating Gateway

Aggregate multiple service requests into a single client response

Claim Check

Reduce message sizes by storing large payloads externally

Command and Query Responsibility Segregation (CQRS)

Isolate data store reads and writes to prevent contention

Security Patterns

Identity provider & Federated Identity

Centralise authentication to reduce the attack surface

Gatekeeper (Service Firewall)

Protect services by validating and sanitising incoming requests in a limited environment

Consistency Patterns

Transaction-Based Processor

Use transactions to encapsulate atomic processes

Transactional Outbox

Atomically send a message and update the database

Saga

Sacrifice isolation for increased availability during a distributed transaction

Choreography-Based Sagas

Distribute responsibility for a saga across participating services

Orchestration-Based Sagas

Centralise responsibility for a saga in a single service

Event Sourcing

Store data as a sequence of changes to preserve history

Migration and Compatibility Patterns

Anti-Corruption Layer (ACL)

Mediate between modern and legacy systems

Incremental Replacement (Strangler Fig)

Replace a legacy system step-by-step

Messaging Bridge

Connect two services with incompatible messaging middleware

Bibliography