Physical replication is one of the strengths of PostgreSQL and one of the reasons why some of the largest organizations in the world have chosen it for the management of their data in business continuity contexts. Primarily used to achieve high availability, physical replication also allows scale-out of read-only workloads and offloading of some work from the primary.
This section is about replication within the same
managed in the same Kubernetes cluster. For information about how to
replicate with another Postgres
Cluster resource, even across different
Kubernetes clusters, please refer to the "Replica clusters"
Having contributed throughout the years to the replication feature in PostgreSQL, we have decided to build high availability in CloudNativePG on top of the native physical replication technology, and integrate it directly in the Kubernetes API.
In Kubernetes terms, this is referred to as application-level replication, in contrast with storage-level replication.
A very mature technology
PostgreSQL has a very robust and mature native framework for replicating data from the primary instance to one or more replicas, built around the concept of transactional changes continuously stored in the WAL (Write Ahead Log).
Started as the evolution of crash recovery and point in time recovery technologies, physical replication was first introduced in PostgreSQL 8.2 (2006) through WAL shipping from the primary to a warm standby in continuous recovery.
PostgreSQL 9.0 (2010) enhanced it with WAL streaming and read-only replicas via hot standby, while 9.1 (2011) introduced synchronous replication at the transaction level (for RPO=0 clusters). Cascading replication was released with PostgreSQL 9.2 (2012). The foundations of logical replication were laid in PostgreSQL 9.4, while version 10 (2017) introduced native support for the publisher/subscriber pattern to replicate data from an origin to a destination.
Streaming replication support
At the moment, CloudNativePG natively and transparently manages
physical streaming replicas within a cluster in a declarative way, based on
the number of provided
instances in the
replicas = instances - 1 (where instances > 0)
Immediately after the initialization of a cluster, the operator creates a user
streaming_replica as follows:
CREATE USER streaming_replica WITH REPLICATION; -- NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOBYPASSRLS
Due to a
pg_rewind requirement, in PostgreSQL 10 the
user is created with
Out of the box, the operator automatically sets up streaming replication within
the cluster over an encrypted channel and enforces TLS client certificate
authentication for the
streaming_replica user - as highlighted by the following
excerpt taken from
# Require client certificate authentication for the streaming_replica user hostssl postgres streaming_replica all cert hostssl replication streaming_replica all cert
For details on how CloudNativePG manages certificates, please refer to the "Certificates" section in the documentation.
Continuous backup integration
In case continuous backup is configured in the cluster, CloudNativePG
transparently configures replicas to take advantage of
in continuous recovery. As a result, PostgreSQL can use the WAL archive
as a fallback option whenever pulling WALs via streaming replication fails.
CloudNativePG supports the configuration of quorum-based synchronous
streaming replication via two configuration options called
maxSyncReplicas, which are the minimum and the maximum number of expected
synchronous standby replicas available at any time.
For self-healing purposes, the operator always compares these two values with
the number of available replicas to determine the quorum.
Synchronous replication is disabled by default (
maxSyncReplicas are not defined).
In case both
maxSyncReplicas are set, CloudNativePG
automatically updates the
synchronous_standby_names option in
PostgreSQL to the following value:
ANY q (pod1, pod2, ...)
qis an integer automatically calculated by the operator to be:
1 <= minSyncReplicas <= q <= maxSyncReplicas <= readyReplicas
pod1, pod2, ...is the list of all PostgreSQL pods in the cluster
To provide self-healing capabilities, the operator can ignore
minSyncReplicas if such value is higher than the currently available
number of replicas. Synchronous replication is automatically disabled
As stated in the
ANY specifies a quorum-based synchronous replication and makes
transaction commits wait until their WAL records are replicated to at least the
requested number of synchronous standbys in the list.
Even though the operator chooses self-healing over enforcement of
synchronous replication settings, our recommendation is to plan for
synchronous replication only in clusters with 3+ instances or,
more generally, when
maxSyncReplicas < (instances - 1).