<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML ><HEAD ><TITLE >Log-Shipping Standby Servers</TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK REV="MADE" HREF="mailto:pgsql-docs@postgresql.org"><LINK REL="HOME" TITLE="PostgreSQL 9.6.21 Documentation" HREF="index.html"><LINK REL="UP" TITLE="High Availability, Load Balancing, and Replication" HREF="high-availability.html"><LINK REL="PREVIOUS" TITLE="Comparison of Different Solutions" HREF="different-replication-solutions.html"><LINK REL="NEXT" TITLE="Failover" HREF="warm-standby-failover.html"><LINK REL="STYLESHEET" TYPE="text/css" HREF="stylesheet.css"><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"><META NAME="creation" CONTENT="2021-02-27T18:26:08"></HEAD ><BODY CLASS="SECT1" ><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="4" ALIGN="center" VALIGN="bottom" ><A HREF="index.html" >PostgreSQL 9.6.21 Documentation</A ></TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="top" ><A TITLE="Comparison of Different Solutions" HREF="different-replication-solutions.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="10%" ALIGN="left" VALIGN="top" ><A HREF="high-availability.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="60%" ALIGN="center" VALIGN="bottom" >Chapter 26. High Availability, Load Balancing, and Replication</TD ><TD WIDTH="20%" ALIGN="right" VALIGN="top" ><A TITLE="Failover" HREF="warm-standby-failover.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><DIV CLASS="SECT1" ><H1 CLASS="SECT1" ><A NAME="WARM-STANDBY" >26.2. Log-Shipping Standby Servers</A ></H1 ><P > Continuous archiving can be used to create a <I CLASS="FIRSTTERM" >high availability</I > (HA) cluster configuration with one or more <I CLASS="FIRSTTERM" >standby servers</I > ready to take over operations if the primary server fails. This capability is widely referred to as <I CLASS="FIRSTTERM" >warm standby</I > or <I CLASS="FIRSTTERM" >log shipping</I >. </P ><P > The primary and standby server work together to provide this capability, though the servers are only loosely coupled. The primary server operates in continuous archiving mode, while each standby server operates in continuous recovery mode, reading the WAL files from the primary. No changes to the database tables are required to enable this capability, so it offers low administration overhead compared to some other replication solutions. This configuration also has relatively low performance impact on the primary server. </P ><P > Directly moving WAL records from one database server to another is typically described as log shipping. <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > implements file-based log shipping by transferring WAL records one file (WAL segment) at a time. WAL files (16MB) can be shipped easily and cheaply over any distance, whether it be to an adjacent system, another system at the same site, or another system on the far side of the globe. The bandwidth required for this technique varies according to the transaction rate of the primary server. Record-based log shipping is more granular and streams WAL changes incrementally over a network connection (see <A HREF="warm-standby.html#STREAMING-REPLICATION" >Section 26.2.5</A >). </P ><P > It should be noted that log shipping is asynchronous, i.e., the WAL records are shipped after transaction commit. As a result, there is a window for data loss should the primary server suffer a catastrophic failure; transactions not yet shipped will be lost. The size of the data loss window in file-based log shipping can be limited by use of the <TT CLASS="VARNAME" >archive_timeout</TT > parameter, which can be set as low as a few seconds. However such a low setting will substantially increase the bandwidth required for file shipping. Streaming replication (see <A HREF="warm-standby.html#STREAMING-REPLICATION" >Section 26.2.5</A >) allows a much smaller window of data loss. </P ><P > Recovery performance is sufficiently good that the standby will typically be only moments away from full availability once it has been activated. As a result, this is called a warm standby configuration which offers high availability. Restoring a server from an archived base backup and rollforward will take considerably longer, so that technique only offers a solution for disaster recovery, not high availability. A standby server can also be used for read-only queries, in which case it is called a Hot Standby server. See <A HREF="hot-standby.html" >Section 26.5</A > for more information. </P ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="STANDBY-PLANNING" >26.2.1. Planning</A ></H2 ><P > It is usually wise to create the primary and standby servers so that they are as similar as possible, at least from the perspective of the database server. In particular, the path names associated with tablespaces will be passed across unmodified, so both primary and standby servers must have the same mount paths for tablespaces if that feature is used. Keep in mind that if <A HREF="sql-createtablespace.html" >CREATE TABLESPACE</A > is executed on the primary, any new mount point needed for it must be created on the primary and all standby servers before the command is executed. Hardware need not be exactly the same, but experience shows that maintaining two identical systems is easier than maintaining two dissimilar ones over the lifetime of the application and system. In any case the hardware architecture must be the same — shipping from, say, a 32-bit to a 64-bit system will not work. </P ><P > In general, log shipping between servers running different major <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > release levels is not possible. It is the policy of the PostgreSQL Global Development Group not to make changes to disk formats during minor release upgrades, so it is likely that running different minor release levels on primary and standby servers will work successfully. However, no formal support for that is offered and you are advised to keep primary and standby servers at the same release level as much as possible. When updating to a new minor release, the safest policy is to update the standby servers first — a new minor release is more likely to be able to read WAL files from a previous minor release than vice versa. </P ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="STANDBY-SERVER-OPERATION" >26.2.2. Standby Server Operation</A ></H2 ><P > In standby mode, the server continuously applies WAL received from the master server. The standby server can read WAL from a WAL archive (see <A HREF="archive-recovery-settings.html#RESTORE-COMMAND" >restore_command</A >) or directly from the master over a TCP connection (streaming replication). The standby server will also attempt to restore any WAL found in the standby cluster's <TT CLASS="FILENAME" >pg_xlog</TT > directory. That typically happens after a server restart, when the standby replays again WAL that was streamed from the master before the restart, but you can also manually copy files to <TT CLASS="FILENAME" >pg_xlog</TT > at any time to have them replayed. </P ><P > At startup, the standby begins by restoring all WAL available in the archive location, calling <TT CLASS="VARNAME" >restore_command</TT >. Once it reaches the end of WAL available there and <TT CLASS="VARNAME" >restore_command</TT > fails, it tries to restore any WAL available in the <TT CLASS="FILENAME" >pg_xlog</TT > directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or <TT CLASS="FILENAME" >pg_xlog</TT >. If that fails or streaming replication is not configured, or if the connection is later disconnected, the standby goes back to step 1 and tries to restore the file from the archive again. This loop of retries from the archive, <TT CLASS="FILENAME" >pg_xlog</TT >, and via streaming replication goes on until the server is stopped or failover is triggered by a trigger file. </P ><P > Standby mode is exited and the server switches to normal operation when <TT CLASS="COMMAND" >pg_ctl promote</TT > is run or a trigger file is found (<TT CLASS="VARNAME" >trigger_file</TT >). Before failover, any WAL immediately available in the archive or in <TT CLASS="FILENAME" >pg_xlog</TT > will be restored, but no attempt is made to connect to the master. </P ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="PREPARING-MASTER-FOR-STANDBY" >26.2.3. Preparing the Master for Standby Servers</A ></H2 ><P > Set up continuous archiving on the primary to an archive directory accessible from the standby, as described in <A HREF="continuous-archiving.html" >Section 25.3</A >. The archive location should be accessible from the standby even when the master is down, i.e., it should reside on the standby server itself or another trusted server, not on the master server. </P ><P > If you want to use streaming replication, set up authentication on the primary server to allow replication connections from the standby server(s); that is, create a role and provide a suitable entry or entries in <TT CLASS="FILENAME" >pg_hba.conf</TT > with the database field set to <TT CLASS="LITERAL" >replication</TT >. Also ensure <TT CLASS="VARNAME" >max_wal_senders</TT > is set to a sufficiently large value in the configuration file of the primary server. If replication slots will be used, ensure that <TT CLASS="VARNAME" >max_replication_slots</TT > is set sufficiently high as well. </P ><P > Take a base backup as described in <A HREF="continuous-archiving.html#BACKUP-BASE-BACKUP" >Section 25.3.2</A > to bootstrap the standby server. </P ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="STANDBY-SERVER-SETUP" >26.2.4. Setting Up a Standby Server</A ></H2 ><P > To set up the standby server, restore the base backup taken from primary server (see <A HREF="continuous-archiving.html#BACKUP-PITR-RECOVERY" >Section 25.3.4</A >). Create a recovery command file <TT CLASS="FILENAME" >recovery.conf</TT > in the standby's cluster data directory, and turn on <TT CLASS="VARNAME" >standby_mode</TT >. Set <TT CLASS="VARNAME" >restore_command</TT > to a simple command to copy files from the WAL archive. If you plan to have multiple standby servers for high availability purposes, set <TT CLASS="VARNAME" >recovery_target_timeline</TT > to <TT CLASS="LITERAL" >latest</TT >, to make the standby server follow the timeline change that occurs at failover to another standby. </P ><DIV CLASS="NOTE" ><BLOCKQUOTE CLASS="NOTE" ><P ><B >Note: </B > Do not use pg_standby or similar tools with the built-in standby mode described here. <TT CLASS="VARNAME" >restore_command</TT > should return immediately if the file does not exist; the server will retry the command again if necessary. See <A HREF="log-shipping-alternative.html" >Section 26.4</A > for using tools like pg_standby. </P ></BLOCKQUOTE ></DIV ><P > If you want to use streaming replication, fill in <TT CLASS="VARNAME" >primary_conninfo</TT > with a libpq connection string, including the host name (or IP address) and any additional details needed to connect to the primary server. If the primary needs a password for authentication, the password needs to be specified in <TT CLASS="VARNAME" >primary_conninfo</TT > as well. </P ><P > If you're setting up the standby server for high availability purposes, set up WAL archiving, connections and authentication like the primary server, because the standby server will work as a primary server after failover. </P ><P > If you're using a WAL archive, its size can be minimized using the <A HREF="archive-recovery-settings.html#ARCHIVE-CLEANUP-COMMAND" >archive_cleanup_command</A > parameter to remove files that are no longer required by the standby server. The <SPAN CLASS="APPLICATION" >pg_archivecleanup</SPAN > utility is designed specifically to be used with <TT CLASS="VARNAME" >archive_cleanup_command</TT > in typical single-standby configurations, see <A HREF="pgarchivecleanup.html" ><SPAN CLASS="APPLICATION" >pg_archivecleanup</SPAN ></A >. Note however, that if you're using the archive for backup purposes, you need to retain files needed to recover from at least the latest base backup, even if they're no longer needed by the standby. </P ><P > A simple example of a <TT CLASS="FILENAME" >recovery.conf</TT > is: </P><PRE CLASS="PROGRAMLISTING" >standby_mode = 'on' primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' restore_command = 'cp /path/to/archive/%f %p' archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'</PRE ><P> </P ><P > You can have any number of standby servers, but if you use streaming replication, make sure you set <TT CLASS="VARNAME" >max_wal_senders</TT > high enough in the primary to allow them to be connected simultaneously. </P ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="STREAMING-REPLICATION" >26.2.5. Streaming Replication</A ></H2 ><P > Streaming replication allows a standby server to stay more up-to-date than is possible with file-based log shipping. The standby connects to the primary, which streams WAL records to the standby as they're generated, without waiting for the WAL file to be filled. </P ><P > Streaming replication is asynchronous by default (see <A HREF="warm-standby.html#SYNCHRONOUS-REPLICATION" >Section 26.2.8</A >), in which case there is a small delay between committing a transaction in the primary and the changes becoming visible in the standby. This delay is however much smaller than with file-based log shipping, typically under one second assuming the standby is powerful enough to keep up with the load. With streaming replication, <TT CLASS="VARNAME" >archive_timeout</TT > is not required to reduce the data loss window. </P ><P > If you use streaming replication without file-based continuous archiving, the server might recycle old WAL segments before the standby has received them. If this occurs, the standby will need to be reinitialized from a new base backup. You can avoid this by setting <TT CLASS="VARNAME" >wal_keep_segments</TT > to a value large enough to ensure that WAL segments are not recycled too early, or by configuring a replication slot for the standby. If you set up a WAL archive that's accessible from the standby, these solutions are not required, since the standby can always use the archive to catch up provided it retains enough segments. </P ><P > To use streaming replication, set up a file-based log-shipping standby server as described in <A HREF="warm-standby.html" >Section 26.2</A >. The step that turns a file-based log-shipping standby into streaming replication standby is setting <TT CLASS="VARNAME" >primary_conninfo</TT > setting in the <TT CLASS="FILENAME" >recovery.conf</TT > file to point to the primary server. Set <A HREF="runtime-config-connection.html#GUC-LISTEN-ADDRESSES" >listen_addresses</A > and authentication options (see <TT CLASS="FILENAME" >pg_hba.conf</TT >) on the primary so that the standby server can connect to the <TT CLASS="LITERAL" >replication</TT > pseudo-database on the primary server (see <A HREF="warm-standby.html#STREAMING-REPLICATION-AUTHENTICATION" >Section 26.2.5.1</A >). </P ><P > On systems that support the keepalive socket option, setting <A HREF="runtime-config-connection.html#GUC-TCP-KEEPALIVES-IDLE" >tcp_keepalives_idle</A >, <A HREF="runtime-config-connection.html#GUC-TCP-KEEPALIVES-INTERVAL" >tcp_keepalives_interval</A > and <A HREF="runtime-config-connection.html#GUC-TCP-KEEPALIVES-COUNT" >tcp_keepalives_count</A > helps the primary promptly notice a broken connection. </P ><P > Set the maximum number of concurrent connections from the standby servers (see <A HREF="runtime-config-replication.html#GUC-MAX-WAL-SENDERS" >max_wal_senders</A > for details). </P ><P > When the standby is started and <TT CLASS="VARNAME" >primary_conninfo</TT > is set correctly, the standby will connect to the primary after replaying all WAL files available in the archive. If the connection is established successfully, you will see a walreceiver process in the standby, and a corresponding walsender process in the primary. </P ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="STREAMING-REPLICATION-AUTHENTICATION" >26.2.5.1. Authentication</A ></H3 ><P > It is very important that the access privileges for replication be set up so that only trusted users can read the WAL stream, because it is easy to extract privileged information from it. Standby servers must authenticate to the primary as a superuser or an account that has the <TT CLASS="LITERAL" >REPLICATION</TT > privilege. It is recommended to create a dedicated user account with <TT CLASS="LITERAL" >REPLICATION</TT > and <TT CLASS="LITERAL" >LOGIN</TT > privileges for replication. While <TT CLASS="LITERAL" >REPLICATION</TT > privilege gives very high permissions, it does not allow the user to modify any data on the primary system, which the <TT CLASS="LITERAL" >SUPERUSER</TT > privilege does. </P ><P > Client authentication for replication is controlled by a <TT CLASS="FILENAME" >pg_hba.conf</TT > record specifying <TT CLASS="LITERAL" >replication</TT > in the <TT CLASS="REPLACEABLE" ><I >database</I ></TT > field. For example, if the standby is running on host IP <TT CLASS="LITERAL" >192.168.1.100</TT > and the account name for replication is <TT CLASS="LITERAL" >foo</TT >, the administrator can add the following line to the <TT CLASS="FILENAME" >pg_hba.conf</TT > file on the primary: </P><PRE CLASS="PROGRAMLISTING" ># Allow the user "foo" from host 192.168.1.100 to connect to the primary # as a replication standby if the user's password is correctly supplied. # # TYPE DATABASE USER ADDRESS METHOD host replication foo 192.168.1.100/32 md5</PRE ><P> </P ><P > The host name and port number of the primary, connection user name, and password are specified in the <TT CLASS="FILENAME" >recovery.conf</TT > file. The password can also be set in the <TT CLASS="FILENAME" >~/.pgpass</TT > file on the standby (specify <TT CLASS="LITERAL" >replication</TT > in the <TT CLASS="REPLACEABLE" ><I >database</I ></TT > field). For example, if the primary is running on host IP <TT CLASS="LITERAL" >192.168.1.50</TT >, port <TT CLASS="LITERAL" >5432</TT >, the account name for replication is <TT CLASS="LITERAL" >foo</TT >, and the password is <TT CLASS="LITERAL" >foopass</TT >, the administrator can add the following line to the <TT CLASS="FILENAME" >recovery.conf</TT > file on the standby: </P><PRE CLASS="PROGRAMLISTING" ># The standby connects to the primary that is running on host 192.168.1.50 # and port 5432 as the user "foo" whose password is "foopass". primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'</PRE ><P> </P ></DIV ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="STREAMING-REPLICATION-MONITORING" >26.2.5.2. Monitoring</A ></H3 ><P > An important health indicator of streaming replication is the amount of WAL records generated in the primary, but not yet applied in the standby. You can calculate this lag by comparing the current WAL write location on the primary with the last WAL location received by the standby. They can be retrieved using <CODE CLASS="FUNCTION" >pg_current_xlog_location</CODE > on the primary and the <CODE CLASS="FUNCTION" >pg_last_xlog_receive_location</CODE > on the standby, respectively (see <A HREF="functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE" >Table 9-78</A > and <A HREF="functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE" >Table 9-79</A > for details). The last WAL receive location in the standby is also displayed in the process status of the WAL receiver process, displayed using the <TT CLASS="COMMAND" >ps</TT > command (see <A HREF="monitoring-ps.html" >Section 28.1</A > for details). </P ><P > You can retrieve a list of WAL sender processes via the <A HREF="monitoring-stats.html#PG-STAT-REPLICATION-VIEW" > <TT CLASS="LITERAL" >pg_stat_replication</TT ></A > view. Large differences between <CODE CLASS="FUNCTION" >pg_current_xlog_location</CODE > and <TT CLASS="LITERAL" >sent_location</TT > field might indicate that the master server is under heavy load, while differences between <TT CLASS="LITERAL" >sent_location</TT > and <CODE CLASS="FUNCTION" >pg_last_xlog_receive_location</CODE > on the standby might indicate network delay, or that the standby is under heavy load. </P ></DIV ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="STREAMING-REPLICATION-SLOTS" >26.2.6. Replication Slots</A ></H2 ><P > Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a <A HREF="hot-standby.html#HOT-STANDBY-CONFLICT" >recovery conflict</A > even when the standby is disconnected. </P ><P > In lieu of using replication slots, it is possible to prevent the removal of old WAL segments using <A HREF="runtime-config-replication.html#GUC-WAL-KEEP-SEGMENTS" >wal_keep_segments</A >, or by storing the segments in an archive using <A HREF="runtime-config-wal.html#GUC-ARCHIVE-COMMAND" >archive_command</A >. However, these methods often result in retaining more WAL segments than required, whereas replication slots retain only the number of segments known to be needed. An advantage of these methods is that they bound the space requirement for <TT CLASS="LITERAL" >pg_xlog</TT >; there is currently no way to do this using replication slots. </P ><P > Similarly, <A HREF="runtime-config-replication.html#GUC-HOT-STANDBY-FEEDBACK" >hot_standby_feedback</A > and <A HREF="runtime-config-replication.html#GUC-VACUUM-DEFER-CLEANUP-AGE" >vacuum_defer_cleanup_age</A > provide protection against relevant rows being removed by vacuum, but the former provides no protection during any time period when the standby is not connected, and the latter often needs to be set to a high value to provide adequate protection. Replication slots overcome these disadvantages. </P ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="STREAMING-REPLICATION-SLOTS-MANIPULATION" >26.2.6.1. Querying and manipulating replication slots</A ></H3 ><P > Each replication slot has a name, which can contain lower-case letters, numbers, and the underscore character. </P ><P > Existing replication slots and their state can be seen in the <A HREF="view-pg-replication-slots.html" ><TT CLASS="STRUCTNAME" >pg_replication_slots</TT ></A > view. </P ><P > Slots can be created and dropped either via the streaming replication protocol (see <A HREF="protocol-replication.html" >Section 51.3</A >) or via SQL functions (see <A HREF="functions-admin.html#FUNCTIONS-REPLICATION" >Section 9.26.6</A >). </P ></DIV ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="STREAMING-REPLICATION-SLOTS-CONFIG" >26.2.6.2. Configuration Example</A ></H3 ><P > You can create a replication slot like this: </P><PRE CLASS="PROGRAMLISTING" >postgres=# SELECT * FROM pg_create_physical_replication_slot('node_a_slot'); slot_name | xlog_position -------------+--------------- node_a_slot | postgres=# SELECT slot_name, slot_type, active FROM pg_replication_slots; slot_name | slot_type | active -------------+-----------+-------- node_a_slot | physical | f (1 row)</PRE ><P> To configure the standby to use this slot, <TT CLASS="VARNAME" >primary_slot_name</TT > should be configured in the standby's <TT CLASS="FILENAME" >recovery.conf</TT >. Here is a simple example: </P><PRE CLASS="PROGRAMLISTING" >standby_mode = 'on' primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' primary_slot_name = 'node_a_slot'</PRE ><P> </P ></DIV ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="CASCADING-REPLICATION" >26.2.7. Cascading Replication</A ></H2 ><P > The cascading replication feature allows a standby server to accept replication connections and stream WAL records to other standbys, acting as a relay. This can be used to reduce the number of direct connections to the master and also to minimize inter-site bandwidth overheads. </P ><P > A standby acting as both a receiver and a sender is known as a cascading standby. Standbys that are more directly connected to the master are known as upstream servers, while those standby servers further away are downstream servers. Cascading replication does not place limits on the number or arrangement of downstream servers, though each standby connects to only one upstream server which eventually links to a single master/primary server. </P ><P > A cascading standby sends not only WAL records received from the master but also those restored from the archive. So even if the replication connection in some upstream connection is terminated, streaming replication continues downstream for as long as new WAL records are available. </P ><P > Cascading replication is currently asynchronous. Synchronous replication (see <A HREF="warm-standby.html#SYNCHRONOUS-REPLICATION" >Section 26.2.8</A >) settings have no effect on cascading replication at present. </P ><P > Hot Standby feedback propagates upstream, whatever the cascaded arrangement. </P ><P > If an upstream standby server is promoted to become new master, downstream servers will continue to stream from the new master if <TT CLASS="VARNAME" >recovery_target_timeline</TT > is set to <TT CLASS="LITERAL" >'latest'</TT >. </P ><P > To use cascading replication, set up the cascading standby so that it can accept replication connections (that is, set <A HREF="runtime-config-replication.html#GUC-MAX-WAL-SENDERS" >max_wal_senders</A > and <A HREF="runtime-config-replication.html#GUC-HOT-STANDBY" >hot_standby</A >, and configure <A HREF="auth-pg-hba-conf.html" >host-based authentication</A >). You will also need to set <TT CLASS="VARNAME" >primary_conninfo</TT > in the downstream standby to point to the cascading standby. </P ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="SYNCHRONOUS-REPLICATION" >26.2.8. Synchronous Replication</A ></H2 ><P > <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > streaming replication is asynchronous by default. If the primary server crashes then some transactions that were committed may not have been replicated to the standby server, causing data loss. The amount of data loss is proportional to the replication delay at the time of failover. </P ><P > Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one or more synchronous standby servers. This extends that standard level of durability offered by a transaction commit. This level of protection is referred to as 2-safe replication in computer science theory, and group-1-safe (group-safe and 1-safe) when <TT CLASS="VARNAME" >synchronous_commit</TT > is set to <TT CLASS="LITERAL" >remote_write</TT >. </P ><P > When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server. The only possibility that data can be lost is if both the primary and the standby suffer crashes at the same time. This can provide a much higher level of durability, though only if the sysadmin is cautious about the placement and management of the two servers. Waiting for confirmation increases the user's confidence that the changes will not be lost in the event of server crashes but it also necessarily increases the response time for the requesting transaction. The minimum wait time is the round-trip time between primary to standby. </P ><P > Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits. Long running actions such as data loading or index building do not wait until the very final commit message. All two-phase commit actions require commit waits, including both prepare and commit. </P ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="SYNCHRONOUS-REPLICATION-CONFIG" >26.2.8.1. Basic Configuration</A ></H3 ><P > Once streaming replication has been configured, configuring synchronous replication requires only one additional configuration step: <A HREF="runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMES" >synchronous_standby_names</A > must be set to a non-empty value. <TT CLASS="VARNAME" >synchronous_commit</TT > must also be set to <TT CLASS="LITERAL" >on</TT >, but since this is the default value, typically no change is required. (See <A HREF="runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS" >Section 19.5.1</A > and <A HREF="runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER" >Section 19.6.2</A >.) This configuration will cause each commit to wait for confirmation that the standby has written the commit record to durable storage. <TT CLASS="VARNAME" >synchronous_commit</TT > can be set by individual users, so it can be configured in the configuration file, for particular users or databases, or dynamically by applications, in order to control the durability guarantee on a per-transaction basis. </P ><P > After a commit record has been written to disk on the primary, the WAL record is then sent to the standby. The standby sends reply messages each time a new batch of WAL data is written to disk, unless <TT CLASS="VARNAME" >wal_receiver_status_interval</TT > is set to zero on the standby. In the case that <TT CLASS="VARNAME" >synchronous_commit</TT > is set to <TT CLASS="LITERAL" >remote_apply</TT >, the standby sends reply messages when the commit record is replayed, making the transaction visible. If the standby is chosen as a synchronous standby, from a priority list of <TT CLASS="VARNAME" >synchronous_standby_names</TT > on the primary, the reply messages from that standby will be considered along with those from other synchronous standbys to decide when to release transactions waiting for confirmation that the commit record has been received. These parameters allow the administrator to specify which standby servers should be synchronous standbys. Note that the configuration of synchronous replication is mainly on the master. Named standbys must be directly connected to the master; the master knows nothing about downstream standby servers using cascaded replication. </P ><P > Setting <TT CLASS="VARNAME" >synchronous_commit</TT > to <TT CLASS="LITERAL" >remote_write</TT > will cause each commit to wait for confirmation that the standby has received the commit record and written it out to its own operating system, but not for the data to be flushed to disk on the standby. This setting provides a weaker guarantee of durability than <TT CLASS="LITERAL" >on</TT > does: the standby could lose the data in the event of an operating system crash, though not a <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > crash. However, it's a useful setting in practice because it can decrease the response time for the transaction. Data loss could only occur if both the primary and the standby crash and the database of the primary gets corrupted at the same time. </P ><P > Setting <TT CLASS="VARNAME" >synchronous_commit</TT > to <TT CLASS="LITERAL" >remote_apply</TT > will cause each commit to wait until the current synchronous standbys report that they have replayed the transaction, making it visible to user queries. In simple cases, this allows for load balancing with causal consistency. </P ><P > Users will stop waiting if a fast shutdown is requested. However, as when using asynchronous replication, the server will not fully shutdown until all outstanding WAL records are transferred to the currently connected standby servers. </P ></DIV ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="SYNCHRONOUS-REPLICATION-MULTIPLE-STANDBYS" >26.2.8.2. Multiple Synchronous Standbys</A ></H3 ><P > Synchronous replication supports one or more synchronous standby servers; transactions will wait until all the standby servers which are considered as synchronous confirm receipt of their data. The number of synchronous standbys that transactions must wait for replies from is specified in <TT CLASS="VARNAME" >synchronous_standby_names</TT >. This parameter also specifies a list of standby names, which determines the priority of each standby for being chosen as a synchronous standby. The standbys whose names appear earlier in the list are given higher priority and will be considered as synchronous. Other standby servers appearing later in this list represent potential synchronous standbys. If any of the current synchronous standbys disconnects for whatever reason, it will be replaced immediately with the next-highest-priority standby. </P ><P > An example of <TT CLASS="VARNAME" >synchronous_standby_names</TT > for multiple synchronous standbys is: </P><PRE CLASS="PROGRAMLISTING" >synchronous_standby_names = '2 (s1, s2, s3)'</PRE ><P> In this example, if four standby servers <TT CLASS="LITERAL" >s1</TT >, <TT CLASS="LITERAL" >s2</TT >, <TT CLASS="LITERAL" >s3</TT > and <TT CLASS="LITERAL" >s4</TT > are running, the two standbys <TT CLASS="LITERAL" >s1</TT > and <TT CLASS="LITERAL" >s2</TT > will be chosen as synchronous standbys because their names appear early in the list of standby names. <TT CLASS="LITERAL" >s3</TT > is a potential synchronous standby and will take over the role of synchronous standby when either of <TT CLASS="LITERAL" >s1</TT > or <TT CLASS="LITERAL" >s2</TT > fails. <TT CLASS="LITERAL" >s4</TT > is an asynchronous standby since its name is not in the list. </P ></DIV ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="SYNCHRONOUS-REPLICATION-PERFORMANCE" >26.2.8.3. Planning for Performance</A ></H3 ><P > Synchronous replication usually requires carefully planned and placed standby servers to ensure applications perform acceptably. Waiting doesn't utilize system resources, but transaction locks continue to be held until the transfer is confirmed. As a result, incautious use of synchronous replication will reduce performance for database applications because of increased response times and higher contention. </P ><P > <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > allows the application developer to specify the durability level required via replication. This can be specified for the system overall, though it can also be specified for specific users or connections, or even individual transactions. </P ><P > For example, an application workload might consist of: 10% of changes are important customer details, while 90% of changes are less important data that the business can more easily survive if it is lost, such as chat messages between users. </P ><P > With synchronous replication options specified at the application level (on the primary) we can offer synchronous replication for the most important changes, without slowing down the bulk of the total workload. Application level options are an important and practical tool for allowing the benefits of synchronous replication for high performance applications. </P ><P > You should consider that the network bandwidth must be higher than the rate of generation of WAL data. </P ></DIV ><DIV CLASS="SECT3" ><H3 CLASS="SECT3" ><A NAME="SYNCHRONOUS-REPLICATION-HA" >26.2.8.4. Planning for High Availability</A ></H3 ><P > <TT CLASS="VARNAME" >synchronous_standby_names</TT > specifies the number and names of synchronous standbys that transaction commits made when <TT CLASS="VARNAME" >synchronous_commit</TT > is set to <TT CLASS="LITERAL" >on</TT >, <TT CLASS="LITERAL" >remote_apply</TT > or <TT CLASS="LITERAL" >remote_write</TT > will wait for responses from. Such transaction commits may never be completed if any one of synchronous standbys should crash. </P ><P > The best solution for high availability is to ensure you keep as many synchronous standbys as requested. This can be achieved by naming multiple potential synchronous standbys using <TT CLASS="VARNAME" >synchronous_standby_names</TT >. The standbys whose names appear earlier in the list will be used as synchronous standbys. Standbys listed after these will take over the role of synchronous standby if one of current ones should fail. </P ><P > When a standby first attaches to the primary, it will not yet be properly synchronized. This is described as <TT CLASS="LITERAL" >catchup</TT > mode. Once the lag between standby and primary reaches zero for the first time we move to real-time <TT CLASS="LITERAL" >streaming</TT > state. The catch-up duration may be long immediately after the standby has been created. If the standby is shut down, then the catch-up period will increase according to the length of time the standby has been down. The standby is only able to become a synchronous standby once it has reached <TT CLASS="LITERAL" >streaming</TT > state. </P ><P > If primary restarts while commits are waiting for acknowledgement, those waiting transactions will be marked fully committed once the primary database recovers. There is no way to be certain that all standbys have received all outstanding WAL data at time of the crash of the primary. Some transactions may not show as committed on the standby, even though they show as committed on the primary. The guarantee we offer is that the application will not receive explicit acknowledgement of the successful commit of a transaction until the WAL data is known to be safely received by all the synchronous standbys. </P ><P > If you really cannot keep as many synchronous standbys as requested then you should decrease the number of synchronous standbys that transaction commits must wait for responses from in <TT CLASS="VARNAME" >synchronous_standby_names</TT > (or disable it) and reload the configuration file on the primary server. </P ><P > If the primary is isolated from remaining standby servers you should fail over to the best candidate of those other remaining standby servers. </P ><P > If you need to re-create a standby server while transactions are waiting, make sure that the commands pg_start_backup() and pg_stop_backup() are run in a session with <TT CLASS="VARNAME" >synchronous_commit</TT > = <TT CLASS="LITERAL" >off</TT >, otherwise those requests will wait forever for the standby to appear. </P ></DIV ></DIV ><DIV CLASS="SECT2" ><H2 CLASS="SECT2" ><A NAME="CONTINUOUS-ARCHIVING-IN-STANDBY" >26.2.9. Continuous archiving in standby</A ></H2 ><P > When continuous WAL archiving is used in a standby, there are two different scenarios: the WAL archive can be shared between the primary and the standby, or the standby can have its own WAL archive. When the standby has its own WAL archive, set <TT CLASS="VARNAME" >archive_mode</TT > to <TT CLASS="LITERAL" >always</TT >, and the standby will call the archive command for every WAL segment it receives, whether it's by restoring from the archive or by streaming replication. The shared archive can be handled similarly, but the <TT CLASS="VARNAME" >archive_command</TT > must test if the file being archived exists already, and if the existing file has identical contents. This requires more care in the <TT CLASS="VARNAME" >archive_command</TT >, as it must be careful to not overwrite an existing file with different contents, but return success if the exactly same file is archived twice. And all that must be done free of race conditions, if two servers attempt to archive the same file at the same time. </P ><P > If <TT CLASS="VARNAME" >archive_mode</TT > is set to <TT CLASS="LITERAL" >on</TT >, the archiver is not enabled during recovery or standby mode. If the standby server is promoted, it will start archiving after the promotion, but will not archive any WAL or timeline history files that it did not generate itself. To get a complete series of WAL files in the archive, you must ensure that all WAL is archived, before it reaches the standby. This is inherently true with file-based log shipping, as the standby can only restore files that are found in the archive, but not if streaming replication is enabled. When a server is not in recovery mode, there is no difference between <TT CLASS="LITERAL" >on</TT > and <TT CLASS="LITERAL" >always</TT > modes. </P ></DIV ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="different-replication-solutions.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="warm-standby-failover.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >Comparison of Different Solutions</TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="high-availability.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >Failover</TD ></TR ></TABLE ></DIV ></BODY ></HTML >