<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>18.4. Managing Kernel Resources</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="server-start.html" title="18.3. Starting the Database Server" /><link rel="next" href="server-shutdown.html" title="18.5. Shutting Down the Server" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">18.4. Managing Kernel Resources</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="server-start.html" title="18.3. Starting the Database Server">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 18. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 18. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 11.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-shutdown.html" title="18.5. Shutting Down the Server">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="KERNEL-RESOURCES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">18.4. Managing Kernel Resources</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="kernel-resources.html#SYSVIPC">18.4.1. Shared Memory and Semaphores</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#SYSTEMD-REMOVEIPC">18.4.2. systemd RemoveIPC</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#id-1.6.5.6.5">18.4.3. Resource Limits</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-MEMORY-OVERCOMMIT">18.4.4. Linux Memory Overcommit</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-HUGE-PAGES">18.4.5. Linux Huge Pages</a></span></dt></dl></div><p> <span class="productname">PostgreSQL</span> can sometimes exhaust various operating system resource limits, especially when multiple copies of the server are running on the same system, or in very large installations. This section explains the kernel resources used by <span class="productname">PostgreSQL</span> and the steps you can take to resolve problems related to kernel resource consumption. </p><div class="sect2" id="SYSVIPC"><div class="titlepage"><div><div><h3 class="title">18.4.1. Shared Memory and Semaphores</h3></div></div></div><a id="id-1.6.5.6.3.2" class="indexterm"></a><a id="id-1.6.5.6.3.3" class="indexterm"></a><p> <span class="productname">PostgreSQL</span> requires the operating system to provide inter-process communication (<acronym class="acronym">IPC</acronym>) features, specifically shared memory and semaphores. Unix-derived systems typically provide <span class="quote">“<span class="quote"><span class="systemitem">System V</span></span>”</span> <acronym class="acronym">IPC</acronym>, <span class="quote">“<span class="quote"><span class="systemitem">POSIX</span></span>”</span> <acronym class="acronym">IPC</acronym>, or both. <span class="systemitem">Windows</span> has its own implementation of these features and is not discussed here. </p><p> The complete lack of these facilities is usually manifested by an <span class="quote">“<span class="quote"><span class="errorname">Illegal system call</span></span>”</span> error upon server start. In that case there is no alternative but to reconfigure your kernel. <span class="productname">PostgreSQL</span> won't work without them. This situation is rare, however, among modern operating systems. </p><p> Upon starting the server, <span class="productname">PostgreSQL</span> normally allocates a very small amount of System V shared memory, as well as a much larger amount of POSIX (<code class="function">mmap</code>) shared memory. In addition a significant number of semaphores, which can be either System V or POSIX style, are created at server startup. Currently, POSIX semaphores are used on Linux and FreeBSD systems while other platforms use System V semaphores. </p><div class="note"><h3 class="title">Note</h3><p> Prior to <span class="productname">PostgreSQL</span> 9.3, only System V shared memory was used, so the amount of System V shared memory required to start the server was much larger. If you are running an older version of the server, please consult the documentation for your server version. </p></div><p> System V <acronym class="acronym">IPC</acronym> features are typically constrained by system-wide allocation limits. When <span class="productname">PostgreSQL</span> exceeds one of these limits, the server will refuse to start and should leave an instructive error message describing the problem and what to do about it. (See also <a class="xref" href="server-start.html#SERVER-START-FAILURES" title="18.3.1. Server Start-up Failures">Section 18.3.1</a>.) The relevant kernel parameters are named consistently across different systems; <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 18.1. System V IPC Parameters">Table 18.1</a> gives an overview. The methods to set them, however, vary. Suggestions for some platforms are given below. </p><div class="table" id="SYSVIPC-PARAMETERS"><p class="title"><strong>Table 18.1. <span class="systemitem">System V</span> <acronym class="acronym">IPC</acronym> Parameters</strong></p><div class="table-contents"><table class="table" summary="System V IPC Parameters" border="1"><colgroup><col /><col /><col /></colgroup><thead><tr><th>Name</th><th>Description</th><th>Values needed to run one <span class="productname">PostgreSQL</span> instance</th></tr></thead><tbody><tr><td><code class="varname">SHMMAX</code></td><td>Maximum size of shared memory segment (bytes)</td><td>at least 1kB, but the default is usually much higher</td></tr><tr><td><code class="varname">SHMMIN</code></td><td>Minimum size of shared memory segment (bytes)</td><td>1</td></tr><tr><td><code class="varname">SHMALL</code></td><td>Total amount of shared memory available (bytes or pages)</td><td>same as <code class="varname">SHMMAX</code> if bytes, or <code class="literal">ceil(SHMMAX/PAGE_SIZE)</code> if pages, plus room for other applications</td></tr><tr><td><code class="varname">SHMSEG</code></td><td>Maximum number of shared memory segments per process</td><td>only 1 segment is needed, but the default is much higher</td></tr><tr><td><code class="varname">SHMMNI</code></td><td>Maximum number of shared memory segments system-wide</td><td>like <code class="varname">SHMSEG</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNI</code></td><td>Maximum number of semaphore identifiers (i.e., sets)</td><td>at least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNS</code></td><td>Maximum number of semaphores system-wide</td><td><code class="literal">ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16) * 17</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMSL</code></td><td>Maximum number of semaphores per set</td><td>at least 17</td></tr><tr><td><code class="varname">SEMMAP</code></td><td>Number of entries in semaphore map</td><td>see text</td></tr><tr><td><code class="varname">SEMVMX</code></td><td>Maximum value of semaphore</td><td>at least 1000 (The default is often 32767; do not change unless necessary)</td></tr></tbody></table></div></div><br class="table-break" /><p> <span class="productname">PostgreSQL</span> requires a few bytes of System V shared memory (typically 48 bytes, on 64-bit platforms) for each copy of the server. On most modern operating systems, this amount can easily be allocated. However, if you are running many copies of the server, or if other applications are also using System V shared memory, it may be necessary to increase <code class="varname">SHMALL</code>, which is the total amount of System V shared memory system-wide. Note that <code class="varname">SHMALL</code> is measured in pages rather than bytes on many systems. </p><p> Less likely to cause problems is the minimum size for shared memory segments (<code class="varname">SHMMIN</code>), which should be at most approximately 32 bytes for <span class="productname">PostgreSQL</span> (it is usually just 1). The maximum number of segments system-wide (<code class="varname">SHMMNI</code>) or per-process (<code class="varname">SHMSEG</code>) are unlikely to cause a problem unless your system has them set to zero. </p><p> When using System V semaphores, <span class="productname">PostgreSQL</span> uses one semaphore per allowed connection (<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process (<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>), in sets of 16. Each such set will also contain a 17th semaphore which contains a <span class="quote">“<span class="quote">magic number</span>”</span>, to detect collision with semaphore sets used by other applications. The maximum number of semaphores in the system is set by <code class="varname">SEMMNS</code>, which consequently must be at least as high as <code class="varname">max_connections</code> plus <code class="varname">autovacuum_max_workers</code> plus <code class="varname">max_worker_processes</code>, plus one extra for each 16 allowed connections plus workers (see the formula in <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 18.1. System V IPC Parameters">Table 18.1</a>). The parameter <code class="varname">SEMMNI</code> determines the limit on the number of semaphore sets that can exist on the system at one time. Hence this parameter must be at least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)</code>. Lowering the number of allowed connections is a temporary workaround for failures, which are usually confusingly worded <span class="quote">“<span class="quote">No space left on device</span>”</span>, from the function <code class="function">semget</code>. </p><p> In some cases it might also be necessary to increase <code class="varname">SEMMAP</code> to be at least on the order of <code class="varname">SEMMNS</code>. If the system has this parameter (many do not), it defines the size of the semaphore resource map, in which each contiguous block of available semaphores needs an entry. When a semaphore set is freed it is either added to an existing entry that is adjacent to the freed block or it is registered under a new map entry. If the map is full, the freed semaphores get lost (until reboot). Fragmentation of the semaphore space could over time lead to fewer available semaphores than there should be. </p><p> Various other settings related to <span class="quote">“<span class="quote">semaphore undo</span>”</span>, such as <code class="varname">SEMMNU</code> and <code class="varname">SEMUME</code>, do not affect <span class="productname">PostgreSQL</span>. </p><p> When using POSIX semaphores, the number of semaphores needed is the same as for System V, that is one semaphore per allowed connection (<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process (<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>). On the platforms where this option is preferred, there is no specific kernel limit on the number of POSIX semaphores. </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="systemitem">AIX</span> <a id="id-1.6.5.6.3.16.1.1.2" class="indexterm"></a> </span></dt><dd><p> At least as of version 5.1, it should not be necessary to do any special configuration for such parameters as <code class="varname">SHMMAX</code>, as it appears this is configured to allow all memory to be used as shared memory. That is the sort of configuration commonly used for other databases such as <span class="application">DB/2</span>.</p><p> It might, however, be necessary to modify the global <code class="command">ulimit</code> information in <code class="filename">/etc/security/limits</code>, as the default hard limits for file sizes (<code class="varname">fsize</code>) and numbers of files (<code class="varname">nofiles</code>) might be too low. </p></dd><dt><span class="term"><span class="systemitem">FreeBSD</span> <a id="id-1.6.5.6.3.16.2.1.2" class="indexterm"></a> </span></dt><dd><p> The default IPC settings can be changed using the <code class="command">sysctl</code> or <code class="command">loader</code> interfaces. The following parameters can be set using <code class="command">sysctl</code>: </p><pre class="screen"> <code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmall=32768</code></strong> <code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmmax=134217728</code></strong> </pre><p> To make these settings persist over reboots, modify <code class="filename">/etc/sysctl.conf</code>. </p><p> These semaphore-related settings are read-only as far as <code class="command">sysctl</code> is concerned, but can be set in <code class="filename">/boot/loader.conf</code>: </p><pre class="programlisting"> kern.ipc.semmni=256 kern.ipc.semmns=512 </pre><p> After modifying that file, a reboot is required for the new settings to take effect. </p><p> You might also want to configure your kernel to lock shared memory into RAM and prevent it from being paged out to swap. This can be accomplished using the <code class="command">sysctl</code> setting <code class="literal">kern.ipc.shm_use_phys</code>. </p><p> If running in FreeBSD jails by enabling <span class="application">sysctl</span>'s <code class="literal">security.jail.sysvipc_allowed</code>, <span class="application">postmaster</span>s running in different jails should be run by different operating system users. This improves security because it prevents non-root users from interfering with shared memory or semaphores in different jails, and it allows the PostgreSQL IPC cleanup code to function properly. (In FreeBSD 6.0 and later the IPC cleanup code does not properly detect processes in other jails, preventing the running of postmasters on the same port in different jails.) </p><p> <span class="systemitem">FreeBSD</span> versions before 4.0 work like old <span class="systemitem">OpenBSD</span> (see below). </p></dd><dt><span class="term"><span class="systemitem">NetBSD</span> <a id="id-1.6.5.6.3.16.3.1.2" class="indexterm"></a> </span></dt><dd><p> In <span class="systemitem">NetBSD</span> 5.0 and later, IPC parameters can be adjusted using <code class="command">sysctl</code>, for example: </p><pre class="screen"> <code class="prompt">#</code> <strong class="userinput"><code>sysctl -w kern.ipc.semmni=100</code></strong> </pre><p> To make these settings persist over reboots, modify <code class="filename">/etc/sysctl.conf</code>. </p><p> You will usually want to increase <code class="literal">kern.ipc.semmni</code> and <code class="literal">kern.ipc.semmns</code>, as <span class="systemitem">NetBSD</span>'s default settings for these are uncomfortably small. </p><p> You might also want to configure your kernel to lock shared memory into RAM and prevent it from being paged out to swap. This can be accomplished using the <code class="command">sysctl</code> setting <code class="literal">kern.ipc.shm_use_phys</code>. </p><p> <span class="systemitem">NetBSD</span> versions before 5.0 work like old <span class="systemitem">OpenBSD</span> (see below), except that kernel parameters should be set with the keyword <code class="literal">options</code> not <code class="literal">option</code>. </p></dd><dt><span class="term"><span class="systemitem">OpenBSD</span> <a id="id-1.6.5.6.3.16.4.1.2" class="indexterm"></a> </span></dt><dd><p> In <span class="systemitem">OpenBSD</span> 3.3 and later, IPC parameters can be adjusted using <code class="command">sysctl</code>, for example: </p><pre class="screen"> <code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.seminfo.semmni=100</code></strong> </pre><p> To make these settings persist over reboots, modify <code class="filename">/etc/sysctl.conf</code>. </p><p> You will usually want to increase <code class="literal">kern.seminfo.semmni</code> and <code class="literal">kern.seminfo.semmns</code>, as <span class="systemitem">OpenBSD</span>'s default settings for these are uncomfortably small. </p><p> In older <span class="systemitem">OpenBSD</span> versions, you will need to build a custom kernel to change the IPC parameters. Make sure that the options <code class="varname">SYSVSHM</code> and <code class="varname">SYSVSEM</code> are enabled, too. (They are by default.) The following shows an example of how to set the various parameters in the kernel configuration file: </p><pre class="programlisting"> option SYSVSHM option SHMMAXPGS=4096 option SHMSEG=256 option SYSVSEM option SEMMNI=256 option SEMMNS=512 option SEMMNU=256 </pre><p> </p></dd><dt><span class="term"><span class="systemitem">HP-UX</span> <a id="id-1.6.5.6.3.16.5.1.2" class="indexterm"></a> </span></dt><dd><p> The default settings tend to suffice for normal installations. On <span class="productname">HP-UX</span> 10, the factory default for <code class="varname">SEMMNS</code> is 128, which might be too low for larger database sites. </p><p> <acronym class="acronym">IPC</acronym> parameters can be set in the <span class="application">System Administration Manager</span> (<acronym class="acronym">SAM</acronym>) under <span class="guimenu">Kernel Configuration</span> → <span class="guimenuitem">Configurable Parameters</span>. Choose <span class="guibutton">Create A New Kernel</span> when you're done. </p></dd><dt><span class="term"><span class="systemitem">Linux</span> <a id="id-1.6.5.6.3.16.6.1.2" class="indexterm"></a> </span></dt><dd><p> The default maximum segment size is 32 MB, and the default maximum total size is 2097152 pages. A page is almost always 4096 bytes except in unusual kernel configurations with <span class="quote">“<span class="quote">huge pages</span>”</span> (use <code class="literal">getconf PAGE_SIZE</code> to verify). </p><p> The shared memory size settings can be changed via the <code class="command">sysctl</code> interface. For example, to allow 16 GB: </p><pre class="screen"> <code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmmax=17179869184</code></strong> <code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmall=4194304</code></strong> </pre><p> In addition these settings can be preserved between reboots in the file <code class="filename">/etc/sysctl.conf</code>. Doing that is highly recommended. </p><p> Ancient distributions might not have the <code class="command">sysctl</code> program, but equivalent changes can be made by manipulating the <code class="filename">/proc</code> file system: </p><pre class="screen"> <code class="prompt">$</code> <strong class="userinput"><code>echo 17179869184 >/proc/sys/kernel/shmmax</code></strong> <code class="prompt">$</code> <strong class="userinput"><code>echo 4194304 >/proc/sys/kernel/shmall</code></strong> </pre><p> </p><p> The remaining defaults are quite generously sized, and usually do not require changes. </p></dd><dt><span class="term"><span class="systemitem">macOS</span> <a id="id-1.6.5.6.3.16.7.1.2" class="indexterm"></a> </span></dt><dd><p> The recommended method for configuring shared memory in macOS is to create a file named <code class="filename">/etc/sysctl.conf</code>, containing variable assignments such as: </p><pre class="programlisting"> kern.sysv.shmmax=4194304 kern.sysv.shmmin=1 kern.sysv.shmmni=32 kern.sysv.shmseg=8 kern.sysv.shmall=1024 </pre><p> Note that in some macOS versions, <span class="emphasis"><em>all five</em></span> shared-memory parameters must be set in <code class="filename">/etc/sysctl.conf</code>, else the values will be ignored. </p><p> Beware that recent releases of macOS ignore attempts to set <code class="varname">SHMMAX</code> to a value that isn't an exact multiple of 4096. </p><p> <code class="varname">SHMALL</code> is measured in 4 kB pages on this platform. </p><p> In older macOS versions, you will need to reboot to have changes in the shared memory parameters take effect. As of 10.5 it is possible to change all but <code class="varname">SHMMNI</code> on the fly, using <span class="application">sysctl</span>. But it's still best to set up your preferred values via <code class="filename">/etc/sysctl.conf</code>, so that the values will be kept across reboots. </p><p> The file <code class="filename">/etc/sysctl.conf</code> is only honored in macOS 10.3.9 and later. If you are running a previous 10.3.x release, you must edit the file <code class="filename">/etc/rc</code> and change the values in the following commands: </p><pre class="programlisting"> sysctl -w kern.sysv.shmmax sysctl -w kern.sysv.shmmin sysctl -w kern.sysv.shmmni sysctl -w kern.sysv.shmseg sysctl -w kern.sysv.shmall </pre><p> Note that <code class="filename">/etc/rc</code> is usually overwritten by macOS system updates, so you should expect to have to redo these edits after each update. </p><p> In macOS 10.2 and earlier, instead edit these commands in the file <code class="filename">/System/Library/StartupItems/SystemTuning/SystemTuning</code>. </p></dd><dt><span class="term"><span class="systemitem">Solaris</span> 2.6 to 2.9 (Solaris 6 to Solaris 9) <a id="id-1.6.5.6.3.16.8.1.2" class="indexterm"></a> </span></dt><dd><p> The relevant settings can be changed in <code class="filename">/etc/system</code>, for example: </p><pre class="programlisting"> set shmsys:shminfo_shmmax=0x2000000 set shmsys:shminfo_shmmin=1 set shmsys:shminfo_shmmni=256 set shmsys:shminfo_shmseg=256 set semsys:seminfo_semmap=256 set semsys:seminfo_semmni=512 set semsys:seminfo_semmns=512 set semsys:seminfo_semmsl=32 </pre><p> You need to reboot for the changes to take effect. See also <a class="ulink" href="http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html" target="_top">http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html</a> for information on shared memory under older versions of Solaris. </p></dd><dt><span class="term"><span class="systemitem">Solaris</span> 2.10 (Solaris 10) and later<br /></span><span class="term"><span class="systemitem">OpenSolaris</span></span></dt><dd><p> In Solaris 10 and later, and OpenSolaris, the default shared memory and semaphore settings are good enough for most <span class="productname">PostgreSQL</span> applications. Solaris now defaults to a <code class="varname">SHMMAX</code> of one-quarter of system <acronym class="acronym">RAM</acronym>. To further adjust this setting, use a project setting associated with the <code class="literal">postgres</code> user. For example, run the following as <code class="literal">root</code>: </p><pre class="programlisting"> projadd -c "PostgreSQL DB User" -K "project.max-shm-memory=(privileged,8GB,deny)" -U postgres -G postgres user.postgres </pre><p> </p><p> This command adds the <code class="literal">user.postgres</code> project and sets the shared memory maximum for the <code class="literal">postgres</code> user to 8GB, and takes effect the next time that user logs in, or when you restart <span class="productname">PostgreSQL</span> (not reload). The above assumes that <span class="productname">PostgreSQL</span> is run by the <code class="literal">postgres</code> user in the <code class="literal">postgres</code> group. No server reboot is required. </p><p> Other recommended kernel setting changes for database servers which will have a large number of connections are: </p><pre class="programlisting"> project.max-shm-ids=(priv,32768,deny) project.max-sem-ids=(priv,4096,deny) project.max-msg-ids=(priv,4096,deny) </pre><p> </p><p> Additionally, if you are running <span class="productname">PostgreSQL</span> inside a zone, you may need to raise the zone resource usage limits as well. See "Chapter2: Projects and Tasks" in the <em class="citetitle">System Administrator's Guide</em> for more information on <code class="literal">projects</code> and <code class="command">prctl</code>. </p></dd></dl></div></div><div class="sect2" id="SYSTEMD-REMOVEIPC"><div class="titlepage"><div><div><h3 class="title">18.4.2. systemd RemoveIPC</h3></div></div></div><a id="id-1.6.5.6.4.2" class="indexterm"></a><p> If <span class="productname">systemd</span> is in use, some care must be taken that IPC resources (shared memory and semaphores) are not prematurely removed by the operating system. This is especially of concern when installing PostgreSQL from source. Users of distribution packages of PostgreSQL are less likely to be affected, as the <code class="literal">postgres</code> user is then normally created as a system user. </p><p> The setting <code class="literal">RemoveIPC</code> in <code class="filename">logind.conf</code> controls whether IPC objects are removed when a user fully logs out. System users are exempt. This setting defaults to on in stock <span class="productname">systemd</span>, but some operating system distributions default it to off. </p><p> A typical observed effect when this setting is on is that the semaphore objects used by a PostgreSQL server are removed at apparently random times, leading to the server crashing with log messages like </p><pre class="screen"> LOG: semctl(1234567890, 0, IPC_RMID, ...) failed: Invalid argument </pre><p> Different types of IPC objects (shared memory vs. semaphores, System V vs. POSIX) are treated slightly differently by <span class="productname">systemd</span>, so one might observe that some IPC resources are not removed in the same way as others. But it is not advisable to rely on these subtle differences. </p><p> A <span class="quote">“<span class="quote">user logging out</span>”</span> might happen as part of a maintenance job or manually when an administrator logs in as the <code class="literal">postgres</code> user or something similar, so it is hard to prevent in general. </p><p> What is a <span class="quote">“<span class="quote">system user</span>”</span> is determined at <span class="productname">systemd</span> compile time from the <code class="symbol">SYS_UID_MAX</code> setting in <code class="filename">/etc/login.defs</code>. </p><p> Packaging and deployment scripts should be careful to create the <code class="literal">postgres</code> user as a system user by using <code class="literal">useradd -r</code>, <code class="literal">adduser --system</code>, or equivalent. </p><p> Alternatively, if the user account was created incorrectly or cannot be changed, it is recommended to set </p><pre class="programlisting"> RemoveIPC=no </pre><p> in <code class="filename">/etc/systemd/logind.conf</code> or another appropriate configuration file. </p><div class="caution"><h3 class="title">Caution</h3><p> At least one of these two things has to be ensured, or the PostgreSQL server will be very unreliable. </p></div></div><div class="sect2" id="id-1.6.5.6.5"><div class="titlepage"><div><div><h3 class="title">18.4.3. Resource Limits</h3></div></div></div><p> Unix-like operating systems enforce various kinds of resource limits that might interfere with the operation of your <span class="productname">PostgreSQL</span> server. Of particular importance are limits on the number of processes per user, the number of open files per process, and the amount of memory available to each process. Each of these have a <span class="quote">“<span class="quote">hard</span>”</span> and a <span class="quote">“<span class="quote">soft</span>”</span> limit. The soft limit is what actually counts but it can be changed by the user up to the hard limit. The hard limit can only be changed by the root user. The system call <code class="function">setrlimit</code> is responsible for setting these parameters. The shell's built-in command <code class="command">ulimit</code> (Bourne shells) or <code class="command">limit</code> (<span class="application">csh</span>) is used to control the resource limits from the command line. On BSD-derived systems the file <code class="filename">/etc/login.conf</code> controls the various resource limits set during login. See the operating system documentation for details. The relevant parameters are <code class="varname">maxproc</code>, <code class="varname">openfiles</code>, and <code class="varname">datasize</code>. For example: </p><pre class="programlisting"> default:\ ... :datasize-cur=256M:\ :maxproc-cur=256:\ :openfiles-cur=256:\ ... </pre><p> (<code class="literal">-cur</code> is the soft limit. Append <code class="literal">-max</code> to set the hard limit.) </p><p> Kernels can also have system-wide limits on some resources. </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> On <span class="productname">Linux</span> <code class="filename">/proc/sys/fs/file-max</code> determines the maximum number of open files that the kernel will support. It can be changed by writing a different number into the file or by adding an assignment in <code class="filename">/etc/sysctl.conf</code>. The maximum limit of files per process is fixed at the time the kernel is compiled; see <code class="filename">/usr/src/linux/Documentation/proc.txt</code> for more information. </p></li></ul></div><p> </p><p> The <span class="productname">PostgreSQL</span> server uses one process per connection so you should provide for at least as many processes as allowed connections, in addition to what you need for the rest of your system. This is usually not a problem but if you run several servers on one machine things might get tight. </p><p> The factory default limit on open files is often set to <span class="quote">“<span class="quote">socially friendly</span>”</span> values that allow many users to coexist on a machine without using an inappropriate fraction of the system resources. If you run many servers on a machine this is perhaps what you want, but on dedicated servers you might want to raise this limit. </p><p> On the other side of the coin, some systems allow individual processes to open large numbers of files; if more than a few processes do so then the system-wide limit can easily be exceeded. If you find this happening, and you do not want to alter the system-wide limit, you can set <span class="productname">PostgreSQL</span>'s <a class="xref" href="runtime-config-resource.html#GUC-MAX-FILES-PER-PROCESS">max_files_per_process</a> configuration parameter to limit the consumption of open files. </p></div><div class="sect2" id="LINUX-MEMORY-OVERCOMMIT"><div class="titlepage"><div><div><h3 class="title">18.4.4. Linux Memory Overcommit</h3></div></div></div><a id="id-1.6.5.6.6.2" class="indexterm"></a><a id="id-1.6.5.6.6.3" class="indexterm"></a><a id="id-1.6.5.6.6.4" class="indexterm"></a><p> In Linux 2.4 and later, the default virtual memory behavior is not optimal for <span class="productname">PostgreSQL</span>. Because of the way that the kernel implements memory overcommit, the kernel might terminate the <span class="productname">PostgreSQL</span> postmaster (the master server process) if the memory demands of either <span class="productname">PostgreSQL</span> or another process cause the system to run out of virtual memory. </p><p> If this happens, you will see a kernel message that looks like this (consult your system documentation and configuration on where to look for such a message): </p><pre class="programlisting"> Out of Memory: Killed process 12345 (postgres). </pre><p> This indicates that the <code class="filename">postgres</code> process has been terminated due to memory pressure. Although existing database connections will continue to function normally, no new connections will be accepted. To recover, <span class="productname">PostgreSQL</span> will need to be restarted. </p><p> One way to avoid this problem is to run <span class="productname">PostgreSQL</span> on a machine where you can be sure that other processes will not run the machine out of memory. If memory is tight, increasing the swap space of the operating system can help avoid the problem, because the out-of-memory (OOM) killer is invoked only when physical memory and swap space are exhausted. </p><p> If <span class="productname">PostgreSQL</span> itself is the cause of the system running out of memory, you can avoid the problem by changing your configuration. In some cases, it may help to lower memory-related configuration parameters, particularly <a class="link" href="runtime-config-resource.html#GUC-SHARED-BUFFERS"><code class="varname">shared_buffers</code></a> and <a class="link" href="runtime-config-resource.html#GUC-WORK-MEM"><code class="varname">work_mem</code></a>. In other cases, the problem may be caused by allowing too many connections to the database server itself. In many cases, it may be better to reduce <a class="link" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS"><code class="varname">max_connections</code></a> and instead make use of external connection-pooling software. </p><p> On Linux 2.6 and later, it is possible to modify the kernel's behavior so that it will not <span class="quote">“<span class="quote">overcommit</span>”</span> memory. Although this setting will not prevent the <a class="ulink" href="https://lwn.net/Articles/104179/" target="_top">OOM killer</a> from being invoked altogether, it will lower the chances significantly and will therefore lead to more robust system behavior. This is done by selecting strict overcommit mode via <code class="command">sysctl</code>: </p><pre class="programlisting"> sysctl -w vm.overcommit_memory=2 </pre><p> or placing an equivalent entry in <code class="filename">/etc/sysctl.conf</code>. You might also wish to modify the related setting <code class="varname">vm.overcommit_ratio</code>. For details see the kernel documentation file <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/overcommit-accounting" target="_top">https://www.kernel.org/doc/Documentation/vm/overcommit-accounting</a>. </p><p> Another approach, which can be used with or without altering <code class="varname">vm.overcommit_memory</code>, is to set the process-specific <em class="firstterm">OOM score adjustment</em> value for the postmaster process to <code class="literal">-1000</code>, thereby guaranteeing it will not be targeted by the OOM killer. The simplest way to do this is to execute </p><pre class="programlisting"> echo -1000 > /proc/self/oom_score_adj </pre><p> in the postmaster's startup script just before invoking the postmaster. Note that this action must be done as root, or it will have no effect; so a root-owned startup script is the easiest place to do it. If you do this, you should also set these environment variables in the startup script before invoking the postmaster: </p><pre class="programlisting"> export PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj export PG_OOM_ADJUST_VALUE=0 </pre><p> These settings will cause postmaster child processes to run with the normal OOM score adjustment of zero, so that the OOM killer can still target them at need. You could use some other value for <code class="envar">PG_OOM_ADJUST_VALUE</code> if you want the child processes to run with some other OOM score adjustment. (<code class="envar">PG_OOM_ADJUST_VALUE</code> can also be omitted, in which case it defaults to zero.) If you do not set <code class="envar">PG_OOM_ADJUST_FILE</code>, the child processes will run with the same OOM score adjustment as the postmaster, which is unwise since the whole point is to ensure that the postmaster has a preferential setting. </p><p> Older Linux kernels do not offer <code class="filename">/proc/self/oom_score_adj</code>, but may have a previous version of the same functionality called <code class="filename">/proc/self/oom_adj</code>. This works the same except the disable value is <code class="literal">-17</code> not <code class="literal">-1000</code>. </p><div class="note"><h3 class="title">Note</h3><p> Some vendors' Linux 2.4 kernels are reported to have early versions of the 2.6 overcommit <code class="command">sysctl</code> parameter. However, setting <code class="literal">vm.overcommit_memory</code> to 2 on a 2.4 kernel that does not have the relevant code will make things worse, not better. It is recommended that you inspect the actual kernel source code (see the function <code class="function">vm_enough_memory</code> in the file <code class="filename">mm/mmap.c</code>) to verify what is supported in your kernel before you try this in a 2.4 installation. The presence of the <code class="filename">overcommit-accounting</code> documentation file should <span class="emphasis"><em>not</em></span> be taken as evidence that the feature is there. If in any doubt, consult a kernel expert or your kernel vendor. </p></div></div><div class="sect2" id="LINUX-HUGE-PAGES"><div class="titlepage"><div><div><h3 class="title">18.4.5. Linux Huge Pages</h3></div></div></div><p> Using huge pages reduces overhead when using large contiguous chunks of memory, as <span class="productname">PostgreSQL</span> does, particularly when using large values of <a class="xref" href="runtime-config-resource.html#GUC-SHARED-BUFFERS">shared_buffers</a>. To use this feature in <span class="productname">PostgreSQL</span> you need a kernel with <code class="varname">CONFIG_HUGETLBFS=y</code> and <code class="varname">CONFIG_HUGETLB_PAGE=y</code>. You will also have to adjust the kernel setting <code class="varname">vm.nr_hugepages</code>. To estimate the number of huge pages needed, start <span class="productname">PostgreSQL</span> without huge pages enabled and check the postmaster's anonymous shared memory segment size, as well as the system's huge page size, using the <code class="filename">/proc</code> file system. This might look like: </p><pre class="programlisting"> $ <strong class="userinput"><code>head -1 $PGDATA/postmaster.pid</code></strong> 4170 $ <strong class="userinput"><code>pmap 4170 | awk '/rw-s/ && /zero/ {print $2}'</code></strong> 6490428K $ <strong class="userinput"><code>grep ^Hugepagesize /proc/meminfo</code></strong> Hugepagesize: 2048 kB </pre><p> <code class="literal">6490428</code> / <code class="literal">2048</code> gives approximately <code class="literal">3169.154</code>, so in this example we need at least <code class="literal">3170</code> huge pages, which we can set with: </p><pre class="programlisting"> $ <strong class="userinput"><code>sysctl -w vm.nr_hugepages=3170</code></strong> </pre><p> A larger setting would be appropriate if other programs on the machine also need huge pages. Don't forget to add this setting to <code class="filename">/etc/sysctl.conf</code> so that it will be reapplied after reboots. </p><p> Sometimes the kernel is not able to allocate the desired number of huge pages immediately, so it might be necessary to repeat the command or to reboot. (Immediately after a reboot, most of the machine's memory should be available to convert into huge pages.) To verify the huge page allocation situation, use: </p><pre class="programlisting"> $ <strong class="userinput"><code>grep Huge /proc/meminfo</code></strong> </pre><p> </p><p> It may also be necessary to give the database server's operating system user permission to use huge pages by setting <code class="varname">vm.hugetlb_shm_group</code> via <span class="application">sysctl</span>, and/or give permission to lock memory with <code class="command">ulimit -l</code>. </p><p> The default behavior for huge pages in <span class="productname">PostgreSQL</span> is to use them when possible and to fall back to normal pages when failing. To enforce the use of huge pages, you can set <a class="xref" href="runtime-config-resource.html#GUC-HUGE-PAGES">huge_pages</a> to <code class="literal">on</code> in <code class="filename">postgresql.conf</code>. Note that with this setting <span class="productname">PostgreSQL</span> will fail to start if not enough huge pages are available. </p><p> For a detailed description of the <span class="productname">Linux</span> huge pages feature have a look at <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt" target="_top">https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt</a>. </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="server-start.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-shutdown.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">18.3. Starting the Database Server </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 18.5. Shutting Down the Server</td></tr></table></div></body></html>