Sophie: postgresql11-docs-11.4-1.mga7 noarch

postgresql11-docs-11.4-1.mga7.noarch.rpm

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>18.4. Managing Kernel Resources</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="server-start.html" title="18.3. Starting the Database Server" /><link rel="next" href="server-shutdown.html" title="18.5. Shutting Down the Server" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">18.4. Managing Kernel Resources</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="server-start.html" title="18.3. Starting the Database Server">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 18. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 18. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 11.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-shutdown.html" title="18.5. Shutting Down the Server">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="KERNEL-RESOURCES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">18.4. Managing Kernel Resources</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="kernel-resources.html#SYSVIPC">18.4.1. Shared Memory and Semaphores</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#SYSTEMD-REMOVEIPC">18.4.2. systemd RemoveIPC</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#id-1.6.5.6.5">18.4.3. Resource Limits</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-MEMORY-OVERCOMMIT">18.4.4. Linux Memory Overcommit</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-HUGE-PAGES">18.4.5. Linux Huge Pages</a></span></dt></dl></div><p>
   <span class="productname">PostgreSQL</span> can sometimes exhaust various operating system
   resource limits, especially when multiple copies of the server are running
   on the same system, or in very large installations.  This section explains
   the kernel resources used by <span class="productname">PostgreSQL</span> and the steps you
   can take to resolve problems related to kernel resource consumption.
  </p><div class="sect2" id="SYSVIPC"><div class="titlepage"><div><div><h3 class="title">18.4.1. Shared Memory and Semaphores</h3></div></div></div><a id="id-1.6.5.6.3.2" class="indexterm"></a><a id="id-1.6.5.6.3.3" class="indexterm"></a><p>
    <span class="productname">PostgreSQL</span> requires the operating system to provide
    inter-process communication (<acronym class="acronym">IPC</acronym>) features, specifically
    shared memory and semaphores.  Unix-derived systems typically provide
    <span class="quote">“<span class="quote"><span class="systemitem">System V</span></span>”</span> <acronym class="acronym">IPC</acronym>,
    <span class="quote">“<span class="quote"><span class="systemitem">POSIX</span></span>”</span> <acronym class="acronym">IPC</acronym>, or both.
    <span class="systemitem">Windows</span> has its own implementation of
    these features and is not discussed here.
   </p><p>
    The complete lack of these facilities is usually manifested by an
    <span class="quote">“<span class="quote"><span class="errorname">Illegal system call</span></span>”</span> error upon server
    start.  In that case there is no alternative but to reconfigure your
    kernel.  <span class="productname">PostgreSQL</span> won't work without them.
    This situation is rare, however, among modern operating systems.
   </p><p>
    Upon starting the server, <span class="productname">PostgreSQL</span> normally allocates
    a very small amount of System V shared memory, as well as a much larger
    amount of POSIX (<code class="function">mmap</code>) shared memory.
    In addition a significant number of semaphores, which can be either
    System V or POSIX style, are created at server startup.  Currently,
    POSIX semaphores are used on Linux and FreeBSD systems while other
    platforms use System V semaphores.
   </p><div class="note"><h3 class="title">Note</h3><p>
     Prior to <span class="productname">PostgreSQL</span> 9.3, only System V shared memory
     was used, so the amount of System V shared memory required to start the
     server was much larger.  If you are running an older version of the
     server, please consult the documentation for your server version.
    </p></div><p>
    System V <acronym class="acronym">IPC</acronym> features are typically constrained by
    system-wide allocation limits.
    When <span class="productname">PostgreSQL</span> exceeds one of these limits,
    the server will refuse to start and
    should leave an instructive error message describing the problem
    and what to do about it. (See also <a class="xref" href="server-start.html#SERVER-START-FAILURES" title="18.3.1. Server Start-up Failures">Section 18.3.1</a>.) The relevant kernel
    parameters are named consistently across different systems; <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 18.1. System V IPC Parameters">Table 18.1</a> gives an overview. The methods to set
    them, however, vary. Suggestions for some platforms are given below.
   </p><div class="table" id="SYSVIPC-PARAMETERS"><p class="title"><strong>Table 18.1. <span class="systemitem">System V</span> <acronym class="acronym">IPC</acronym> Parameters</strong></p><div class="table-contents"><table class="table" summary="System V IPC Parameters" border="1"><colgroup><col /><col /><col /></colgroup><thead><tr><th>Name</th><th>Description</th><th>Values needed to run one <span class="productname">PostgreSQL</span> instance</th></tr></thead><tbody><tr><td><code class="varname">SHMMAX</code></td><td>Maximum size of shared memory segment (bytes)</td><td>at least 1kB, but the default is usually much higher</td></tr><tr><td><code class="varname">SHMMIN</code></td><td>Minimum size of shared memory segment (bytes)</td><td>1</td></tr><tr><td><code class="varname">SHMALL</code></td><td>Total amount of shared memory available (bytes or pages)</td><td>same as <code class="varname">SHMMAX</code> if bytes,
        or <code class="literal">ceil(SHMMAX/PAGE_SIZE)</code> if pages,
        plus room for other applications</td></tr><tr><td><code class="varname">SHMSEG</code></td><td>Maximum number of shared memory segments per process</td><td>only 1 segment is needed, but the default is much higher</td></tr><tr><td><code class="varname">SHMMNI</code></td><td>Maximum number of shared memory segments system-wide</td><td>like <code class="varname">SHMSEG</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNI</code></td><td>Maximum number of semaphore identifiers (i.e., sets)</td><td>at least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNS</code></td><td>Maximum number of semaphores system-wide</td><td><code class="literal">ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16) * 17</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMSL</code></td><td>Maximum number of semaphores per set</td><td>at least 17</td></tr><tr><td><code class="varname">SEMMAP</code></td><td>Number of entries in semaphore map</td><td>see text</td></tr><tr><td><code class="varname">SEMVMX</code></td><td>Maximum value of semaphore</td><td>at least 1000 (The default is often 32767; do not change unless necessary)</td></tr></tbody></table></div></div><br class="table-break" /><p>
    <span class="productname">PostgreSQL</span> requires a few bytes of System V shared memory
    (typically 48 bytes, on 64-bit platforms) for each copy of the server.
    On most modern operating systems, this amount can easily be allocated.
    However, if you are running many copies of the server, or if other
    applications are also using System V shared memory, it may be necessary to
    increase <code class="varname">SHMALL</code>, which is the total amount of System V shared
    memory system-wide.  Note that <code class="varname">SHMALL</code> is measured in pages
    rather than bytes on many systems.
   </p><p>
    Less likely to cause problems is the minimum size for shared
    memory segments (<code class="varname">SHMMIN</code>), which should be at most
    approximately 32 bytes for <span class="productname">PostgreSQL</span> (it is
    usually just 1). The maximum number of segments system-wide
    (<code class="varname">SHMMNI</code>) or per-process (<code class="varname">SHMSEG</code>) are unlikely
    to cause a problem unless your system has them set to zero.
   </p><p>
    When using System V semaphores,
    <span class="productname">PostgreSQL</span> uses one semaphore per allowed connection
    (<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process
    (<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background
    process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>), in sets of 16.
    Each such set will
    also contain a 17th semaphore which contains a <span class="quote">“<span class="quote">magic
    number</span>”</span>, to detect collision with semaphore sets used by
    other applications. The maximum number of semaphores in the system
    is set by <code class="varname">SEMMNS</code>, which consequently must be at least
    as high as <code class="varname">max_connections</code> plus
    <code class="varname">autovacuum_max_workers</code> plus <code class="varname">max_worker_processes</code>,
    plus one extra for each 16
    allowed connections plus workers (see the formula in <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 18.1. System V IPC Parameters">Table 18.1</a>).  The parameter <code class="varname">SEMMNI</code>
    determines the limit on the number of semaphore sets that can
    exist on the system at one time.  Hence this parameter must be at
    least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)</code>.
    Lowering the number
    of allowed connections is a temporary workaround for failures,
    which are usually confusingly worded <span class="quote">“<span class="quote">No space
    left on device</span>”</span>, from the function <code class="function">semget</code>.
   </p><p>
    In some cases it might also be necessary to increase
    <code class="varname">SEMMAP</code> to be at least on the order of
    <code class="varname">SEMMNS</code>.  If the system has this parameter
    (many do not), it defines the size of the semaphore
    resource map, in which each contiguous block of available semaphores
    needs an entry. When a semaphore set is freed it is either added to
    an existing entry that is adjacent to the freed block or it is
    registered under a new map entry. If the map is full, the freed
    semaphores get lost (until reboot). Fragmentation of the semaphore
    space could over time lead to fewer available semaphores than there
    should be.
   </p><p>
    Various other settings related to <span class="quote">“<span class="quote">semaphore undo</span>”</span>, such as
    <code class="varname">SEMMNU</code> and <code class="varname">SEMUME</code>, do not affect
    <span class="productname">PostgreSQL</span>.
   </p><p>
    When using POSIX semaphores, the number of semaphores needed is the
    same as for System V, that is one semaphore per allowed connection
    (<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process
    (<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background
    process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>).
    On the platforms where this option is preferred, there is no specific
    kernel limit on the number of POSIX semaphores.
   </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="systemitem">AIX</span>
      <a id="id-1.6.5.6.3.16.1.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        At least as of version 5.1, it should not be necessary to do
        any special configuration for such parameters as
        <code class="varname">SHMMAX</code>, as it appears this is configured to
        allow all memory to be used as shared memory.  That is the
        sort of configuration commonly used for other databases such
        as <span class="application">DB/2</span>.</p><p> It might, however, be necessary to modify the global
       <code class="command">ulimit</code> information in
       <code class="filename">/etc/security/limits</code>, as the default hard
       limits for file sizes (<code class="varname">fsize</code>) and numbers of
       files (<code class="varname">nofiles</code>) might be too low.
       </p></dd><dt><span class="term"><span class="systemitem">FreeBSD</span>
      <a id="id-1.6.5.6.3.16.2.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        The default IPC settings can be changed using
        the <code class="command">sysctl</code> or
        <code class="command">loader</code> interfaces.  The following
        parameters can be set using <code class="command">sysctl</code>:
</p><pre class="screen">
<code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmall=32768</code></strong>
<code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmmax=134217728</code></strong>
</pre><p>
        To make these settings persist over reboots, modify
        <code class="filename">/etc/sysctl.conf</code>.
       </p><p>
        These semaphore-related settings are read-only as far as
        <code class="command">sysctl</code> is concerned, but can be set in
        <code class="filename">/boot/loader.conf</code>:
</p><pre class="programlisting">
kern.ipc.semmni=256
kern.ipc.semmns=512
</pre><p>
        After modifying that file, a reboot is required for the new
        settings to take effect.
       </p><p>
        You might also want to configure your kernel to lock shared
        memory into RAM and prevent it from being paged out to swap.
        This can be accomplished using the <code class="command">sysctl</code>
        setting <code class="literal">kern.ipc.shm_use_phys</code>.
       </p><p>
        If running in FreeBSD jails by enabling <span class="application">sysctl</span>'s
        <code class="literal">security.jail.sysvipc_allowed</code>, <span class="application">postmaster</span>s
        running in different jails should be run by different operating system
        users.  This improves security because it prevents non-root users
        from interfering with shared memory or semaphores in different jails,
        and it allows the PostgreSQL IPC cleanup code to function properly.
        (In FreeBSD 6.0 and later the IPC cleanup code does not properly detect
        processes in other jails, preventing the running of postmasters on the
        same port in different jails.)
       </p><p>
        <span class="systemitem">FreeBSD</span> versions before 4.0 work like
        old <span class="systemitem">OpenBSD</span> (see below).
       </p></dd><dt><span class="term"><span class="systemitem">NetBSD</span>
      <a id="id-1.6.5.6.3.16.3.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        In <span class="systemitem">NetBSD</span> 5.0 and later,
        IPC parameters can be adjusted using <code class="command">sysctl</code>,
        for example:
</p><pre class="screen">
<code class="prompt">#</code> <strong class="userinput"><code>sysctl -w kern.ipc.semmni=100</code></strong>
</pre><p>
        To make these settings persist over reboots, modify
        <code class="filename">/etc/sysctl.conf</code>.
       </p><p>
        You will usually want to increase <code class="literal">kern.ipc.semmni</code>
        and <code class="literal">kern.ipc.semmns</code>,
        as <span class="systemitem">NetBSD</span>'s default settings
        for these are uncomfortably small.
       </p><p>
        You might also want to configure your kernel to lock shared
        memory into RAM and prevent it from being paged out to swap.
        This can be accomplished using the <code class="command">sysctl</code>
        setting <code class="literal">kern.ipc.shm_use_phys</code>.
       </p><p>
        <span class="systemitem">NetBSD</span> versions before 5.0
        work like old <span class="systemitem">OpenBSD</span>
        (see below), except that kernel parameters should be set with the
        keyword <code class="literal">options</code> not <code class="literal">option</code>.
       </p></dd><dt><span class="term"><span class="systemitem">OpenBSD</span>
      <a id="id-1.6.5.6.3.16.4.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        In <span class="systemitem">OpenBSD</span> 3.3 and later,
        IPC parameters can be adjusted using <code class="command">sysctl</code>,
        for example:
</p><pre class="screen">
<code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.seminfo.semmni=100</code></strong>
</pre><p>
        To make these settings persist over reboots, modify
        <code class="filename">/etc/sysctl.conf</code>.
       </p><p>
        You will usually want to
        increase <code class="literal">kern.seminfo.semmni</code>
        and <code class="literal">kern.seminfo.semmns</code>,
        as <span class="systemitem">OpenBSD</span>'s default settings
        for these are uncomfortably small.
       </p><p>
        In older <span class="systemitem">OpenBSD</span> versions,
        you will need to build a custom kernel to change the IPC parameters.
        Make sure that the options <code class="varname">SYSVSHM</code>
        and <code class="varname">SYSVSEM</code> are enabled, too.  (They are by
        default.)  The following shows an example of how to set the various
        parameters in the kernel configuration file:
</p><pre class="programlisting">
option        SYSVSHM
option        SHMMAXPGS=4096
option        SHMSEG=256

option        SYSVSEM
option        SEMMNI=256
option        SEMMNS=512
option        SEMMNU=256
</pre><p>
       </p></dd><dt><span class="term"><span class="systemitem">HP-UX</span>
      <a id="id-1.6.5.6.3.16.5.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        The default settings tend to suffice for normal installations.
        On <span class="productname">HP-UX</span> 10, the factory default for
        <code class="varname">SEMMNS</code> is 128, which might be too low for larger
        database sites.
       </p><p>
        <acronym class="acronym">IPC</acronym> parameters can be set in the <span class="application">System
        Administration Manager</span> (<acronym class="acronym">SAM</acronym>) under
        <span class="guimenu">Kernel
        Configuration</span> → <span class="guimenuitem">Configurable Parameters</span>. Choose
        <span class="guibutton">Create A New Kernel</span> when you're done.
       </p></dd><dt><span class="term"><span class="systemitem">Linux</span>
      <a id="id-1.6.5.6.3.16.6.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        The default maximum segment size is 32 MB, and the
        default maximum total size is 2097152
        pages.  A page is almost always 4096 bytes except in unusual
        kernel configurations with <span class="quote">“<span class="quote">huge pages</span>”</span>
        (use <code class="literal">getconf PAGE_SIZE</code> to verify).
       </p><p>
        The shared memory size settings can be changed via the
        <code class="command">sysctl</code> interface.  For example, to allow 16 GB:
</p><pre class="screen">
<code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmmax=17179869184</code></strong>
<code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmall=4194304</code></strong>
</pre><p>
        In addition these settings can be preserved between reboots in
        the file <code class="filename">/etc/sysctl.conf</code>.  Doing that is
        highly recommended.
       </p><p>
        Ancient distributions might not have the <code class="command">sysctl</code> program,
        but equivalent changes can be made by manipulating the
        <code class="filename">/proc</code> file system:
</p><pre class="screen">
<code class="prompt">$</code> <strong class="userinput"><code>echo 17179869184 &gt;/proc/sys/kernel/shmmax</code></strong>
<code class="prompt">$</code> <strong class="userinput"><code>echo 4194304 &gt;/proc/sys/kernel/shmall</code></strong>
</pre><p>
       </p><p>
        The remaining defaults are quite generously sized, and usually
        do not require changes.
       </p></dd><dt><span class="term"><span class="systemitem">macOS</span>
      <a id="id-1.6.5.6.3.16.7.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        The recommended method for configuring shared memory in macOS
        is to create a file named <code class="filename">/etc/sysctl.conf</code>,
        containing variable assignments such as:
</p><pre class="programlisting">
kern.sysv.shmmax=4194304
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.sysv.shmall=1024
</pre><p>
        Note that in some macOS versions,
        <span class="emphasis"><em>all five</em></span> shared-memory parameters must be set in
        <code class="filename">/etc/sysctl.conf</code>, else the values will be ignored.
       </p><p>
        Beware that recent releases of macOS ignore attempts to set
        <code class="varname">SHMMAX</code> to a value that isn't an exact multiple of 4096.
       </p><p>
        <code class="varname">SHMALL</code> is measured in 4 kB pages on this platform.
       </p><p>
        In older macOS versions, you will need to reboot to have changes in the
        shared memory parameters take effect.  As of 10.5 it is possible to
        change all but <code class="varname">SHMMNI</code> on the fly, using
        <span class="application">sysctl</span>.  But it's still best to set up your preferred
        values via <code class="filename">/etc/sysctl.conf</code>, so that the values will be
        kept across reboots.
       </p><p>
        The file <code class="filename">/etc/sysctl.conf</code> is only honored in macOS
        10.3.9 and later.  If you are running a previous 10.3.x release,
        you must edit the file <code class="filename">/etc/rc</code>
        and change the values in the following commands:
</p><pre class="programlisting">
sysctl -w kern.sysv.shmmax
sysctl -w kern.sysv.shmmin
sysctl -w kern.sysv.shmmni
sysctl -w kern.sysv.shmseg
sysctl -w kern.sysv.shmall
</pre><p>
        Note that
        <code class="filename">/etc/rc</code> is usually overwritten by macOS system updates,
        so you should expect to have to redo these edits after each update.
       </p><p>
        In macOS 10.2 and earlier, instead edit these commands in the file
        <code class="filename">/System/Library/StartupItems/SystemTuning/SystemTuning</code>.
       </p></dd><dt><span class="term"><span class="systemitem">Solaris</span> 2.6 to 2.9 (Solaris
      6 to Solaris 9)
      <a id="id-1.6.5.6.3.16.8.1.2" class="indexterm"></a>
      </span></dt><dd><p>
        The relevant settings can be changed in
        <code class="filename">/etc/system</code>, for example:
</p><pre class="programlisting">
set shmsys:shminfo_shmmax=0x2000000
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=256
set shmsys:shminfo_shmseg=256

set semsys:seminfo_semmap=256
set semsys:seminfo_semmni=512
set semsys:seminfo_semmns=512
set semsys:seminfo_semmsl=32
</pre><p>
        You need to reboot for the changes to take effect.  See also
        <a class="ulink" href="http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html" target="_top">http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html</a>
        for information on shared memory under older versions of Solaris.
       </p></dd><dt><span class="term"><span class="systemitem">Solaris</span> 2.10 (Solaris
      10) and later<br /></span><span class="term"><span class="systemitem">OpenSolaris</span></span></dt><dd><p>
        In Solaris 10 and later, and OpenSolaris, the default shared memory and
        semaphore settings are good enough for most
        <span class="productname">PostgreSQL</span> applications.  Solaris now defaults
        to a <code class="varname">SHMMAX</code> of one-quarter of system <acronym class="acronym">RAM</acronym>.
        To further adjust this setting, use a project setting associated
        with the <code class="literal">postgres</code> user.  For example, run the
        following as <code class="literal">root</code>:
</p><pre class="programlisting">
projadd -c "PostgreSQL DB User" -K "project.max-shm-memory=(privileged,8GB,deny)" -U postgres -G postgres user.postgres
</pre><p>
       </p><p>
        This command adds the <code class="literal">user.postgres</code> project and
        sets the shared memory maximum for the <code class="literal">postgres</code>
        user to 8GB, and takes effect the next time that user logs
        in, or when you restart <span class="productname">PostgreSQL</span> (not reload).
        The above assumes that <span class="productname">PostgreSQL</span> is run by
        the <code class="literal">postgres</code> user in the <code class="literal">postgres</code>
        group.  No server reboot is required.
       </p><p>
        Other recommended kernel setting changes for database servers which will
        have a large number of connections are:
</p><pre class="programlisting">
project.max-shm-ids=(priv,32768,deny)
project.max-sem-ids=(priv,4096,deny)
project.max-msg-ids=(priv,4096,deny)
</pre><p>
       </p><p>
        Additionally, if you are running <span class="productname">PostgreSQL</span>
        inside a zone, you may need to raise the zone resource usage
        limits as well.  See "Chapter2:  Projects and Tasks" in the
        <em class="citetitle">System Administrator's Guide</em> for more
        information on <code class="literal">projects</code> and <code class="command">prctl</code>.
       </p></dd></dl></div></div><div class="sect2" id="SYSTEMD-REMOVEIPC"><div class="titlepage"><div><div><h3 class="title">18.4.2. systemd RemoveIPC</h3></div></div></div><a id="id-1.6.5.6.4.2" class="indexterm"></a><p>
    If <span class="productname">systemd</span> is in use, some care must be taken
    that IPC resources (shared memory and semaphores) are not prematurely
    removed by the operating system.  This is especially of concern when
    installing PostgreSQL from source.  Users of distribution packages of
    PostgreSQL are less likely to be affected, as
    the <code class="literal">postgres</code> user is then normally created as a system
    user.
   </p><p>
    The setting <code class="literal">RemoveIPC</code>
    in <code class="filename">logind.conf</code> controls whether IPC objects are
    removed when a user fully logs out.  System users are exempt.  This
    setting defaults to on in stock <span class="productname">systemd</span>, but
    some operating system distributions default it to off.
   </p><p>
    A typical observed effect when this setting is on is that the semaphore
    objects used by a PostgreSQL server are removed at apparently random
    times, leading to the server crashing with log messages like
</p><pre class="screen">
LOG: semctl(1234567890, 0, IPC_RMID, ...) failed: Invalid argument
</pre><p>
    Different types of IPC objects (shared memory vs. semaphores, System V
    vs. POSIX) are treated slightly differently
    by <span class="productname">systemd</span>, so one might observe that some IPC
    resources are not removed in the same way as others.  But it is not
    advisable to rely on these subtle differences.
   </p><p>
    A <span class="quote">“<span class="quote">user logging out</span>”</span> might happen as part of a maintenance
    job or manually when an administrator logs in as
    the <code class="literal">postgres</code> user or something similar, so it is hard
    to prevent in general.
   </p><p>
    What is a <span class="quote">“<span class="quote">system user</span>”</span> is determined
    at <span class="productname">systemd</span> compile time from
    the <code class="symbol">SYS_UID_MAX</code> setting
    in <code class="filename">/etc/login.defs</code>.
   </p><p>
    Packaging and deployment scripts should be careful to create
    the <code class="literal">postgres</code> user as a system user by
    using <code class="literal">useradd -r</code>, <code class="literal">adduser --system</code>,
    or equivalent.
   </p><p>
    Alternatively, if the user account was created incorrectly or cannot be
    changed, it is recommended to set
</p><pre class="programlisting">
RemoveIPC=no
</pre><p>
    in <code class="filename">/etc/systemd/logind.conf</code> or another appropriate
    configuration file.
   </p><div class="caution"><h3 class="title">Caution</h3><p>
     At least one of these two things has to be ensured, or the PostgreSQL
     server will be very unreliable.
    </p></div></div><div class="sect2" id="id-1.6.5.6.5"><div class="titlepage"><div><div><h3 class="title">18.4.3. Resource Limits</h3></div></div></div><p>
    Unix-like operating systems enforce various kinds of resource limits
    that might interfere with the operation of your
    <span class="productname">PostgreSQL</span> server. Of particular
    importance are limits on the number of processes per user, the
    number of open files per process, and the amount of memory available
    to each process. Each of these have a <span class="quote">“<span class="quote">hard</span>”</span> and a
    <span class="quote">“<span class="quote">soft</span>”</span> limit. The soft limit is what actually counts
    but it can be changed by the user up to the hard limit. The hard
    limit can only be changed by the root user. The system call
    <code class="function">setrlimit</code> is responsible for setting these
    parameters. The shell's built-in command <code class="command">ulimit</code>
    (Bourne shells) or <code class="command">limit</code> (<span class="application">csh</span>) is
    used to control the resource limits from the command line. On
    BSD-derived systems the file <code class="filename">/etc/login.conf</code>
    controls the various resource limits set during login. See the
    operating system documentation for details. The relevant
    parameters are <code class="varname">maxproc</code>,
    <code class="varname">openfiles</code>, and <code class="varname">datasize</code>. For
    example:
</p><pre class="programlisting">
default:\
...
        :datasize-cur=256M:\
        :maxproc-cur=256:\
        :openfiles-cur=256:\
...
</pre><p>
    (<code class="literal">-cur</code> is the soft limit.  Append
    <code class="literal">-max</code> to set the hard limit.)
   </p><p>
    Kernels can also have system-wide limits on some resources.
    </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
      On <span class="productname">Linux</span>
      <code class="filename">/proc/sys/fs/file-max</code> determines the
      maximum number of open files that the kernel will support.  It can
      be changed by writing a different number into the file or by
      adding an assignment in <code class="filename">/etc/sysctl.conf</code>.
      The maximum limit of files per process is fixed at the time the
      kernel is compiled; see
      <code class="filename">/usr/src/linux/Documentation/proc.txt</code> for
      more information.
      </p></li></ul></div><p>
   </p><p>
    The <span class="productname">PostgreSQL</span> server uses one process
    per connection so you should provide for at least as many processes
    as allowed connections, in addition to what you need for the rest
    of your system.  This is usually not a problem but if you run
    several servers on one machine things might get tight.
   </p><p>
    The factory default limit on open files is often set to
    <span class="quote">“<span class="quote">socially friendly</span>”</span> values that allow many users to
    coexist on a machine without using an inappropriate fraction of
    the system resources.  If you run many servers on a machine this
    is perhaps what you want, but on dedicated servers you might want to
    raise this limit.
   </p><p>
    On the other side of the coin, some systems allow individual
    processes to open large numbers of files; if more than a few
    processes do so then the system-wide limit can easily be exceeded.
    If you find this happening, and you do not want to alter the
    system-wide limit, you can set <span class="productname">PostgreSQL</span>'s <a class="xref" href="runtime-config-resource.html#GUC-MAX-FILES-PER-PROCESS">max_files_per_process</a> configuration parameter to
    limit the consumption of open files.
   </p></div><div class="sect2" id="LINUX-MEMORY-OVERCOMMIT"><div class="titlepage"><div><div><h3 class="title">18.4.4. Linux Memory Overcommit</h3></div></div></div><a id="id-1.6.5.6.6.2" class="indexterm"></a><a id="id-1.6.5.6.6.3" class="indexterm"></a><a id="id-1.6.5.6.6.4" class="indexterm"></a><p>
    In Linux 2.4 and later, the default virtual memory behavior is not
    optimal for <span class="productname">PostgreSQL</span>. Because of the
    way that the kernel implements memory overcommit, the kernel might
    terminate the <span class="productname">PostgreSQL</span> postmaster (the
    master server process) if the memory demands of either
    <span class="productname">PostgreSQL</span> or another process cause the
    system to run out of virtual memory.
   </p><p>
    If this happens, you will see a kernel message that looks like
    this (consult your system documentation and configuration on where
    to look for such a message):
</p><pre class="programlisting">
Out of Memory: Killed process 12345 (postgres).
</pre><p>
    This indicates that the <code class="filename">postgres</code> process
    has been terminated due to memory pressure.
    Although existing database connections will continue to function
    normally, no new connections will be accepted.  To recover,
    <span class="productname">PostgreSQL</span> will need to be restarted.
   </p><p>
    One way to avoid this problem is to run
    <span class="productname">PostgreSQL</span> on a machine where you can
    be sure that other processes will not run the machine out of
    memory.  If memory is tight, increasing the swap space of the
    operating system can help avoid the problem, because the
    out-of-memory (OOM) killer is invoked only when physical memory and
    swap space are exhausted.
   </p><p>
    If <span class="productname">PostgreSQL</span> itself is the cause of the
    system running out of memory, you can avoid the problem by changing
    your configuration.  In some cases, it may help to lower memory-related
    configuration parameters, particularly
    <a class="link" href="runtime-config-resource.html#GUC-SHARED-BUFFERS"><code class="varname">shared_buffers</code></a>
    and <a class="link" href="runtime-config-resource.html#GUC-WORK-MEM"><code class="varname">work_mem</code></a>.  In
    other cases, the problem may be caused by allowing too many connections
    to the database server itself.  In many cases, it may be better to reduce
    <a class="link" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS"><code class="varname">max_connections</code></a>
    and instead make use of external connection-pooling software.
   </p><p>
    On Linux 2.6 and later, it is possible to modify the
    kernel's behavior so that it will not <span class="quote">“<span class="quote">overcommit</span>”</span> memory.
    Although this setting will not prevent the <a class="ulink" href="https://lwn.net/Articles/104179/" target="_top">OOM killer</a> from being invoked
    altogether, it will lower the chances significantly and will therefore
    lead to more robust system behavior.  This is done by selecting strict
    overcommit mode via <code class="command">sysctl</code>:
</p><pre class="programlisting">
sysctl -w vm.overcommit_memory=2
</pre><p>
    or placing an equivalent entry in <code class="filename">/etc/sysctl.conf</code>.
    You might also wish to modify the related setting
    <code class="varname">vm.overcommit_ratio</code>.  For details see the kernel documentation
    file <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/overcommit-accounting" target="_top">https://www.kernel.org/doc/Documentation/vm/overcommit-accounting</a>.
   </p><p>
    Another approach, which can be used with or without altering
    <code class="varname">vm.overcommit_memory</code>, is to set the process-specific
    <em class="firstterm">OOM score adjustment</em> value for the postmaster process to
    <code class="literal">-1000</code>, thereby guaranteeing it will not be targeted by the OOM
    killer.  The simplest way to do this is to execute
</p><pre class="programlisting">
echo -1000 &gt; /proc/self/oom_score_adj
</pre><p>
    in the postmaster's startup script just before invoking the postmaster.
    Note that this action must be done as root, or it will have no effect;
    so a root-owned startup script is the easiest place to do it.  If you
    do this, you should also set these environment variables in the startup
    script before invoking the postmaster:
</p><pre class="programlisting">
export PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
export PG_OOM_ADJUST_VALUE=0
</pre><p>
    These settings will cause postmaster child processes to run with the
    normal OOM score adjustment of zero, so that the OOM killer can still
    target them at need.  You could use some other value for
    <code class="envar">PG_OOM_ADJUST_VALUE</code> if you want the child processes to run
    with some other OOM score adjustment.  (<code class="envar">PG_OOM_ADJUST_VALUE</code>
    can also be omitted, in which case it defaults to zero.)  If you do not
    set <code class="envar">PG_OOM_ADJUST_FILE</code>, the child processes will run with the
    same OOM score adjustment as the postmaster, which is unwise since the
    whole point is to ensure that the postmaster has a preferential setting.
   </p><p>
    Older Linux kernels do not offer <code class="filename">/proc/self/oom_score_adj</code>,
    but may have a previous version of the same functionality called
    <code class="filename">/proc/self/oom_adj</code>.  This works the same except the disable
    value is <code class="literal">-17</code> not <code class="literal">-1000</code>.
   </p><div class="note"><h3 class="title">Note</h3><p>
    Some vendors' Linux 2.4 kernels are reported to have early versions
    of the 2.6 overcommit <code class="command">sysctl</code> parameter.  However, setting
    <code class="literal">vm.overcommit_memory</code> to 2
    on a 2.4 kernel that does not have the relevant code will make
    things worse, not better.  It is recommended that you inspect
    the actual kernel source code (see the function
    <code class="function">vm_enough_memory</code> in the file <code class="filename">mm/mmap.c</code>)
    to verify what is supported in your kernel before you try this in a 2.4
    installation.  The presence of the <code class="filename">overcommit-accounting</code>
    documentation file should <span class="emphasis"><em>not</em></span> be taken as evidence that the
    feature is there.  If in any doubt, consult a kernel expert or your
    kernel vendor.
   </p></div></div><div class="sect2" id="LINUX-HUGE-PAGES"><div class="titlepage"><div><div><h3 class="title">18.4.5. Linux Huge Pages</h3></div></div></div><p>
    Using huge pages reduces overhead when using large contiguous chunks of
    memory, as <span class="productname">PostgreSQL</span> does, particularly when
    using large values of <a class="xref" href="runtime-config-resource.html#GUC-SHARED-BUFFERS">shared_buffers</a>.  To use this
    feature in <span class="productname">PostgreSQL</span> you need a kernel
    with <code class="varname">CONFIG_HUGETLBFS=y</code> and
    <code class="varname">CONFIG_HUGETLB_PAGE=y</code>. You will also have to adjust
    the kernel setting <code class="varname">vm.nr_hugepages</code>. To estimate the
    number of huge pages needed, start <span class="productname">PostgreSQL</span>
    without huge pages enabled and check the
    postmaster's anonymous shared memory segment size, as well as the system's
    huge page size, using the <code class="filename">/proc</code> file system.  This might
    look like:
</p><pre class="programlisting">
$ <strong class="userinput"><code>head -1 $PGDATA/postmaster.pid</code></strong>
4170
$ <strong class="userinput"><code>pmap 4170 | awk '/rw-s/ &amp;&amp; /zero/ {print $2}'</code></strong>
6490428K
$ <strong class="userinput"><code>grep ^Hugepagesize /proc/meminfo</code></strong>
Hugepagesize:       2048 kB
</pre><p>
     <code class="literal">6490428</code> / <code class="literal">2048</code> gives approximately
     <code class="literal">3169.154</code>, so in this example we need at
     least <code class="literal">3170</code> huge pages, which we can set with:
</p><pre class="programlisting">
$ <strong class="userinput"><code>sysctl -w vm.nr_hugepages=3170</code></strong>
</pre><p>
    A larger setting would be appropriate if other programs on the machine
    also need huge pages.  Don't forget to add this setting
    to <code class="filename">/etc/sysctl.conf</code> so that it will be reapplied
    after reboots.
   </p><p>
    Sometimes the kernel is not able to allocate the desired number of huge
    pages immediately, so it might be necessary to repeat the command or to
    reboot.  (Immediately after a reboot, most of the machine's memory
    should be available to convert into huge pages.)  To verify the huge
    page allocation situation, use:
</p><pre class="programlisting">
$ <strong class="userinput"><code>grep Huge /proc/meminfo</code></strong>
</pre><p>
   </p><p>
    It may also be necessary to give the database server's operating system
    user permission to use huge pages by setting
    <code class="varname">vm.hugetlb_shm_group</code> via <span class="application">sysctl</span>, and/or
    give permission to lock memory with <code class="command">ulimit -l</code>.
   </p><p>
    The default behavior for huge pages in
    <span class="productname">PostgreSQL</span> is to use them when possible and
    to fall back to normal pages when failing. To enforce the use of huge
    pages, you can set <a class="xref" href="runtime-config-resource.html#GUC-HUGE-PAGES">huge_pages</a>
    to <code class="literal">on</code> in <code class="filename">postgresql.conf</code>.
    Note that with this setting <span class="productname">PostgreSQL</span> will fail to
    start if not enough huge pages are available.
   </p><p>
    For a detailed description of the <span class="productname">Linux</span> huge
    pages feature have a look
    at <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt" target="_top">https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt</a>.
   </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="server-start.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-shutdown.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">18.3. Starting the Database Server </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 18.5. Shutting Down the Server</td></tr></table></div></body></html>