Sophie

Sophie

distrib > Mandriva > current > i586 > media > main-updates > by-pkgid > fc62ce67f262cdcd253dc7f849ce3223 > files > 1000

postgresql8.4-docs-8.4.12-0.1mdv2010.2.i586.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>Reliability</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REV="MADE"
HREF="mailto:pgsql-docs@postgresql.org"><LINK
REL="HOME"
TITLE="PostgreSQL 8.4.12 Documentation"
HREF="index.html"><LINK
REL="UP"
TITLE="Reliability and the Write-Ahead Log"
HREF="wal.html"><LINK
REL="PREVIOUS"
TITLE="Reliability and the Write-Ahead Log"
HREF="wal.html"><LINK
REL="NEXT"
TITLE="Write-Ahead Logging (WAL)"
HREF="wal-intro.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="stylesheet.css"><META
HTTP-EQUIV="Content-Type"
CONTENT="text/html; charset=ISO-8859-1"><META
NAME="creation"
CONTENT="2012-05-31T23:30:11"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="5"
ALIGN="center"
VALIGN="bottom"
>PostgreSQL 8.4.12 Documentation</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
HREF="wal.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
HREF="wal.html"
>Fast Backward</A
></TD
><TD
WIDTH="60%"
ALIGN="center"
VALIGN="bottom"
>Chapter 28. Reliability and the Write-Ahead Log</TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="top"
><A
HREF="wal.html"
>Fast Forward</A
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="top"
><A
HREF="wal-intro.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="WAL-RELIABILITY"
>28.1. Reliability</A
></H1
><P
>   Reliability is an important property of any serious database
   system, and <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> does everything possible to
   guarantee reliable operation. One aspect of reliable operation is
   that all data recorded by a committed transaction should be stored
   in a nonvolatile area that is safe from power loss, operating
   system failure, and hardware failure (except failure of the
   nonvolatile area itself, of course).  Successfully writing the data
   to the computer's permanent storage (disk drive or equivalent)
   ordinarily meets this requirement.  In fact, even if a computer is
   fatally damaged, if the disk drives survive they can be moved to
   another computer with similar hardware and all committed
   transactions will remain intact.
  </P
><P
>   While forcing data periodically to the disk platters might seem like
   a simple operation, it is not. Because disk drives are dramatically
   slower than main memory and CPUs, several layers of caching exist
   between the computer's main memory and the disk platters.
   First, there is the operating system's buffer cache, which caches
   frequently requested disk blocks and combines disk writes. Fortunately,
   all operating systems give applications a way to force writes from
   the buffer cache to disk, and <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> uses those
   features.  (See the <A
HREF="runtime-config-wal.html#GUC-WAL-SYNC-METHOD"
>wal_sync_method</A
> parameter
   to adjust how this is done.)
  </P
><P
>   Next, there might be a cache in the disk drive controller; this is
   particularly common on <ACRONYM
CLASS="ACRONYM"
>RAID</ACRONYM
> controller cards. Some of
   these caches are <I
CLASS="FIRSTTERM"
>write-through</I
>, meaning writes are passed
   along to the drive as soon as they arrive. Others are
   <I
CLASS="FIRSTTERM"
>write-back</I
>, meaning data is passed on to the drive at
   some later time. Such caches can be a reliability hazard because the
   memory in the disk controller cache is volatile, and will lose its
   contents in a power failure.  Better controller cards have
   <I
CLASS="FIRSTTERM"
>battery-backed</I
> caches, meaning the card has a battery that
   maintains power to the cache in case of system power loss.  After power
   is restored the data will be written to the disk drives.
  </P
><P
>   And finally, most disk drives have caches. Some are write-through
   while some are write-back, and the
   same concerns about data loss exist for write-back drive caches as
   exist for disk controller caches.  Consumer-grade IDE and SATA drives are
   particularly likely to have write-back caches that will not survive a
   power failure.  To check write caching on <SPAN
CLASS="PRODUCTNAME"
>Linux</SPAN
> use
   <TT
CLASS="COMMAND"
>hdparm -I</TT
>;  it is enabled if there is a <TT
CLASS="LITERAL"
>*</TT
> next
   to <TT
CLASS="LITERAL"
>Write cache</TT
>.  <TT
CLASS="COMMAND"
>hdparm -W</TT
> to turn off
   write caching.  On <SPAN
CLASS="PRODUCTNAME"
>FreeBSD</SPAN
> use
   <SPAN
CLASS="APPLICATION"
>atacontrol</SPAN
>.  (For SCSI disks use <A
HREF="http://sg.torque.net/sg/sdparm.html"
TARGET="_top"
><SPAN
CLASS="APPLICATION"
>sdparm</SPAN
></A
>
   to turn off <TT
CLASS="LITERAL"
>WCE</TT
>.)  On <SPAN
CLASS="PRODUCTNAME"
>Solaris</SPAN
> the disk
   write cache is controlled by <A
HREF="http://www.sun.com/bigadmin/content/submitted/format_utility.jsp"
TARGET="_top"
><TT
CLASS="LITERAL"
>format
   -e</TT
></A
>. (The Solaris <ACRONYM
CLASS="ACRONYM"
>ZFS</ACRONYM
> file system is safe with
   disk write-cache enabled because it issues its own disk cache flush
   commands.)  On <SPAN
CLASS="PRODUCTNAME"
>Windows</SPAN
> if <TT
CLASS="VARNAME"
>wal_sync_method</TT
>
   is <TT
CLASS="LITERAL"
>open_datasync</TT
> (the default), write caching is disabled
   by unchecking <TT
CLASS="LITERAL"
>My Computer\Open\{select disk
   drive}\Properties\Hardware\Properties\Policies\Enable write caching on
   the disk</TT
>.  Also on Windows, <TT
CLASS="LITERAL"
>fsync</TT
> and
   <TT
CLASS="LITERAL"
>fsync_writethrough</TT
> never do write caching.  The
   <TT
CLASS="LITERAL"
>fsync_writethrough</TT
> option can also be used to disable
   write caching on <SPAN
CLASS="PRODUCTNAME"
>MacOS X</SPAN
>.
  </P
><P
>   When the operating system sends a write request to the disk hardware,
   there is little it can do to make sure the data has arrived at a truly
   non-volatile storage area. Rather, it is the
   administrator's responsibility to be sure that all storage components
   ensure data integrity.  Avoid disk controllers that have non-battery-backed
   write caches.  At the drive level, disable write-back caching if the
   drive cannot guarantee the data will be written before shutdown.
  </P
><P
>   Another risk of data loss is posed by the disk platter write
   operations themselves. Disk platters are divided into sectors,
   commonly 512 bytes each.  Every physical read or write operation
   processes a whole sector.
   When a write request arrives at the drive, it might be for 512 bytes,
   1024 bytes, or 8192 bytes, and the process of writing could fail due
   to power loss at any time, meaning some of the 512-byte sectors were
   written, and others were not.  To guard against such failures,
   <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> periodically writes full page images to
   permanent storage <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>before</I
></SPAN
> modifying the actual page on
   disk. By doing this, during crash recovery <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> can
   restore partially-written pages.  If you have a battery-backed disk
   controller or file-system software that prevents partial page writes
   (e.g., ReiserFS 4),  you can turn off this page imaging by using the
   <A
HREF="runtime-config-wal.html#GUC-FULL-PAGE-WRITES"
>full_page_writes</A
> parameter.
  </P
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="wal.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="wal-intro.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Reliability and the Write-Ahead Log</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="wal.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Write-Ahead Logging (<ACRONYM
CLASS="ACRONYM"
>WAL</ACRONYM
>)</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>