<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <title>PHP Character Encoding Requirements</title> </head> <body><div class="manualnavbar" style="text-align: center;"> <div class="prev" style="text-align: left; float: left;"><a href="mbstring.overload.html">Function Overloading Feature</a></div> <div class="next" style="text-align: right; float: right;"><a href="ref.mbstring.html">Multibyte String Functions</a></div> <div class="up"><a href="book.mbstring.html">Multibyte String</a></div> <div class="home"><a href="index.html">PHP Manual</a></div> </div><hr /><div id="mbstring.php4.req" class="chapter"> <h1>PHP Character Encoding Requirements</h1> <p class="para"> Encodings of the following types are safely used with PHP. <ul class="itemizedlist"> <li class="listitem"> <p class="para"> A singlebyte encoding, <ul class="itemizedlist"> <li class="listitem"> <span class="simpara"> which has ASCII-compatible (ISO646 compatible) mappings for the characters in range of <em>00h</em> to <em>7fh</em>. </span> </li> </ul> </p> </li> <li class="listitem"> <p class="para"> A multibyte encoding, <ul class="itemizedlist"> <li class="listitem"> <span class="simpara"> which has ASCII-compatible mappings for the characters in range of <em>00h</em> to <em>7fh</em>. </span> </li> <li class="listitem"> <span class="simpara"> which don't use ISO2022 escape sequences. </span> </li> <li class="listitem"> <span class="simpara"> which don't use a value from <em>00h</em> to <em>7fh</em> in any of the compounded bytes that represents a single character. </span> </li> </ul> </p> </li> </ul> </p> <p class="para"> These are examples of character encodings that are unlikely to work with PHP. <div class="informalexample"> <div class="example-contents"> <div class="cdata"><pre> JIS, SJIS, ISO-2022-JP, BIG-5 </pre></div> </div> </div> </p> <p class="para"> Although PHP scripts written in any of those encodings might not work, especially in the case where encoded strings appear as identifiers or literals in the script, you can almost avoid using these encodings by setting up the <em>mbstring</em>'s transparent encoding filter function for incoming HTTP queries. </p> <blockquote class="note"><p><strong class="note">Note</strong>: <p class="para"> It's highly discouraged to use SJIS, BIG5, CP936, CP949 and GB18030 for the internal encoding unless you are familiar with the parser, the scanner and the character encoding. </p> </p></blockquote> <blockquote class="note"><p><strong class="note">Note</strong>: <p class="para"> If you are connecting to a database with PHP, it is recommended that you use the same character encoding for both the database and the <em>internal encoding</em> for ease of use and better performance. </p> <p class="para"> If you are using PostgreSQL, the character encoding used in the database and the one used in PHP may differ as it supports automatic character set conversion between the backend and the frontend. </p> </p></blockquote> </div> <hr /><div class="manualnavbar" style="text-align: center;"> <div class="prev" style="text-align: left; float: left;"><a href="mbstring.overload.html">Function Overloading Feature</a></div> <div class="next" style="text-align: right; float: right;"><a href="ref.mbstring.html">Multibyte String Functions</a></div> <div class="up"><a href="book.mbstring.html">Multibyte String</a></div> <div class="home"><a href="index.html">PHP Manual</a></div> </div></body></html>