Sophie

Sophie

distrib > Mageia > 6 > armv7hl > media > core-updates > by-pkgid > 65530c6176058f9b54858c3b4f6385e6 > files > 794

python-django-doc-1.8.19-1.mga6.noarch.rpm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml" lang="">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Unicode data &#8212; Django 1.8.19 documentation</title>
    
    <link rel="stylesheet" href="../_static/default.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '1.8.19',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <link rel="index" title="Index" href="../genindex.html" />
    <link rel="search" title="Search" href="../search.html" />
    <link rel="top" title="Django 1.8.19 documentation" href="../contents.html" />
    <link rel="up" title="API Reference" href="index.html" />
    <link rel="next" title="django.core.urlresolvers utility functions" href="urlresolvers.html" />
    <link rel="prev" title="TemplateResponse and SimpleTemplateResponse" href="template-response.html" />



 
<script type="text/javascript" src="../templatebuiltins.js"></script>
<script type="text/javascript">
(function($) {
    if (!django_template_builtins) {
       // templatebuiltins.js missing, do nothing.
       return;
    }
    $(document).ready(function() {
        // Hyperlink Django template tags and filters
        var base = "templates/builtins.html";
        if (base == "#") {
            // Special case for builtins.html itself
            base = "";
        }
        // Tags are keywords, class '.k'
        $("div.highlight\\-html\\+django span.k").each(function(i, elem) {
             var tagname = $(elem).text();
             if ($.inArray(tagname, django_template_builtins.ttags) != -1) {
                 var fragment = tagname.replace(/_/, '-');
                 $(elem).html("<a href='" + base + "#" + fragment + "'>" + tagname + "</a>");
             }
        });
        // Filters are functions, class '.nf'
        $("div.highlight\\-html\\+django span.nf").each(function(i, elem) {
             var filtername = $(elem).text();
             if ($.inArray(filtername, django_template_builtins.tfilters) != -1) {
                 var fragment = filtername.replace(/_/, '-');
                 $(elem).html("<a href='" + base + "#" + fragment + "'>" + filtername + "</a>");
             }
        });
    });
})(jQuery);
</script>


  </head>
  <body role="document">

    <div class="document">
  <div id="custom-doc" class="yui-t6">
    <div id="hd">
      <h1><a href="../index.html">Django 1.8.19 documentation</a></h1>
      <div id="global-nav">
        <a title="Home page" href="../index.html">Home</a>  |
        <a title="Table of contents" href="../contents.html">Table of contents</a>  |
        <a title="Global index" href="../genindex.html">Index</a>  |
        <a title="Module index" href="../py-modindex.html">Modules</a>
      </div>
      <div class="nav">
    &laquo; <a href="template-response.html" title="TemplateResponse and SimpleTemplateResponse">previous</a>
     |
    <a href="index.html" title="API Reference" accesskey="U">up</a>
   |
    <a href="urlresolvers.html" title="&lt;code class=&#34;docutils literal&#34;&gt;&lt;span class=&#34;pre&#34;&gt;django.core.urlresolvers&lt;/span&gt;&lt;/code&gt; utility functions">next</a> &raquo;</div>
    </div>

    <div id="bd">
      <div id="yui-main">
        <div class="yui-b">
          <div class="yui-g" id="ref-unicode">
            
  <div class="section" id="s-unicode-data">
<span id="unicode-data"></span><h1>Unicode data<a class="headerlink" href="#unicode-data" title="Permalink to this headline">¶</a></h1>
<p>Django natively supports Unicode data everywhere. Providing your database can
somehow store the data, you can safely pass around Unicode strings to
templates, models and the database.</p>
<p>This document tells you what you need to know if you&#8217;re writing applications
that use data or templates that are encoded in something other than ASCII.</p>
<div class="section" id="s-creating-the-database">
<span id="creating-the-database"></span><h2>Creating the database<a class="headerlink" href="#creating-the-database" title="Permalink to this headline">¶</a></h2>
<p>Make sure your database is configured to be able to store arbitrary string
data. Normally, this means giving it an encoding of UTF-8 or UTF-16. If you use
a more restrictive encoding &#8211; for example, latin1 (iso8859-1) &#8211; you won&#8217;t be
able to store certain characters in the database, and information will be lost.</p>
<ul class="simple">
<li>MySQL users, refer to the <a class="reference external" href="http://dev.mysql.com/doc/refman/5.6/en/charset-database.html">MySQL manual</a> for details on how to set or alter
the database character set encoding.</li>
<li>PostgreSQL users, refer to the <a class="reference external" href="http://www.postgresql.org/docs/current/static/multibyte.html">PostgreSQL manual</a> (section 22.3.2 in
PostgreSQL 9) for details on creating databases with the correct encoding.</li>
<li>Oracle users, refer to the <a class="reference external" href="http://docs.oracle.com/cd/E11882_01/server.112/e10729/toc.htm">Oracle manual</a> for details on how to set
(<a class="reference external" href="http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch2charset.htm#NLSPG002">section 2</a>) or alter (<a class="reference external" href="http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch11charsetmig.htm#NLSPG011">section 11</a>) the database character set encoding.</li>
<li>SQLite users, there is nothing you need to do. SQLite always uses UTF-8
for internal encoding.</li>
</ul>
<p>All of Django&#8217;s database backends automatically convert Unicode strings into
the appropriate encoding for talking to the database. They also automatically
convert strings retrieved from the database into Python Unicode strings. You
don&#8217;t even need to tell Django what encoding your database uses: that is
handled transparently.</p>
<p>For more, see the section &#8220;The database API&#8221; below.</p>
</div>
<div class="section" id="s-general-string-handling">
<span id="general-string-handling"></span><h2>General string handling<a class="headerlink" href="#general-string-handling" title="Permalink to this headline">¶</a></h2>
<p>Whenever you use strings with Django &#8211; e.g., in database lookups, template
rendering or anywhere else &#8211; you have two choices for encoding those strings.
You can use Unicode strings, or you can use normal strings (sometimes called
&#8220;bytestrings&#8221;) that are encoded using UTF-8.</p>
<p>In Python 3, the logic is reversed, that is normal strings are Unicode, and
when you want to specifically create a bytestring, you have to prefix the
string with a &#8216;b&#8217;. As we are doing in Django code from version 1.5,
we recommend that you import <code class="docutils literal"><span class="pre">unicode_literals</span></code> from the __future__ library
in your code. Then, when you specifically want to create a bytestring literal,
prefix the string with &#8216;b&#8217;.</p>
<p>Python 2 legacy:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">my_string</span> <span class="o">=</span> <span class="s2">&quot;This is a bytestring&quot;</span>
<span class="n">my_unicode</span> <span class="o">=</span> <span class="s2">u&quot;This is an Unicode string&quot;</span>
</pre></div>
</div>
<p>Python 2 with unicode literals or Python 3:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">__future__</span> <span class="k">import</span> <span class="n">unicode_literals</span>

<span class="n">my_string</span> <span class="o">=</span> <span class="n">b</span><span class="s2">&quot;This is a bytestring&quot;</span>
<span class="n">my_unicode</span> <span class="o">=</span> <span class="s2">&quot;This is an Unicode string&quot;</span>
</pre></div>
</div>
<p>See also <a class="reference internal" href="../topics/python3.html"><span class="doc">Python 3 compatibility</span></a>.</p>
<div class="admonition warning">
<p class="first admonition-title">Warning</p>
<p>A bytestring does not carry any information with it about its encoding.
For that reason, we have to make an assumption, and Django assumes that all
bytestrings are in UTF-8.</p>
<p class="last">If you pass a string to Django that has been encoded in some other format,
things will go wrong in interesting ways. Usually, Django will raise a
<code class="docutils literal"><span class="pre">UnicodeDecodeError</span></code> at some point.</p>
</div>
<p>If your code only uses ASCII data, it&#8217;s safe to use your normal strings,
passing them around at will, because ASCII is a subset of UTF-8.</p>
<p>Don&#8217;t be fooled into thinking that if your <a class="reference internal" href="settings.html#std:setting-DEFAULT_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">DEFAULT_CHARSET</span></code></a> setting is set
to something other than <code class="docutils literal"><span class="pre">'utf-8'</span></code> you can use that other encoding in your
bytestrings! <a class="reference internal" href="settings.html#std:setting-DEFAULT_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">DEFAULT_CHARSET</span></code></a> only applies to the strings generated as
the result of template rendering (and email). Django will always assume UTF-8
encoding for internal bytestrings. The reason for this is that the
<a class="reference internal" href="settings.html#std:setting-DEFAULT_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">DEFAULT_CHARSET</span></code></a> setting is not actually under your control (if you are the
application developer). It&#8217;s under the control of the person installing and
using your application &#8211; and if that person chooses a different setting, your
code must still continue to work. Ergo, it cannot rely on that setting.</p>
<p>In most cases when Django is dealing with strings, it will convert them to
Unicode strings before doing anything else. So, as a general rule, if you pass
in a bytestring, be prepared to receive a Unicode string back in the result.</p>
<div class="section" id="s-translated-strings">
<span id="translated-strings"></span><h3>Translated strings<a class="headerlink" href="#translated-strings" title="Permalink to this headline">¶</a></h3>
<p>Aside from Unicode strings and bytestrings, there&#8217;s a third type of string-like
object you may encounter when using Django. The framework&#8217;s
internationalization features introduce the concept of a &#8220;lazy translation&#8221; &#8211;
a string that has been marked as translated but whose actual translation result
isn&#8217;t determined until the object is used in a string. This feature is useful
in cases where the translation locale is unknown until the string is used, even
though the string might have originally been created when the code was first
imported.</p>
<p>Normally, you won&#8217;t have to worry about lazy translations. Just be aware that
if you examine an object and it claims to be a
<code class="docutils literal"><span class="pre">django.utils.functional.__proxy__</span></code> object, it is a lazy translation.
Calling <code class="docutils literal"><span class="pre">unicode()</span></code> with the lazy translation as the argument will generate a
Unicode string in the current locale.</p>
<p>For more details about lazy translation objects, refer to the
<a class="reference internal" href="../topics/i18n/index.html"><span class="doc">internationalization</span></a> documentation.</p>
</div>
<div class="section" id="s-useful-utility-functions">
<span id="useful-utility-functions"></span><h3>Useful utility functions<a class="headerlink" href="#useful-utility-functions" title="Permalink to this headline">¶</a></h3>
<p>Because some string operations come up again and again, Django ships with a few
useful functions that should make working with Unicode and bytestring objects
a bit easier.</p>
<div class="section" id="s-conversion-functions">
<span id="conversion-functions"></span><h4>Conversion functions<a class="headerlink" href="#conversion-functions" title="Permalink to this headline">¶</a></h4>
<p>The <code class="docutils literal"><span class="pre">django.utils.encoding</span></code> module contains a few functions that are handy
for converting back and forth between Unicode and bytestrings.</p>
<ul>
<li><p class="first"><code class="docutils literal"><span class="pre">smart_text(s,</span> <span class="pre">encoding='utf-8',</span> <span class="pre">strings_only=False,</span> <span class="pre">errors='strict')</span></code>
converts its input to a Unicode string. The <code class="docutils literal"><span class="pre">encoding</span></code> parameter
specifies the input encoding. (For example, Django uses this internally
when processing form input data, which might not be UTF-8 encoded.) The
<code class="docutils literal"><span class="pre">strings_only</span></code> parameter, if set to True, will result in Python
numbers, booleans and <code class="docutils literal"><span class="pre">None</span></code> not being converted to a string (they keep
their original types). The <code class="docutils literal"><span class="pre">errors</span></code> parameter takes any of the values
that are accepted by Python&#8217;s <code class="docutils literal"><span class="pre">unicode()</span></code> function for its error
handling.</p>
<p>If you pass <code class="docutils literal"><span class="pre">smart_text()</span></code> an object that has a <code class="docutils literal"><span class="pre">__unicode__</span></code>
method, it will use that method to do the conversion.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">force_text(s,</span> <span class="pre">encoding='utf-8',</span> <span class="pre">strings_only=False,</span>
<span class="pre">errors='strict')</span></code> is identical to <code class="docutils literal"><span class="pre">smart_text()</span></code> in almost all
cases. The difference is when the first argument is a <a class="reference internal" href="../topics/i18n/translation.html#lazy-translations"><span class="std std-ref">lazy
translation</span></a> instance. While <code class="docutils literal"><span class="pre">smart_text()</span></code>
preserves lazy translations, <code class="docutils literal"><span class="pre">force_text()</span></code> forces those objects to a
Unicode string (causing the translation to occur). Normally, you&#8217;ll want
to use <code class="docutils literal"><span class="pre">smart_text()</span></code>. However, <code class="docutils literal"><span class="pre">force_text()</span></code> is useful in
template tags and filters that absolutely <em>must</em> have a string to work
with, not just something that can be converted to a string.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">smart_bytes(s,</span> <span class="pre">encoding='utf-8',</span> <span class="pre">strings_only=False,</span> <span class="pre">errors='strict')</span></code>
is essentially the opposite of <code class="docutils literal"><span class="pre">smart_text()</span></code>. It forces the first
argument to a bytestring. The <code class="docutils literal"><span class="pre">strings_only</span></code> parameter has the same
behavior as for <code class="docutils literal"><span class="pre">smart_text()</span></code> and <code class="docutils literal"><span class="pre">force_text()</span></code>. This is
slightly different semantics from Python&#8217;s builtin <code class="docutils literal"><span class="pre">str()</span></code> function,
but the difference is needed in a few places within Django&#8217;s internals.</p>
</li>
</ul>
<p>Normally, you&#8217;ll only need to use <code class="docutils literal"><span class="pre">smart_text()</span></code>. Call it as early as
possible on any input data that might be either Unicode or a bytestring, and
from then on, you can treat the result as always being Unicode.</p>
</div>
<div class="section" id="s-uri-and-iri-handling">
<span id="s-id1"></span><span id="uri-and-iri-handling"></span><span id="id1"></span><h4>URI and IRI handling<a class="headerlink" href="#uri-and-iri-handling" title="Permalink to this headline">¶</a></h4>
<p>Web frameworks have to deal with URLs (which are a type of <a class="reference external" href="http://www.ietf.org/rfc/rfc3987.txt">IRI</a>). One
requirement of URLs is that they are encoded using only ASCII characters.
However, in an international environment, you might need to construct a
URL from an <a class="reference external" href="http://www.ietf.org/rfc/rfc3987.txt">IRI</a> &#8211; very loosely speaking, a <a class="reference external" href="http://www.ietf.org/rfc/rfc2396.txt">URI</a> that can contain Unicode
characters. Quoting and converting an IRI to URI can be a little tricky, so
Django provides some assistance.</p>
<ul class="simple">
<li>The function <a class="reference internal" href="utils.html#django.utils.encoding.iri_to_uri" title="django.utils.encoding.iri_to_uri"><code class="xref py py-func docutils literal"><span class="pre">django.utils.encoding.iri_to_uri()</span></code></a> implements the
conversion from IRI to URI as required by the specification (<span class="target" id="index-0"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc3987.html#section-3.1"><strong>RFC 3987#section-3.1</strong></a>).</li>
<li>The functions <a class="reference internal" href="utils.html#django.utils.http.urlquote" title="django.utils.http.urlquote"><code class="xref py py-func docutils literal"><span class="pre">django.utils.http.urlquote()</span></code></a> and
<a class="reference internal" href="utils.html#django.utils.http.urlquote_plus" title="django.utils.http.urlquote_plus"><code class="xref py py-func docutils literal"><span class="pre">django.utils.http.urlquote_plus()</span></code></a> are versions of Python&#8217;s standard
<code class="docutils literal"><span class="pre">urllib.quote()</span></code> and <code class="docutils literal"><span class="pre">urllib.quote_plus()</span></code> that work with non-ASCII
characters. (The data is converted to UTF-8 prior to encoding.)</li>
</ul>
<p>These two groups of functions have slightly different purposes, and it&#8217;s
important to keep them straight. Normally, you would use <code class="docutils literal"><span class="pre">urlquote()</span></code> on the
individual portions of the IRI or URI path so that any reserved characters
such as &#8216;&amp;&#8217; or &#8216;%&#8217; are correctly encoded. Then, you apply <code class="docutils literal"><span class="pre">iri_to_uri()</span></code> to
the full IRI and it converts any non-ASCII characters to the correct encoded
values.</p>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">Technically, it isn&#8217;t correct to say that <code class="docutils literal"><span class="pre">iri_to_uri()</span></code> implements the
full algorithm in the IRI specification. It doesn&#8217;t (yet) perform the
international domain name encoding portion of the algorithm.</p>
</div>
<p>The <code class="docutils literal"><span class="pre">iri_to_uri()</span></code> function will not change ASCII characters that are
otherwise permitted in a URL. So, for example, the character &#8216;%&#8217; is not
further encoded when passed to <code class="docutils literal"><span class="pre">iri_to_uri()</span></code>. This means you can pass a
full URL to this function and it will not mess up the query string or anything
like that.</p>
<p>An example might clarify things here:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">urlquote</span><span class="p">(</span><span class="s1">&#39;Paris &amp; Orléans&#39;</span><span class="p">)</span>
<span class="go">&#39;Paris%20%26%20Orl%C3%A9ans&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">iri_to_uri</span><span class="p">(</span><span class="s1">&#39;/favorites/François/</span><span class="si">%s</span><span class="s1">&#39;</span> <span class="o">%</span> <span class="n">urlquote</span><span class="p">(</span><span class="s1">&#39;Paris &amp; Orléans&#39;</span><span class="p">))</span>
<span class="go">&#39;/favorites/Fran%C3%A7ois/Paris%20%26%20Orl%C3%A9ans&#39;</span>
</pre></div>
</div>
<p>If you look carefully, you can see that the portion that was generated by
<code class="docutils literal"><span class="pre">urlquote()</span></code> in the second example was not double-quoted when passed to
<code class="docutils literal"><span class="pre">iri_to_uri()</span></code>. This is a very important and useful feature. It means that
you can construct your IRI without worrying about whether it contains
non-ASCII characters and then, right at the end, call <code class="docutils literal"><span class="pre">iri_to_uri()</span></code> on the
result.</p>
<p>Similarly, Django provides <a class="reference internal" href="utils.html#django.utils.encoding.uri_to_iri" title="django.utils.encoding.uri_to_iri"><code class="xref py py-func docutils literal"><span class="pre">django.utils.encoding.uri_to_iri()</span></code></a> which
implements the conversion from URI to IRI as per <span class="target" id="index-1"></span><a class="rfc reference external" href="https://tools.ietf.org/html/rfc3987.html#section-3.2"><strong>RFC 3987#section-3.2</strong></a>.
It decodes all percent-encodings except those that don&#8217;t represent a valid
UTF-8 sequence.</p>
<p>An example to demonstrate:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">uri_to_iri</span><span class="p">(</span><span class="s1">&#39;/</span><span class="si">%E</span><span class="s1">2</span><span class="si">%99%</span><span class="s1">A5</span><span class="si">%E</span><span class="s1">2</span><span class="si">%99%</span><span class="s1">A5/?utf8=</span><span class="si">%E</span><span class="s1">2%9C%93&#39;</span><span class="p">)</span>
<span class="go">&#39;/♥♥/?utf8=✓&#39;</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">uri_to_iri</span><span class="p">(</span><span class="s1">&#39;%A9helloworld&#39;</span><span class="p">)</span>
<span class="go">&#39;%A9helloworld&#39;</span>
</pre></div>
</div>
<p>In the first example, the UTF-8 characters and reserved characters are
unquoted. In the second, the percent-encoding remains unchanged because it
lies outside the valid UTF-8 range.</p>
<p>Both <code class="docutils literal"><span class="pre">iri_to_uri()</span></code> and <code class="docutils literal"><span class="pre">uri_to_iri()</span></code> functions are idempotent, which means the
following is always true:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">iri_to_uri</span><span class="p">(</span><span class="n">iri_to_uri</span><span class="p">(</span><span class="n">some_string</span><span class="p">))</span> <span class="o">==</span> <span class="n">iri_to_uri</span><span class="p">(</span><span class="n">some_string</span><span class="p">)</span>
<span class="n">uri_to_iri</span><span class="p">(</span><span class="n">uri_to_iri</span><span class="p">(</span><span class="n">some_string</span><span class="p">))</span> <span class="o">==</span> <span class="n">uri_to_iri</span><span class="p">(</span><span class="n">some_string</span><span class="p">)</span>
</pre></div>
</div>
<p>So you can safely call it multiple times on the same URI/IRI without risking
double-quoting problems.</p>
</div>
</div>
</div>
<div class="section" id="s-models">
<span id="models"></span><h2>Models<a class="headerlink" href="#models" title="Permalink to this headline">¶</a></h2>
<p>Because all strings are returned from the database as Unicode strings, model
fields that are character based (CharField, TextField, URLField, etc) will
contain Unicode values when Django retrieves data from the database. This
is <em>always</em> the case, even if the data could fit into an ASCII bytestring.</p>
<p>You can pass in bytestrings when creating a model or populating a field, and
Django will convert it to Unicode when it needs to.</p>
<div class="section" id="s-choosing-between-str-and-unicode">
<span id="choosing-between-str-and-unicode"></span><h3>Choosing between <code class="docutils literal"><span class="pre">__str__()</span></code> and <code class="docutils literal"><span class="pre">__unicode__()</span></code><a class="headerlink" href="#choosing-between-str-and-unicode" title="Permalink to this headline">¶</a></h3>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">If you are on Python 3, you can skip this section because you&#8217;ll always
create <code class="docutils literal"><span class="pre">__str__()</span></code> rather than <code class="docutils literal"><span class="pre">__unicode__()</span></code>. If you&#8217;d like
compatibility with Python 2, you can decorate your model class with
<a class="reference internal" href="utils.html#django.utils.encoding.python_2_unicode_compatible" title="django.utils.encoding.python_2_unicode_compatible"><code class="xref py py-func docutils literal"><span class="pre">python_2_unicode_compatible()</span></code></a>.</p>
</div>
<p>One consequence of using Unicode by default is that you have to take some care
when printing data from the model.</p>
<p>In particular, rather than giving your model a <code class="docutils literal"><span class="pre">__str__()</span></code> method, we
recommended you implement a <code class="docutils literal"><span class="pre">__unicode__()</span></code> method. In the <code class="docutils literal"><span class="pre">__unicode__()</span></code>
method, you can quite safely return the values of all your fields without
having to worry about whether they fit into a bytestring or not. (The way
Python works, the result of <code class="docutils literal"><span class="pre">__str__()</span></code> is <em>always</em> a bytestring, even if you
accidentally try to return a Unicode object).</p>
<p>You can still create a <code class="docutils literal"><span class="pre">__str__()</span></code> method on your models if you want, of
course, but you shouldn&#8217;t need to do this unless you have a good reason.
Django&#8217;s <code class="docutils literal"><span class="pre">Model</span></code> base class automatically provides a <code class="docutils literal"><span class="pre">__str__()</span></code>
implementation that calls <code class="docutils literal"><span class="pre">__unicode__()</span></code> and encodes the result into UTF-8.
This means you&#8217;ll normally only need to implement a <code class="docutils literal"><span class="pre">__unicode__()</span></code> method
and let Django handle the coercion to a bytestring when required.</p>
</div>
<div class="section" id="s-taking-care-in-get-absolute-url">
<span id="taking-care-in-get-absolute-url"></span><h3>Taking care in <code class="docutils literal"><span class="pre">get_absolute_url()</span></code><a class="headerlink" href="#taking-care-in-get-absolute-url" title="Permalink to this headline">¶</a></h3>
<p>URLs can only contain ASCII characters. If you&#8217;re constructing a URL from
pieces of data that might be non-ASCII, be careful to encode the results in a
way that is suitable for a URL. The <a class="reference internal" href="urlresolvers.html#django.core.urlresolvers.reverse" title="django.core.urlresolvers.reverse"><code class="xref py py-func docutils literal"><span class="pre">reverse()</span></code></a>
function handles this for you automatically.</p>
<p>If you&#8217;re constructing a URL manually (i.e., <em>not</em> using the <code class="docutils literal"><span class="pre">reverse()</span></code>
function), you&#8217;ll need to take care of the encoding yourself. In this case,
use the <code class="docutils literal"><span class="pre">iri_to_uri()</span></code> and <code class="docutils literal"><span class="pre">urlquote()</span></code> functions that were documented
<a class="reference internal" href="#id1">above</a>. For example:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">django.utils.encoding</span> <span class="k">import</span> <span class="n">iri_to_uri</span>
<span class="kn">from</span> <span class="nn">django.utils.http</span> <span class="k">import</span> <span class="n">urlquote</span>

<span class="k">def</span> <span class="nf">get_absolute_url</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
    <span class="n">url</span> <span class="o">=</span> <span class="s1">&#39;/person/</span><span class="si">%s</span><span class="s1">/?x=0&amp;y=0&#39;</span> <span class="o">%</span> <span class="n">urlquote</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">location</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">iri_to_uri</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
</pre></div>
</div>
<p>This function returns a correctly encoded URL even if <code class="docutils literal"><span class="pre">self.location</span></code> is
something like &#8220;Jack visited Paris &amp; Orléans&#8221;. (In fact, the <code class="docutils literal"><span class="pre">iri_to_uri()</span></code>
call isn&#8217;t strictly necessary in the above example, because all the
non-ASCII characters would have been removed in quoting in the first line.)</p>
</div>
</div>
<div class="section" id="s-the-database-api">
<span id="the-database-api"></span><h2>The database API<a class="headerlink" href="#the-database-api" title="Permalink to this headline">¶</a></h2>
<p>You can pass either Unicode strings or UTF-8 bytestrings as arguments to
<code class="docutils literal"><span class="pre">filter()</span></code> methods and the like in the database API. The following two
querysets are identical:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">__future__</span> <span class="k">import</span> <span class="n">unicode_literals</span>

<span class="n">qs</span> <span class="o">=</span> <span class="n">People</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">name__contains</span><span class="o">=</span><span class="s1">&#39;Å&#39;</span><span class="p">)</span>
<span class="n">qs</span> <span class="o">=</span> <span class="n">People</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">name__contains</span><span class="o">=</span><span class="n">b</span><span class="s1">&#39;</span><span class="se">\xc3\x85</span><span class="s1">&#39;</span><span class="p">)</span> <span class="c1"># UTF-8 encoding of Å</span>
</pre></div>
</div>
</div>
<div class="section" id="s-templates">
<span id="templates"></span><h2>Templates<a class="headerlink" href="#templates" title="Permalink to this headline">¶</a></h2>
<p>You can use either Unicode or bytestrings when creating templates manually:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">__future__</span> <span class="k">import</span> <span class="n">unicode_literals</span>
<span class="kn">from</span> <span class="nn">django.template</span> <span class="k">import</span> <span class="n">Template</span>
<span class="n">t1</span> <span class="o">=</span> <span class="n">Template</span><span class="p">(</span><span class="n">b</span><span class="s1">&#39;This is a bytestring template.&#39;</span><span class="p">)</span>
<span class="n">t2</span> <span class="o">=</span> <span class="n">Template</span><span class="p">(</span><span class="s1">&#39;This is a Unicode template.&#39;</span><span class="p">)</span>
</pre></div>
</div>
<p>But the common case is to read templates from the filesystem, and this creates
a slight complication: not all filesystems store their data encoded as UTF-8.
If your template files are not stored with a UTF-8 encoding, set the <a class="reference internal" href="settings.html#std:setting-FILE_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">FILE_CHARSET</span></code></a>
setting to the encoding of the files on disk. When Django reads in a template
file, it will convert the data from this encoding to Unicode. (<a class="reference internal" href="settings.html#std:setting-FILE_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">FILE_CHARSET</span></code></a>
is set to <code class="docutils literal"><span class="pre">'utf-8'</span></code> by default.)</p>
<p>The <a class="reference internal" href="settings.html#std:setting-DEFAULT_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">DEFAULT_CHARSET</span></code></a> setting controls the encoding of rendered templates.
This is set to UTF-8 by default.</p>
<div class="section" id="s-template-tags-and-filters">
<span id="template-tags-and-filters"></span><h3>Template tags and filters<a class="headerlink" href="#template-tags-and-filters" title="Permalink to this headline">¶</a></h3>
<p>A couple of tips to remember when writing your own template tags and filters:</p>
<ul class="simple">
<li>Always return Unicode strings from a template tag&#8217;s <code class="docutils literal"><span class="pre">render()</span></code> method
and from template filters.</li>
<li>Use <code class="docutils literal"><span class="pre">force_text()</span></code> in preference to <code class="docutils literal"><span class="pre">smart_text()</span></code> in these
places. Tag rendering and filter calls occur as the template is being
rendered, so there is no advantage to postponing the conversion of lazy
translation objects into strings. It&#8217;s easier to work solely with Unicode
strings at that point.</li>
</ul>
</div>
</div>
<div class="section" id="s-files">
<span id="s-unicode-files"></span><span id="files"></span><span id="unicode-files"></span><h2>Files<a class="headerlink" href="#files" title="Permalink to this headline">¶</a></h2>
<p>If you intend to allow users to upload files, you must ensure that the
environment used to run Django is configured to work with non-ASCII file names.
If your environment isn&#8217;t configured correctly, you&#8217;ll encounter
<code class="docutils literal"><span class="pre">UnicodeEncodeError</span></code> exceptions when saving files with file names that
contain non-ASCII characters.</p>
<p>Filesystem support for UTF-8 file names varies and might depend on the
environment. Check your current configuration in an interactive Python shell by
running:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="o">.</span><span class="n">getfilesystemencoding</span><span class="p">()</span>
</pre></div>
</div>
<p>This should output &#8220;UTF-8&#8221;.</p>
<p>The <code class="docutils literal"><span class="pre">LANG</span></code> environment variable is responsible for setting the expected
encoding on Unix platforms. Consult the documentation for your operating system
and application server for the appropriate syntax and location to set this
variable.</p>
<p>In your development environment, you might need to add a setting to your
<code class="docutils literal"><span class="pre">~.bashrc</span></code> analogous to::</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">export</span> <span class="n">LANG</span><span class="o">=</span><span class="s2">&quot;en_US.UTF-8&quot;</span>
</pre></div>
</div>
</div>
<div class="section" id="s-email">
<span id="email"></span><h2>Email<a class="headerlink" href="#email" title="Permalink to this headline">¶</a></h2>
<p>Django&#8217;s email framework (in <code class="docutils literal"><span class="pre">django.core.mail</span></code>) supports Unicode
transparently. You can use Unicode data in the message bodies and any headers.
However, you&#8217;re still obligated to respect the requirements of the email
specifications, so, for example, email addresses should use only ASCII
characters.</p>
<p>The following code example demonstrates that everything except email addresses
can be non-ASCII:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">__future__</span> <span class="k">import</span> <span class="n">unicode_literals</span>
<span class="kn">from</span> <span class="nn">django.core.mail</span> <span class="k">import</span> <span class="n">EmailMessage</span>

<span class="n">subject</span> <span class="o">=</span> <span class="s1">&#39;My visit to Sør-Trøndelag&#39;</span>
<span class="n">sender</span> <span class="o">=</span> <span class="s1">&#39;Arnbjörg Ráðormsdóttir &lt;arnbjorg@example.com&gt;&#39;</span>
<span class="n">recipients</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;Fred &lt;fred@example.com&#39;</span><span class="p">]</span>
<span class="n">body</span> <span class="o">=</span> <span class="s1">&#39;...&#39;</span>
<span class="n">msg</span> <span class="o">=</span> <span class="n">EmailMessage</span><span class="p">(</span><span class="n">subject</span><span class="p">,</span> <span class="n">body</span><span class="p">,</span> <span class="n">sender</span><span class="p">,</span> <span class="n">recipients</span><span class="p">)</span>
<span class="n">msg</span><span class="o">.</span><span class="n">attach</span><span class="p">(</span><span class="s2">&quot;Une pièce jointe.pdf&quot;</span><span class="p">,</span> <span class="s2">&quot;%PDF-1.4.%...&quot;</span><span class="p">,</span> <span class="n">mimetype</span><span class="o">=</span><span class="s2">&quot;application/pdf&quot;</span><span class="p">)</span>
<span class="n">msg</span><span class="o">.</span><span class="n">send</span><span class="p">()</span>
</pre></div>
</div>
</div>
<div class="section" id="s-form-submission">
<span id="form-submission"></span><h2>Form submission<a class="headerlink" href="#form-submission" title="Permalink to this headline">¶</a></h2>
<p>HTML form submission is a tricky area. There&#8217;s no guarantee that the
submission will include encoding information, which means the framework might
have to guess at the encoding of submitted data.</p>
<p>Django adopts a &#8220;lazy&#8221; approach to decoding form data. The data in an
<code class="docutils literal"><span class="pre">HttpRequest</span></code> object is only decoded when you access it. In fact, most of
the data is not decoded at all. Only the <code class="docutils literal"><span class="pre">HttpRequest.GET</span></code> and
<code class="docutils literal"><span class="pre">HttpRequest.POST</span></code> data structures have any decoding applied to them. Those
two fields will return their members as Unicode data. All other attributes and
methods of <code class="docutils literal"><span class="pre">HttpRequest</span></code> return data exactly as it was submitted by the
client.</p>
<p>By default, the <a class="reference internal" href="settings.html#std:setting-DEFAULT_CHARSET"><code class="xref std std-setting docutils literal"><span class="pre">DEFAULT_CHARSET</span></code></a> setting is used as the assumed encoding
for form data. If you need to change this for a particular form, you can set
the <code class="docutils literal"><span class="pre">encoding</span></code> attribute on an <code class="docutils literal"><span class="pre">HttpRequest</span></code> instance. For example:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">some_view</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
    <span class="c1"># We know that the data must be encoded as KOI8-R (for some reason).</span>
    <span class="n">request</span><span class="o">.</span><span class="n">encoding</span> <span class="o">=</span> <span class="s1">&#39;koi8-r&#39;</span>
    <span class="o">...</span>
</pre></div>
</div>
<p>You can even change the encoding after having accessed <code class="docutils literal"><span class="pre">request.GET</span></code> or
<code class="docutils literal"><span class="pre">request.POST</span></code>, and all subsequent accesses will use the new encoding.</p>
<p>Most developers won&#8217;t need to worry about changing form encoding, but this is
a useful feature for applications that talk to legacy systems whose encoding
you cannot control.</p>
<p>Django does not decode the data of file uploads, because that data is normally
treated as collections of bytes, rather than strings. Any automatic decoding
there would alter the meaning of the stream of bytes.</p>
</div>
</div>


          </div>
        </div>
      </div>
      
        
          <div class="yui-b" id="sidebar">
            
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
  <h3><a href="../contents.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Unicode data</a><ul>
<li><a class="reference internal" href="#creating-the-database">Creating the database</a></li>
<li><a class="reference internal" href="#general-string-handling">General string handling</a><ul>
<li><a class="reference internal" href="#translated-strings">Translated strings</a></li>
<li><a class="reference internal" href="#useful-utility-functions">Useful utility functions</a><ul>
<li><a class="reference internal" href="#conversion-functions">Conversion functions</a></li>
<li><a class="reference internal" href="#uri-and-iri-handling">URI and IRI handling</a></li>
</ul>
</li>
</ul>
</li>
<li><a class="reference internal" href="#models">Models</a><ul>
<li><a class="reference internal" href="#choosing-between-str-and-unicode">Choosing between <code class="docutils literal"><span class="pre">__str__()</span></code> and <code class="docutils literal"><span class="pre">__unicode__()</span></code></a></li>
<li><a class="reference internal" href="#taking-care-in-get-absolute-url">Taking care in <code class="docutils literal"><span class="pre">get_absolute_url()</span></code></a></li>
</ul>
</li>
<li><a class="reference internal" href="#the-database-api">The database API</a></li>
<li><a class="reference internal" href="#templates">Templates</a><ul>
<li><a class="reference internal" href="#template-tags-and-filters">Template tags and filters</a></li>
</ul>
</li>
<li><a class="reference internal" href="#files">Files</a></li>
<li><a class="reference internal" href="#email">Email</a></li>
<li><a class="reference internal" href="#form-submission">Form submission</a></li>
</ul>
</li>
</ul>

  <h3>Browse</h3>
  <ul>
    
      <li>Prev: <a href="template-response.html">TemplateResponse and SimpleTemplateResponse</a></li>
    
    
      <li>Next: <a href="urlresolvers.html"><code class="docutils literal"><span class="pre">django.core.urlresolvers</span></code> utility functions</a></li>
    
  </ul>
  <h3>You are here:</h3>
  <ul>
      <li>
        <a href="../index.html">Django 1.8.19 documentation</a>
        
          <ul><li><a href="index.html">API Reference</a>
        
        <ul><li>Unicode data</li></ul>
        </li></ul>
      </li>
  </ul>

  <div role="note" aria-label="source link">
    <h3>This Page</h3>
    <ul class="this-page-menu">
      <li><a href="../_sources/ref/unicode.txt"
            rel="nofollow">Show Source</a></li>
    </ul>
   </div>
<div id="searchbox" style="display: none" role="search">
  <h3>Quick search</h3>
    <form class="search" action="../search.html" method="get">
      <div><input type="text" name="q" /></div>
      <div><input type="submit" value="Go" /></div>
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
              <h3>Last update:</h3>
              <p class="topless">Mar 10, 2018</p>
          </div>
        
      
    </div>

    <div id="ft">
      <div class="nav">
    &laquo; <a href="template-response.html" title="TemplateResponse and SimpleTemplateResponse">previous</a>
     |
    <a href="index.html" title="API Reference" accesskey="U">up</a>
   |
    <a href="urlresolvers.html" title="&lt;code class=&#34;docutils literal&#34;&gt;&lt;span class=&#34;pre&#34;&gt;django.core.urlresolvers&lt;/span&gt;&lt;/code&gt; utility functions">next</a> &raquo;</div>
    </div>
  </div>

      <div class="clearer"></div>
    </div>
  </body>
</html>