<?xml version="1.0" encoding="ascii" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=ascii" /> <meta name="generator" content="Docutils 0.5: http://docutils.sourceforge.net/" /> <title>End-to-end Tests for Unicode Encoding</title> <link rel="stylesheet" href="../custom.css" type="text/css" /> </head> <body> <div class="document" id="end-to-end-tests-for-unicode-encoding"> <h1 class="title">End-to-end Tests for Unicode Encoding</h1> <div class="section" id="test-function"> <h1>Test Function</h1> <p>The function <cite>testencoding</cite> is used as an end-to-end test for unicode encodings. It takes a given string, writes it to a python file, and processes that file's documentation. It then generates HTML output from the documentation, extracts all docstrings from the generated HTML output, and displays them. (In order to extract & display all docstrings, it monkey-patches the HMTLwriter.docstring_to_html() method.)</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span><span class="py-keyword">from</span> epydoc.test.util <span class="py-keyword">import</span> testencoding</pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span><span class="py-keyword">from</span> epydoc.test.util <span class="py-keyword">import</span> print_warnings <span class="py-prompt">>>> </span>print_warnings()</pre> </blockquote> </div> <div class="section" id="encoding-tests"> <h1>Encoding Tests</h1> <p>This section tests the output for a variety of different encodings. Note that some encodings (such as cp424) are not supported, since the ascii coding directive would result in a syntax error in the new encoding.</p> <p>Tests for several Microsoft codepges:</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: cp874 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x85"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#8364; &#8230;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: cp1250 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x82 \x84 \x85 \xff"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#8364; &#8218; &#8222; &#8230; &#729;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: cp1251 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x81 \x82 \xff"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#1026; &#1027; &#8218; &#1103;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: cp1252 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x82 \x83 \xff"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#8364; &#8218; &#402; &#255;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: cp1253 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x82 \x83 \xfe"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#8364; &#8218; &#402; &#974;</p></span></pre> </blockquote> <p>Unicode tests:</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>utf8_test =<span class="py-string">'''\</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123</span> <span class="py-more">...</span> <span class="py-more">... </span><span class="py-string">0x80-0x7ff range:</span> <span class="py-more">... </span><span class="py-string">\xc2\x80 \xc2\x81 \xdf\xbe \xdf\xbf</span> <span class="py-more">...</span> <span class="py-more">... </span><span class="py-string">0x800-0xffff range:</span> <span class="py-more">... </span><span class="py-string">\xe0\xa0\x80 \xe0\xa0\x81 \xef\xbf\xbe \xef\xbf\xbf</span> <span class="py-more">...</span> <span class="py-more">... </span><span class="py-string">0x10000-0x10ffff range:</span> <span class="py-more">... </span><span class="py-string">\xf0\x90\x80\x80 \xf0\x90\x80\x81</span> <span class="py-more">... </span><span class="py-string">\xf4\x8f\xbf\xbe \xf4\x8f\xbf\xbf</span> <span class="py-more">... </span><span class="py-string">"""\n'''</span> <span class="py-prompt">>>> </span>utf8_bom = <span class="py-string">'\xef\xbb\xbf'</span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span><span class="py-comment"># UTF-8 with a coding directive:</span> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">"# -*- coding: utf-8 -*-\n"</span>+utf8_test) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span><span class="py-comment"># UTF-8 with a BOM & a coding directive:</span> <span class="py-prompt">>>> </span>testencoding(utf8_bom+<span class="py-string">"# -*- coding: utf-8 -*-\n"</span>+utf8_test) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span><span class="py-comment"># UTF-8 with a BOM & no coding directive:</span> <span class="py-prompt">>>> </span>testencoding(utf8_bom+utf8_test) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> </blockquote> <p>Tests for KOI8-R:</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: koi8-r -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x82 \x83 \xff"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#9472; &#9484; &#9488; &#1066;</p></span></pre> </blockquote> <p>Tests for 'coding' directive on the second line:</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''\n# -*- coding: cp1252 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x82 \x83 \xff"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#8364; &#8218; &#402; &#255;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# comment on the first line.\n# -*- coding: cp1252 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \x80 \x82 \x83 \xff"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#8364; &#8218; &#402; &#255;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">"\n# -*- coding: utf-8 -*-\n"</span>+utf8_test) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">"# comment\n# -*- coding: utf-8 -*-\n"</span>+utf8_test) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> </blockquote> <p>Tests for shift-jis</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: shift_jis -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \xA1 \xA2 \xA3"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-comment"># doctest: +PYTHON2.4</span> <span class="py-output">abc ABC 123 &#65377; &#65378; &#65379;</span></pre> </blockquote> </div> <div class="section" id="str-unicode-test"> <h1>Str/Unicode Test</h1> <p>Make sure that we use the coding for both str and unicode docstrings.</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: utf-8 -*-</span> <span class="py-more">... </span><span class="py-string">"""abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#128; &#2047; &#2048;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: utf-8 -*-</span> <span class="py-more">... </span><span class="py-string">u"""abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output"><p>abc ABC 123 &#128; &#2047; &#2048;</p></span></pre> </blockquote> <p>Under special circumstances, we may not be able to tell what the proper encoding for a docstring is. This happens if:</p> <ol class="arabic simple"> <li>the docstring is only available via introspection.</li> <li>we are unable to determine what module the object that owns the docstring came from.</li> <li>the docstring contains non-ascii characters</li> </ol> <p>Under these circumstances, we issue a warning, and treat the docstring as latin-1. An example of this is a non-unicode docstring for properties:</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: utf-8 -*-</span> <span class="py-more">... </span><span class="py-string">p=property(doc="""\xc2\x80""")</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-comment"># doctest: +ELLIPSIS</span> <span class="py-output"><property object at ...>'s docstring is not a unicode string, but it contains non-ascii data -- treating it as latin-1.</span> <span class="py-output">&#194;&#128;</span></pre> </blockquote> </div> <div class="section" id="introspection-parsing-tests"> <h1>Introspection/Parsing Tests</h1> <p>This section checks to make sure that both introspection & parsing are getting the right results.</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">"# -*- coding: utf-8 -*-\n"</span>+utf8_test, introspect=False) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span> <span class="py-output"></span><span class="py-prompt">>>> </span>testencoding(utf8_bom+<span class="py-string">"# -*- coding: utf-8 -*-\n"</span>+utf8_test, introspect=False) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span> <span class="py-output"></span><span class="py-prompt">>>> </span>testencoding(utf8_bom+utf8_test, introspect=False) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">"# -*- coding: utf-8 -*-\n"</span>+utf8_test, parse=False) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span> <span class="py-output"></span><span class="py-prompt">>>> </span>testencoding(utf8_bom+<span class="py-string">"# -*- coding: utf-8 -*-\n"</span>+utf8_test, parse=False) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span> <span class="py-output"></span><span class="py-prompt">>>> </span>testencoding(utf8_bom+utf8_test, parse=False) <span class="py-output"><p>abc ABC 123</p></span> <span class="py-output"><p>0x80-0x7ff range: &#128; &#129; &#2046; &#2047;</p></span> <span class="py-output"><p>0x800-0xffff range: &#2048; &#2049; &#65534; &#65535;</p></span> <span class="py-output"><p>0x10000-0x10ffff range: &#65536; &#65537; &#1114110; &#1114111;</p></span></pre> </blockquote> </div> <div class="section" id="context-checks"> <h1>Context checks</h1> <p>Make sure that docstrings are rendered correctly in different contexts.</p> <blockquote> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: utf-8 -*-</span> <span class="py-more">... </span><span class="py-string">"""</span> <span class="py-more">... </span><span class="py-string">@var x: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string">@group \xc2\x80: x</span> <span class="py-more">... </span><span class="py-string">"""</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: utf-8 -*-</span> <span class="py-more">... </span><span class="py-string">def f(x):</span> <span class="py-more">... </span><span class="py-string"> """</span> <span class="py-more">... </span><span class="py-string"> abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @param x: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @type x: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @return: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @rtype: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @except X: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> """</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output"><p>abc ABC 123 &#128; &#2047; &#2048;</p></span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span></pre> <pre class="py-doctest"> <span class="py-prompt">>>> </span>testencoding(<span class="py-string">'''# -*- coding: utf-8 -*-</span> <span class="py-more">... </span><span class="py-string">class A:</span> <span class="py-more">... </span><span class="py-string"> """</span> <span class="py-more">... </span><span class="py-string"> abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @ivar x: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @cvar y: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> @type x: abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80</span> <span class="py-more">... </span><span class="py-string"> """</span> <span class="py-more">...</span> <span class="py-more">... </span><span class="py-string"> z = property(doc=u"abc ABC 123 \xc2\x80 \xdf\xbf \xe0\xa0\x80")</span> <span class="py-more">... </span><span class="py-string">'''</span>) <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output"><p>abc ABC 123 &#128; &#2047; &#2048;</p></span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span> <span class="py-output">abc ABC 123 &#128; &#2047; &#2048;</span></pre> </blockquote> </div> </div> <table width="100%" class="navbox" cellpadding="1" cellspacing="0"> <tr> <a class="nav" href="../index.html"> <td align="center" width="20%" class="nav"> <a class="nav" href="../index.html"> Home</a></td></a> <a class="nav" href="../installing.html"> <td align="center" width="20%" class="nav"> <a class="nav" href="../installing.html"> Installing Epydoc</a></td></a> <a class="nav" href="../using.html"> <td align="center" width="20%" class="nav"> <a class="nav" href="../using.html"> Using Epydoc</a></td></a> <a class="nav" href="../epytext.html"> <td align="center" width="20%" class="nav"> <a class="nav" href="../epytext.html"> Epytext</a></td></a> <td align="center" width="20%" class="nav"> <A href="http://sourceforge.net/projects/epydoc"> <IMG src="../sflogo.png" width="88" height="26" border="0" alt="SourceForge" align="top"/></A></td> </tr> </table> </body> </html>