<!-- HTML header for doxygen 1.8.7--> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/> <meta http-equiv="X-UA-Compatible" content="IE=9"/> <meta name="generator" content="Doxygen 1.8.13"/> <title>RapidJSON: Internals</title> <link href="tabs.css" rel="stylesheet" type="text/css"/> <script type="text/javascript" src="jquery.js"></script> <script type="text/javascript" src="dynsections.js"></script> <link href="navtree.css" rel="stylesheet" type="text/css"/> <script type="text/javascript" src="resize.js"></script> <script type="text/javascript" src="navtreedata.js"></script> <script type="text/javascript" src="navtree.js"></script> <script type="text/javascript"> $(document).ready(initResizable); </script> <link href="search/search.css" rel="stylesheet" type="text/css"/> <script type="text/javascript" src="search/searchdata.js"></script> <script type="text/javascript" src="search/search.js"></script> <script type="text/javascript"> $(document).ready(function() { init_search(); }); </script> <link href="doxygen.css" rel="stylesheet" type="text/css" /> <link href="doxygenextra.css" rel="stylesheet" type="text/css"/> </head> <body> <div id="top"><!-- do not remove this div, it is closed by doxygen! --> <div id="topbanner"><a href="https://github.com/miloyip/rapidjson" title="RapidJSON GitHub"><i class="githublogo"></i></a></div> <div id="MSearchBox" class="MSearchBoxInactive"> <span class="left"> <img id="MSearchSelect" src="search/mag_sel.png" onmouseover="return searchBox.OnSearchSelectShow()" onmouseout="return searchBox.OnSearchSelectHide()" alt=""/> <input type="text" id="MSearchField" value="Search" accesskey="S" onfocus="searchBox.OnSearchFieldFocus(true)" onblur="searchBox.OnSearchFieldFocus(false)" onkeyup="searchBox.OnSearchFieldChange(event)"/> </span><span class="right"> <a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a> </span> </div> <!-- end header part --> <!-- Generated by Doxygen 1.8.13 --> <script type="text/javascript"> var searchBox = new SearchBox("searchBox", "search",false,'Search'); </script> </div><!-- top --> <div id="side-nav" class="ui-resizable side-nav-resizable"> <div id="nav-tree"> <div id="nav-tree-contents"> <div id="nav-sync" class="sync"></div> </div> </div> <div id="splitbar" style="-moz-user-select:none;" class="ui-resizable-handle"> </div> </div> <script type="text/javascript"> $(document).ready(function(){initNavTree('md_doc_internals.html','');}); </script> <div id="doc-content"> <!-- window showing the filter options --> <div id="MSearchSelectWindow" onmouseover="return searchBox.OnSearchSelectShow()" onmouseout="return searchBox.OnSearchSelectHide()" onkeydown="return searchBox.OnSearchSelectKey(event)"> </div> <!-- iframe showing the search results (closed by default) --> <div id="MSearchResultsWindow"> <iframe src="javascript:void(0)" frameborder="0" name="MSearchResults" id="MSearchResults"> </iframe> </div> <div class="header"> <div class="headertitle"> <div class="title">Internals </div> </div> </div><!--header--> <div class="contents"> <div class="toc"><h3>Table of Contents</h3> <ul><li class="level1"><a href="#Architecture">Architecture</a></li> <li class="level1"><a href="#Value">Value</a><ul><li class="level2"><a href="#DataLayout">Data Layout</a></li> <li class="level2"><a href="#Flags">Flags</a></li> <li class="level2"><a href="#ShortString">Short-String Optimization</a></li> </ul> </li> <li class="level1"><a href="#InternalAllocator">Allocator</a><ul><li class="level2"><a href="#MemoryPoolAllocator">MemoryPoolAllocator</a></li> </ul> </li> <li class="level1"><a href="#ParsingOptimization">Parsing Optimization</a><ul><li class="level2"><a href="#SkipwhitespaceWithSIMD">Skip Whitespaces with SIMD</a></li> <li class="level2"><a href="#LocalStreamCopy">Local Stream Copy</a></li> <li class="level2"><a href="#ParsingDouble">Parsing to Double</a></li> </ul> </li> <li class="level1"><a href="#GenerationOptimization">Generation Optimization</a><ul><li class="level2"><a href="#itoa">Integer-to-String conversion</a></li> <li class="level2"><a href="#dtoa">Double-to-String conversion</a></li> </ul> </li> <li class="level1"><a href="#Parser">Parser</a><ul><li class="level2"><a href="#IterativeParser">Iterative Parser</a><ul><li class="level3"><a href="#IterativeParserGrammar">Grammar</a></li> <li class="level3"><a href="#IterativeParserParsingTable">Parsing Table</a></li> <li class="level3"><a href="#IterativeParserImplementation">Implementation</a></li> </ul> </li> </ul> </li> </ul> </div> <div class="textblock"><p>This section records some design and implementation details.</p> <h1><a class="anchor" id="Architecture"></a> Architecture</h1> <h2>SAX and DOM</h2> <p>The basic relationships of SAX and DOM is shown in the following UML diagram.</p> <div class="image"> <img src="architecture.png" alt="architecture.png"/> <div class="caption"> Architecture UML class diagram</div></div> <p> The core of the relationship is the <code>Handler</code> concept. From the SAX side, <code>Reader</code> parses a JSON from a stream and publish events to a <code>Handler</code>. <code>Writer</code> implements the <code>Handler</code> concept to handle the same set of events. From the DOM side, <code>Document</code> implements the <code>Handler</code> concept to build a DOM according to the events. <code>Value</code> supports a <code>Value::Accept(Handler&)</code> function, which traverses the DOM to publish events.</p> <p>With this design, SAX is not dependent on DOM. Even <code>Reader</code> and <code>Writer</code> have no dependencies between them. This provides flexibility to chain event publisher and handlers. Besides, <code>Value</code> does not depends on SAX as well. So, in addition to stringify a DOM to JSON, user may also stringify it to a XML writer, or do anything else.</p> <h2>Utility Classes</h2> <p>Both SAX and DOM APIs depends on 3 additional concepts: <code>Allocator</code>, <code>Encoding</code> and <code>Stream</code>. Their inheritance hierarchy is shown as below.</p> <div class="image"> <img src="utilityclass.png" alt="utilityclass.png"/> <div class="caption"> Utility classes UML class diagram</div></div> <h1><a class="anchor" id="Value"></a> Value</h1> <p><code>Value</code> (actually a typedef of <code>GenericValue<UTF8<>></code>) is the core of DOM API. This section describes the design of it.</p> <h2><a class="anchor" id="DataLayout"></a> Data Layout</h2> <p><code>Value</code> is a <a href="http://en.wikipedia.org/wiki/Variant_type">variant type</a>. In RapidJSON's context, an instance of <code>Value</code> can contain 1 of 6 JSON value types. This is possible by using <code>union</code>. Each <code>Value</code> contains two members: <code>union Data data_</code> and a<code>unsigned flags_</code>. The <code>flags_</code> indiciates the JSON type, and also additional information.</p> <p>The following tables show the data layout of each type. The 32-bit/64-bit columns indicates the size of the field in bytes.</p> <table class="doxtable"> <tr> <th>Null </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kNullType kNullFlag</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Bool </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kBoolType</code> (either <code>kTrueFlag</code> or <code>kFalseFlag</code>) </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>String </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>Ch* str</code> </td><td>Pointer to the string (may own) </td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>SizeType length</code> </td><td>Length of string </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kStringType kStringFlag ...</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Object </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>Member* members</code> </td><td>Pointer to array of members (owned) </td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>SizeType size</code> </td><td>Number of members </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>SizeType capacity</code> </td><td>Capacity of members </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kObjectType kObjectFlag</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Array </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>Value* values</code> </td><td>Pointer to array of values (owned) </td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>SizeType size</code> </td><td>Number of values </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>SizeType capacity</code> </td><td>Capacity of values </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kArrayType kArrayFlag</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Number (Int) </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>int i</code> </td><td>32-bit signed integer </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(zero padding) </td><td>0 </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kNumberType kNumberFlag kIntFlag kInt64Flag ...</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Number (UInt) </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>unsigned u</code> </td><td>32-bit unsigned integer </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(zero padding) </td><td>0 </td><td align="center">4 </td><td align="center">4 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kNumberType kNumberFlag kUIntFlag kUInt64Flag ...</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Number (Int64) </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>int64_t i64</code> </td><td>64-bit signed integer </td><td align="center">8 </td><td align="center">8 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kNumberType kNumberFlag kInt64Flag ...</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Number (Uint64) </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>uint64_t i64</code> </td><td>64-bit unsigned integer </td><td align="center">8 </td><td align="center">8 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kNumberType kNumberFlag kInt64Flag ...</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <table class="doxtable"> <tr> <th>Number (Double) </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>uint64_t i64</code> </td><td>Double precision floating-point </td><td align="center">8 </td><td align="center">8 </td></tr> <tr> <td>(unused) </td><td></td><td align="center">4 </td><td align="center">8 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kNumberType kNumberFlag kDoubleFlag</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <p>Here are some notes:</p><ul> <li>To reduce memory consumption for 64-bit architecture, <code>SizeType</code> is typedef as <code>unsigned</code> instead of <code>size_t</code>.</li> <li>Zero padding for 32-bit number may be placed after or before the actual type, according to the endianess. This makes possible for interpreting a 32-bit integer as a 64-bit integer, without any conversion.</li> <li>An <code>Int</code> is always an <code>Int64</code>, but the converse is not always true.</li> </ul> <h2><a class="anchor" id="Flags"></a> Flags</h2> <p>The 32-bit <code>flags_</code> contains both JSON type and other additional information. As shown in the above tables, each JSON type contains redundant <code>kXXXType</code> and <code>kXXXFlag</code>. This design is for optimizing the operation of testing bit-flags (<code>IsNumber()</code>) and obtaining a sequential number for each type (<code>GetType()</code>).</p> <p>String has two optional flags. <code>kCopyFlag</code> means that the string owns a copy of the string. <code>kInlineStrFlag</code> means using <a href="#ShortString">Short-String Optimization</a>.</p> <p>Number is a bit more complicated. For normal integer values, it can contains <code>kIntFlag</code>, <code>kUintFlag</code>, <code>kInt64Flag</code> and/or <code>kUint64Flag</code>, according to the range of the integer. For numbers with fraction, and integers larger than 64-bit range, they will be stored as <code>double</code> with <code>kDoubleFlag</code>.</p> <h2><a class="anchor" id="ShortString"></a> Short-String Optimization</h2> <p><a href="https://github.com/Kosta-Github">Kosta</a> provided a very neat short-string optimization. The optimization idea is given as follow. Excluding the <code>flags_</code>, a <code>Value</code> has 12 or 16 bytes (32-bit or 64-bit) for storing actual data. Instead of storing a pointer to a string, it is possible to store short strings in these space internally. For encoding with 1-byte character type (e.g. <code>char</code>), it can store maximum 11 or 15 characters string inside the <code>Value</code> type.</p> <table class="doxtable"> <tr> <th>ShortString (Ch=char) </th><th></th><th align="center">32-bit</th><th align="center">64-bit </th></tr> <tr> <td><code>Ch str[MaxChars]</code> </td><td>String buffer </td><td align="center">11 </td><td align="center">15 </td></tr> <tr> <td><code>Ch invLength</code> </td><td>MaxChars - Length </td><td align="center">1 </td><td align="center">1 </td></tr> <tr> <td><code>unsigned flags_</code> </td><td><code>kStringType kStringFlag ...</code> </td><td align="center">4 </td><td align="center">4 </td></tr> </table> <p>A special technique is applied. Instead of storing the length of string directly, it stores (MaxChars - length). This make it possible to store 11 characters with trailing <code>\0</code>.</p> <p>This optimization can reduce memory usage for copy-string. It can also improve cache-coherence thus improve runtime performance.</p> <h1><a class="anchor" id="InternalAllocator"></a> Allocator</h1> <p><code>Allocator</code> is a concept in RapidJSON: </p><div class="fragment"><div class="line">concept Allocator {</div><div class="line"> <span class="keyword">static</span> <span class="keyword">const</span> <span class="keywordtype">bool</span> kNeedFree; <span class="comment">//!< Whether this allocator needs to call Free().</span></div><div class="line"><span class="comment"></span></div><div class="line"> <span class="comment">// Allocate a memory block.</span></div><div class="line"> <span class="comment">// \param size of the memory block in bytes.</span></div><div class="line"> <span class="comment">// \returns pointer to the memory block.</span></div><div class="line"> <span class="keywordtype">void</span>* Malloc(<span class="keywordtype">size_t</span> size);</div><div class="line"></div><div class="line"> <span class="comment">// Resize a memory block.</span></div><div class="line"> <span class="comment">// \param originalPtr The pointer to current memory block. Null pointer is permitted.</span></div><div class="line"> <span class="comment">// \param originalSize The current size in bytes. (Design issue: since some allocator may not book-keep this, explicitly pass to it can save memory.)</span></div><div class="line"> <span class="comment">// \param newSize the new size in bytes.</span></div><div class="line"> <span class="keywordtype">void</span>* Realloc(<span class="keywordtype">void</span>* originalPtr, <span class="keywordtype">size_t</span> originalSize, <span class="keywordtype">size_t</span> newSize);</div><div class="line"></div><div class="line"> <span class="comment">// Free a memory block.</span></div><div class="line"> <span class="comment">// \param pointer to the memory block. Null pointer is permitted.</span></div><div class="line"> <span class="keyword">static</span> <span class="keywordtype">void</span> Free(<span class="keywordtype">void</span> *ptr);</div><div class="line">};</div></div><!-- fragment --><p>Note that <code>Malloc()</code> and <code>Realloc()</code> are member functions but <code>Free()</code> is static member function.</p> <h2><a class="anchor" id="MemoryPoolAllocator"></a> MemoryPoolAllocator</h2> <p><code>MemoryPoolAllocator</code> is the default allocator for DOM. It allocate but do not free memory. This is suitable for building a DOM tree.</p> <p>Internally, it allocates chunks of memory from the base allocator (by default <code>CrtAllocator</code>) and stores the chunks as a singly linked list. When user requests an allocation, it allocates memory from the following order:</p> <ol type="1"> <li>User supplied buffer if it is available. (See <a class="el" href="md_doc_dom.html">User Buffer section in DOM</a>)</li> <li>If user supplied buffer is full, use the current memory chunk.</li> <li>If the current block is full, allocate a new block of memory.</li> </ol> <h1><a class="anchor" id="ParsingOptimization"></a> Parsing Optimization</h1> <h2><a class="anchor" id="SkipwhitespaceWithSIMD"></a> Skip Whitespaces with SIMD</h2> <p>When parsing JSON from a stream, the parser need to skip 4 whitespace characters:</p> <ol type="1"> <li>Space (<code>U+0020</code>)</li> <li>Character Tabulation (<code>U+000B</code>)</li> <li>Line Feed (<code>U+000A</code>)</li> <li>Carriage Return (<code>U+000D</code>)</li> </ol> <p>A simple implementation will be simply: </p><div class="fragment"><div class="line"><span class="keywordtype">void</span> <a class="code" href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a">SkipWhitespace</a>(InputStream& s) {</div><div class="line"> <span class="keywordflow">while</span> (s.Peek() == <span class="charliteral">' '</span> || s.Peek() == <span class="charliteral">'\n'</span> || s.Peek() == <span class="charliteral">'\r'</span> || s.Peek() == <span class="charliteral">'\t'</span>)</div><div class="line"> s.Take();</div><div class="line">}</div></div><!-- fragment --><p>However, this requires 4 comparisons and a few branching for each character. This was found to be a hot spot.</p> <p>To accelerate this process, SIMD was applied to compare 16 characters with 4 white spaces for each iteration. Currently RapidJSON only supports SSE2 and SSE4.2 instructions for this. And it is only activated for UTF-8 memory streams, including string stream or <em>in situ</em> parsing.</p> <p>To enable this optimization, need to define <code>RAPIDJSON_SSE2</code> or <code>RAPIDJSON_SSE42</code> before including <code><a class="el" href="rapidjson_8h.html" title="common definitions and configuration ">rapidjson.h</a></code>. Some compilers can detect the setting, as in <code>perftest.h</code>:</p> <div class="fragment"><div class="line"><span class="comment">// __SSE2__ and __SSE4_2__ are recognized by gcc, clang, and the Intel compiler.</span></div><div class="line"><span class="comment">// We use -march=native with gmake to enable -msse2 and -msse4.2, if supported.</span></div><div class="line"><span class="preprocessor">#if defined(__SSE4_2__)</span></div><div class="line"><span class="preprocessor"># define RAPIDJSON_SSE42</span></div><div class="line"><span class="preprocessor">#elif defined(__SSE2__)</span></div><div class="line"><span class="preprocessor"># define RAPIDJSON_SSE2</span></div><div class="line"><span class="preprocessor">#endif</span></div></div><!-- fragment --><p>Note that, these are compile-time settings. Running the executable on a machine without such instruction set support will make it crash.</p> <h3>Page boundary issue</h3> <p>In an early version of RapidJSON, <a href="https://code.google.com/archive/p/rapidjson/issues/104">an issue</a> reported that the <code>SkipWhitespace_SIMD()</code> causes crash very rarely (around 1 in 500,000). After investigation, it is suspected that <code>_mm_loadu_si128()</code> accessed bytes after `'\0'`, and across a protected page boundary.</p> <p>In <a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel® 64 and IA-32 Architectures Optimization Reference Manual</a>, section 10.2.1:</p> <blockquote class="doxtable"> <p>To support algorithms requiring unaligned 128-bit SIMD memory accesses, memory buffer allocation by a caller function should consider adding some pad space so that a callee function can safely use the address pointer safely with unaligned 128-bit SIMD memory operations. The minimal padding size should be the width of the SIMD register that might be used in conjunction with unaligned SIMD memory access. </p> </blockquote> <p>This is not feasible as RapidJSON should not enforce such requirement.</p> <p>To fix this issue, currently the routine process bytes up to the next aligned address. After tha, use aligned read to perform SIMD processing. Also see <a href="https://github.com/miloyip/rapidjson/issues/85">#85</a>.</p> <h2><a class="anchor" id="LocalStreamCopy"></a> Local Stream Copy</h2> <p>During optimization, it is found that some compilers cannot localize some member data access of streams into local variables or registers. Experimental results show that for some stream types, making a copy of the stream and used it in inner-loop can improve performance. For example, the actual (non-SIMD) implementation of <code><a class="el" href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a" title="Skip the JSON white spaces in a stream. ">SkipWhitespace()</a></code> is implemented as:</p> <div class="fragment"><div class="line"><span class="keyword">template</span><<span class="keyword">typename</span> InputStream></div><div class="line"><span class="keywordtype">void</span> <a class="code" href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a">SkipWhitespace</a>(InputStream& is) {</div><div class="line"> internal::StreamLocalCopy<InputStream> copy(is);</div><div class="line"> InputStream& s(copy.s);</div><div class="line"></div><div class="line"> <span class="keywordflow">while</span> (s.Peek() == <span class="charliteral">' '</span> || s.Peek() == <span class="charliteral">'\n'</span> || s.Peek() == <span class="charliteral">'\r'</span> || s.Peek() == <span class="charliteral">'\t'</span>)</div><div class="line"> s.Take();</div><div class="line">}</div></div><!-- fragment --><p>Depending on the traits of stream, <code>StreamLocalCopy</code> will make (or not make) a copy of the stream object, use it locally and copy the states of stream back to the original stream.</p> <h2><a class="anchor" id="ParsingDouble"></a> Parsing to Double</h2> <p>Parsing string into <code>double</code> is difficult. The standard library function <code>strtod()</code> can do the job but it is slow. By default, the parsers use normal precision setting. This has has maximum 3 <a href="http://en.wikipedia.org/wiki/Unit_in_the_last_place">ULP</a> error and implemented in <code>internal::StrtodNormalPrecision()</code>.</p> <p>When using <code>kParseFullPrecisionFlag</code>, the parsers calls <code>internal::StrtodFullPrecision()</code> instead, and this function actually implemented 3 versions of conversion methods.</p><ol type="1"> <li><a href="http://www.exploringbinary.com/fast-path-decimal-to-floating-point-conversion/">Fast-Path</a>.</li> <li>Custom DIY-FP implementation as in <a href="https://github.com/floitsch/double-conversion">double-conversion</a>.</li> <li>Big Integer Method as in (Clinger, William D. How to read floating point numbers accurately. Vol. 25. No. 6. ACM, 1990).</li> </ol> <p>If the first conversion methods fail, it will try the second, and so on.</p> <h1><a class="anchor" id="GenerationOptimization"></a> Generation Optimization</h1> <h2><a class="anchor" id="itoa"></a> Integer-to-String conversion</h2> <p>The naive algorithm for integer-to-string conversion involves division per each decimal digit. We have implemented various implementations and evaluated them in <a href="https://github.com/miloyip/itoa-benchmark">itoa-benchmark</a>.</p> <p>Although SSE2 version is the fastest but the difference is minor by comparing to the first running-up <code>branchlut</code>. And <code>branchlut</code> is pure C++ implementation so we adopt <code>branchlut</code> in RapidJSON.</p> <h2><a class="anchor" id="dtoa"></a> Double-to-String conversion</h2> <p>Originally RapidJSON uses <code>snprintf(..., ..., "%g")</code> to achieve double-to-string conversion. This is not accurate as the default precision is 6. Later we also find that this is slow and there is an alternative.</p> <p>Google's V8 <a href="https://github.com/floitsch/double-conversion">double-conversion</a> implemented a newer, fast algorithm called Grisu3 (Loitsch, Florian. "Printing floating-point numbers quickly and accurately with integers." ACM Sigplan Notices 45.6 (2010): 233-243.).</p> <p>However, since it is not header-only so that we implemented a header-only version of Grisu2. This algorithm guarantees that the result is always accurate. And in most of cases it produces the shortest (optimal) string representation.</p> <p>The header-only conversion function has been evaluated in <a href="https://github.com/miloyip/dtoa-benchmark">dtoa-benchmark</a>.</p> <h1><a class="anchor" id="Parser"></a> Parser</h1> <h2><a class="anchor" id="IterativeParser"></a> Iterative Parser</h2> <p>The iterative parser is a recursive descent LL(1) parser implemented in a non-recursive manner.</p> <h3><a class="anchor" id="IterativeParserGrammar"></a> Grammar</h3> <p>The grammar used for this parser is based on strict JSON syntax: </p><div class="fragment"><div class="line">S -> array | object</div><div class="line">array -> [ values ]</div><div class="line">object -> { members }</div><div class="line">values -> non-empty-values | ε</div><div class="line">non-empty-values -> value addition-values</div><div class="line">addition-values -> ε | , non-empty-values</div><div class="line">members -> non-empty-members | ε</div><div class="line">non-empty-members -> member addition-members</div><div class="line">addition-members -> ε | , non-empty-members</div><div class="line">member -> STRING : value</div><div class="line">value -> STRING | NUMBER | NULL | BOOLEAN | object | array</div></div><!-- fragment --><p>Note that left factoring is applied to non-terminals <code>values</code> and <code>members</code> to make the grammar be LL(1).</p> <h3><a class="anchor" id="IterativeParserParsingTable"></a> Parsing Table</h3> <p>Based on the grammar, we can construct the FIRST and FOLLOW set.</p> <p>The FIRST set of non-terminals is listed below:</p> <table class="doxtable"> <tr> <th align="center">NON-TERMINAL </th><th align="center">FIRST </th></tr> <tr> <td align="center">array </td><td align="center">[ </td></tr> <tr> <td align="center">object </td><td align="center">{ </td></tr> <tr> <td align="center">values </td><td align="center">ε STRING NUMBER NULL BOOLEAN { [ </td></tr> <tr> <td align="center">addition-values </td><td align="center">ε COMMA </td></tr> <tr> <td align="center">members </td><td align="center">ε STRING </td></tr> <tr> <td align="center">addition-members </td><td align="center">ε COMMA </td></tr> <tr> <td align="center">member </td><td align="center">STRING </td></tr> <tr> <td align="center">value </td><td align="center">STRING NUMBER NULL BOOLEAN { [ </td></tr> <tr> <td align="center">S </td><td align="center">[ { </td></tr> <tr> <td align="center">non-empty-members </td><td align="center">STRING </td></tr> <tr> <td align="center">non-empty-values </td><td align="center">STRING NUMBER NULL BOOLEAN { [ </td></tr> </table> <p>The FOLLOW set is listed below:</p> <table class="doxtable"> <tr> <th align="center">NON-TERMINAL </th><th align="center">FOLLOW </th></tr> <tr> <td align="center">S </td><td align="center">$ </td></tr> <tr> <td align="center">array </td><td align="center">, $ } ] </td></tr> <tr> <td align="center">object </td><td align="center">, $ } ] </td></tr> <tr> <td align="center">values </td><td align="center">] </td></tr> <tr> <td align="center">non-empty-values </td><td align="center">] </td></tr> <tr> <td align="center">addition-values </td><td align="center">] </td></tr> <tr> <td align="center">members </td><td align="center">} </td></tr> <tr> <td align="center">non-empty-members </td><td align="center">} </td></tr> <tr> <td align="center">addition-members </td><td align="center">} </td></tr> <tr> <td align="center">member </td><td align="center">, } </td></tr> <tr> <td align="center">value </td><td align="center">, } ] </td></tr> </table> <p>Finally the parsing table can be constructed from FIRST and FOLLOW set:</p> <table class="doxtable"> <tr> <th align="center">NON-TERMINAL </th><th align="center">[ </th><th align="center">{ </th><th align="center">, </th><th align="center">: </th><th align="center">] </th><th align="center">} </th><th align="center">STRING </th><th align="center">NUMBER </th><th align="center">NULL </th><th align="center">BOOLEAN </th></tr> <tr> <td align="center">S </td><td align="center">array </td><td align="center">object </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">array </td><td align="center">[ values ] </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">object </td><td align="center"></td><td align="center">{ members } </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">values </td><td align="center">non-empty-values </td><td align="center">non-empty-values </td><td align="center"></td><td align="center"></td><td align="center">ε </td><td align="center"></td><td align="center">non-empty-values </td><td align="center">non-empty-values </td><td align="center">non-empty-values </td><td align="center">non-empty-values </td></tr> <tr> <td align="center">non-empty-values </td><td align="center">value addition-values </td><td align="center">value addition-values </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center">value addition-values </td><td align="center">value addition-values </td><td align="center">value addition-values </td><td align="center">value addition-values </td></tr> <tr> <td align="center">addition-values </td><td align="center"></td><td align="center"></td><td align="center">, non-empty-values </td><td align="center"></td><td align="center">ε </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">members </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center">ε </td><td align="center">non-empty-members </td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">non-empty-members </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center">member addition-members </td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">addition-members </td><td align="center"></td><td align="center"></td><td align="center">, non-empty-members </td><td align="center"></td><td align="center"></td><td align="center">ε </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">member </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center">STRING : value </td><td align="center"></td><td align="center"></td><td align="center"></td></tr> <tr> <td align="center">value </td><td align="center">array </td><td align="center">object </td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center">STRING </td><td align="center">NUMBER </td><td align="center">NULL </td><td align="center">BOOLEAN </td></tr> </table> <p>There is a great <a href="http://hackingoff.com/compilers/predict-first-follow-set">tool</a> for above grammar analysis.</p> <h3><a class="anchor" id="IterativeParserImplementation"></a> Implementation</h3> <p>Based on the parsing table, a direct(or conventional) implementation that pushes the production body in reverse order while generating a production could work.</p> <p>In RapidJSON, several modifications(or adaptations to current design) are made to a direct implementation.</p> <p>First, the parsing table is encoded in a state machine in RapidJSON. States are constructed by the head and body of production. State transitions are constructed by production rules. Besides, extra states are added for productions involved with <code>array</code> and <code>object</code>. In this way the generation of array values or object members would be a single state transition, rather than several pop/push operations in the direct implementation. This also makes the estimation of stack size more easier.</p> <p>The state diagram is shown as follows:</p> <div class="image"> <img src="iterative-parser-states-diagram.png" alt="iterative-parser-states-diagram.png"/> <div class="caption"> State Diagram</div></div> <p> Second, the iterative parser also keeps track of array's value count and object's member count in its internal stack, which may be different from a conventional implementation. </p> </div></div><!-- contents --> </div><!-- doc-content --> <!-- HTML footer for doxygen 1.8.7--> <!-- start footer part --> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> </ul> </div> </body> </html>