<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="generator" content="rustdoc"> <meta name="description" content="Source to the Rust file `src/libcore/num/dec2flt/algorithm.rs`."> <meta name="keywords" content="rust, rustlang, rust-lang"> <title>algorithm.rs.html -- source</title> <link rel="stylesheet" type="text/css" href="../../../../normalize.css"> <link rel="stylesheet" type="text/css" href="../../../../rustdoc.css"> <link rel="stylesheet" type="text/css" href="../../../../main.css"> <link rel="shortcut icon" href="https://doc.rust-lang.org/favicon.ico"> </head> <body class="rustdoc source"> <!--[if lte IE 8]> <div class="warning"> This old browser is unsupported and will most likely display funky things. </div> <![endif]--> <nav class="sidebar"> <a href='../../../../core/index.html'><img src='https://www.rust-lang.org/logos/rust-logo-128x128-blk-v2.png' alt='logo' width='100'></a> </nav> <nav class="sub"> <form class="search-form js-only"> <div class="search-container"> <input class="search-input" name="search" autocomplete="off" placeholder="Click or press ‘S’ to search, ‘?’ for more options…" type="search"> </div> </form> </nav> <section id='main' class="content"><pre class="line-numbers"><span id="1"> 1</span> <span id="2"> 2</span> <span id="3"> 3</span> <span id="4"> 4</span> <span id="5"> 5</span> <span id="6"> 6</span> <span id="7"> 7</span> <span id="8"> 8</span> <span id="9"> 9</span> <span id="10"> 10</span> <span id="11"> 11</span> <span id="12"> 12</span> <span id="13"> 13</span> <span id="14"> 14</span> <span id="15"> 15</span> <span id="16"> 16</span> <span id="17"> 17</span> <span id="18"> 18</span> <span id="19"> 19</span> <span id="20"> 20</span> <span id="21"> 21</span> <span id="22"> 22</span> <span id="23"> 23</span> <span id="24"> 24</span> <span id="25"> 25</span> <span id="26"> 26</span> <span id="27"> 27</span> <span id="28"> 28</span> <span id="29"> 29</span> <span id="30"> 30</span> <span id="31"> 31</span> <span id="32"> 32</span> <span id="33"> 33</span> <span id="34"> 34</span> <span id="35"> 35</span> <span id="36"> 36</span> <span id="37"> 37</span> <span id="38"> 38</span> <span id="39"> 39</span> <span id="40"> 40</span> <span id="41"> 41</span> <span id="42"> 42</span> <span id="43"> 43</span> <span id="44"> 44</span> <span id="45"> 45</span> <span id="46"> 46</span> <span id="47"> 47</span> <span id="48"> 48</span> <span id="49"> 49</span> <span id="50"> 50</span> <span id="51"> 51</span> <span id="52"> 52</span> <span id="53"> 53</span> <span id="54"> 54</span> <span id="55"> 55</span> <span id="56"> 56</span> <span id="57"> 57</span> <span id="58"> 58</span> <span id="59"> 59</span> <span id="60"> 60</span> <span id="61"> 61</span> <span id="62"> 62</span> <span id="63"> 63</span> <span id="64"> 64</span> <span id="65"> 65</span> <span id="66"> 66</span> <span id="67"> 67</span> <span id="68"> 68</span> <span id="69"> 69</span> <span id="70"> 70</span> <span id="71"> 71</span> <span id="72"> 72</span> <span id="73"> 73</span> <span id="74"> 74</span> <span id="75"> 75</span> <span id="76"> 76</span> <span id="77"> 77</span> <span id="78"> 78</span> <span id="79"> 79</span> <span id="80"> 80</span> <span id="81"> 81</span> <span id="82"> 82</span> <span id="83"> 83</span> <span id="84"> 84</span> <span id="85"> 85</span> <span id="86"> 86</span> <span id="87"> 87</span> <span id="88"> 88</span> <span id="89"> 89</span> <span id="90"> 90</span> <span id="91"> 91</span> <span id="92"> 92</span> <span id="93"> 93</span> <span id="94"> 94</span> <span id="95"> 95</span> <span id="96"> 96</span> <span id="97"> 97</span> <span id="98"> 98</span> <span id="99"> 99</span> <span id="100">100</span> <span id="101">101</span> <span id="102">102</span> <span id="103">103</span> <span id="104">104</span> <span id="105">105</span> <span id="106">106</span> <span id="107">107</span> <span id="108">108</span> <span id="109">109</span> <span id="110">110</span> <span id="111">111</span> <span id="112">112</span> <span id="113">113</span> <span id="114">114</span> <span id="115">115</span> <span id="116">116</span> <span id="117">117</span> <span id="118">118</span> <span id="119">119</span> <span id="120">120</span> <span id="121">121</span> <span id="122">122</span> <span id="123">123</span> <span id="124">124</span> <span id="125">125</span> <span id="126">126</span> <span id="127">127</span> <span id="128">128</span> <span id="129">129</span> <span id="130">130</span> <span id="131">131</span> <span id="132">132</span> <span id="133">133</span> <span id="134">134</span> <span id="135">135</span> <span id="136">136</span> <span id="137">137</span> <span id="138">138</span> <span id="139">139</span> <span id="140">140</span> <span id="141">141</span> <span id="142">142</span> <span id="143">143</span> <span id="144">144</span> <span id="145">145</span> <span id="146">146</span> <span id="147">147</span> <span id="148">148</span> <span id="149">149</span> <span id="150">150</span> <span id="151">151</span> <span id="152">152</span> <span id="153">153</span> <span id="154">154</span> <span id="155">155</span> <span id="156">156</span> <span id="157">157</span> <span id="158">158</span> <span id="159">159</span> <span id="160">160</span> <span id="161">161</span> <span id="162">162</span> <span id="163">163</span> <span id="164">164</span> <span id="165">165</span> <span id="166">166</span> <span id="167">167</span> <span id="168">168</span> <span id="169">169</span> <span id="170">170</span> <span id="171">171</span> <span id="172">172</span> <span id="173">173</span> <span id="174">174</span> <span id="175">175</span> <span id="176">176</span> <span id="177">177</span> <span id="178">178</span> <span id="179">179</span> <span id="180">180</span> <span id="181">181</span> <span id="182">182</span> <span id="183">183</span> <span id="184">184</span> <span id="185">185</span> <span id="186">186</span> <span id="187">187</span> <span id="188">188</span> <span id="189">189</span> <span id="190">190</span> <span id="191">191</span> <span id="192">192</span> <span id="193">193</span> <span id="194">194</span> <span id="195">195</span> <span id="196">196</span> <span id="197">197</span> <span id="198">198</span> <span id="199">199</span> <span id="200">200</span> <span id="201">201</span> <span id="202">202</span> <span id="203">203</span> <span id="204">204</span> <span id="205">205</span> <span id="206">206</span> <span id="207">207</span> <span id="208">208</span> <span id="209">209</span> <span id="210">210</span> <span id="211">211</span> <span id="212">212</span> <span id="213">213</span> <span id="214">214</span> <span id="215">215</span> <span id="216">216</span> <span id="217">217</span> <span id="218">218</span> <span id="219">219</span> <span id="220">220</span> <span id="221">221</span> <span id="222">222</span> <span id="223">223</span> <span id="224">224</span> <span id="225">225</span> <span id="226">226</span> <span id="227">227</span> <span id="228">228</span> <span id="229">229</span> <span id="230">230</span> <span id="231">231</span> <span id="232">232</span> <span id="233">233</span> <span id="234">234</span> <span id="235">235</span> <span id="236">236</span> <span id="237">237</span> <span id="238">238</span> <span id="239">239</span> <span id="240">240</span> <span id="241">241</span> <span id="242">242</span> <span id="243">243</span> <span id="244">244</span> <span id="245">245</span> <span id="246">246</span> <span id="247">247</span> <span id="248">248</span> <span id="249">249</span> <span id="250">250</span> <span id="251">251</span> <span id="252">252</span> <span id="253">253</span> <span id="254">254</span> <span id="255">255</span> <span id="256">256</span> <span id="257">257</span> <span id="258">258</span> <span id="259">259</span> <span id="260">260</span> <span id="261">261</span> <span id="262">262</span> <span id="263">263</span> <span id="264">264</span> <span id="265">265</span> <span id="266">266</span> <span id="267">267</span> <span id="268">268</span> <span id="269">269</span> <span id="270">270</span> <span id="271">271</span> <span id="272">272</span> <span id="273">273</span> <span id="274">274</span> <span id="275">275</span> <span id="276">276</span> <span id="277">277</span> <span id="278">278</span> <span id="279">279</span> <span id="280">280</span> <span id="281">281</span> <span id="282">282</span> <span id="283">283</span> <span id="284">284</span> <span id="285">285</span> <span id="286">286</span> <span id="287">287</span> <span id="288">288</span> <span id="289">289</span> <span id="290">290</span> <span id="291">291</span> <span id="292">292</span> <span id="293">293</span> <span id="294">294</span> <span id="295">295</span> <span id="296">296</span> <span id="297">297</span> <span id="298">298</span> <span id="299">299</span> <span id="300">300</span> <span id="301">301</span> <span id="302">302</span> <span id="303">303</span> <span id="304">304</span> <span id="305">305</span> <span id="306">306</span> <span id="307">307</span> <span id="308">308</span> <span id="309">309</span> <span id="310">310</span> <span id="311">311</span> <span id="312">312</span> <span id="313">313</span> <span id="314">314</span> <span id="315">315</span> <span id="316">316</span> <span id="317">317</span> <span id="318">318</span> <span id="319">319</span> <span id="320">320</span> <span id="321">321</span> <span id="322">322</span> <span id="323">323</span> <span id="324">324</span> <span id="325">325</span> <span id="326">326</span> <span id="327">327</span> <span id="328">328</span> <span id="329">329</span> <span id="330">330</span> <span id="331">331</span> <span id="332">332</span> <span id="333">333</span> <span id="334">334</span> <span id="335">335</span> <span id="336">336</span> <span id="337">337</span> <span id="338">338</span> <span id="339">339</span> <span id="340">340</span> <span id="341">341</span> <span id="342">342</span> <span id="343">343</span> <span id="344">344</span> <span id="345">345</span> <span id="346">346</span> <span id="347">347</span> <span id="348">348</span> <span id="349">349</span> <span id="350">350</span> <span id="351">351</span> <span id="352">352</span> <span id="353">353</span> <span id="354">354</span> <span id="355">355</span> <span id="356">356</span> <span id="357">357</span> <span id="358">358</span> <span id="359">359</span> <span id="360">360</span> <span id="361">361</span> <span id="362">362</span> <span id="363">363</span> <span id="364">364</span> <span id="365">365</span> <span id="366">366</span> <span id="367">367</span> <span id="368">368</span> <span id="369">369</span> <span id="370">370</span> <span id="371">371</span> <span id="372">372</span> <span id="373">373</span> <span id="374">374</span> <span id="375">375</span> <span id="376">376</span> <span id="377">377</span> <span id="378">378</span> <span id="379">379</span> <span id="380">380</span> <span id="381">381</span> <span id="382">382</span> <span id="383">383</span> <span id="384">384</span> <span id="385">385</span> <span id="386">386</span> <span id="387">387</span> <span id="388">388</span> <span id="389">389</span> <span id="390">390</span> <span id="391">391</span> <span id="392">392</span> <span id="393">393</span> <span id="394">394</span> <span id="395">395</span> <span id="396">396</span> <span id="397">397</span> <span id="398">398</span> <span id="399">399</span> <span id="400">400</span> <span id="401">401</span> <span id="402">402</span> <span id="403">403</span> <span id="404">404</span> <span id="405">405</span> <span id="406">406</span> <span id="407">407</span> <span id="408">408</span> <span id="409">409</span> <span id="410">410</span> <span id="411">411</span> <span id="412">412</span> <span id="413">413</span> <span id="414">414</span> <span id="415">415</span> <span id="416">416</span> <span id="417">417</span> <span id="418">418</span> <span id="419">419</span> <span id="420">420</span> <span id="421">421</span> <span id="422">422</span> <span id="423">423</span> <span id="424">424</span> </pre><pre class="rust "> <span class="comment">// Copyright 2015 The Rust Project Developers. See the COPYRIGHT</span> <span class="comment">// file at the top-level directory of this distribution and at</span> <span class="comment">// http://rust-lang.org/COPYRIGHT.</span> <span class="comment">//</span> <span class="comment">// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or</span> <span class="comment">// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license</span> <span class="comment">// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your</span> <span class="comment">// option. This file may not be copied, modified, or distributed</span> <span class="comment">// except according to those terms.</span> <span class="doccomment">//! The various algorithms from the paper.</span> <span class="kw">use</span> <span class="ident">cmp</span>::<span class="ident">min</span>; <span class="kw">use</span> <span class="ident">cmp</span>::<span class="ident">Ordering</span>::{<span class="ident">Less</span>, <span class="ident">Equal</span>, <span class="ident">Greater</span>}; <span class="kw">use</span> <span class="ident">num</span>::<span class="ident">diy_float</span>::<span class="ident">Fp</span>; <span class="kw">use</span> <span class="ident">num</span>::<span class="ident">dec2flt</span>::<span class="ident">table</span>; <span class="kw">use</span> <span class="ident">num</span>::<span class="ident">dec2flt</span>::<span class="ident">rawfp</span>::{<span class="self">self</span>, <span class="ident">Unpacked</span>, <span class="ident">RawFloat</span>, <span class="ident">fp_to_float</span>, <span class="ident">next_float</span>, <span class="ident">prev_float</span>}; <span class="kw">use</span> <span class="ident">num</span>::<span class="ident">dec2flt</span>::<span class="ident">num</span>::{<span class="self">self</span>, <span class="ident">Big</span>}; <span class="doccomment">/// Number of significand bits in Fp</span> <span class="kw">const</span> <span class="ident">P</span>: <span class="ident">u32</span> <span class="op">=</span> <span class="number">64</span>; <span class="comment">// We simply store the best approximation for *all* exponents, so the variable "h" and the</span> <span class="comment">// associated conditions can be omitted. This trades performance for a couple kilobytes of space.</span> <span class="kw">fn</span> <span class="ident">power_of_ten</span>(<span class="ident">e</span>: <span class="ident">i16</span>) <span class="op">-></span> <span class="ident">Fp</span> { <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">e</span> <span class="op">>=</span> <span class="ident">table</span>::<span class="ident">MIN_E</span>); <span class="kw">let</span> <span class="ident">i</span> <span class="op">=</span> <span class="ident">e</span> <span class="op">-</span> <span class="ident">table</span>::<span class="ident">MIN_E</span>; <span class="kw">let</span> <span class="ident">sig</span> <span class="op">=</span> <span class="ident">table</span>::<span class="ident">POWERS</span>.<span class="number">0</span>[<span class="ident">i</span> <span class="kw">as</span> <span class="ident">usize</span>]; <span class="kw">let</span> <span class="ident">exp</span> <span class="op">=</span> <span class="ident">table</span>::<span class="ident">POWERS</span>.<span class="number">1</span>[<span class="ident">i</span> <span class="kw">as</span> <span class="ident">usize</span>]; <span class="ident">Fp</span> { <span class="ident">f</span>: <span class="ident">sig</span>, <span class="ident">e</span>: <span class="ident">exp</span> } } <span class="comment">// In most architectures, floating point operations have an explicit bit size, therefore the</span> <span class="comment">// precision of the computation is determined on a per-operation basis.</span> <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">not</span>(<span class="ident">target_arch</span><span class="op">=</span><span class="string">"x86"</span>), <span class="ident">target_feature</span><span class="op">=</span><span class="string">"sse2"</span>))]</span> <span class="kw">mod</span> <span class="ident">fpu_precision</span> { <span class="kw">pub</span> <span class="kw">fn</span> <span class="ident">set_precision</span><span class="op"><</span><span class="ident">T</span><span class="op">></span>() { } } <span class="comment">// On x86, the x87 FPU is used for float operations if the SSE/SSE2 extensions are not available.</span> <span class="comment">// The x87 FPU operates with 80 bits of precision by default, which means that operations will</span> <span class="comment">// round to 80 bits causing double rounding to happen when values are eventually represented as</span> <span class="comment">// 32/64 bit float values. To overcome this, the FPU control word can be set so that the</span> <span class="comment">// computations are performed in the desired precision.</span> <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">all</span>(<span class="ident">target_arch</span><span class="op">=</span><span class="string">"x86"</span>, <span class="ident">not</span>(<span class="ident">target_feature</span><span class="op">=</span><span class="string">"sse2"</span>)))]</span> <span class="kw">mod</span> <span class="ident">fpu_precision</span> { <span class="kw">use</span> <span class="ident">mem</span>::<span class="ident">size_of</span>; <span class="doccomment">/// A structure used to preserve the original value of the FPU control word, so that it can be</span> <span class="doccomment">/// restored when the structure is dropped.</span> <span class="doccomment">///</span> <span class="doccomment">/// The x87 FPU is a 16-bits register whose fields are as follows:</span> <span class="doccomment">///</span> <span class="doccomment">/// | 12-15 | 10-11 | 8-9 | 6-7 | 5 | 4 | 3 | 2 | 1 | 0 |</span> <span class="doccomment">/// |------:|------:|----:|----:|---:|---:|---:|---:|---:|---:|</span> <span class="doccomment">/// | | RC | PC | | PM | UM | OM | ZM | DM | IM |</span> <span class="doccomment">///</span> <span class="doccomment">/// The documentation for all of the fields is available in the IA-32 Architectures Software</span> <span class="doccomment">/// Developer's Manual (Volume 1).</span> <span class="doccomment">///</span> <span class="doccomment">/// The only field which is relevant for the following code is PC, Precision Control. This</span> <span class="doccomment">/// field determines the precision of the operations performed by the FPU. It can be set to:</span> <span class="doccomment">/// - 0b00, single precision i.e. 32-bits</span> <span class="doccomment">/// - 0b10, double precision i.e. 64-bits</span> <span class="doccomment">/// - 0b11, double extended precision i.e. 80-bits (default state)</span> <span class="doccomment">/// The 0b01 value is reserved and should not be used.</span> <span class="kw">pub</span> <span class="kw">struct</span> <span class="ident">FPUControlWord</span>(<span class="ident">u16</span>); <span class="kw">fn</span> <span class="ident">set_cw</span>(<span class="ident">cw</span>: <span class="ident">u16</span>) { <span class="kw">unsafe</span> { <span class="macro">asm</span><span class="macro">!</span>(<span class="string">"fldcw $0"</span> :: <span class="string">"m"</span> (<span class="ident">cw</span>) :: <span class="string">"volatile"</span>) } } <span class="doccomment">/// Set the precision field of the FPU to `T` and return a `FPUControlWord`</span> <span class="kw">pub</span> <span class="kw">fn</span> <span class="ident">set_precision</span><span class="op"><</span><span class="ident">T</span><span class="op">></span>() <span class="op">-></span> <span class="ident">FPUControlWord</span> { <span class="kw">let</span> <span class="ident">cw</span> <span class="op">=</span> <span class="number">0u16</span>; <span class="comment">// Compute the value for the Precision Control field that is appropriate for `T`.</span> <span class="kw">let</span> <span class="ident">cw_precision</span> <span class="op">=</span> <span class="kw">match</span> <span class="ident">size_of</span>::<span class="op"><</span><span class="ident">T</span><span class="op">></span>() { <span class="number">4</span> <span class="op">=></span> <span class="number">0x0000</span>, <span class="comment">// 32 bits</span> <span class="number">8</span> <span class="op">=></span> <span class="number">0x0200</span>, <span class="comment">// 64 bits</span> _ <span class="op">=></span> <span class="number">0x0300</span>, <span class="comment">// default, 80 bits</span> }; <span class="comment">// Get the original value of the control word to restore it later, when the</span> <span class="comment">// `FPUControlWord` structure is dropped</span> <span class="kw">unsafe</span> { <span class="macro">asm</span><span class="macro">!</span>(<span class="string">"fnstcw $0"</span> : <span class="string">"=*m"</span> (<span class="kw-2">&</span><span class="ident">cw</span>) ::: <span class="string">"volatile"</span>) } <span class="comment">// Set the control word to the desired precision. This is achieved by masking away the old</span> <span class="comment">// precision (bits 8 and 9, 0x300) and replacing it with the precision flag computed above.</span> <span class="ident">set_cw</span>((<span class="ident">cw</span> <span class="op">&</span> <span class="number">0xFCFF</span>) <span class="op">|</span> <span class="ident">cw_precision</span>); <span class="ident">FPUControlWord</span>(<span class="ident">cw</span>) } <span class="kw">impl</span> <span class="ident">Drop</span> <span class="kw">for</span> <span class="ident">FPUControlWord</span> { <span class="kw">fn</span> <span class="ident">drop</span>(<span class="kw-2">&</span><span class="kw-2">mut</span> <span class="self">self</span>) { <span class="ident">set_cw</span>(<span class="self">self</span>.<span class="number">0</span>) } } } <span class="doccomment">/// The fast path of Bellerophon using machine-sized integers and floats.</span> <span class="doccomment">///</span> <span class="doccomment">/// This is extracted into a separate function so that it can be attempted before constructing</span> <span class="doccomment">/// a bignum.</span> <span class="kw">pub</span> <span class="kw">fn</span> <span class="ident">fast_path</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">integral</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">fractional</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">e</span>: <span class="ident">i64</span>) <span class="op">-></span> <span class="prelude-ty">Option</span><span class="op"><</span><span class="ident">T</span><span class="op">></span> { <span class="kw">let</span> <span class="ident">num_digits</span> <span class="op">=</span> <span class="ident">integral</span>.<span class="ident">len</span>() <span class="op">+</span> <span class="ident">fractional</span>.<span class="ident">len</span>(); <span class="comment">// log_10(f64::MAX_SIG) ~ 15.95. We compare the exact value to MAX_SIG near the end,</span> <span class="comment">// this is just a quick, cheap rejection (and also frees the rest of the code from</span> <span class="comment">// worrying about underflow).</span> <span class="kw">if</span> <span class="ident">num_digits</span> <span class="op">></span> <span class="number">16</span> { <span class="kw">return</span> <span class="prelude-val">None</span>; } <span class="kw">if</span> <span class="ident">e</span>.<span class="ident">abs</span>() <span class="op">>=</span> <span class="ident">T</span>::<span class="ident">CEIL_LOG5_OF_MAX_SIG</span> <span class="kw">as</span> <span class="ident">i64</span> { <span class="kw">return</span> <span class="prelude-val">None</span>; } <span class="kw">let</span> <span class="ident">f</span> <span class="op">=</span> <span class="ident">num</span>::<span class="ident">from_str_unchecked</span>(<span class="ident">integral</span>.<span class="ident">iter</span>().<span class="ident">chain</span>(<span class="ident">fractional</span>.<span class="ident">iter</span>())); <span class="kw">if</span> <span class="ident">f</span> <span class="op">></span> <span class="ident">T</span>::<span class="ident">MAX_SIG</span> { <span class="kw">return</span> <span class="prelude-val">None</span>; } <span class="comment">// The fast path crucially depends on arithmetic being rounded to the correct number of bits</span> <span class="comment">// without any intermediate rounding. On x86 (without SSE or SSE2) this requires the precision</span> <span class="comment">// of the x87 FPU stack to be changed so that it directly rounds to 64/32 bit.</span> <span class="comment">// The `set_precision` function takes care of setting the precision on architectures which</span> <span class="comment">// require setting it by changing the global state (like the control word of the x87 FPU).</span> <span class="kw">let</span> <span class="ident">_cw</span> <span class="op">=</span> <span class="ident">fpu_precision</span>::<span class="ident">set_precision</span>::<span class="op"><</span><span class="ident">T</span><span class="op">></span>(); <span class="comment">// The case e < 0 cannot be folded into the other branch. Negative powers result in</span> <span class="comment">// a repeating fractional part in binary, which are rounded, which causes real</span> <span class="comment">// (and occasionally quite significant!) errors in the final result.</span> <span class="kw">if</span> <span class="ident">e</span> <span class="op">>=</span> <span class="number">0</span> { <span class="prelude-val">Some</span>(<span class="ident">T</span>::<span class="ident">from_int</span>(<span class="ident">f</span>) <span class="op">*</span> <span class="ident">T</span>::<span class="ident">short_fast_pow10</span>(<span class="ident">e</span> <span class="kw">as</span> <span class="ident">usize</span>)) } <span class="kw">else</span> { <span class="prelude-val">Some</span>(<span class="ident">T</span>::<span class="ident">from_int</span>(<span class="ident">f</span>) <span class="op">/</span> <span class="ident">T</span>::<span class="ident">short_fast_pow10</span>(<span class="ident">e</span>.<span class="ident">abs</span>() <span class="kw">as</span> <span class="ident">usize</span>)) } } <span class="doccomment">/// Algorithm Bellerophon is trivial code justified by non-trivial numeric analysis.</span> <span class="doccomment">///</span> <span class="doccomment">/// It rounds ``f`` to a float with 64 bit significand and multiplies it by the best approximation</span> <span class="doccomment">/// of `10^e` (in the same floating point format). This is often enough to get the correct result.</span> <span class="doccomment">/// However, when the result is close to halfway between two adjacent (ordinary) floats, the</span> <span class="doccomment">/// compound rounding error from multiplying two approximation means the result may be off by a</span> <span class="doccomment">/// few bits. When this happens, the iterative Algorithm R fixes things up.</span> <span class="doccomment">///</span> <span class="doccomment">/// The hand-wavy "close to halfway" is made precise by the numeric analysis in the paper.</span> <span class="doccomment">/// In the words of Clinger:</span> <span class="doccomment">///</span> <span class="doccomment">/// > Slop, expressed in units of the least significant bit, is an inclusive bound for the error</span> <span class="doccomment">/// > accumulated during the floating point calculation of the approximation to f * 10^e. (Slop is</span> <span class="doccomment">/// > not a bound for the true error, but bounds the difference between the approximation z and</span> <span class="doccomment">/// > the best possible approximation that uses p bits of significand.)</span> <span class="kw">pub</span> <span class="kw">fn</span> <span class="ident">bellerophon</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">f</span>: <span class="kw-2">&</span><span class="ident">Big</span>, <span class="ident">e</span>: <span class="ident">i16</span>) <span class="op">-></span> <span class="ident">T</span> { <span class="kw">let</span> <span class="ident">slop</span>; <span class="kw">if</span> <span class="ident">f</span> <span class="op"><=</span> <span class="kw-2">&</span><span class="ident">Big</span>::<span class="ident">from_u64</span>(<span class="ident">T</span>::<span class="ident">MAX_SIG</span>) { <span class="comment">// The cases abs(e) < log5(2^N) are in fast_path()</span> <span class="ident">slop</span> <span class="op">=</span> <span class="kw">if</span> <span class="ident">e</span> <span class="op">>=</span> <span class="number">0</span> { <span class="number">0</span> } <span class="kw">else</span> { <span class="number">3</span> }; } <span class="kw">else</span> { <span class="ident">slop</span> <span class="op">=</span> <span class="kw">if</span> <span class="ident">e</span> <span class="op">>=</span> <span class="number">0</span> { <span class="number">1</span> } <span class="kw">else</span> { <span class="number">4</span> }; } <span class="kw">let</span> <span class="ident">z</span> <span class="op">=</span> <span class="ident">rawfp</span>::<span class="ident">big_to_fp</span>(<span class="ident">f</span>).<span class="ident">mul</span>(<span class="kw-2">&</span><span class="ident">power_of_ten</span>(<span class="ident">e</span>)).<span class="ident">normalize</span>(); <span class="kw">let</span> <span class="ident">exp_p_n</span> <span class="op">=</span> <span class="number">1</span> <span class="op"><<</span> (<span class="ident">P</span> <span class="op">-</span> <span class="ident">T</span>::<span class="ident">SIG_BITS</span> <span class="kw">as</span> <span class="ident">u32</span>); <span class="kw">let</span> <span class="ident">lowbits</span>: <span class="ident">i64</span> <span class="op">=</span> (<span class="ident">z</span>.<span class="ident">f</span> <span class="op">%</span> <span class="ident">exp_p_n</span>) <span class="kw">as</span> <span class="ident">i64</span>; <span class="comment">// Is the slop large enough to make a difference when</span> <span class="comment">// rounding to n bits?</span> <span class="kw">if</span> (<span class="ident">lowbits</span> <span class="op">-</span> <span class="ident">exp_p_n</span> <span class="kw">as</span> <span class="ident">i64</span> <span class="op">/</span> <span class="number">2</span>).<span class="ident">abs</span>() <span class="op"><=</span> <span class="ident">slop</span> { <span class="ident">algorithm_r</span>(<span class="ident">f</span>, <span class="ident">e</span>, <span class="ident">fp_to_float</span>(<span class="ident">z</span>)) } <span class="kw">else</span> { <span class="ident">fp_to_float</span>(<span class="ident">z</span>) } } <span class="doccomment">/// An iterative algorithm that improves a floating point approximation of `f * 10^e`.</span> <span class="doccomment">///</span> <span class="doccomment">/// Each iteration gets one unit in the last place closer, which of course takes terribly long to</span> <span class="doccomment">/// converge if `z0` is even mildly off. Luckily, when used as fallback for Bellerophon, the</span> <span class="doccomment">/// starting approximation is off by at most one ULP.</span> <span class="kw">fn</span> <span class="ident">algorithm_r</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">f</span>: <span class="kw-2">&</span><span class="ident">Big</span>, <span class="ident">e</span>: <span class="ident">i16</span>, <span class="ident">z0</span>: <span class="ident">T</span>) <span class="op">-></span> <span class="ident">T</span> { <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">z</span> <span class="op">=</span> <span class="ident">z0</span>; <span class="kw">loop</span> { <span class="kw">let</span> <span class="ident">raw</span> <span class="op">=</span> <span class="ident">z</span>.<span class="ident">unpack</span>(); <span class="kw">let</span> (<span class="ident">m</span>, <span class="ident">k</span>) <span class="op">=</span> (<span class="ident">raw</span>.<span class="ident">sig</span>, <span class="ident">raw</span>.<span class="ident">k</span>); <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">x</span> <span class="op">=</span> <span class="ident">f</span>.<span class="ident">clone</span>(); <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">y</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_u64</span>(<span class="ident">m</span>); <span class="comment">// Find positive integers `x`, `y` such that `x / y` is exactly `(f * 10^e) / (m * 2^k)`.</span> <span class="comment">// This not only avoids dealing with the signs of `e` and `k`, we also eliminate the</span> <span class="comment">// power of two common to `10^e` and `2^k` to make the numbers smaller.</span> <span class="ident">make_ratio</span>(<span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">x</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">y</span>, <span class="ident">e</span>, <span class="ident">k</span>); <span class="kw">let</span> <span class="ident">m_digits</span> <span class="op">=</span> [(<span class="ident">m</span> <span class="op">&</span> <span class="number">0xFF_FF_FF_FF</span>) <span class="kw">as</span> <span class="ident">u32</span>, (<span class="ident">m</span> <span class="op">>></span> <span class="number">32</span>) <span class="kw">as</span> <span class="ident">u32</span>]; <span class="comment">// This is written a bit awkwardly because our bignums don't support</span> <span class="comment">// negative numbers, so we use the absolute value + sign information.</span> <span class="comment">// The multiplication with m_digits can't overflow. If `x` or `y` are large enough that</span> <span class="comment">// we need to worry about overflow, then they are also large enough that `make_ratio` has</span> <span class="comment">// reduced the fraction by a factor of 2^64 or more.</span> <span class="kw">let</span> (<span class="ident">d2</span>, <span class="ident">d_negative</span>) <span class="op">=</span> <span class="kw">if</span> <span class="ident">x</span> <span class="op">>=</span> <span class="ident">y</span> { <span class="comment">// Don't need x any more, save a clone().</span> <span class="ident">x</span>.<span class="ident">sub</span>(<span class="kw-2">&</span><span class="ident">y</span>).<span class="ident">mul_pow2</span>(<span class="number">1</span>).<span class="ident">mul_digits</span>(<span class="kw-2">&</span><span class="ident">m_digits</span>); (<span class="ident">x</span>, <span class="bool-val">false</span>) } <span class="kw">else</span> { <span class="comment">// Still need y - make a copy.</span> <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">y</span> <span class="op">=</span> <span class="ident">y</span>.<span class="ident">clone</span>(); <span class="ident">y</span>.<span class="ident">sub</span>(<span class="kw-2">&</span><span class="ident">x</span>).<span class="ident">mul_pow2</span>(<span class="number">1</span>).<span class="ident">mul_digits</span>(<span class="kw-2">&</span><span class="ident">m_digits</span>); (<span class="ident">y</span>, <span class="bool-val">true</span>) }; <span class="kw">if</span> <span class="ident">d2</span> <span class="op"><</span> <span class="ident">y</span> { <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">d2_double</span> <span class="op">=</span> <span class="ident">d2</span>; <span class="ident">d2_double</span>.<span class="ident">mul_pow2</span>(<span class="number">1</span>); <span class="kw">if</span> <span class="ident">m</span> <span class="op">==</span> <span class="ident">T</span>::<span class="ident">MIN_SIG</span> <span class="op">&&</span> <span class="ident">d_negative</span> <span class="op">&&</span> <span class="ident">d2_double</span> <span class="op">></span> <span class="ident">y</span> { <span class="ident">z</span> <span class="op">=</span> <span class="ident">prev_float</span>(<span class="ident">z</span>); } <span class="kw">else</span> { <span class="kw">return</span> <span class="ident">z</span>; } } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">d2</span> <span class="op">==</span> <span class="ident">y</span> { <span class="kw">if</span> <span class="ident">m</span> <span class="op">%</span> <span class="number">2</span> <span class="op">==</span> <span class="number">0</span> { <span class="kw">if</span> <span class="ident">m</span> <span class="op">==</span> <span class="ident">T</span>::<span class="ident">MIN_SIG</span> <span class="op">&&</span> <span class="ident">d_negative</span> { <span class="ident">z</span> <span class="op">=</span> <span class="ident">prev_float</span>(<span class="ident">z</span>); } <span class="kw">else</span> { <span class="kw">return</span> <span class="ident">z</span>; } } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">d_negative</span> { <span class="ident">z</span> <span class="op">=</span> <span class="ident">prev_float</span>(<span class="ident">z</span>); } <span class="kw">else</span> { <span class="ident">z</span> <span class="op">=</span> <span class="ident">next_float</span>(<span class="ident">z</span>); } } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">d_negative</span> { <span class="ident">z</span> <span class="op">=</span> <span class="ident">prev_float</span>(<span class="ident">z</span>); } <span class="kw">else</span> { <span class="ident">z</span> <span class="op">=</span> <span class="ident">next_float</span>(<span class="ident">z</span>); } } } <span class="doccomment">/// Given `x = f` and `y = m` where `f` represent input decimal digits as usual and `m` is the</span> <span class="doccomment">/// significand of a floating point approximation, make the ratio `x / y` equal to</span> <span class="doccomment">/// `(f * 10^e) / (m * 2^k)`, possibly reduced by a power of two both have in common.</span> <span class="kw">fn</span> <span class="ident">make_ratio</span>(<span class="ident">x</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">Big</span>, <span class="ident">y</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">Big</span>, <span class="ident">e</span>: <span class="ident">i16</span>, <span class="ident">k</span>: <span class="ident">i16</span>) { <span class="kw">let</span> (<span class="ident">e_abs</span>, <span class="ident">k_abs</span>) <span class="op">=</span> (<span class="ident">e</span>.<span class="ident">abs</span>() <span class="kw">as</span> <span class="ident">usize</span>, <span class="ident">k</span>.<span class="ident">abs</span>() <span class="kw">as</span> <span class="ident">usize</span>); <span class="kw">if</span> <span class="ident">e</span> <span class="op">>=</span> <span class="number">0</span> { <span class="kw">if</span> <span class="ident">k</span> <span class="op">>=</span> <span class="number">0</span> { <span class="comment">// x = f * 10^e, y = m * 2^k, except that we reduce the fraction by some power of two.</span> <span class="kw">let</span> <span class="ident">common</span> <span class="op">=</span> <span class="ident">min</span>(<span class="ident">e_abs</span>, <span class="ident">k_abs</span>); <span class="ident">x</span>.<span class="ident">mul_pow5</span>(<span class="ident">e_abs</span>).<span class="ident">mul_pow2</span>(<span class="ident">e_abs</span> <span class="op">-</span> <span class="ident">common</span>); <span class="ident">y</span>.<span class="ident">mul_pow2</span>(<span class="ident">k_abs</span> <span class="op">-</span> <span class="ident">common</span>); } <span class="kw">else</span> { <span class="comment">// x = f * 10^e * 2^abs(k), y = m</span> <span class="comment">// This can't overflow because it requires positive `e` and negative `k`, which can</span> <span class="comment">// only happen for values extremely close to 1, which means that `e` and `k` will be</span> <span class="comment">// comparatively tiny.</span> <span class="ident">x</span>.<span class="ident">mul_pow5</span>(<span class="ident">e_abs</span>).<span class="ident">mul_pow2</span>(<span class="ident">e_abs</span> <span class="op">+</span> <span class="ident">k_abs</span>); } } <span class="kw">else</span> { <span class="kw">if</span> <span class="ident">k</span> <span class="op">>=</span> <span class="number">0</span> { <span class="comment">// x = f, y = m * 10^abs(e) * 2^k</span> <span class="comment">// This can't overflow either, see above.</span> <span class="ident">y</span>.<span class="ident">mul_pow5</span>(<span class="ident">e_abs</span>).<span class="ident">mul_pow2</span>(<span class="ident">k_abs</span> <span class="op">+</span> <span class="ident">e_abs</span>); } <span class="kw">else</span> { <span class="comment">// x = f * 2^abs(k), y = m * 10^abs(e), again reducing by a common power of two.</span> <span class="kw">let</span> <span class="ident">common</span> <span class="op">=</span> <span class="ident">min</span>(<span class="ident">e_abs</span>, <span class="ident">k_abs</span>); <span class="ident">x</span>.<span class="ident">mul_pow2</span>(<span class="ident">k_abs</span> <span class="op">-</span> <span class="ident">common</span>); <span class="ident">y</span>.<span class="ident">mul_pow5</span>(<span class="ident">e_abs</span>).<span class="ident">mul_pow2</span>(<span class="ident">e_abs</span> <span class="op">-</span> <span class="ident">common</span>); } } } <span class="doccomment">/// Conceptually, Algorithm M is the simplest way to convert a decimal to a float.</span> <span class="doccomment">///</span> <span class="doccomment">/// We form a ratio that is equal to `f * 10^e`, then throwing in powers of two until it gives</span> <span class="doccomment">/// a valid float significand. The binary exponent `k` is the number of times we multiplied</span> <span class="doccomment">/// numerator or denominator by two, i.e., at all times `f * 10^e` equals `(u / v) * 2^k`.</span> <span class="doccomment">/// When we have found out significand, we only need to round by inspecting the remainder of the</span> <span class="doccomment">/// division, which is done in helper functions further below.</span> <span class="doccomment">///</span> <span class="doccomment">/// This algorithm is super slow, even with the optimization described in `quick_start()`.</span> <span class="doccomment">/// However, it's the simplest of the algorithms to adapt for overflow, underflow, and subnormal</span> <span class="doccomment">/// results. This implementation takes over when Bellerophon and Algorithm R are overwhelmed.</span> <span class="doccomment">/// Detecting underflow and overflow is easy: The ratio still isn't an in-range significand,</span> <span class="doccomment">/// yet the minimum/maximum exponent has been reached. In the case of overflow, we simply return</span> <span class="doccomment">/// infinity.</span> <span class="doccomment">///</span> <span class="doccomment">/// Handling underflow and subnormals is trickier. One big problem is that, with the minimum</span> <span class="doccomment">/// exponent, the ratio might still be too large for a significand. See underflow() for details.</span> <span class="kw">pub</span> <span class="kw">fn</span> <span class="ident">algorithm_m</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">f</span>: <span class="kw-2">&</span><span class="ident">Big</span>, <span class="ident">e</span>: <span class="ident">i16</span>) <span class="op">-></span> <span class="ident">T</span> { <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">u</span>; <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">v</span>; <span class="kw">let</span> <span class="ident">e_abs</span> <span class="op">=</span> <span class="ident">e</span>.<span class="ident">abs</span>() <span class="kw">as</span> <span class="ident">usize</span>; <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">k</span> <span class="op">=</span> <span class="number">0</span>; <span class="kw">if</span> <span class="ident">e</span> <span class="op"><</span> <span class="number">0</span> { <span class="ident">u</span> <span class="op">=</span> <span class="ident">f</span>.<span class="ident">clone</span>(); <span class="ident">v</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_small</span>(<span class="number">1</span>); <span class="ident">v</span>.<span class="ident">mul_pow5</span>(<span class="ident">e_abs</span>).<span class="ident">mul_pow2</span>(<span class="ident">e_abs</span>); } <span class="kw">else</span> { <span class="comment">// FIXME possible optimization: generalize big_to_fp so that we can do the equivalent of</span> <span class="comment">// fp_to_float(big_to_fp(u)) here, only without the double rounding.</span> <span class="ident">u</span> <span class="op">=</span> <span class="ident">f</span>.<span class="ident">clone</span>(); <span class="ident">u</span>.<span class="ident">mul_pow5</span>(<span class="ident">e_abs</span>).<span class="ident">mul_pow2</span>(<span class="ident">e_abs</span>); <span class="ident">v</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_small</span>(<span class="number">1</span>); } <span class="ident">quick_start</span>::<span class="op"><</span><span class="ident">T</span><span class="op">></span>(<span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">u</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">v</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">k</span>); <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">rem</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_small</span>(<span class="number">0</span>); <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">x</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_small</span>(<span class="number">0</span>); <span class="kw">let</span> <span class="ident">min_sig</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_u64</span>(<span class="ident">T</span>::<span class="ident">MIN_SIG</span>); <span class="kw">let</span> <span class="ident">max_sig</span> <span class="op">=</span> <span class="ident">Big</span>::<span class="ident">from_u64</span>(<span class="ident">T</span>::<span class="ident">MAX_SIG</span>); <span class="kw">loop</span> { <span class="ident">u</span>.<span class="ident">div_rem</span>(<span class="kw-2">&</span><span class="ident">v</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">x</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">rem</span>); <span class="kw">if</span> <span class="ident">k</span> <span class="op">==</span> <span class="ident">T</span>::<span class="ident">MIN_EXP_INT</span> { <span class="comment">// We have to stop at the minimum exponent, if we wait until `k < T::MIN_EXP_INT`,</span> <span class="comment">// then we'd be off by a factor of two. Unfortunately this means we have to special-</span> <span class="comment">// case normal numbers with the minimum exponent.</span> <span class="comment">// FIXME find a more elegant formulation, but run the `tiny-pow10` test to make sure</span> <span class="comment">// that it's actually correct!</span> <span class="kw">if</span> <span class="ident">x</span> <span class="op">>=</span> <span class="ident">min_sig</span> <span class="op">&&</span> <span class="ident">x</span> <span class="op"><=</span> <span class="ident">max_sig</span> { <span class="kw">break</span>; } <span class="kw">return</span> <span class="ident">underflow</span>(<span class="ident">x</span>, <span class="ident">v</span>, <span class="ident">rem</span>); } <span class="kw">if</span> <span class="ident">k</span> <span class="op">></span> <span class="ident">T</span>::<span class="ident">MAX_EXP_INT</span> { <span class="kw">return</span> <span class="ident">T</span>::<span class="ident">INFINITY</span>; } <span class="kw">if</span> <span class="ident">x</span> <span class="op"><</span> <span class="ident">min_sig</span> { <span class="ident">u</span>.<span class="ident">mul_pow2</span>(<span class="number">1</span>); <span class="ident">k</span> <span class="op">-=</span> <span class="number">1</span>; } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">x</span> <span class="op">></span> <span class="ident">max_sig</span> { <span class="ident">v</span>.<span class="ident">mul_pow2</span>(<span class="number">1</span>); <span class="ident">k</span> <span class="op">+=</span> <span class="number">1</span>; } <span class="kw">else</span> { <span class="kw">break</span>; } } <span class="kw">let</span> <span class="ident">q</span> <span class="op">=</span> <span class="ident">num</span>::<span class="ident">to_u64</span>(<span class="kw-2">&</span><span class="ident">x</span>); <span class="kw">let</span> <span class="ident">z</span> <span class="op">=</span> <span class="ident">rawfp</span>::<span class="ident">encode_normal</span>(<span class="ident">Unpacked</span>::<span class="ident">new</span>(<span class="ident">q</span>, <span class="ident">k</span>)); <span class="ident">round_by_remainder</span>(<span class="ident">v</span>, <span class="ident">rem</span>, <span class="ident">q</span>, <span class="ident">z</span>) } <span class="doccomment">/// Skip over most AlgorithmM iterations by checking the bit length.</span> <span class="kw">fn</span> <span class="ident">quick_start</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">u</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">Big</span>, <span class="ident">v</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">Big</span>, <span class="ident">k</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">i16</span>) { <span class="comment">// The bit length is an estimate of the base two logarithm, and log(u / v) = log(u) - log(v).</span> <span class="comment">// The estimate is off by at most 1, but always an under-estimate, so the error on log(u)</span> <span class="comment">// and log(v) are of the same sign and cancel out (if both are large). Therefore the error</span> <span class="comment">// for log(u / v) is at most one as well.</span> <span class="comment">// The target ratio is one where u/v is in an in-range significand. Thus our termination</span> <span class="comment">// condition is log2(u / v) being the significand bits, plus/minus one.</span> <span class="comment">// FIXME Looking at the second bit could improve the estimate and avoid some more divisions.</span> <span class="kw">let</span> <span class="ident">target_ratio</span> <span class="op">=</span> <span class="ident">T</span>::<span class="ident">SIG_BITS</span> <span class="kw">as</span> <span class="ident">i16</span>; <span class="kw">let</span> <span class="ident">log2_u</span> <span class="op">=</span> <span class="ident">u</span>.<span class="ident">bit_length</span>() <span class="kw">as</span> <span class="ident">i16</span>; <span class="kw">let</span> <span class="ident">log2_v</span> <span class="op">=</span> <span class="ident">v</span>.<span class="ident">bit_length</span>() <span class="kw">as</span> <span class="ident">i16</span>; <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">u_shift</span>: <span class="ident">i16</span> <span class="op">=</span> <span class="number">0</span>; <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">v_shift</span>: <span class="ident">i16</span> <span class="op">=</span> <span class="number">0</span>; <span class="macro">assert</span><span class="macro">!</span>(<span class="kw-2">*</span><span class="ident">k</span> <span class="op">==</span> <span class="number">0</span>); <span class="kw">loop</span> { <span class="kw">if</span> <span class="kw-2">*</span><span class="ident">k</span> <span class="op">==</span> <span class="ident">T</span>::<span class="ident">MIN_EXP_INT</span> { <span class="comment">// Underflow or subnormal. Leave it to the main function.</span> <span class="kw">break</span>; } <span class="kw">if</span> <span class="kw-2">*</span><span class="ident">k</span> <span class="op">==</span> <span class="ident">T</span>::<span class="ident">MAX_EXP_INT</span> { <span class="comment">// Overflow. Leave it to the main function.</span> <span class="kw">break</span>; } <span class="kw">let</span> <span class="ident">log2_ratio</span> <span class="op">=</span> (<span class="ident">log2_u</span> <span class="op">+</span> <span class="ident">u_shift</span>) <span class="op">-</span> (<span class="ident">log2_v</span> <span class="op">+</span> <span class="ident">v_shift</span>); <span class="kw">if</span> <span class="ident">log2_ratio</span> <span class="op"><</span> <span class="ident">target_ratio</span> <span class="op">-</span> <span class="number">1</span> { <span class="ident">u_shift</span> <span class="op">+=</span> <span class="number">1</span>; <span class="kw-2">*</span><span class="ident">k</span> <span class="op">-=</span> <span class="number">1</span>; } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">log2_ratio</span> <span class="op">></span> <span class="ident">target_ratio</span> <span class="op">+</span> <span class="number">1</span> { <span class="ident">v_shift</span> <span class="op">+=</span> <span class="number">1</span>; <span class="kw-2">*</span><span class="ident">k</span> <span class="op">+=</span> <span class="number">1</span>; } <span class="kw">else</span> { <span class="kw">break</span>; } } <span class="ident">u</span>.<span class="ident">mul_pow2</span>(<span class="ident">u_shift</span> <span class="kw">as</span> <span class="ident">usize</span>); <span class="ident">v</span>.<span class="ident">mul_pow2</span>(<span class="ident">v_shift</span> <span class="kw">as</span> <span class="ident">usize</span>); } <span class="kw">fn</span> <span class="ident">underflow</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">x</span>: <span class="ident">Big</span>, <span class="ident">v</span>: <span class="ident">Big</span>, <span class="ident">rem</span>: <span class="ident">Big</span>) <span class="op">-></span> <span class="ident">T</span> { <span class="kw">if</span> <span class="ident">x</span> <span class="op"><</span> <span class="ident">Big</span>::<span class="ident">from_u64</span>(<span class="ident">T</span>::<span class="ident">MIN_SIG</span>) { <span class="kw">let</span> <span class="ident">q</span> <span class="op">=</span> <span class="ident">num</span>::<span class="ident">to_u64</span>(<span class="kw-2">&</span><span class="ident">x</span>); <span class="kw">let</span> <span class="ident">z</span> <span class="op">=</span> <span class="ident">rawfp</span>::<span class="ident">encode_subnormal</span>(<span class="ident">q</span>); <span class="kw">return</span> <span class="ident">round_by_remainder</span>(<span class="ident">v</span>, <span class="ident">rem</span>, <span class="ident">q</span>, <span class="ident">z</span>); } <span class="comment">// Ratio isn't an in-range significand with the minimum exponent, so we need to round off</span> <span class="comment">// excess bits and adjust the exponent accordingly. The real value now looks like this:</span> <span class="comment">//</span> <span class="comment">// x lsb</span> <span class="comment">// /--------------\/</span> <span class="comment">// 1010101010101010.10101010101010 * 2^k</span> <span class="comment">// \-----/\-------/ \------------/</span> <span class="comment">// q trunc. (represented by rem)</span> <span class="comment">//</span> <span class="comment">// Therefore, when the rounded-off bits are != 0.5 ULP, they decide the rounding</span> <span class="comment">// on their own. When they are equal and the remainder is non-zero, the value still</span> <span class="comment">// needs to be rounded up. Only when the rounded off bits are 1/2 and the remainder</span> <span class="comment">// is zero, we have a half-to-even situation.</span> <span class="kw">let</span> <span class="ident">bits</span> <span class="op">=</span> <span class="ident">x</span>.<span class="ident">bit_length</span>(); <span class="kw">let</span> <span class="ident">lsb</span> <span class="op">=</span> <span class="ident">bits</span> <span class="op">-</span> <span class="ident">T</span>::<span class="ident">SIG_BITS</span> <span class="kw">as</span> <span class="ident">usize</span>; <span class="kw">let</span> <span class="ident">q</span> <span class="op">=</span> <span class="ident">num</span>::<span class="ident">get_bits</span>(<span class="kw-2">&</span><span class="ident">x</span>, <span class="ident">lsb</span>, <span class="ident">bits</span>); <span class="kw">let</span> <span class="ident">k</span> <span class="op">=</span> <span class="ident">T</span>::<span class="ident">MIN_EXP_INT</span> <span class="op">+</span> <span class="ident">lsb</span> <span class="kw">as</span> <span class="ident">i16</span>; <span class="kw">let</span> <span class="ident">z</span> <span class="op">=</span> <span class="ident">rawfp</span>::<span class="ident">encode_normal</span>(<span class="ident">Unpacked</span>::<span class="ident">new</span>(<span class="ident">q</span>, <span class="ident">k</span>)); <span class="kw">let</span> <span class="ident">q_even</span> <span class="op">=</span> <span class="ident">q</span> <span class="op">%</span> <span class="number">2</span> <span class="op">==</span> <span class="number">0</span>; <span class="kw">match</span> <span class="ident">num</span>::<span class="ident">compare_with_half_ulp</span>(<span class="kw-2">&</span><span class="ident">x</span>, <span class="ident">lsb</span>) { <span class="ident">Greater</span> <span class="op">=></span> <span class="ident">next_float</span>(<span class="ident">z</span>), <span class="ident">Less</span> <span class="op">=></span> <span class="ident">z</span>, <span class="ident">Equal</span> <span class="kw">if</span> <span class="ident">rem</span>.<span class="ident">is_zero</span>() <span class="op">&&</span> <span class="ident">q_even</span> <span class="op">=></span> <span class="ident">z</span>, <span class="ident">Equal</span> <span class="op">=></span> <span class="ident">next_float</span>(<span class="ident">z</span>), } } <span class="doccomment">/// Ordinary round-to-even, obfuscated by having to round based on the remainder of a division.</span> <span class="kw">fn</span> <span class="ident">round_by_remainder</span><span class="op"><</span><span class="ident">T</span>: <span class="ident">RawFloat</span><span class="op">></span>(<span class="ident">v</span>: <span class="ident">Big</span>, <span class="ident">r</span>: <span class="ident">Big</span>, <span class="ident">q</span>: <span class="ident">u64</span>, <span class="ident">z</span>: <span class="ident">T</span>) <span class="op">-></span> <span class="ident">T</span> { <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">v_minus_r</span> <span class="op">=</span> <span class="ident">v</span>; <span class="ident">v_minus_r</span>.<span class="ident">sub</span>(<span class="kw-2">&</span><span class="ident">r</span>); <span class="kw">if</span> <span class="ident">r</span> <span class="op"><</span> <span class="ident">v_minus_r</span> { <span class="ident">z</span> } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">r</span> <span class="op">></span> <span class="ident">v_minus_r</span> { <span class="ident">next_float</span>(<span class="ident">z</span>) } <span class="kw">else</span> <span class="kw">if</span> <span class="ident">q</span> <span class="op">%</span> <span class="number">2</span> <span class="op">==</span> <span class="number">0</span> { <span class="ident">z</span> } <span class="kw">else</span> { <span class="ident">next_float</span>(<span class="ident">z</span>) } } </pre> </section> <section id='search' class="content hidden"></section> <section class="footer"></section> <aside id="help" class="hidden"> <div> <h1 class="hidden">Help</h1> <div class="shortcuts"> <h2>Keyboard Shortcuts</h2> <dl> <dt>?</dt> <dd>Show this help dialog</dd> <dt>S</dt> <dd>Focus the search field</dd> <dt>⇤</dt> <dd>Move up in search results</dd> <dt>⇥</dt> <dd>Move down in search results</dd> <dt>⏎</dt> <dd>Go to active search result</dd> <dt>+</dt> <dd>Collapse/expand all sections</dd> </dl> </div> <div class="infos"> <h2>Search Tricks</h2> <p> Prefix searches with a type followed by a colon (e.g. <code>fn:</code>) to restrict the search to a given type. </p> <p> Accepted types are: <code>fn</code>, <code>mod</code>, <code>struct</code>, <code>enum</code>, <code>trait</code>, <code>type</code>, <code>macro</code>, and <code>const</code>. </p> <p> Search functions by type signature (e.g. <code>vec -> usize</code> or <code>* -> vec</code>) </p> </div> </div> </aside> <script> window.rootPath = "../../../../"; window.currentCrate = "core"; </script> <script src="../../../../main.js"></script> <script defer src="../../../../search-index.js"></script> </body> </html>