<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="X-UA-Compatible" content="IE=Edge" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Design and Usage of the InAlloca Attribute — LLVM 8 documentation</title> <link rel="stylesheet" href="_static/llvm-theme.css" type="text/css" /> <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> <script type="text/javascript" id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script> <script type="text/javascript" src="_static/jquery.js"></script> <script type="text/javascript" src="_static/underscore.js"></script> <script type="text/javascript" src="_static/doctools.js"></script> <script type="text/javascript" src="_static/language_data.js"></script> <link rel="index" title="Index" href="genindex.html" /> <link rel="search" title="Search" href="search.html" /> <link rel="next" title="Using ARM NEON instructions in big endian mode" href="BigEndianNEON.html" /> <link rel="prev" title="Stack maps and patch points in LLVM" href="StackMaps.html" /> <style type="text/css"> table.right { float: right; margin-left: 20px; } table.right td { border: 1px solid #ccc; } </style> </head><body> <div class="logo"> <a href="index.html"> <img src="_static/logo.png" alt="LLVM Logo" width="250" height="88"/></a> </div> <div class="related" role="navigation" aria-label="related navigation"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="BigEndianNEON.html" title="Using ARM NEON instructions in big endian mode" accesskey="N">next</a> |</li> <li class="right" > <a href="StackMaps.html" title="Stack maps and patch points in LLVM" accesskey="P">previous</a> |</li> <li><a href="http://llvm.org/">LLVM Home</a> | </li> <li><a href="index.html">Documentation</a>»</li> </ul> </div> <div class="document"> <div class="documentwrapper"> <div class="body" role="main"> <div class="section" id="design-and-usage-of-the-inalloca-attribute"> <h1>Design and Usage of the InAlloca Attribute<a class="headerlink" href="#design-and-usage-of-the-inalloca-attribute" title="Permalink to this headline">¶</a></h1> <div class="section" id="introduction"> <h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2> <p>The <a class="reference internal" href="LangRef.html#attr-inalloca"><span class="std std-ref">inalloca</span></a> attribute is designed to allow taking the address of an aggregate argument that is being passed by value through memory. Primarily, this feature is required for compatibility with the Microsoft C++ ABI. Under that ABI, class instances that are passed by value are constructed directly into argument stack memory. Prior to the addition of inalloca, calls in LLVM were indivisible instructions. There was no way to perform intermediate work, such as object construction, between the first stack adjustment and the final control transfer. With inalloca, all arguments passed in memory are modelled as a single alloca, which can be stored to prior to the call. Unfortunately, this complicated feature comes with a large set of restrictions designed to bound the lifetime of the argument memory around the call.</p> <p>For now, it is recommended that frontends and optimizers avoid producing this construct, primarily because it forces the use of a base pointer. This feature may grow in the future to allow general mid-level optimization, but for now, it should be regarded as less efficient than passing by value with a copy.</p> </div> <div class="section" id="intended-usage"> <h2>Intended Usage<a class="headerlink" href="#intended-usage" title="Permalink to this headline">¶</a></h2> <p>The example below is the intended LLVM IR lowering for some C++ code that passes two default-constructed <code class="docutils literal notranslate"><span class="pre">Foo</span></code> objects to <code class="docutils literal notranslate"><span class="pre">g</span></code> in the 32-bit Microsoft C++ ABI.</p> <div class="highlight-c++ notranslate"><div class="highlight"><pre><span></span><span class="c1">// Foo is non-trivial.</span> <span class="k">struct</span> <span class="n">Foo</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">;</span> <span class="n">Foo</span><span class="p">();</span> <span class="o">~</span><span class="n">Foo</span><span class="p">();</span> <span class="n">Foo</span><span class="p">(</span><span class="k">const</span> <span class="n">Foo</span> <span class="o">&</span><span class="p">);</span> <span class="p">};</span> <span class="kt">void</span> <span class="nf">g</span><span class="p">(</span><span class="n">Foo</span> <span class="n">a</span><span class="p">,</span> <span class="n">Foo</span> <span class="n">b</span><span class="p">);</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">()</span> <span class="p">{</span> <span class="n">g</span><span class="p">(</span><span class="n">Foo</span><span class="p">(),</span> <span class="n">Foo</span><span class="p">());</span> <span class="p">}</span> </pre></div> </div> <div class="highlight-text notranslate"><div class="highlight"><pre><span></span>%struct.Foo = type { i32, i32 } declare void @Foo_ctor(%struct.Foo* %this) declare void @Foo_dtor(%struct.Foo* %this) declare void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs) define void @f() { entry: %base = call i8* @llvm.stacksave() %memargs = alloca <{ %struct.Foo, %struct.Foo }> %b = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 1 call void @Foo_ctor(%struct.Foo* %b) ; If a's ctor throws, we must destruct b. %a = getelementptr <{ %struct.Foo, %struct.Foo }>* %memargs, i32 0 invoke void @Foo_ctor(%struct.Foo* %a) to label %invoke.cont unwind %invoke.unwind invoke.cont: call void @g(<{ %struct.Foo, %struct.Foo }>* inalloca %memargs) call void @llvm.stackrestore(i8* %base) ... invoke.unwind: call void @Foo_dtor(%struct.Foo* %b) call void @llvm.stackrestore(i8* %base) ... } </pre></div> </div> <p>To avoid stack leaks, the frontend saves the current stack pointer with a call to <a class="reference internal" href="LangRef.html#int-stacksave"><span class="std std-ref">llvm.stacksave</span></a>. Then, it allocates the argument stack space with alloca and calls the default constructor. The default constructor could throw an exception, so the frontend has to create a landing pad. The frontend has to destroy the already constructed argument <code class="docutils literal notranslate"><span class="pre">b</span></code> before restoring the stack pointer. If the constructor does not unwind, <code class="docutils literal notranslate"><span class="pre">g</span></code> is called. In the Microsoft C++ ABI, <code class="docutils literal notranslate"><span class="pre">g</span></code> will destroy its arguments, and then the stack is restored in <code class="docutils literal notranslate"><span class="pre">f</span></code>.</p> </div> <div class="section" id="design-considerations"> <h2>Design Considerations<a class="headerlink" href="#design-considerations" title="Permalink to this headline">¶</a></h2> <div class="section" id="lifetime"> <h3>Lifetime<a class="headerlink" href="#lifetime" title="Permalink to this headline">¶</a></h3> <p>The biggest design consideration for this feature is object lifetime. We cannot model the arguments as static allocas in the entry block, because all calls need to use the memory at the top of the stack to pass arguments. We cannot vend pointers to that memory at function entry because after code generation they will alias.</p> <p>The rule against allocas between argument allocations and the call site avoids this problem, but it creates a cleanup problem. Cleanup and lifetime is handled explicitly with stack save and restore calls. In the future, we may want to introduce a new construct such as <code class="docutils literal notranslate"><span class="pre">freea</span></code> or <code class="docutils literal notranslate"><span class="pre">afree</span></code> to make it clear that this stack adjusting cleanup is less powerful than a full stack save and restore.</p> </div> <div class="section" id="nested-calls-and-copy-elision"> <h3>Nested Calls and Copy Elision<a class="headerlink" href="#nested-calls-and-copy-elision" title="Permalink to this headline">¶</a></h3> <p>We also want to be able to support copy elision into these argument slots. This means we have to support multiple live argument allocations.</p> <p>Consider the evaluation of:</p> <div class="highlight-c++ notranslate"><div class="highlight"><pre><span></span><span class="c1">// Foo is non-trivial.</span> <span class="k">struct</span> <span class="n">Foo</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">a</span><span class="p">;</span> <span class="n">Foo</span><span class="p">();</span> <span class="n">Foo</span><span class="p">(</span><span class="k">const</span> <span class="o">&</span><span class="n">Foo</span><span class="p">);</span> <span class="o">~</span><span class="n">Foo</span><span class="p">();</span> <span class="p">};</span> <span class="n">Foo</span> <span class="nf">bar</span><span class="p">(</span><span class="n">Foo</span> <span class="n">b</span><span class="p">);</span> <span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span> <span class="n">bar</span><span class="p">(</span><span class="n">bar</span><span class="p">(</span><span class="n">Foo</span><span class="p">()));</span> <span class="p">}</span> </pre></div> </div> <p>In this case, we want to be able to elide copies into <code class="docutils literal notranslate"><span class="pre">bar</span></code>’s argument slots. That means we need to have more than one set of argument frames active at the same time. First, we need to allocate the frame for the outer call so we can pass it in as the hidden struct return pointer to the middle call. Then we do the same for the middle call, allocating a frame and passing its address to <code class="docutils literal notranslate"><span class="pre">Foo</span></code>’s default constructor. By wrapping the evaluation of the inner <code class="docutils literal notranslate"><span class="pre">bar</span></code> with stack save and restore, we can have multiple overlapping active call frames.</p> </div> <div class="section" id="callee-cleanup-calling-conventions"> <h3>Callee-cleanup Calling Conventions<a class="headerlink" href="#callee-cleanup-calling-conventions" title="Permalink to this headline">¶</a></h3> <p>Another wrinkle is the existence of callee-cleanup conventions. On Windows, all methods and many other functions adjust the stack to clear the memory used to pass their arguments. In some sense, this means that the allocas are automatically cleared by the call. However, LLVM instead models this as a write of undef to all of the inalloca values passed to the call instead of a stack adjustment. Frontends should still restore the stack pointer to avoid a stack leak.</p> </div> <div class="section" id="exceptions"> <h3>Exceptions<a class="headerlink" href="#exceptions" title="Permalink to this headline">¶</a></h3> <p>There is also the possibility of an exception. If argument evaluation or copy construction throws an exception, the landing pad must do cleanup, which includes adjusting the stack pointer to avoid a stack leak. This means the cleanup of the stack memory cannot be tied to the call itself. There needs to be a separate IR-level instruction that can perform independent cleanup of arguments.</p> </div> <div class="section" id="efficiency"> <h3>Efficiency<a class="headerlink" href="#efficiency" title="Permalink to this headline">¶</a></h3> <p>Eventually, it should be possible to generate efficient code for this construct. In particular, using inalloca should not require a base pointer. If the backend can prove that all points in the CFG only have one possible stack level, then it can address the stack directly from the stack pointer. While this is not yet implemented, the plan is that the inalloca attribute should not change much, but the frontend IR generation recommendations may change.</p> </div> </div> </div> </div> </div> <div class="clearer"></div> </div> <div class="related" role="navigation" aria-label="related navigation"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="BigEndianNEON.html" title="Using ARM NEON instructions in big endian mode" >next</a> |</li> <li class="right" > <a href="StackMaps.html" title="Stack maps and patch points in LLVM" >previous</a> |</li> <li><a href="http://llvm.org/">LLVM Home</a> | </li> <li><a href="index.html">Documentation</a>»</li> </ul> </div> <div class="footer" role="contentinfo"> © Copyright 2003-2020, LLVM Project. Last updated on 2020-09-07. Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.8.4. </div> </body> </html>