Sophie: rust-doc-1.27.0-1.mga6 x86

rust-doc-1.27.0-1.mga6.x86_64.rpm

<!DOCTYPE HTML>
<html lang="en" class="sidebar-visible no-js">
    <head>
        <!-- Book generated using mdBook -->
        <meta charset="UTF-8">
        <title></title>
        <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
        <meta name="description" content="">
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <meta name="theme-color" content="#ffffff" />

        <base href="">

        <link rel="stylesheet" href="book.css">
        <link href="https://fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800" rel="stylesheet" type="text/css">
        <link href="https://fonts.googleapis.com/css?family=Source+Code+Pro:500" rel="stylesheet" type="text/css">

        <link rel="shortcut icon" href="favicon.png">

        <!-- Font Awesome -->
        <link rel="stylesheet" href="_FontAwesome/css/font-awesome.css">

        <link rel="stylesheet" href="highlight.css">
        <link rel="stylesheet" href="tomorrow-night.css">
        <link rel="stylesheet" href="ayu-highlight.css">

        <!-- Custom theme stylesheets -->
        

        

    </head>
    <body class="light">
        <!-- Work around some values being stored in localStorage wrapped in quotes -->
        <script type="text/javascript">
            try {
                var theme = localStorage.getItem('mdbook-theme');
                var sidebar = localStorage.getItem('mdbook-sidebar');

                if (theme.startsWith('"') && theme.endsWith('"')) {
                    localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
                }

                if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
                    localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
                }
            } catch (e) { }
        </script>

        <!-- Set the theme before any content is loaded, prevents flash -->
        <script type="text/javascript">
            var theme;
            try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { } 
            if (theme === null || theme === undefined) { theme = 'light'; }
            document.body.className = theme;
            document.querySelector('html').className = theme + ' js';
        </script>

        <!-- Hide / unhide sidebar before it is displayed -->
        <script type="text/javascript">
            var html = document.querySelector('html');
            var sidebar = 'hidden';
            if (document.body.clientWidth >= 1080) {
                try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
                sidebar = sidebar || 'visible';
            }
            html.classList.remove('sidebar-visible');
            html.classList.add("sidebar-" + sidebar);
        </script>

        <nav id="sidebar" class="sidebar" aria-label="Table of contents">
            <ol class="chapter"><li class="affix"><a href="README.html">Introduction</a></li><li><a href="meet-safe-and-unsafe.html"><strong aria-hidden="true">1.</strong> Meet Safe and Unsafe</a></li><li><ol class="section"><li><a href="safe-unsafe-meaning.html"><strong aria-hidden="true">1.1.</strong> How Safe and Unsafe Interact</a></li><li><a href="what-unsafe-does.html"><strong aria-hidden="true">1.2.</strong> What Unsafe Can Do</a></li><li><a href="working-with-unsafe.html"><strong aria-hidden="true">1.3.</strong> Working with Unsafe</a></li></ol></li><li><a href="data.html"><strong aria-hidden="true">2.</strong> Data Layout</a></li><li><ol class="section"><li><a href="repr-rust.html"><strong aria-hidden="true">2.1.</strong> repr(Rust)</a></li><li><a href="exotic-sizes.html"><strong aria-hidden="true">2.2.</strong> Exotically Sized Types</a></li><li><a href="other-reprs.html"><strong aria-hidden="true">2.3.</strong> Other reprs</a></li></ol></li><li><a href="ownership.html"><strong aria-hidden="true">3.</strong> Ownership</a></li><li><ol class="section"><li><a href="references.html"><strong aria-hidden="true">3.1.</strong> References</a></li><li><a href="aliasing.html"><strong aria-hidden="true">3.2.</strong> Aliasing</a></li><li><a href="lifetimes.html"><strong aria-hidden="true">3.3.</strong> Lifetimes</a></li><li><a href="lifetime-mismatch.html"><strong aria-hidden="true">3.4.</strong> Limits of Lifetimes</a></li><li><a href="lifetime-elision.html"><strong aria-hidden="true">3.5.</strong> Lifetime Elision</a></li><li><a href="unbounded-lifetimes.html"><strong aria-hidden="true">3.6.</strong> Unbounded Lifetimes</a></li><li><a href="hrtb.html"><strong aria-hidden="true">3.7.</strong> Higher-Rank Trait Bounds</a></li><li><a href="subtyping.html"><strong aria-hidden="true">3.8.</strong> Subtyping and Variance</a></li><li><a href="dropck.html"><strong aria-hidden="true">3.9.</strong> Drop Check</a></li><li><a href="phantom-data.html"><strong aria-hidden="true">3.10.</strong> PhantomData</a></li><li><a href="borrow-splitting.html"><strong aria-hidden="true">3.11.</strong> Splitting Borrows</a></li></ol></li><li><a href="conversions.html"><strong aria-hidden="true">4.</strong> Type Conversions</a></li><li><ol class="section"><li><a href="coercions.html"><strong aria-hidden="true">4.1.</strong> Coercions</a></li><li><a href="dot-operator.html"><strong aria-hidden="true">4.2.</strong> The Dot Operator</a></li><li><a href="casts.html"><strong aria-hidden="true">4.3.</strong> Casts</a></li><li><a href="transmutes.html"><strong aria-hidden="true">4.4.</strong> Transmutes</a></li></ol></li><li><a href="uninitialized.html"><strong aria-hidden="true">5.</strong> Uninitialized Memory</a></li><li><ol class="section"><li><a href="checked-uninit.html"><strong aria-hidden="true">5.1.</strong> Checked</a></li><li><a href="drop-flags.html"><strong aria-hidden="true">5.2.</strong> Drop Flags</a></li><li><a href="unchecked-uninit.html"><strong aria-hidden="true">5.3.</strong> Unchecked</a></li></ol></li><li><a href="obrm.html"><strong aria-hidden="true">6.</strong> Ownership Based Resource Management</a></li><li><ol class="section"><li><a href="constructors.html"><strong aria-hidden="true">6.1.</strong> Constructors</a></li><li><a href="destructors.html"><strong aria-hidden="true">6.2.</strong> Destructors</a></li><li><a href="leaking.html"><strong aria-hidden="true">6.3.</strong> Leaking</a></li></ol></li><li><a href="unwinding.html"><strong aria-hidden="true">7.</strong> Unwinding</a></li><li><ol class="section"><li><a href="exception-safety.html"><strong aria-hidden="true">7.1.</strong> Exception Safety</a></li><li><a href="poisoning.html"><strong aria-hidden="true">7.2.</strong> Poisoning</a></li></ol></li><li><a href="concurrency.html"><strong aria-hidden="true">8.</strong> Concurrency</a></li><li><ol class="section"><li><a href="races.html"><strong aria-hidden="true">8.1.</strong> Races</a></li><li><a href="send-and-sync.html"><strong aria-hidden="true">8.2.</strong> Send and Sync</a></li><li><a href="atomics.html"><strong aria-hidden="true">8.3.</strong> Atomics</a></li></ol></li><li><a href="vec.html"><strong aria-hidden="true">9.</strong> Implementing Vec</a></li><li><ol class="section"><li><a href="vec-layout.html"><strong aria-hidden="true">9.1.</strong> Layout</a></li><li><a href="vec-alloc.html"><strong aria-hidden="true">9.2.</strong> Allocating</a></li><li><a href="vec-push-pop.html"><strong aria-hidden="true">9.3.</strong> Push and Pop</a></li><li><a href="vec-dealloc.html"><strong aria-hidden="true">9.4.</strong> Deallocating</a></li><li><a href="vec-deref.html"><strong aria-hidden="true">9.5.</strong> Deref</a></li><li><a href="vec-insert-remove.html"><strong aria-hidden="true">9.6.</strong> Insert and Remove</a></li><li><a href="vec-into-iter.html"><strong aria-hidden="true">9.7.</strong> IntoIter</a></li><li><a href="vec-raw.html"><strong aria-hidden="true">9.8.</strong> RawVec</a></li><li><a href="vec-drain.html"><strong aria-hidden="true">9.9.</strong> Drain</a></li><li><a href="vec-zsts.html"><strong aria-hidden="true">9.10.</strong> Handling Zero-Sized Types</a></li><li><a href="vec-final.html"><strong aria-hidden="true">9.11.</strong> Final Code</a></li></ol></li><li><a href="arc-and-mutex.html"><strong aria-hidden="true">10.</strong> Implementing Arc and Mutex</a></li><li><a href="ffi.html"><strong aria-hidden="true">11.</strong> FFI</a></li></ol>
        </nav>

        <div id="page-wrapper" class="page-wrapper">

            <div class="page">
                
                <div id="menu-bar" class="menu-bar">
                    <div id="menu-bar-sticky-container">
                        <div class="left-buttons">
                            <button id="sidebar-toggle" class="icon-button" type="button" title="Toggle Table of Contents" aria-label="Toggle Table of Contents" aria-controls="sidebar">
                                <i class="fa fa-bars"></i>
                            </button>
                            <button id="theme-toggle" class="icon-button" type="button" title="Change theme" aria-label="Change theme" aria-haspopup="true" aria-expanded="false" aria-controls="theme-list">
                                <i class="fa fa-paint-brush"></i>
                            </button>
                            <ul id="theme-list" class="theme-popup" aria-label="Themes" role="menu">
                                <li role="none"><button role="menuitem" class="theme" id="light">Light <span class="default">(default)</span></button></li>
                                <li role="none"><button role="menuitem" class="theme" id="rust">Rust</button></li>
                                <li role="none"><button role="menuitem" class="theme" id="coal">Coal</button></li>
                                <li role="none"><button role="menuitem" class="theme" id="navy">Navy</button></li>
                                <li role="none"><button role="menuitem" class="theme" id="ayu">Ayu</button></li>
                            </ul>
                            
                            <button id="search-toggle" class="icon-button" type="button" title="Search. (Shortkey: s)" aria-label="Toggle Searchbar" aria-expanded="false" aria-keyshortcuts="S" aria-controls="searchbar">
                                <i class="fa fa-search"></i>
                            </button>
                            
                        </div>

                        <h1 class="menu-title"></h1> 

                        <div class="right-buttons">
                            <a href="print.html" title="Print this book" aria-label="Print this book">
                                <i id="print-button" class="fa fa-print"></i>
                            </a>
                        </div>
                    </div>
                </div>

                
                <div id="search-wrapper" class="hidden">
                    <form id="searchbar-outer" class="searchbar-outer">
                        <input type="search" name="search" id="searchbar" name="searchbar" placeholder="Search this book ..." aria-controls="searchresults-outer" aria-describedby="searchresults-header">
                    </form>
                    <div id="searchresults-outer" class="searchresults-outer hidden">
                        <div id="searchresults-header" class="searchresults-header"></div>
                        <ul id="searchresults">
                        </ul>
                    </div>
                </div>
                

                <!-- Apply ARIA attributes after the sidebar and the sidebar toggle button are added to the DOM -->
                <script type="text/javascript">
                    document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
                    document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
                    Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
                        link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
                    });
                </script>

                <div id="content" class="content">
                    <main>
                        <a class="header" href="print.html#the-rustonomicon" id="the-rustonomicon"><h1>The Rustonomicon</h1></a>
<a class="header" href="print.html#the-dark-arts-of-advanced-and-unsafe-rust-programming" id="the-dark-arts-of-advanced-and-unsafe-rust-programming"><h4>The Dark Arts of Advanced and Unsafe Rust Programming</h4></a>
<a class="header" href="print.html#note-this-is-a-draft-document-that-discusses-several-unstable-aspects-of-rust-and-may-contain-serious-errors-or-outdated-information" id="note-this-is-a-draft-document-that-discusses-several-unstable-aspects-of-rust-and-may-contain-serious-errors-or-outdated-information"><h1>NOTE: This is a draft document that discusses several unstable aspects of Rust, and may contain serious errors or outdated information.</h1></a>
<blockquote>
<p>Instead of the programs I had hoped for, there came only a shuddering blackness
and ineffable loneliness; and I saw at last a fearful truth which no one had
ever dared to breathe before — the unwhisperable secret of secrets — The fact
that this language of stone and stridor is not a sentient perpetuation of Rust
as London is of Old London and Paris of Old Paris, but that it is in fact
quite unsafe, its sprawling body imperfectly embalmed and infested with queer
animate things which have nothing to do with it as it was in compilation.</p>
</blockquote>
<p>This book digs into all the awful details that are necessary to understand in
order to write correct Unsafe Rust programs. Due to the nature of this problem,
it may lead to unleashing untold horrors that shatter your psyche into a billion
infinitesimal fragments of despair.</p>
<p>Should you wish a long and happy career of writing Rust programs, you should
turn back now and forget you ever saw this book. It is not necessary. However
if you intend to write unsafe code — or just want to dig into the guts of the
language — this book contains lots of useful information.</p>
<p>Unlike <em><a href="../book/index.html">The Rust Programming Language</a></em>, we will be assuming considerable
prior knowledge. In particular, you should be comfortable with basic systems
programming and Rust. If you don't feel comfortable with these topics, you
should consider <a href="../book/index.html">reading The Book</a> first. That said, we won't assume you
have read it, and we will take care to occasionally give a refresher on the
basics where appropriate. You can skip straight to this book if you want;
just know that we won't be explaining everything from the ground up.</p>
<p>We're going to dig into exception-safety, pointer aliasing, memory models,
compiler and hardware implementation details, and even some type-theory.
Much text will be devoted to exotic corner cases that no one <em>should</em> ever have
to care about, but suddenly become important because we wrote <code>unsafe</code>.</p>
<p>We will also be spending a lot of time talking about the different kinds of
safety and guarantees that programs could care about.</p>
<a class="header" href="print.html#meet-safe-and-unsafe" id="meet-safe-and-unsafe"><h1>Meet Safe and Unsafe</h1></a>
<p><img src="img/safeandunsafe.svg" alt="safe and unsafe" /></p>
<p>It would be great to not have to worry about low-level implementation details.
Who could possibly care how much space the empty tuple occupies? Sadly, it
sometimes matters and we need to worry about it. The most common reason
developers start to care about implementation details is performance, but more
importantly, these details can become a matter of correctness when interfacing
directly with hardware, operating systems, or other languages.</p>
<p>When implementation details start to matter in a safe programming language,
programmers usually have three options:</p>
<ul>
<li>fiddle with the code to encourage the compiler/runtime to perform an optimization</li>
<li>adopt a more unidiomatic or cumbersome design to get the desired implementation</li>
<li>rewrite the implementation in a language that lets you deal with those details</li>
</ul>
<p>For that last option, the language programmers tend to use is <em>C</em>. This is often
necessary to interface with systems that only declare a C interface.</p>
<p>Unfortunately, C is incredibly unsafe to use (sometimes for good reason),
and this unsafety is magnified when trying to interoperate with another
language. Care must be taken to ensure C and the other language agree on
what's happening, and that they don't step on each other's toes.</p>
<p>So what does this have to do with Rust?</p>
<p>Well, unlike C, Rust is a safe programming language.</p>
<p>But, like C, Rust is an unsafe programming language.</p>
<p>More accurately, Rust <em>contains</em> both a safe and unsafe programming language.</p>
<p>Rust can be thought of as a combination of two programming languages: <em>Safe
Rust</em> and <em>Unsafe Rust</em>. Conveniently, these names mean exactly what they say:
Safe Rust is Safe. Unsafe Rust is, well, not. In fact, Unsafe Rust lets us
do some <em>really</em> unsafe things. Things the Rust authors will implore you not to
do, but we'll do anyway.</p>
<p>Safe Rust is the <em>true</em> Rust programming language. If all you do is write Safe
Rust, you will never have to worry about type-safety or memory-safety. You will
never endure a dangling pointer, a use-after-free, or any other kind of
Undefined Behavior.</p>
<p>The standard library also gives you enough utilities out of the box that you'll
be able to write high-performance applications and libraries in pure idiomatic
Safe Rust.</p>
<p>But maybe you want to talk to another language. Maybe you're writing a
low-level abstraction not exposed by the standard library. Maybe you're
<em>writing</em> the standard library (which is written entirely in Rust). Maybe you
need to do something the type-system doesn't understand and just <em>frob some dang
bits</em>. Maybe you need Unsafe Rust.</p>
<p>Unsafe Rust is exactly like Safe Rust with all the same rules and semantics.
It just lets you do some <em>extra</em> things that are Definitely Not Safe
(which we will define in the next section).</p>
<p>The value of this separation is that we gain the benefits of using an unsafe
language like C — low level control over implementation details — without most
of the problems that come with trying to integrate it with a completely
different safe language.</p>
<p>There are still some problems — most notably, we must become aware of properties
that the type system assumes and audit them in any code that interacts with
Unsafe Rust. That's the purpose of this book: to teach you about these assumptions
and how to manage them.</p>
<a class="header" href="print.html#how-safe-and-unsafe-interact" id="how-safe-and-unsafe-interact"><h1>How Safe and Unsafe Interact</h1></a>
<p>What's the relationship between Safe Rust and Unsafe Rust? How do they
interact?</p>
<p>The separation between Safe Rust and Unsafe Rust is controlled with the
<code>unsafe</code> keyword, which acts as an interface from one to the other. This is
why we can say Safe Rust is a safe language: all the unsafe parts are kept
exclusively behind the <code>unsafe</code> boundary. If you wish, you can even toss
<code>#![forbid(unsafe_code)]</code> into your code base to statically guarantee that
you're only writing Safe Rust.</p>
<p>The <code>unsafe</code> keyword has two uses: to declare the existence of contracts the
compiler can't check, and to declare that a programmer has checked that these
contracts have been upheld.</p>
<p>You can use <code>unsafe</code> to indicate the existence of unchecked contracts on
<em>functions</em> and <em>trait declarations</em>. On functions, <code>unsafe</code> means that
users of the function must check that function's documentation to ensure
they are using it in a way that maintains the contracts the function
requires. On trait declarations, <code>unsafe</code> means that implementors of the
trait must check the trait documentation to ensure their implementation
maintains the contracts the trait requires.</p>
<p>You can use <code>unsafe</code> on a block to declare that all unsafe actions performed
within are verified to uphold the contracts of those operations. For instance,
the index passed to <code>slice::get_unchecked</code> is in-bounds.</p>
<p>You can use <code>unsafe</code> on a trait implementation to declare that the implementation
upholds the trait's contract. For instance, that a type implementing <code>Send</code> is
really safe to move to another thread.</p>
<p>The standard library has a number of unsafe functions, including:</p>
<ul>
<li><code>slice::get_unchecked</code>, which performs unchecked indexing, allowing
memory safety to be freely violated.</li>
<li><code>mem::transmute</code> reinterprets some value as having a given type, bypassing
type safety in arbitrary ways (see <a href="conversions.html">conversions</a> for details).</li>
<li>Every raw pointer to a sized type has an <code>offset</code> method that
invokes Undefined Behavior if the passed offset is not <a href="../std/primitive.pointer.html#method.offset">&quot;in bounds&quot;</a>.</li>
<li>All FFI (Foreign Function Interface) functions are <code>unsafe</code> to call because the
other language can do arbitrary operations that the Rust compiler can't check.</li>
</ul>
<p>As of Rust 1.0 there are exactly two unsafe traits:</p>
<ul>
<li><code>Send</code> is a marker trait (a trait with no API) that promises implementors are
safe to send (move) to another thread.</li>
<li><code>Sync</code> is a marker trait that promises threads can safely share implementors
through a shared reference.</li>
</ul>
<p>Much of the Rust standard library also uses Unsafe Rust internally. These
implementations have generally been rigorously manually checked, so the Safe Rust
interfaces built on top of these implementations can be assumed to be safe.</p>
<p>The need for all of this separation boils down a single fundamental property
of Safe Rust:</p>
<p><strong>No matter what, Safe Rust can't cause Undefined Behavior.</strong></p>
<p>The design of the safe/unsafe split means that there is an asymmetric trust
relationship between Safe and Unsafe Rust. Safe Rust inherently has to
trust that any Unsafe Rust it touches has been written correctly.
On the other hand, Unsafe Rust has to be very careful about trusting Safe Rust.</p>
<p>As an example, Rust has the <code>PartialOrd</code> and <code>Ord</code> traits to differentiate
between types which can &quot;just&quot; be compared, and those that provide a &quot;total&quot;
ordering (which basically means that comparison behaves reasonably).</p>
<p><code>BTreeMap</code> doesn't really make sense for partially-ordered types, and so it
requires that its keys implement <code>Ord</code>. However, <code>BTreeMap</code> has Unsafe Rust code
inside of its implementation. Because it would be unacceptable for a sloppy <code>Ord</code>
implementation (which is Safe to write) to cause Undefined Behavior, the Unsafe
code in BTreeMap must be written to be robust against <code>Ord</code> implementations which
aren't actually total — even though that's the whole point of requiring <code>Ord</code>.</p>
<p>The Unsafe Rust code just can't trust the Safe Rust code to be written correctly.
That said, <code>BTreeMap</code> will still behave completely erratically if you feed in
values that don't have a total ordering. It just won't ever cause Undefined
Behavior.</p>
<p>One may wonder, if <code>BTreeMap</code> cannot trust <code>Ord</code> because it's Safe, why can it
trust <em>any</em> Safe code? For instance <code>BTreeMap</code> relies on integers and slices to
be implemented correctly. Those are safe too, right?</p>
<p>The difference is one of scope. When <code>BTreeMap</code> relies on integers and slices,
it's relying on one very specific implementation. This is a measured risk that
can be weighed against the benefit. In this case there's basically zero risk;
if integers and slices are broken, <em>everyone</em> is broken. Also, they're maintained
by the same people who maintain <code>BTreeMap</code>, so it's easy to keep tabs on them.</p>
<p>On the other hand, <code>BTreeMap</code>'s key type is generic. Trusting its <code>Ord</code> implementation
means trusting every <code>Ord</code> implementation in the past, present, and future.
Here the risk is high: someone somewhere is going to make a mistake and mess up
their <code>Ord</code> implementation, or even just straight up lie about providing a total
ordering because &quot;it seems to work&quot;. When that happens, <code>BTreeMap</code> needs to be
prepared.</p>
<p>The same logic applies to trusting a closure that's passed to you to behave
correctly.</p>
<p>This problem of unbounded generic trust is the problem that <code>unsafe</code> traits
exist to resolve. The <code>BTreeMap</code> type could theoretically require that keys
implement a new trait called <code>UnsafeOrd</code>, rather than <code>Ord</code>, that might look
like this:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
use std::cmp::Ordering;

unsafe trait UnsafeOrd {
    fn cmp(&amp;self, other: &amp;Self) -&gt; Ordering;
}
#}</code></pre></pre>
<p>Then, a type would use <code>unsafe</code> to implement <code>UnsafeOrd</code>, indicating that
they've ensured their implementation maintains whatever contracts the
trait expects. In this situation, the Unsafe Rust in the internals of
<code>BTreeMap</code> would be justified in trusting that the key type's <code>UnsafeOrd</code>
implementation is correct. If it isn't, it's the fault of the unsafe trait
implementation, which is consistent with Rust's safety guarantees.</p>
<p>The decision of whether to mark a trait <code>unsafe</code> is an API design choice.
Rust has traditionally avoided doing this because it makes Unsafe
Rust pervasive, which isn't desirable. <code>Send</code> and <code>Sync</code> are marked unsafe
because thread safety is a <em>fundamental property</em> that unsafe code can't
possibly hope to defend against in the way it could defend against a bad
<code>Ord</code> implementation. The decision of whether to mark your own traits <code>unsafe</code>
depends on the same sort of consideration. If <code>unsafe</code> code can't reasonably
expect to defend against a bad implementation of the trait, then marking the
trait <code>unsafe</code> is a reasonable choice.</p>
<p>As an aside, while <code>Send</code> and <code>Sync</code> are <code>unsafe</code> traits, they are <em>also</em>
automatically implemented for types when such derivations are provably safe
to do. <code>Send</code> is automatically derived for all types composed only of values
whose types also implement <code>Send</code>. <code>Sync</code> is automatically derived for all
types composed only of values whose types also implement <code>Sync</code>. This minimizes
the pervasive unsafety of making these two traits <code>unsafe</code>.</p>
<p>This is the balance between Safe and Unsafe Rust. The separation is designed to
make using Safe Rust as ergonomic as possible, but requires extra effort and
care when writing Unsafe Rust. The rest of this book is largely a discussion
of the sort of care that must be taken, and what contracts Unsafe Rust must uphold.</p>
<a class="header" href="print.html#what-unsafe-rust-can-do" id="what-unsafe-rust-can-do"><h1>What Unsafe Rust Can Do</h1></a>
<p>The only things that are different in Unsafe Rust are that you can:</p>
<ul>
<li>Dereference raw pointers</li>
<li>Call <code>unsafe</code> functions (including C functions, compiler intrinsics, and the raw allocator)</li>
<li>Implement <code>unsafe</code> traits</li>
<li>Mutate statics</li>
</ul>
<p>That's it. The reason these operations are relegated to Unsafe is that misusing
any of these things will cause the ever dreaded Undefined Behavior. Invoking
Undefined Behavior gives the compiler full rights to do arbitrarily bad things
to your program. You definitely <em>should not</em> invoke Undefined Behavior.</p>
<p>Unlike C, Undefined Behavior is pretty limited in scope in Rust. All the core
language cares about is preventing the following things:</p>
<ul>
<li>Dereferencing null, dangling, or unaligned pointers</li>
<li>Reading <a href="uninitialized.html">uninitialized memory</a></li>
<li>Breaking the <a href="references.html">pointer aliasing rules</a></li>
<li>Producing invalid primitive values:
<ul>
<li>dangling/null references</li>
<li>a <code>bool</code> that isn't 0 or 1</li>
<li>an undefined <code>enum</code> discriminant</li>
<li>a <code>char</code> outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]</li>
<li>A non-utf8 <code>str</code></li>
</ul>
</li>
<li>Unwinding into another language</li>
<li>Causing a <a href="races.html">data race</a></li>
</ul>
<p>That's it. That's all the causes of Undefined Behavior baked into Rust. Of
course, unsafe functions and traits are free to declare arbitrary other
constraints that a program must maintain to avoid Undefined Behavior. For
instance, the allocator APIs declare that deallocating unallocated memory is
Undefined Behavior.</p>
<p>However, violations of these constraints generally will just transitively lead to one of
the above problems. Some additional constraints may also derive from compiler
intrinsics that make special assumptions about how code can be optimized. For instance,
Vec and Box make use of intrinsics that require their pointers to be non-null at all times.</p>
<p>Rust is otherwise quite permissive with respect to other dubious operations.
Rust considers it &quot;safe&quot; to:</p>
<ul>
<li>Deadlock</li>
<li>Have a <a href="races.html">race condition</a></li>
<li>Leak memory</li>
<li>Fail to call destructors</li>
<li>Overflow integers</li>
<li>Abort the program</li>
<li>Delete the production database</li>
</ul>
<p>However any program that actually manages to do such a thing is <em>probably</em>
incorrect. Rust provides lots of tools to make these things rare, but
these problems are considered impractical to categorically prevent.</p>
<a class="header" href="print.html#working-with-unsafe" id="working-with-unsafe"><h1>Working with Unsafe</h1></a>
<p>Rust generally only gives us the tools to talk about Unsafe Rust in a scoped and
binary manner. Unfortunately, reality is significantly more complicated than
that. For instance, consider the following toy function:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn index(idx: usize, arr: &amp;[u8]) -&gt; Option&lt;u8&gt; {
    if idx &lt; arr.len() {
        unsafe {
            Some(*arr.get_unchecked(idx))
        }
    } else {
        None
    }
}
#}</code></pre></pre>
<p>This function is safe and correct. We check that the index is in bounds, and if it
is, index into the array in an unchecked manner. But even in such a trivial
function, the scope of the unsafe block is questionable. Consider changing the
<code>&lt;</code> to a <code>&lt;=</code>:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn index(idx: usize, arr: &amp;[u8]) -&gt; Option&lt;u8&gt; {
    if idx &lt;= arr.len() {
        unsafe {
            Some(*arr.get_unchecked(idx))
        }
    } else {
        None
    }
}
#}</code></pre></pre>
<p>This program is now unsound, and yet <em>we only modified safe code</em>. This is the
fundamental problem of safety: it's non-local. The soundness of our unsafe
operations necessarily depends on the state established by otherwise
&quot;safe&quot; operations.</p>
<p>Safety is modular in the sense that opting into unsafety doesn't require you
to consider arbitrary other kinds of badness. For instance, doing an unchecked
index into a slice doesn't mean you suddenly need to worry about the slice being
null or containing uninitialized memory. Nothing fundamentally changes. However
safety <em>isn't</em> modular in the sense that programs are inherently stateful and
your unsafe operations may depend on arbitrary other state.</p>
<p>This non-locality gets much worse when we incorporate actual persistent state.
Consider a simple implementation of <code>Vec</code>:</p>
<pre><pre class="playpen"><code class="language-rust">use std::ptr;

// Note: This definition is naive. See the chapter on implementing Vec.
pub struct Vec&lt;T&gt; {
    ptr: *mut T,
    len: usize,
    cap: usize,
}

// Note this implementation does not correctly handle zero-sized types.
// See the chapter on implementing Vec.
impl&lt;T&gt; Vec&lt;T&gt; {
    pub fn push(&amp;mut self, elem: T) {
        if self.len == self.cap {
            // not important for this example
            self.reallocate();
        }
        unsafe {
            ptr::write(self.ptr.offset(self.len as isize), elem);
            self.len += 1;
        }
    }
    # fn reallocate(&amp;mut self) { }
}

# fn main() {}
</code></pre></pre>
<p>This code is simple enough to reasonably audit and informally verify. Now consider
adding the following method:</p>
<pre><code class="language-rust ignore">fn make_room(&amp;mut self) {
    // grow the capacity
    self.cap += 1;
}
</code></pre>
<p>This code is 100% Safe Rust but it is also completely unsound. Changing the
capacity violates the invariants of Vec (that <code>cap</code> reflects the allocated space
in the Vec). This is not something the rest of Vec can guard against. It <em>has</em>
to trust the capacity field because there's no way to verify it.</p>
<p>Because it relies on invariants of a struct field, this <code>unsafe</code> code
does more than pollute a whole function: it pollutes a whole <em>module</em>.
Generally, the only bullet-proof way to limit the scope of unsafe code is at the
module boundary with privacy.</p>
<p>However this works <em>perfectly</em>. The existence of <code>make_room</code> is <em>not</em> a
problem for the soundness of Vec because we didn't mark it as public. Only the
module that defines this function can call it. Also, <code>make_room</code> directly
accesses the private fields of Vec, so it can only be written in the same module
as Vec.</p>
<p>It is therefore possible for us to write a completely safe abstraction that
relies on complex invariants. This is <em>critical</em> to the relationship between
Safe Rust and Unsafe Rust.</p>
<p>We have already seen that Unsafe code must trust <em>some</em> Safe code, but shouldn't
trust <em>generic</em> Safe code. Privacy is important to unsafe code for similar reasons:
it prevents us from having to trust all the safe code in the universe from messing
with our trusted state.</p>
<p>Safety lives!</p>
<a class="header" href="print.html#data-representation-in-rust" id="data-representation-in-rust"><h1>Data Representation in Rust</h1></a>
<p>Low-level programming cares a lot about data layout. It's a big deal. It also
pervasively influences the rest of the language, so we're going to start by
digging into how data is represented in Rust.</p>
<a class="header" href="print.html#reprrust" id="reprrust"><h1>repr(Rust)</h1></a>
<p>First and foremost, all types have an alignment specified in bytes. The
alignment of a type specifies what addresses are valid to store the value at. A
value of alignment <code>n</code> must only be stored at an address that is a multiple of
<code>n</code>. So alignment 2 means you must be stored at an even address, and 1 means
that you can be stored anywhere. Alignment is at least 1, and always a power
of 2. Most primitives are generally aligned to their size, although this is
platform-specific behavior. In particular, on x86 <code>u64</code> and <code>f64</code> may be only
aligned to 32 bits.</p>
<p>A type's size must always be a multiple of its alignment. This ensures that an
array of that type may always be indexed by offsetting by a multiple of its
size. Note that the size and alignment of a type may not be known
statically in the case of <a href="exotic-sizes.html#dynamically-sized-types-dsts">dynamically sized types</a>.</p>
<p>Rust gives you the following ways to lay out composite data:</p>
<ul>
<li>structs (named product types)</li>
<li>tuples (anonymous product types)</li>
<li>arrays (homogeneous product types)</li>
<li>enums (named sum types -- tagged unions)</li>
</ul>
<p>An enum is said to be <em>field-less</em> if none of its variants have associated data.</p>
<p>Composite structures will have an alignment equal to the maximum
of their fields' alignment. Rust will consequently insert padding where
necessary to ensure that all fields are properly aligned and that the overall
type's size is a multiple of its alignment. For instance:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct A {
    a: u8,
    b: u32,
    c: u16,
}
#}</code></pre></pre>
<p>will be 32-bit aligned on an architecture that aligns these primitives to their
respective sizes. The whole struct will therefore have a size that is a multiple
of 32-bits. It will potentially become:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct A {
    a: u8,
    _pad1: [u8; 3], // to align `b`
    b: u32,
    c: u16,
    _pad2: [u8; 2], // to make overall size multiple of 4
}
#}</code></pre></pre>
<p>There is <em>no indirection</em> for these types; all data is stored within the struct,
as you would expect in C. However with the exception of arrays (which are
densely packed and in-order), the layout of data is not by default specified in
Rust. Given the two following struct definitions:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct A {
    a: i32,
    b: u64,
}

struct B {
    a: i32,
    b: u64,
}
#}</code></pre></pre>
<p>Rust <em>does</em> guarantee that two instances of A have their data laid out in
exactly the same way. However Rust <em>does not</em> currently guarantee that an
instance of A has the same field ordering or padding as an instance of B, though
in practice there's no reason why they wouldn't.</p>
<p>With A and B as written, this point would seem to be pedantic, but several other
features of Rust make it desirable for the language to play with data layout in
complex ways.</p>
<p>For instance, consider this struct:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct Foo&lt;T, U&gt; {
    count: u16,
    data1: T,
    data2: U,
}
#}</code></pre></pre>
<p>Now consider the monomorphizations of <code>Foo&lt;u32, u16&gt;</code> and <code>Foo&lt;u16, u32&gt;</code>. If
Rust lays out the fields in the order specified, we expect it to pad the
values in the struct to satisfy their alignment requirements. So if Rust
didn't reorder fields, we would expect it to produce the following:</p>
<pre><code class="language-rust ignore">struct Foo&lt;u16, u32&gt; {
    count: u16,
    data1: u16,
    data2: u32,
}

struct Foo&lt;u32, u16&gt; {
    count: u16,
    _pad1: u16,
    data1: u32,
    data2: u16,
    _pad2: u16,
}
</code></pre>
<p>The latter case quite simply wastes space. An optimal use of space therefore
requires different monomorphizations to have <em>different field orderings</em>.</p>
<p>Enums make this consideration even more complicated. Naively, an enum such as:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
enum Foo {
    A(u32),
    B(u64),
    C(u8),
}
#}</code></pre></pre>
<p>would be laid out as:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct FooRepr {
    data: u64, // this is either a u64, u32, or u8 based on `tag`
    tag: u8,   // 0 = A, 1 = B, 2 = C
}
#}</code></pre></pre>
<p>And indeed this is approximately how it would be laid out in general (modulo the
size and position of <code>tag</code>).</p>
<p>However there are several cases where such a representation is inefficient. The
classic case of this is Rust's &quot;null pointer optimization&quot;: an enum consisting
of a single outer unit variant (e.g. <code>None</code>) and a (potentially nested) non-
nullable pointer variant (e.g. <code>&amp;T</code>) makes the tag unnecessary, because a null
pointer value can safely be interpreted to mean that the unit variant is chosen
instead. The net result is that, for example, <code>size_of::&lt;Option&lt;&amp;T&gt;&gt;() == size_of::&lt;&amp;T&gt;()</code>.</p>
<p>There are many types in Rust that are, or contain, non-nullable pointers such as
<code>Box&lt;T&gt;</code>, <code>Vec&lt;T&gt;</code>, <code>String</code>, <code>&amp;T</code>, and <code>&amp;mut T</code>. Similarly, one can imagine
nested enums pooling their tags into a single discriminant, as they are by
definition known to have a limited range of valid values. In principle enums could
use fairly elaborate algorithms to cache bits throughout nested types with
special constrained representations. As such it is <em>especially</em> desirable that
we leave enum layout unspecified today.</p>
<a class="header" href="print.html#exotically-sized-types" id="exotically-sized-types"><h1>Exotically Sized Types</h1></a>
<p>Most of the time, we think in terms of types with a fixed, positive size. This
is not always the case, however.</p>
<a class="header" href="print.html#dynamically-sized-types-dsts" id="dynamically-sized-types-dsts"><h1>Dynamically Sized Types (DSTs)</h1></a>
<p>Rust in fact supports Dynamically Sized Types (DSTs): types without a statically
known size or alignment. On the surface, this is a bit nonsensical: Rust <em>must</em>
know the size and alignment of something in order to correctly work with it! In
this regard, DSTs are not normal types. Due to their lack of a statically known
size, these types can only exist behind some kind of pointer. Any pointer to a
DST consequently becomes a <em>fat</em> pointer consisting of the pointer and the
information that &quot;completes&quot; them (more on this below).</p>
<p>There are two major DSTs exposed by the language: trait objects, and slices.</p>
<p>A trait object represents some type that implements the traits it specifies.
The exact original type is <em>erased</em> in favor of runtime reflection
with a vtable containing all the information necessary to use the type.
This is the information that completes a trait object: a pointer to its vtable.</p>
<p>A slice is simply a view into some contiguous storage -- typically an array or
<code>Vec</code>. The information that completes a slice is just the number of elements
it points to.</p>
<p>Structs can actually store a single DST directly as their last field, but this
makes them a DST as well:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
// Can't be stored on the stack directly
struct Foo {
    info: u32,
    data: [u8],
}
#}</code></pre></pre>
<a class="header" href="print.html#zero-sized-types-zsts" id="zero-sized-types-zsts"><h1>Zero Sized Types (ZSTs)</h1></a>
<p>Rust actually allows types to be specified that occupy no space:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct Foo; // No fields = no size

// All fields have no size = no size
struct Baz {
    foo: Foo,
    qux: (),      // empty tuple has no size
    baz: [u8; 0], // empty array has no size
}
#}</code></pre></pre>
<p>On their own, Zero Sized Types (ZSTs) are, for obvious reasons, pretty useless.
However as with many curious layout choices in Rust, their potential is realized
in a generic context: Rust largely understands that any operation that  produces
or stores a ZST can be reduced to a no-op. First off, storing it  doesn't even
make sense -- it doesn't occupy any space. Also there's only one  value of that
type, so anything that loads it can just produce it from the  aether -- which is
also a no-op since it doesn't occupy any space.</p>
<p>One of the most extreme example's of this is Sets and Maps. Given a
<code>Map&lt;Key, Value&gt;</code>, it is common to implement a <code>Set&lt;Key&gt;</code> as just a thin wrapper
around <code>Map&lt;Key, UselessJunk&gt;</code>. In many languages, this would necessitate
allocating space for UselessJunk and doing work to store and load UselessJunk
only to discard it. Proving this unnecessary would be a difficult analysis for
the compiler.</p>
<p>However in Rust, we can just say that  <code>Set&lt;Key&gt; = Map&lt;Key, ()&gt;</code>. Now Rust
statically knows that every load and store is useless, and no allocation has any
size. The result is that the monomorphized code is basically a custom
implementation of a HashSet with none of the overhead that HashMap would have to
support values.</p>
<p>Safe code need not worry about ZSTs, but <em>unsafe</em> code must be careful about the
consequence of types with no size. In particular, pointer offsets are no-ops,
and standard allocators (including jemalloc, the one used by default in Rust)
may return <code>nullptr</code> when a zero-sized allocation is requested, which is
indistinguishable from out of memory.</p>
<a class="header" href="print.html#empty-types" id="empty-types"><h1>Empty Types</h1></a>
<p>Rust also enables types to be declared that <em>cannot even be instantiated</em>. These
types can only be talked about at the type level, and never at the value level.
Empty types can be declared by specifying an enum with no variants:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
enum Void {} // No variants = EMPTY
#}</code></pre></pre>
<p>Empty types are even more marginal than ZSTs. The primary motivating example for
Void types is type-level unreachability. For instance, suppose an API needs to
return a Result in general, but a specific case actually is infallible. It's
actually possible to communicate this at the type level by returning a
<code>Result&lt;T, Void&gt;</code>. Consumers of the API can confidently unwrap such a Result
knowing that it's <em>statically impossible</em> for this value to be an <code>Err</code>, as
this would require providing a value of type <code>Void</code>.</p>
<p>In principle, Rust can do some interesting analyses and optimizations based
on this fact. For instance, <code>Result&lt;T, Void&gt;</code> could be represented as just <code>T</code>,
because the <code>Err</code> case doesn't actually exist. The following <em>could</em> also
compile:</p>
<pre><code class="language-rust ignore">enum Void {}

let res: Result&lt;u32, Void&gt; = Ok(0);

// Err doesn't exist anymore, so Ok is actually irrefutable.
let Ok(num) = res;
</code></pre>
<p>But neither of these tricks work today, so all Void types get you is
the ability to be confident that certain situations are statically impossible.</p>
<p>One final subtle detail about empty types is that raw pointers to them are
actually valid to construct, but dereferencing them is Undefined Behavior
because that doesn't actually make sense. That is, you could model C's <code>void *</code>
type with <code>*const Void</code>, but this doesn't necessarily gain anything over using
e.g. <code>*const ()</code>, which <em>is</em> safe to randomly dereference.</p>
<a class="header" href="print.html#alternative-representations" id="alternative-representations"><h1>Alternative representations</h1></a>
<p>Rust allows you to specify alternative data layout strategies from the default.</p>
<a class="header" href="print.html#reprc" id="reprc"><h1>repr(C)</h1></a>
<p>This is the most important <code>repr</code>. It has fairly simple intent: do what C does.
The order, size, and alignment of fields is exactly what you would expect from C
or C++. Any type you expect to pass through an FFI boundary should have
<code>repr(C)</code>, as C is the lingua-franca of the programming world. This is also
necessary to soundly do more elaborate tricks with data layout such as
reinterpreting values as a different type.</p>
<p>However, the interaction with Rust's more exotic data layout features must be
kept in mind. Due to its dual purpose as &quot;for FFI&quot; and &quot;for layout control&quot;,
<code>repr(C)</code> can be applied to types that will be nonsensical or problematic if
passed through the FFI boundary.</p>
<ul>
<li>
<p>ZSTs are still zero-sized, even though this is not a standard behavior in
C, and is explicitly contrary to the behavior of an empty type in C++, which
still consumes a byte of space.</p>
</li>
<li>
<p>DST pointers (fat pointers), tuples, and enums with fields are not a concept
in C, and as such are never FFI-safe.</p>
</li>
<li>
<p>If <code>T</code> is an <a href="ffi.html#the-nullable-pointer-optimization">FFI-safe non-nullable pointer
type</a>,
<code>Option&lt;T&gt;</code> is guaranteed to have the same layout and ABI as <code>T</code> and is
therefore also FFI-safe. As of this writing, this covers <code>&amp;</code>, <code>&amp;mut</code>,
and function pointers, all of which can never be null.</p>
</li>
<li>
<p>Tuple structs are like structs with regards to <code>repr(C)</code>, as the only
difference from a struct is that the fields aren’t named.</p>
</li>
<li>
<p>This is equivalent to one of <code>repr(u*)</code> (see the next section) for enums. The
chosen size is the default enum size for the target platform's C application
binary interface (ABI). Note that enum representation in C is implementation
defined, so this is really a &quot;best guess&quot;. In particular, this may be incorrect
when the C code of interest is compiled with certain flags.</p>
</li>
<li>
<p>Field-less enums with <code>repr(C)</code> or <code>repr(u*)</code> still may not be set to an
integer value without a corresponding variant, even though this is
permitted behavior in C or C++. It is undefined behavior to (unsafely)
construct an instance of an enum that does not match one of its
variants. (This allows exhaustive matches to continue to be written and
compiled as normal.)</p>
</li>
</ul>
<a class="header" href="print.html#repru-repri" id="repru-repri"><h1>repr(u*), repr(i*)</h1></a>
<p>These specify the size to make a field-less enum. If the discriminant overflows
the integer it has to fit in, it will produce a compile-time error. You can
manually ask Rust to allow this by setting the overflowing element to explicitly
be 0. However Rust will not allow you to create an enum where two variants have
the same discriminant.</p>
<p>The term &quot;field-less enum&quot; only means that the enum doesn't have data in any
of its variants. A field-less enum without a <code>repr(u*)</code> or <code>repr(C)</code> is
still a Rust native type, and does not have a stable ABI representation.
Adding a <code>repr</code> causes it to be treated exactly like the specified
integer size for ABI purposes.</p>
<p>Any enum with fields is a Rust type with no guaranteed ABI (even if the
only data is <code>PhantomData</code> or something else with zero size).</p>
<p>Adding an explicit <code>repr</code> to an enum suppresses the null-pointer
optimization.</p>
<p>These reprs have no effect on a struct.</p>
<a class="header" href="print.html#reprpacked" id="reprpacked"><h1>repr(packed)</h1></a>
<p><code>repr(packed)</code> forces Rust to strip any padding, and only align the type to a
byte. This may improve the memory footprint, but will likely have other negative
side-effects.</p>
<p>In particular, most architectures <em>strongly</em> prefer values to be aligned. This
may mean the unaligned loads are penalized (x86), or even fault (some ARM
chips). For simple cases like directly loading or storing a packed field, the
compiler might be able to paper over alignment issues with shifts and masks.
However if you take a reference to a packed field, it's unlikely that the
compiler will be able to emit code to avoid an unaligned load.</p>
<p><strong><a href="https://github.com/rust-lang/rust/issues/27060">As of Rust 1.0 this can cause undefined behavior.</a></strong></p>
<p><code>repr(packed)</code> is not to be used lightly. Unless you have extreme requirements,
this should not be used.</p>
<p>This repr is a modifier on <code>repr(C)</code> and <code>repr(rust)</code>.</p>
<a class="header" href="print.html#ownership-and-lifetimes" id="ownership-and-lifetimes"><h1>Ownership and Lifetimes</h1></a>
<p>Ownership is the breakout feature of Rust. It allows Rust to be completely
memory-safe and efficient, while avoiding garbage collection. Before getting
into the ownership system in detail, we will consider the motivation of this
design.</p>
<p>We will assume that you accept that garbage collection (GC) is not always an
optimal solution, and that it is desirable to manually manage memory in some
contexts. If you do not accept this, might I interest you in a different
language?</p>
<p>Regardless of your feelings on GC, it is pretty clearly a <em>massive</em> boon to
making code safe. You never have to worry about things going away <em>too soon</em>
(although whether you still wanted to be pointing at that thing is a different
issue...). This is a pervasive problem that C and C++ programs need to deal
with. Consider this simple mistake that all of us who have used a non-GC'd
language have made at one point:</p>
<pre><code class="language-rust ignore">fn as_str(data: &amp;u32) -&gt; &amp;str {
    // compute the string
    let s = format!(&quot;{}&quot;, data);

    // OH NO! We returned a reference to something that
    // exists only in this function!
    // Dangling pointer! Use after free! Alas!
    // (this does not compile in Rust)
    &amp;s
}
</code></pre>
<p>This is exactly what Rust's ownership system was built to solve.
Rust knows the scope in which the <code>&amp;s</code> lives, and as such can prevent it from
escaping. However this is a simple case that even a C compiler could plausibly
catch. Things get more complicated as code gets bigger and pointers get fed through
various functions. Eventually, a C compiler will fall down and won't be able to
perform sufficient escape analysis to prove your code unsound. It will consequently
be forced to accept your program on the assumption that it is correct.</p>
<p>This will never happen to Rust. It's up to the programmer to prove to the
compiler that everything is sound.</p>
<p>Of course, Rust's story around ownership is much more complicated than just
verifying that references don't escape the scope of their referent. That's
because ensuring pointers are always valid is much more complicated than this.
For instance in this code,</p>
<pre><code class="language-rust ignore">let mut data = vec![1, 2, 3];
// get an internal reference
let x = &amp;data[0];

// OH NO! `push` causes the backing storage of `data` to be reallocated.
// Dangling pointer! Use after free! Alas!
// (this does not compile in Rust)
data.push(4);

println!(&quot;{}&quot;, x);
</code></pre>
<p>naive scope analysis would be insufficient to prevent this bug, because <code>data</code>
does in fact live as long as we needed. However it was <em>changed</em> while we had
a reference into it. This is why Rust requires any references to freeze the
referent and its owners.</p>
<a class="header" href="print.html#references" id="references"><h1>References</h1></a>
<p>There are two kinds of reference:</p>
<ul>
<li>Shared reference: <code>&amp;</code></li>
<li>Mutable reference: <code>&amp;mut</code></li>
</ul>
<p>Which obey the following rules:</p>
<ul>
<li>A reference cannot outlive its referent</li>
<li>A mutable reference cannot be aliased</li>
</ul>
<p>That's it. That's the whole model references follow.</p>
<p>Of course, we should probably define what <em>aliased</em> means.</p>
<pre><code class="language-text">error[E0425]: cannot find value `aliased` in this scope
 --&gt; &lt;rust.rs&gt;:2:20
  |
2 |     println!(&quot;{}&quot;, aliased);
  |                    ^^^^^^^ not found in this scope

error: aborting due to previous error
</code></pre>
<p>Unfortunately, Rust hasn't actually defined its aliasing model. 🙀</p>
<p>While we wait for the Rust devs to specify the semantics of their language,
let's use the next section to discuss what aliasing is in general, and why it
matters.</p>
<a class="header" href="print.html#aliasing" id="aliasing"><h1>Aliasing</h1></a>
<p>First off, let's get some important caveats out of this way:</p>
<ul>
<li>
<p>We will be using the broadest possible definition of aliasing for the sake
of discussion. Rust's definition will probably be more restricted to factor
in mutations and liveness.</p>
</li>
<li>
<p>We will be assuming a single-threaded, interrupt-free, execution. We will also
be ignoring things like memory-mapped hardware. Rust assumes these things
don't happen unless you tell it otherwise. For more details, see the
<a href="concurrency.html">Concurrency Chapter</a>.</p>
</li>
</ul>
<p>With that said, here's our working definition: variables and pointers <em>alias</em>
if they refer to overlapping regions of memory.</p>
<a class="header" href="print.html#why-aliasing-matters" id="why-aliasing-matters"><h1>Why Aliasing Matters</h1></a>
<p>So why should we care about aliasing?</p>
<p>Consider this simple function:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn compute(input: &amp;u32, output: &amp;mut u32) {
    if *input &gt; 10 {
        *output = 1;
    }
    if *input &gt; 5 {
        *output *= 2;
    }
}
#}</code></pre></pre>
<p>We would <em>like</em> to be able to optimize it to the following function:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn compute(input: &amp;u32, output: &amp;mut u32) {
    let cached_input = *input; // keep *input in a register
    if cached_input &gt; 10 {
        *output = 2;  // x &gt; 10 implies x &gt; 5, so double and exit immediately
    } else if cached_input &gt; 5 {
        *output *= 2;
    }
}
#}</code></pre></pre>
<p>In Rust, this optimization should be sound. For almost any other language, it
wouldn't be (barring global analysis). This is because the optimization relies
on knowing that aliasing doesn't occur, which most languages are fairly liberal
with. Specifically, we need to worry about function arguments that make <code>input</code>
and <code>output</code> overlap, such as <code>compute(&amp;x, &amp;mut x)</code>.</p>
<p>With that input, we could get this execution:</p>
<pre><code class="language-rust ignore">                    //  input ==  output == 0xabad1dea
                    // *input == *output == 20
if *input &gt; 10 {    // true  (*input == 20)
    *output = 1;    // also overwrites *input, because they are the same
}
if *input &gt; 5 {     // false (*input == 1)
    *output *= 2;
}
                    // *input == *output == 1
</code></pre>
<p>Our optimized function would produce <code>*output == 2</code> for this input, so the
correctness of our optimization relies on this input being impossible.</p>
<p>In Rust we know this input should be impossible because <code>&amp;mut</code> isn't allowed to be
aliased. So we can safely reject its possibility and perform this optimization.
In most other languages, this input would be entirely possible, and must be considered.</p>
<p>This is why alias analysis is important: it lets the compiler perform useful
optimizations! Some examples:</p>
<ul>
<li>keeping values in registers by proving no pointers access the value's memory</li>
<li>eliminating reads by proving some memory hasn't been written to since last we read it</li>
<li>eliminating writes by proving some memory is never read before the next write to it</li>
<li>moving or reordering reads and writes by proving they don't depend on each other</li>
</ul>
<p>These optimizations also tend to prove the soundness of bigger optimizations
such as loop vectorization, constant propagation, and dead code elimination.</p>
<p>In the previous example, we used the fact that <code>&amp;mut u32</code> can't be aliased to prove
that writes to <code>*output</code> can't possibly affect <code>*input</code>. This let us cache <code>*input</code>
in a register, eliminating a read.</p>
<p>By caching this read, we knew that the the write in the <code>&gt; 10</code> branch couldn't
affect whether we take the <code>&gt; 5</code> branch, allowing us to also eliminate a
read-modify-write (doubling <code>*output</code>) when <code>*input &gt; 10</code>.</p>
<p>The key thing to remember about alias analysis is that writes are the primary
hazard for optimizations. That is, the only thing that prevents us
from moving a read to any other part of the program is the possibility of us
re-ordering it with a write to the same location.</p>
<p>For instance, we have no concern for aliasing in the following modified version
of our function, because we've moved the only write to <code>*output</code> to the very
end of our function. This allows us to freely reorder the reads of <code>*input</code> that
occur before it:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn compute(input: &amp;u32, output: &amp;mut u32) {
    let mut temp = *output;
    if *input &gt; 10 {
        temp = 1;
    }
    if *input &gt; 5 {
        temp *= 2;
    }
    *output = temp;
}
#}</code></pre></pre>
<p>We're still relying on alias analysis to assume that <code>temp</code> doesn't alias
<code>input</code>, but the proof is much simpler: the value of a local variable can't be
aliased by things that existed before it was declared. This is an assumption
every language freely makes, and so this version of the function could be
optimized the way we want in any language.</p>
<p>This is why the definition of &quot;alias&quot; that Rust will use likely involves some
notion of liveness and mutation: we don't actually care if aliasing occurs if
there aren't any actual writes to memory happening.</p>
<p>Of course, a full aliasing model for Rust must also take into consideration things like
function calls (which may mutate things we don't see), raw pointers (which have
no aliasing requirements on their own), and UnsafeCell (which lets the referent
of an <code>&amp;</code> be mutated).</p>
<a class="header" href="print.html#lifetimes" id="lifetimes"><h1>Lifetimes</h1></a>
<p>Rust enforces these rules through <em>lifetimes</em>. Lifetimes are effectively
just names for scopes somewhere in the program. Each reference,
and anything that contains a reference, is tagged with a lifetime specifying
the scope it's valid for.</p>
<p>Within a function body, Rust generally doesn't let you explicitly name the
lifetimes involved. This is because it's generally not really necessary
to talk about lifetimes in a local context; Rust has all the information and
can work out everything as optimally as possible. Many anonymous scopes and
temporaries that you would otherwise have to write are often introduced to
make your code Just Work.</p>
<p>However once you cross the function boundary, you need to start talking about
lifetimes. Lifetimes are denoted with an apostrophe: <code>'a</code>, <code>'static</code>. To dip
our toes with lifetimes, we're going to pretend that we're actually allowed
to label scopes with lifetimes, and desugar the examples from the start of
this chapter.</p>
<p>Originally, our examples made use of <em>aggressive</em> sugar -- high fructose corn
syrup even -- around scopes and lifetimes, because writing everything out
explicitly is <em>extremely noisy</em>. All Rust code relies on aggressive inference
and elision of &quot;obvious&quot; things.</p>
<p>One particularly interesting piece of sugar is that each <code>let</code> statement implicitly
introduces a scope. For the most part, this doesn't really matter. However it
does matter for variables that refer to each other. As a simple example, let's
completely desugar this simple piece of Rust code:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
let x = 0;
let y = &amp;x;
let z = &amp;y;
#}</code></pre></pre>
<p>The borrow checker always tries to minimize the extent of a lifetime, so it will
likely desugar to the following:</p>
<pre><code class="language-rust ignore">// NOTE: `'a: {` and `&amp;'b x` is not valid syntax!
'a: {
    let x: i32 = 0;
    'b: {
        // lifetime used is 'b because that's good enough.
        let y: &amp;'b i32 = &amp;'b x;
        'c: {
            // ditto on 'c
            let z: &amp;'c &amp;'b i32 = &amp;'c y;
        }
    }
}
</code></pre>
<p>Wow. That's... awful. Let's all take a moment to thank Rust for making this easier.</p>
<p>Actually passing references to outer scopes will cause Rust to infer
a larger lifetime:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
let x = 0;
let z;
let y = &amp;x;
z = y;
#}</code></pre></pre>
<pre><code class="language-rust ignore">'a: {
    let x: i32 = 0;
    'b: {
        let z: &amp;'b i32;
        'c: {
            // Must use 'b here because this reference is
            // being passed to that scope.
            let y: &amp;'b i32 = &amp;'b x;
            z = y;
        }
    }
}
</code></pre>
<a class="header" href="print.html#example-references-that-outlive-referents" id="example-references-that-outlive-referents"><h1>Example: references that outlive referents</h1></a>
<p>Alright, let's look at some of those examples from before:</p>
<pre><code class="language-rust ignore">fn as_str(data: &amp;u32) -&gt; &amp;str {
    let s = format!(&quot;{}&quot;, data);
    &amp;s
}
</code></pre>
<p>desugars to:</p>
<pre><code class="language-rust ignore">fn as_str&lt;'a&gt;(data: &amp;'a u32) -&gt; &amp;'a str {
    'b: {
        let s = format!(&quot;{}&quot;, data);
        return &amp;'a s;
    }
}
</code></pre>
<p>This signature of <code>as_str</code> takes a reference to a u32 with <em>some</em> lifetime, and
promises that it can produce a reference to a str that can live <em>just as long</em>.
Already we can see why this signature might be trouble. That basically implies
that we're going to find a str somewhere in the scope the reference
to the u32 originated in, or somewhere <em>even earlier</em>. That's a bit of a tall
order.</p>
<p>We then proceed to compute the string <code>s</code>, and return a reference to it. Since
the contract of our function says the reference must outlive <code>'a</code>, that's the
lifetime we infer for the reference. Unfortunately, <code>s</code> was defined in the
scope <code>'b</code>, so the only way this is sound is if <code>'b</code> contains <code>'a</code> -- which is
clearly false since <code>'a</code> must contain the function call itself. We have therefore
created a reference whose lifetime outlives its referent, which is <em>literally</em>
the first thing we said that references can't do. The compiler rightfully blows
up in our face.</p>
<p>To make this more clear, we can expand the example:</p>
<pre><code class="language-rust ignore">fn as_str&lt;'a&gt;(data: &amp;'a u32) -&gt; &amp;'a str {
    'b: {
        let s = format!(&quot;{}&quot;, data);
        return &amp;'a s
    }
}

fn main() {
    'c: {
        let x: u32 = 0;
        'd: {
            // An anonymous scope is introduced because the borrow does not
            // need to last for the whole scope x is valid for. The return
            // of as_str must find a str somewhere before this function
            // call. Obviously not happening.
            println!(&quot;{}&quot;, as_str::&lt;'d&gt;(&amp;'d x));
        }
    }
}
</code></pre>
<p>Shoot!</p>
<p>Of course, the right way to write this function is as follows:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn to_string(data: &amp;u32) -&gt; String {
    format!(&quot;{}&quot;, data)
}
#}</code></pre></pre>
<p>We must produce an owned value inside the function to return it! The only way
we could have returned an <code>&amp;'a str</code> would have been if it was in a field of the
<code>&amp;'a u32</code>, which is obviously not the case.</p>
<p>(Actually we could have also just returned a string literal, which as a global
can be considered to reside at the bottom of the stack; though this limits
our implementation <em>just a bit</em>.)</p>
<a class="header" href="print.html#example-aliasing-a-mutable-reference" id="example-aliasing-a-mutable-reference"><h1>Example: aliasing a mutable reference</h1></a>
<p>How about the other example:</p>
<pre><code class="language-rust ignore">let mut data = vec![1, 2, 3];
let x = &amp;data[0];
data.push(4);
println!(&quot;{}&quot;, x);
</code></pre>
<pre><code class="language-rust ignore">'a: {
    let mut data: Vec&lt;i32&gt; = vec![1, 2, 3];
    'b: {
        // 'b is as big as we need this borrow to be
        // (just need to get to `println!`)
        let x: &amp;'b i32 = Index::index::&lt;'b&gt;(&amp;'b data, 0);
        'c: {
            // Temporary scope because we don't need the
            // &amp;mut to last any longer.
            Vec::push(&amp;'c mut data, 4);
        }
        println!(&quot;{}&quot;, x);
    }
}
</code></pre>
<p>The problem here is a bit more subtle and interesting. We want Rust to
reject this program for the following reason: We have a live shared reference <code>x</code>
to a descendant of <code>data</code> when we try to take a mutable reference to <code>data</code>
to <code>push</code>. This would create an aliased mutable reference, which would
violate the <em>second</em> rule of references.</p>
<p>However this is <em>not at all</em> how Rust reasons that this program is bad. Rust
doesn't understand that <code>x</code> is a reference to a subpath of <code>data</code>. It doesn't
understand Vec at all. What it <em>does</em> see is that <code>x</code> has to live for <code>'b</code> to
be printed. The signature of <code>Index::index</code> subsequently demands that the
reference we take to <code>data</code> has to survive for <code>'b</code>. When we try to call <code>push</code>,
it then sees us try to make an <code>&amp;'c mut data</code>. Rust knows that <code>'c</code> is contained
within <code>'b</code>, and rejects our program because the <code>&amp;'b data</code> must still be live!</p>
<p>Here we see that the lifetime system is much more coarse than the reference
semantics we're actually interested in preserving. For the most part, <em>that's
totally ok</em>, because it keeps us from spending all day explaining our program
to the compiler. However it does mean that several programs that are totally
correct with respect to Rust's <em>true</em> semantics are rejected because lifetimes
are too dumb.</p>
<a class="header" href="print.html#limits-of-lifetimes" id="limits-of-lifetimes"><h1>Limits of Lifetimes</h1></a>
<p>Given the following code:</p>
<pre><code class="language-rust ignore">struct Foo;

impl Foo {
    fn mutate_and_share(&amp;mut self) -&gt; &amp;Self { &amp;*self }
    fn share(&amp;self) {}
}

fn main() {
    let mut foo = Foo;
    let loan = foo.mutate_and_share();
    foo.share();
}
</code></pre>
<p>One might expect it to compile. We call <code>mutate_and_share</code>, which mutably borrows
<code>foo</code> temporarily, but then returns only a shared reference. Therefore we
would expect <code>foo.share()</code> to succeed as <code>foo</code> shouldn't be mutably borrowed.</p>
<p>However when we try to compile it:</p>
<pre><code class="language-text">&lt;anon&gt;:11:5: 11:8 error: cannot borrow `foo` as immutable because it is also borrowed as mutable
&lt;anon&gt;:11     foo.share();
              ^~~
&lt;anon&gt;:10:16: 10:19 note: previous borrow of `foo` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `foo` until the borrow ends
&lt;anon&gt;:10     let loan = foo.mutate_and_share();
                         ^~~
&lt;anon&gt;:12:2: 12:2 note: previous borrow ends here
&lt;anon&gt;:8 fn main() {
&lt;anon&gt;:9     let mut foo = Foo;
&lt;anon&gt;:10     let loan = foo.mutate_and_share();
&lt;anon&gt;:11     foo.share();
&lt;anon&gt;:12 }
          ^
</code></pre>
<p>What happened? Well, we got the exact same reasoning as we did for
<a href="lifetimes.html#example-aliasing-a-mutable-reference">Example 2 in the previous section</a>. We desugar the program and we get
the following:</p>
<pre><code class="language-rust ignore">struct Foo;

impl Foo {
    fn mutate_and_share&lt;'a&gt;(&amp;'a mut self) -&gt; &amp;'a Self { &amp;'a *self }
    fn share&lt;'a&gt;(&amp;'a self) {}
}

fn main() {
    'b: {
        let mut foo: Foo = Foo;
        'c: {
            let loan: &amp;'c Foo = Foo::mutate_and_share::&lt;'c&gt;(&amp;'c mut foo);
            'd: {
                Foo::share::&lt;'d&gt;(&amp;'d foo);
            }
        }
    }
}
</code></pre>
<p>The lifetime system is forced to extend the <code>&amp;mut foo</code> to have lifetime <code>'c</code>,
due to the lifetime of <code>loan</code> and mutate_and_share's signature. Then when we
try to call <code>share</code>, and it sees we're trying to alias that <code>&amp;'c mut foo</code> and
blows up in our face!</p>
<p>This program is clearly correct according to the reference semantics we actually
care about, but the lifetime system is too coarse-grained to handle that.</p>
<p>TODO: other common problems? SEME regions stuff, mostly?</p>
<a class="header" href="print.html#lifetime-elision" id="lifetime-elision"><h1>Lifetime Elision</h1></a>
<p>In order to make common patterns more ergonomic, Rust allows lifetimes to be
<em>elided</em> in function signatures.</p>
<p>A <em>lifetime position</em> is anywhere you can write a lifetime in a type:</p>
<pre><code class="language-rust ignore">&amp;'a T
&amp;'a mut T
T&lt;'a&gt;
</code></pre>
<p>Lifetime positions can appear as either &quot;input&quot; or &quot;output&quot;:</p>
<ul>
<li>
<p>For <code>fn</code> definitions, input refers to the types of the formal arguments
in the <code>fn</code> definition, while output refers to
result types. So <code>fn foo(s: &amp;str) -&gt; (&amp;str, &amp;str)</code> has elided one lifetime in
input position and two lifetimes in output position.
Note that the input positions of a <code>fn</code> method definition do not
include the lifetimes that occur in the method's <code>impl</code> header
(nor lifetimes that occur in the trait header, for a default method).</p>
</li>
<li>
<p>In the future, it should be possible to elide <code>impl</code> headers in the same manner.</p>
</li>
</ul>
<p>Elision rules are as follows:</p>
<ul>
<li>
<p>Each elided lifetime in input position becomes a distinct lifetime
parameter.</p>
</li>
<li>
<p>If there is exactly one input lifetime position (elided or not), that lifetime
is assigned to <em>all</em> elided output lifetimes.</p>
</li>
<li>
<p>If there are multiple input lifetime positions, but one of them is <code>&amp;self</code> or
<code>&amp;mut self</code>, the lifetime of <code>self</code> is assigned to <em>all</em> elided output lifetimes.</p>
</li>
<li>
<p>Otherwise, it is an error to elide an output lifetime.</p>
</li>
</ul>
<p>Examples:</p>
<pre><code class="language-rust ignore">fn print(s: &amp;str);                                      // elided
fn print&lt;'a&gt;(s: &amp;'a str);                               // expanded

fn debug(lvl: usize, s: &amp;str);                          // elided
fn debug&lt;'a&gt;(lvl: usize, s: &amp;'a str);                   // expanded

fn substr(s: &amp;str, until: usize) -&gt; &amp;str;               // elided
fn substr&lt;'a&gt;(s: &amp;'a str, until: usize) -&gt; &amp;'a str;     // expanded

fn get_str() -&gt; &amp;str;                                   // ILLEGAL

fn frob(s: &amp;str, t: &amp;str) -&gt; &amp;str;                      // ILLEGAL

fn get_mut(&amp;mut self) -&gt; &amp;mut T;                        // elided
fn get_mut&lt;'a&gt;(&amp;'a mut self) -&gt; &amp;'a mut T;              // expanded

fn args&lt;T: ToCStr&gt;(&amp;mut self, args: &amp;[T]) -&gt; &amp;mut Command                  // elided
fn args&lt;'a, 'b, T: ToCStr&gt;(&amp;'a mut self, args: &amp;'b [T]) -&gt; &amp;'a mut Command // expanded

fn new(buf: &amp;mut [u8]) -&gt; BufWriter;                    // elided
fn new&lt;'a&gt;(buf: &amp;'a mut [u8]) -&gt; BufWriter&lt;'a&gt;          // expanded

</code></pre>
<a class="header" href="print.html#unbounded-lifetimes" id="unbounded-lifetimes"><h1>Unbounded Lifetimes</h1></a>
<p>Unsafe code can often end up producing references or lifetimes out of thin air.
Such lifetimes come into the world as <em>unbounded</em>. The most common source of this
is dereferencing a raw pointer, which produces a reference with an unbounded lifetime.
Such a lifetime becomes as big as context demands. This is in fact more powerful
than simply becoming <code>'static</code>, because for instance <code>&amp;'static &amp;'a T</code>
will fail to typecheck, but the unbound lifetime will perfectly mold into
<code>&amp;'a &amp;'a T</code> as needed. However for most intents and purposes, such an unbounded
lifetime can be regarded as <code>'static</code>.</p>
<p>Almost no reference is <code>'static</code>, so this is probably wrong. <code>transmute</code> and
<code>transmute_copy</code> are the two other primary offenders. One should endeavor to
bound an unbounded lifetime as quickly as possible, especially across function
boundaries.</p>
<p>Given a function, any output lifetimes that don't derive from inputs are
unbounded. For instance:</p>
<pre><code class="language-rust ignore">fn get_str&lt;'a&gt;() -&gt; &amp;'a str;
</code></pre>
<p>will produce an <code>&amp;str</code> with an unbounded lifetime. The easiest way to avoid
unbounded lifetimes is to use lifetime elision at the function boundary.
If an output lifetime is elided, then it <em>must</em> be bounded by an input lifetime.
Of course it might be bounded by the <em>wrong</em> lifetime, but this will usually
just cause a compiler error, rather than allow memory safety to be trivially
violated.</p>
<p>Within a function, bounding lifetimes is more error-prone. The safest and easiest
way to bound a lifetime is to return it from a function with a bound lifetime.
However if this is unacceptable, the reference can be placed in a location with
a specific lifetime. Unfortunately it's impossible to name all lifetimes involved
in a function.</p>
<a class="header" href="print.html#higher-rank-trait-bounds-hrtbs" id="higher-rank-trait-bounds-hrtbs"><h1>Higher-Rank Trait Bounds (HRTBs)</h1></a>
<p>Rust's <code>Fn</code> traits are a little bit magic. For instance, we can write the
following code:</p>
<pre><pre class="playpen"><code class="language-rust">struct Closure&lt;F&gt; {
    data: (u8, u16),
    func: F,
}

impl&lt;F&gt; Closure&lt;F&gt;
    where F: Fn(&amp;(u8, u16)) -&gt; &amp;u8,
{
    fn call(&amp;self) -&gt; &amp;u8 {
        (self.func)(&amp;self.data)
    }
}

fn do_it(data: &amp;(u8, u16)) -&gt; &amp;u8 { &amp;data.0 }

fn main() {
    let clo = Closure { data: (0, 1), func: do_it };
    println!(&quot;{}&quot;, clo.call());
}
</code></pre></pre>
<p>If we try to naively desugar this code in the same way that we did in the
lifetimes section, we run into some trouble:</p>
<pre><code class="language-rust ignore">struct Closure&lt;F&gt; {
    data: (u8, u16),
    func: F,
}

impl&lt;F&gt; Closure&lt;F&gt;
    // where F: Fn(&amp;'??? (u8, u16)) -&gt; &amp;'??? u8,
{
    fn call&lt;'a&gt;(&amp;'a self) -&gt; &amp;'a u8 {
        (self.func)(&amp;self.data)
    }
}

fn do_it&lt;'b&gt;(data: &amp;'b (u8, u16)) -&gt; &amp;'b u8 { &amp;'b data.0 }

fn main() {
    'x: {
        let clo = Closure { data: (0, 1), func: do_it };
        println!(&quot;{}&quot;, clo.call());
    }
}
</code></pre>
<p>How on earth are we supposed to express the lifetimes on <code>F</code>'s trait bound? We
need to provide some lifetime there, but the lifetime we care about can't be
named until we enter the body of <code>call</code>! Also, that isn't some fixed lifetime;
<code>call</code> works with <em>any</em> lifetime <code>&amp;self</code> happens to have at that point.</p>
<p>This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we
desugar this is as follows:</p>
<pre><code class="language-rust ignore">where for&lt;'a&gt; F: Fn(&amp;'a (u8, u16)) -&gt; &amp;'a u8,
</code></pre>
<p>(Where <code>Fn(a, b, c) -&gt; d</code> is itself just sugar for the unstable <em>real</em> <code>Fn</code>
trait)</p>
<p><code>for&lt;'a&gt;</code> can be read as &quot;for all choices of <code>'a</code>&quot;, and basically produces an
<em>infinite list</em> of trait bounds that F must satisfy. Intense. There aren't many
places outside of the <code>Fn</code> traits where we encounter HRTBs, and even for
those we have a nice magic sugar for the common cases.</p>
<a class="header" href="print.html#subtyping-and-variance" id="subtyping-and-variance"><h1>Subtyping and Variance</h1></a>
<p>Although Rust doesn't have any notion of structural inheritance, it <em>does</em>
include subtyping. In Rust, subtyping derives entirely from lifetimes. Since
lifetimes are scopes, we can partially order them based on the <em>contains</em>
(outlives) relationship. We can even express this as a generic bound.</p>
<p>Subtyping on lifetimes is in terms of that relationship: if <code>'a: 'b</code> (&quot;a contains
b&quot; or &quot;a outlives b&quot;), then <code>'a</code> is a subtype of <code>'b</code>. This is a large source of
confusion, because it seems intuitively backwards to many: the bigger scope is a
<em>subtype</em> of the smaller scope.</p>
<p>This does in fact make sense, though. The intuitive reason for this is that if
you expect an <code>&amp;'a u8</code> (for some concrete <code>'a</code> that you have already chosen),
then it's totally fine for me to hand you an <code>&amp;'static u8</code> even if <code>'static != 'a</code>, in the same way that if you expect an Animal in Java, it's totally fine
for me to hand you a Cat. Cats are just Animals <em>and more</em>, just as <code>'static</code>
is just <code>'a</code> <em>and more</em>.</p>
<p>(Note, the subtyping relationship and typed-ness of lifetimes is a fairly
arbitrary construct that some disagree with. However it simplifies our analysis
to treat lifetimes and types uniformly.)</p>
<p>Higher-ranked lifetimes are also subtypes of every concrete lifetime. This is
because taking an arbitrary lifetime is strictly more general than taking a
specific one.</p>
<a class="header" href="print.html#variance" id="variance"><h1>Variance</h1></a>
<p>Variance is where things get a bit complicated.</p>
<p>Variance is a property that <em>type constructors</em> have with respect to their
arguments. A type constructor in Rust is a generic type with unbound arguments.
For instance <code>Vec</code> is a type constructor that takes a <code>T</code> and returns a
<code>Vec&lt;T&gt;</code>. <code>&amp;</code> and <code>&amp;mut</code> are type constructors that take two inputs: a
lifetime, and a type to point to.</p>
<p>A type constructor's <em>variance</em> is how the subtyping of its inputs affects the
subtyping of its outputs. There are two kinds of variance in Rust:</p>
<ul>
<li>F is <em>variant</em> over <code>T</code> if <code>T</code> being a subtype of <code>U</code> implies
<code>F&lt;T&gt;</code> is a subtype of <code>F&lt;U&gt;</code> (subtyping &quot;passes through&quot;)</li>
<li>F is <em>invariant</em> over <code>T</code> otherwise (no subtyping relation can be derived)</li>
</ul>
<p>(For those of you who are familiar with variance from other languages, what we
refer to as &quot;just&quot; variance is in fact <em>covariance</em>. Rust has <em>contravariance</em>
for functions. The future of contravariance is uncertain and it may be
scrapped. For now, <code>fn(T)</code> is contravariant in <code>T</code>, which is used in matching
methods in trait implementations to the trait definition. Traits don't have
inferred variance, so <code>Fn(T)</code> is invariant in <code>T</code>).</p>
<p>Some important variances:</p>
<ul>
<li><code>&amp;'a T</code> is variant over <code>'a</code> and <code>T</code> (as is <code>*const T</code> by metaphor)</li>
<li><code>&amp;'a mut T</code> is variant over <code>'a</code> but invariant over <code>T</code></li>
<li><code>Fn(T) -&gt; U</code> is invariant over <code>T</code>, but variant over <code>U</code></li>
<li><code>Box</code>, <code>Vec</code>, and all other collections are variant over the types of
their contents</li>
<li><code>UnsafeCell&lt;T&gt;</code>, <code>Cell&lt;T&gt;</code>, <code>RefCell&lt;T&gt;</code>, <code>Mutex&lt;T&gt;</code> and all other
interior mutability types are invariant over T (as is <code>*mut T</code> by metaphor)</li>
</ul>
<p>To understand why these variances are correct and desirable, we will consider
several examples.</p>
<p>We have already covered why <code>&amp;'a T</code> should be variant over <code>'a</code> when
introducing subtyping: it's desirable to be able to pass longer-lived things
where shorter-lived things are needed.</p>
<p>Similar reasoning applies to why it should be variant over T. It is reasonable
to be able to pass <code>&amp;&amp;'static str</code> where an <code>&amp;&amp;'a str</code> is expected. The
additional level of indirection does not change the desire to be able to pass
longer lived things where shorted lived things are expected.</p>
<p>However this logic doesn't apply to <code>&amp;mut</code>. To see why <code>&amp;mut</code> should
be invariant over T, consider the following code:</p>
<pre><code class="language-rust ignore">fn overwrite&lt;T: Copy&gt;(input: &amp;mut T, new: &amp;mut T) {
    *input = *new;
}

fn main() {
    let mut forever_str: &amp;'static str = &quot;hello&quot;;
    {
        let string = String::from(&quot;world&quot;);
        overwrite(&amp;mut forever_str, &amp;mut &amp;*string);
    }
    // Oops, printing free'd memory
    println!(&quot;{}&quot;, forever_str);
}
</code></pre>
<p>The signature of <code>overwrite</code> is clearly valid: it takes mutable references to
two values of the same type, and overwrites one with the other. If <code>&amp;mut T</code> was
variant over T, then <code>&amp;mut &amp;'static str</code> would be a subtype of <code>&amp;mut &amp;'a str</code>,
since <code>&amp;'static str</code> is a subtype of <code>&amp;'a str</code>. Therefore the lifetime of
<code>forever_str</code> would successfully be &quot;shrunk&quot; down to the shorter lifetime of
<code>string</code>, and <code>overwrite</code> would be called successfully. <code>string</code> would
subsequently be dropped, and <code>forever_str</code> would point to freed memory when we
print it! Therefore <code>&amp;mut</code> should be invariant.</p>
<p>This is the general theme of variance vs invariance: if variance would allow you
to store a short-lived value into a longer-lived slot, then you must be
invariant.</p>
<p>However it <em>is</em> sound for <code>&amp;'a mut T</code> to be variant over <code>'a</code>. The key difference
between <code>'a</code> and T is that <code>'a</code> is a property of the reference itself,
while T is something the reference is borrowing. If you change T's type, then
the source still remembers the original type. However if you change the
lifetime's type, no one but the reference knows this information, so it's fine.
Put another way: <code>&amp;'a mut T</code> owns <code>'a</code>, but only <em>borrows</em> T.</p>
<p><code>Box</code> and <code>Vec</code> are interesting cases because they're variant, but you can
definitely store values in them! This is where Rust gets really clever: it's
fine for them to be variant because you can only store values
in them <em>via a mutable reference</em>! The mutable reference makes the whole type
invariant, and therefore prevents you from smuggling a short-lived type into
them.</p>
<p>Being variant allows <code>Box</code> and <code>Vec</code> to be weakened when shared
immutably. So you can pass a <code>&amp;Box&lt;&amp;'static str&gt;</code> where a <code>&amp;Box&lt;&amp;'a str&gt;</code> is
expected.</p>
<p>However what should happen when passing <em>by-value</em> is less obvious. It turns out
that, yes, you can use subtyping when passing by-value. That is, this works:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
fn get_box&lt;'a&gt;(str: &amp;'a str) -&gt; Box&lt;&amp;'a str&gt; {
    // string literals are `&amp;'static str`s
    Box::new(&quot;hello&quot;)
}
#}</code></pre></pre>
<p>Weakening when you pass by-value is fine because there's no one else who
&quot;remembers&quot; the old lifetime in the Box. The reason a variant <code>&amp;mut</code> was
trouble was because there's always someone else who remembers the original
subtype: the actual owner.</p>
<p>The invariance of the cell types can be seen as follows: <code>&amp;</code> is like an <code>&amp;mut</code>
for a cell, because you can still store values in them through an <code>&amp;</code>. Therefore
cells must be invariant to avoid lifetime smuggling.</p>
<p><code>Fn</code> is the most subtle case because it has mixed variance. To see why
<code>Fn(T) -&gt; U</code> should be invariant over T, consider the following function
signature:</p>
<pre><code class="language-rust ignore">// 'a is derived from some parent scope
fn foo(&amp;'a str) -&gt; usize;
</code></pre>
<p>This signature claims that it can handle any <code>&amp;str</code> that lives at least as
long as <code>'a</code>. Now if this signature was variant over <code>&amp;'a str</code>, that
would mean</p>
<pre><code class="language-rust ignore">fn foo(&amp;'static str) -&gt; usize;
</code></pre>
<p>could be provided in its place, as it would be a subtype. However this function
has a stronger requirement: it says that it can only handle <code>&amp;'static str</code>s,
and nothing else. Giving <code>&amp;'a str</code>s to it would be unsound, as it's free to
assume that what it's given lives forever. Therefore functions are not variant
over their arguments.</p>
<p>To see why <code>Fn(T) -&gt; U</code> should be variant over U, consider the following
function signature:</p>
<pre><code class="language-rust ignore">// 'a is derived from some parent scope
fn foo(usize) -&gt; &amp;'a str;
</code></pre>
<p>This signature claims that it will return something that outlives <code>'a</code>. It is
therefore completely reasonable to provide</p>
<pre><code class="language-rust ignore">fn foo(usize) -&gt; &amp;'static str;
</code></pre>
<p>in its place. Therefore functions are variant over their return type.</p>
<p><code>*const</code> has the exact same semantics as <code>&amp;</code>, so variance follows. <code>*mut</code> on the
other hand can dereference to an <code>&amp;mut</code> whether shared or not, so it is marked
as invariant just like cells.</p>
<p>This is all well and good for the types the standard library provides, but
how is variance determined for type that <em>you</em> define? A struct, informally
speaking, inherits the variance of its fields. If a struct <code>Foo</code>
has a generic argument <code>A</code> that is used in a field <code>a</code>, then Foo's variance
over <code>A</code> is exactly <code>a</code>'s variance. However this is complicated if <code>A</code> is used
in multiple fields.</p>
<ul>
<li>If all uses of A are variant, then Foo is variant over A</li>
<li>Otherwise, Foo is invariant over A</li>
</ul>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
use std::cell::Cell;

struct Foo&lt;'a, 'b, A: 'a, B: 'b, C, D, E, F, G, H&gt; {
    a: &amp;'a A,     // variant over 'a and A
    b: &amp;'b mut B, // variant over 'b and invariant over B
    c: *const C,  // variant over C
    d: *mut D,    // invariant over D
    e: Vec&lt;E&gt;,    // variant over E
    f: Cell&lt;F&gt;,   // invariant over F
    g: G,         // variant over G
    h1: H,        // would also be variant over H except...
    h2: Cell&lt;H&gt;,  // invariant over H, because invariance wins
}
#}</code></pre></pre>
<a class="header" href="print.html#drop-check" id="drop-check"><h1>Drop Check</h1></a>
<p>We have seen how lifetimes provide us some fairly simple rules for ensuring
that we never read dangling references. However up to this point we have only ever
interacted with the <em>outlives</em> relationship in an inclusive manner. That is,
when we talked about <code>'a: 'b</code>, it was ok for <code>'a</code> to live <em>exactly</em> as long as
<code>'b</code>. At first glance, this seems to be a meaningless distinction. Nothing ever
gets dropped at the same time as another, right? This is why we used the
following desugaring of <code>let</code> statements:</p>
<pre><code class="language-rust ignore">let x;
let y;
</code></pre>
<pre><code class="language-rust ignore">{
    let x;
    {
        let y;
    }
}
</code></pre>
<p>Each creates its own scope, clearly establishing that one drops before the
other. However, what if we do the following?</p>
<pre><code class="language-rust ignore">let (x, y) = (vec![], vec![]);
</code></pre>
<p>Does either value strictly outlive the other? The answer is in fact <em>no</em>,
neither value strictly outlives the other. Of course, one of x or y will be
dropped before the other, but the actual order is not specified. Tuples aren't
special in this regard; composite structures just don't guarantee their
destruction order as of Rust 1.0.</p>
<p>We <em>could</em> specify this for the fields of built-in composites like tuples and
structs. However, what about something like Vec? Vec has to manually drop its
elements via pure-library code. In general, anything that implements Drop has
a chance to fiddle with its innards during its final death knell. Therefore
the compiler can't sufficiently reason about the actual destruction order
of the contents of any type that implements Drop.</p>
<p>So why do we care? We care because if the type system isn't careful, it could
accidentally make dangling pointers. Consider the following simple program:</p>
<pre><pre class="playpen"><code class="language-rust">struct Inspector&lt;'a&gt;(&amp;'a u8);

fn main() {
    let (inspector, days);
    days = Box::new(1);
    inspector = Inspector(&amp;days);
}
</code></pre></pre>
<p>This program is totally sound and compiles today. The fact that <code>days</code> does
not <em>strictly</em> outlive <code>inspector</code> doesn't matter. As long as the <code>inspector</code>
is alive, so is days.</p>
<p>However if we add a destructor, the program will no longer compile!</p>
<pre><code class="language-rust ignore">struct Inspector&lt;'a&gt;(&amp;'a u8);

impl&lt;'a&gt; Drop for Inspector&lt;'a&gt; {
    fn drop(&amp;mut self) {
        println!(&quot;I was only {} days from retirement!&quot;, self.0);
    }
}

fn main() {
    let (inspector, days);
    days = Box::new(1);
    inspector = Inspector(&amp;days);
    // Let's say `days` happens to get dropped first.
    // Then when Inspector is dropped, it will try to read free'd memory!
}
</code></pre>
<pre><code class="language-text">error: `days` does not live long enough
  --&gt; &lt;anon&gt;:15:1
   |
12 |     inspector = Inspector(&amp;days);
   |                            ---- borrow occurs here
...
15 | }
   | ^ `days` dropped here while still borrowed
   |
   = note: values in a scope are dropped in the opposite order they are created

error: aborting due to previous error
</code></pre>
<p>Implementing <code>Drop</code> lets the <code>Inspector</code> execute some arbitrary code during its
death. This means it can potentially observe that types that are supposed to
live as long as it does actually were destroyed first.</p>
<p>Interestingly, only generic types need to worry about this. If they aren't
generic, then the only lifetimes they can harbor are <code>'static</code>, which will truly
live <em>forever</em>. This is why this problem is referred to as <em>sound generic drop</em>.
Sound generic drop is enforced by the <em>drop checker</em>. As of this writing, some
of the finer details of how the drop checker validates types is totally up in
the air. However The Big Rule is the subtlety that we have focused on this whole
section:</p>
<p><strong>For a generic type to soundly implement drop, its generics arguments must
strictly outlive it.</strong></p>
<p>Obeying this rule is (usually) necessary to satisfy the borrow
checker; obeying it is sufficient but not necessary to be
sound. That is, if your type obeys this rule then it's definitely
sound to drop.</p>
<p>The reason that it is not always necessary to satisfy the above rule
is that some Drop implementations will not access borrowed data even
though their type gives them the capability for such access.</p>
<p>For example, this variant of the above <code>Inspector</code> example will never
access borrowed data:</p>
<pre><code class="language-rust ignore">struct Inspector&lt;'a&gt;(&amp;'a u8, &amp;'static str);

impl&lt;'a&gt; Drop for Inspector&lt;'a&gt; {
    fn drop(&amp;mut self) {
        println!(&quot;Inspector(_, {}) knows when *not* to inspect.&quot;, self.1);
    }
}

fn main() {
    let (inspector, days);
    days = Box::new(1);
    inspector = Inspector(&amp;days, &quot;gadget&quot;);
    // Let's say `days` happens to get dropped first.
    // Even when Inspector is dropped, its destructor will not access the
    // borrowed `days`.
}
</code></pre>
<p>Likewise, this variant will also never access borrowed data:</p>
<pre><code class="language-rust ignore">use std::fmt;

struct Inspector&lt;T: fmt::Display&gt;(T, &amp;'static str);

impl&lt;T: fmt::Display&gt; Drop for Inspector&lt;T&gt; {
    fn drop(&amp;mut self) {
        println!(&quot;Inspector(_, {}) knows when *not* to inspect.&quot;, self.1);
    }
}

fn main() {
    let (inspector, days): (Inspector&lt;&amp;u8&gt;, Box&lt;u8&gt;);
    days = Box::new(1);
    inspector = Inspector(&amp;days, &quot;gadget&quot;);
    // Let's say `days` happens to get dropped first.
    // Even when Inspector is dropped, its destructor will not access the
    // borrowed `days`.
}
</code></pre>
<p>However, <em>both</em> of the above variants are rejected by the borrow
checker during the analysis of <code>fn main</code>, saying that <code>days</code> does not
live long enough.</p>
<p>The reason is that the borrow checking analysis of <code>main</code> does not
know about the internals of each <code>Inspector</code>'s <code>Drop</code> implementation.  As
far as the borrow checker knows while it is analyzing <code>main</code>, the body
of an inspector's destructor might access that borrowed data.</p>
<p>Therefore, the drop checker forces all borrowed data in a value to
strictly outlive that value.</p>
<a class="header" href="print.html#an-escape-hatch" id="an-escape-hatch"><h1>An Escape Hatch</h1></a>
<p>The precise rules that govern drop checking may be less restrictive in
the future.</p>
<p>The current analysis is deliberately conservative and trivial; it forces all
borrowed data in a value to outlive that value, which is certainly sound.</p>
<p>Future versions of the language may make the analysis more precise, to
reduce the number of cases where sound code is rejected as unsafe.
This would help address cases such as the two <code>Inspector</code>s above that
know not to inspect during destruction.</p>
<p>In the meantime, there is an unstable attribute that one can use to
assert (unsafely) that a generic type's destructor is <em>guaranteed</em> to
not access any expired data, even if its type gives it the capability
to do so.</p>
<p>That attribute is called <code>may_dangle</code> and was introduced in <a href="https://github.com/rust-lang/rfcs/blob/master/text/1327-dropck-param-eyepatch.md">RFC 1327</a>.
To deploy it on the <code>Inspector</code> example from above, we would write:</p>
<pre><code class="language-rust ignore">struct Inspector&lt;'a&gt;(&amp;'a u8, &amp;'static str);

unsafe impl&lt;#[may_dangle] 'a&gt; Drop for Inspector&lt;'a&gt; {
    fn drop(&amp;mut self) {
        println!(&quot;Inspector(_, {}) knows when *not* to inspect.&quot;, self.1);
    }
}
</code></pre>
<p>Use of this attribute requires the <code>Drop</code> impl to be marked <code>unsafe</code> because the
compiler is not checking the implicit assertion that no potentially expired data
(e.g. <code>self.0</code> above) is accessed.</p>
<p>The attribute can be applied to any number of lifetime and type parameters. In
the following example, we assert that we access no data behind a reference of
lifetime <code>'b</code> and that the only uses of <code>T</code> will be moves or drops, but omit
the attribute from <code>'a</code> and <code>U</code>, because we do access data with that lifetime
and that type:</p>
<pre><code class="language-rust ignore">use std::fmt::Display;

struct Inspector&lt;'a, 'b, T, U: Display&gt;(&amp;'a u8, &amp;'b u8, T, U);

unsafe impl&lt;'a, #[may_dangle] 'b, #[may_dangle] T, U: Display&gt; Drop for Inspector&lt;'a, 'b, T, U&gt; {
    fn drop(&amp;mut self) {
        println!(&quot;Inspector({}, _, _, {})&quot;, self.0, self.3);
    }
}
</code></pre>
<p>It is sometimes obvious that no such access can occur, like the case above.
However, when dealing with a generic type parameter, such access can
occur indirectly. Examples of such indirect access are:</p>
<ul>
<li>invoking a callback,</li>
<li>via a trait method call.</li>
</ul>
<p>(Future changes to the language, such as impl specialization, may add
other avenues for such indirect access.)</p>
<p>Here is an example of invoking a callback:</p>
<pre><code class="language-rust ignore">struct Inspector&lt;T&gt;(T, &amp;'static str, Box&lt;for &lt;'r&gt; fn(&amp;'r T) -&gt; String&gt;);

impl&lt;T&gt; Drop for Inspector&lt;T&gt; {
    fn drop(&amp;mut self) {
        // The `self.2` call could access a borrow e.g. if `T` is `&amp;'a _`.
        println!(&quot;Inspector({}, {}) unwittingly inspects expired data.&quot;,
                 (self.2)(&amp;self.0), self.1);
    }
}
</code></pre>
<p>Here is an example of a trait method call:</p>
<pre><code class="language-rust ignore">use std::fmt;

struct Inspector&lt;T: fmt::Display&gt;(T, &amp;'static str);

impl&lt;T: fmt::Display&gt; Drop for Inspector&lt;T&gt; {
    fn drop(&amp;mut self) {
        // There is a hidden call to `&lt;T as Display&gt;::fmt` below, which
        // could access a borrow e.g. if `T` is `&amp;'a _`
        println!(&quot;Inspector({}, {}) unwittingly inspects expired data.&quot;,
                 self.0, self.1);
    }
}
</code></pre>
<p>And of course, all of these accesses could be further hidden within
some other method invoked by the destructor, rather than being written
directly within it.</p>
<p>In all of the above cases where the <code>&amp;'a u8</code> is accessed in the
destructor, adding the <code>#[may_dangle]</code>
attribute makes the type vulnerable to misuse that the borrower
checker will not catch, inviting havoc. It is better to avoid adding
the attribute.</p>
<a class="header" href="print.html#is-that-all-about-drop-checker" id="is-that-all-about-drop-checker"><h1>Is that all about drop checker?</h1></a>
<p>It turns out that when writing unsafe code, we generally don't need to
worry at all about doing the right thing for the drop checker. However there
is one special case that you need to worry about, which we will look at in
the next section.</p>
<a class="header" href="print.html#phantomdata" id="phantomdata"><h1>PhantomData</h1></a>
<p>When working with unsafe code, we can often end up in a situation where
types or lifetimes are logically associated with a struct, but not actually
part of a field. This most commonly occurs with lifetimes. For instance, the
<code>Iter</code> for <code>&amp;'a [T]</code> is (approximately) defined as follows:</p>
<pre><code class="language-rust ignore">struct Iter&lt;'a, T: 'a&gt; {
    ptr: *const T,
    end: *const T,
}
</code></pre>
<p>However because <code>'a</code> is unused within the struct's body, it's <em>unbounded</em>.
Because of the troubles this has historically caused, unbounded lifetimes and
types are <em>forbidden</em> in struct definitions. Therefore we must somehow refer
to these types in the body. Correctly doing this is necessary to have
correct variance and drop checking.</p>
<p>We do this using <code>PhantomData</code>, which is a special marker type. <code>PhantomData</code>
consumes no space, but simulates a field of the given type for the purpose of
static analysis. This was deemed to be less error-prone than explicitly telling
the type-system the kind of variance that you want, while also providing other
useful such as the information needed by drop check.</p>
<p>Iter logically contains a bunch of <code>&amp;'a T</code>s, so this is exactly what we tell
the PhantomData to simulate:</p>
<pre><code>use std::marker;

struct Iter&lt;'a, T: 'a&gt; {
    ptr: *const T,
    end: *const T,
    _marker: marker::PhantomData&lt;&amp;'a T&gt;,
}
</code></pre>
<p>and that's it. The lifetime will be bounded, and your iterator will be variant
over <code>'a</code> and <code>T</code>. Everything Just Works.</p>
<p>Another important example is Vec, which is (approximately) defined as follows:</p>
<pre><code>struct Vec&lt;T&gt; {
    data: *const T, // *const for variance!
    len: usize,
    cap: usize,
}
</code></pre>
<p>Unlike the previous example, it <em>appears</em> that everything is exactly as we
want. Every generic argument to Vec shows up in at least one field.
Good to go!</p>
<p>Nope.</p>
<p>The drop checker will generously determine that <code>Vec&lt;T&gt;</code> does not own any values
of type T. This will in turn make it conclude that it doesn't need to worry
about Vec dropping any T's in its destructor for determining drop check
soundness. This will in turn allow people to create unsoundness using
Vec's destructor.</p>
<p>In order to tell dropck that we <em>do</em> own values of type T, and therefore may
drop some T's when <em>we</em> drop, we must add an extra PhantomData saying exactly
that:</p>
<pre><code>use std::marker;

struct Vec&lt;T&gt; {
    data: *const T, // *const for variance!
    len: usize,
    cap: usize,
    _marker: marker::PhantomData&lt;T&gt;,
}
</code></pre>
<p>Raw pointers that own an allocation is such a pervasive pattern that the
standard library made a utility for itself called <code>Unique&lt;T&gt;</code> which:</p>
<ul>
<li>wraps a <code>*const T</code> for variance</li>
<li>includes a <code>PhantomData&lt;T&gt;</code></li>
<li>auto-derives <code>Send</code>/<code>Sync</code> as if T was contained</li>
<li>marks the pointer as <code>NonZero</code> for the null-pointer optimization</li>
</ul>
<a class="header" href="print.html#table-of-phantomdata-patterns" id="table-of-phantomdata-patterns"><h2>Table of <code>PhantomData</code> patterns</h2></a>
<p>Here’s a table of all the wonderful ways <code>PhantomData</code> could be used:</p>
<table><thead><tr><th> Phantom type                </th><th> <code>'a</code>      </th><th> <code>T</code>                       </th></tr></thead><tbody>
<tr><td> <code>PhantomData&lt;T&gt;</code>            </td><td> -         </td><td> variant (with drop check) </td></tr>
<tr><td> <code>PhantomData&lt;&amp;'a T&gt;</code>        </td><td> variant   </td><td> variant                   </td></tr>
<tr><td> <code>PhantomData&lt;&amp;'a mut T&gt;</code>    </td><td> variant   </td><td> invariant                 </td></tr>
<tr><td> <code>PhantomData&lt;*const T&gt;</code>     </td><td> -         </td><td> variant                   </td></tr>
<tr><td> <code>PhantomData&lt;*mut T&gt;</code>       </td><td> -         </td><td> invariant                 </td></tr>
<tr><td> <code>PhantomData&lt;fn(T)&gt;</code>        </td><td> -         </td><td> contravariant (*)         </td></tr>
<tr><td> <code>PhantomData&lt;fn() -&gt; T&gt;</code>    </td><td> -         </td><td> variant                   </td></tr>
<tr><td> <code>PhantomData&lt;fn(T) -&gt; T&gt;</code>   </td><td> -         </td><td> invariant                 </td></tr>
<tr><td> <code>PhantomData&lt;Cell&lt;&amp;'a ()&gt;&gt;</code> </td><td> invariant </td><td> -                         </td></tr>
</tbody></table>
<p>(*) If contravariance gets scrapped, this would be invariant.</p>
<a class="header" href="print.html#splitting-borrows" id="splitting-borrows"><h1>Splitting Borrows</h1></a>
<p>The mutual exclusion property of mutable references can be very limiting when
working with a composite structure. The borrow checker understands some basic
stuff, but will fall over pretty easily. It does understand structs
sufficiently to know that it's possible to borrow disjoint fields of a struct
simultaneously. So this works today:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct Foo {
    a: i32,
    b: i32,
    c: i32,
}

let mut x = Foo {a: 0, b: 0, c: 0};
let a = &amp;mut x.a;
let b = &amp;mut x.b;
let c = &amp;x.c;
*b += 1;
let c2 = &amp;x.c;
*a += 10;
println!(&quot;{} {} {} {}&quot;, a, b, c, c2);
#}</code></pre></pre>
<p>However borrowck doesn't understand arrays or slices in any way, so this doesn't
work:</p>
<pre><code class="language-rust ignore">let mut x = [1, 2, 3];
let a = &amp;mut x[0];
let b = &amp;mut x[1];
println!(&quot;{} {}&quot;, a, b);
</code></pre>
<pre><code class="language-text">&lt;anon&gt;:4:14: 4:18 error: cannot borrow `x[..]` as mutable more than once at a time
&lt;anon&gt;:4 let b = &amp;mut x[1];
                      ^~~~
&lt;anon&gt;:3:14: 3:18 note: previous borrow of `x[..]` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `x[..]` until the borrow ends
&lt;anon&gt;:3 let a = &amp;mut x[0];
                      ^~~~
&lt;anon&gt;:6:2: 6:2 note: previous borrow ends here
&lt;anon&gt;:1 fn main() {
&lt;anon&gt;:2 let mut x = [1, 2, 3];
&lt;anon&gt;:3 let a = &amp;mut x[0];
&lt;anon&gt;:4 let b = &amp;mut x[1];
&lt;anon&gt;:5 println!(&quot;{} {}&quot;, a, b);
&lt;anon&gt;:6 }
         ^
error: aborting due to 2 previous errors
</code></pre>
<p>While it was plausible that borrowck could understand this simple case, it's
pretty clearly hopeless for borrowck to understand disjointness in general
container types like a tree, especially if distinct keys actually <em>do</em> map
to the same value.</p>
<p>In order to &quot;teach&quot; borrowck that what we're doing is ok, we need to drop down
to unsafe code. For instance, mutable slices expose a <code>split_at_mut</code> function
that consumes the slice and returns two mutable slices. One for everything to
the left of the index, and one for everything to the right. Intuitively we know
this is safe because the slices don't overlap, and therefore alias. However
the implementation requires some unsafety:</p>
<pre><code class="language-rust ignore">fn split_at_mut(&amp;mut self, mid: usize) -&gt; (&amp;mut [T], &amp;mut [T]) {
    let len = self.len();
    let ptr = self.as_mut_ptr();
    assert!(mid &lt;= len);
    unsafe {
        (from_raw_parts_mut(ptr, mid),
         from_raw_parts_mut(ptr.offset(mid as isize), len - mid))
    }
}
</code></pre>
<p>This is actually a bit subtle. So as to avoid ever making two <code>&amp;mut</code>'s to the
same value, we explicitly construct brand-new slices through raw pointers.</p>
<p>However more subtle is how iterators that yield mutable references work.
The iterator trait is defined as follows:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
trait Iterator {
    type Item;

    fn next(&amp;mut self) -&gt; Option&lt;Self::Item&gt;;
}
#}</code></pre></pre>
<p>Given this definition, Self::Item has <em>no</em> connection to <code>self</code>. This means that
we can call <code>next</code> several times in a row, and hold onto all the results
<em>concurrently</em>. This is perfectly fine for by-value iterators, which have
exactly these semantics. It's also actually fine for shared references, as they
admit arbitrarily many references to the same thing (although the iterator needs
to be a separate object from the thing being shared).</p>
<p>But mutable references make this a mess. At first glance, they might seem
completely incompatible with this API, as it would produce multiple mutable
references to the same object!</p>
<p>However it actually <em>does</em> work, exactly because iterators are one-shot objects.
Everything an IterMut yields will be yielded at most once, so we don't
actually ever yield multiple mutable references to the same piece of data.</p>
<p>Perhaps surprisingly, mutable iterators don't require unsafe code to be
implemented for many types!</p>
<p>For instance here's a singly linked list:</p>
<pre><pre class="playpen"><code class="language-rust"># fn main() {}
type Link&lt;T&gt; = Option&lt;Box&lt;Node&lt;T&gt;&gt;&gt;;

struct Node&lt;T&gt; {
    elem: T,
    next: Link&lt;T&gt;,
}

pub struct LinkedList&lt;T&gt; {
    head: Link&lt;T&gt;,
}

pub struct IterMut&lt;'a, T: 'a&gt;(Option&lt;&amp;'a mut Node&lt;T&gt;&gt;);

impl&lt;T&gt; LinkedList&lt;T&gt; {
    fn iter_mut(&amp;mut self) -&gt; IterMut&lt;T&gt; {
        IterMut(self.head.as_mut().map(|node| &amp;mut **node))
    }
}

impl&lt;'a, T&gt; Iterator for IterMut&lt;'a, T&gt; {
    type Item = &amp;'a mut T;

    fn next(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        self.0.take().map(|node| {
            self.0 = node.next.as_mut().map(|node| &amp;mut **node);
            &amp;mut node.elem
        })
    }
}
</code></pre></pre>
<p>Here's a mutable slice:</p>
<pre><pre class="playpen"><code class="language-rust"># fn main() {}
use std::mem;

pub struct IterMut&lt;'a, T: 'a&gt;(&amp;'a mut[T]);

impl&lt;'a, T&gt; Iterator for IterMut&lt;'a, T&gt; {
    type Item = &amp;'a mut T;

    fn next(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        let slice = mem::replace(&amp;mut self.0, &amp;mut []);
        if slice.is_empty() { return None; }

        let (l, r) = slice.split_at_mut(1);
        self.0 = r;
        l.get_mut(0)
    }
}

impl&lt;'a, T&gt; DoubleEndedIterator for IterMut&lt;'a, T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        let slice = mem::replace(&amp;mut self.0, &amp;mut []);
        if slice.is_empty() { return None; }

        let new_len = slice.len() - 1;
        let (l, r) = slice.split_at_mut(new_len);
        self.0 = l;
        r.get_mut(0)
    }
}
</code></pre></pre>
<p>And here's a binary tree:</p>
<pre><pre class="playpen"><code class="language-rust"># fn main() {}
use std::collections::VecDeque;

type Link&lt;T&gt; = Option&lt;Box&lt;Node&lt;T&gt;&gt;&gt;;

struct Node&lt;T&gt; {
    elem: T,
    left: Link&lt;T&gt;,
    right: Link&lt;T&gt;,
}

pub struct Tree&lt;T&gt; {
    root: Link&lt;T&gt;,
}

struct NodeIterMut&lt;'a, T: 'a&gt; {
    elem: Option&lt;&amp;'a mut T&gt;,
    left: Option&lt;&amp;'a mut Node&lt;T&gt;&gt;,
    right: Option&lt;&amp;'a mut Node&lt;T&gt;&gt;,
}

enum State&lt;'a, T: 'a&gt; {
    Elem(&amp;'a mut T),
    Node(&amp;'a mut Node&lt;T&gt;),
}

pub struct IterMut&lt;'a, T: 'a&gt;(VecDeque&lt;NodeIterMut&lt;'a, T&gt;&gt;);

impl&lt;T&gt; Tree&lt;T&gt; {
    pub fn iter_mut(&amp;mut self) -&gt; IterMut&lt;T&gt; {
        let mut deque = VecDeque::new();
        self.root.as_mut().map(|root| deque.push_front(root.iter_mut()));
        IterMut(deque)
    }
}

impl&lt;T&gt; Node&lt;T&gt; {
    pub fn iter_mut(&amp;mut self) -&gt; NodeIterMut&lt;T&gt; {
        NodeIterMut {
            elem: Some(&amp;mut self.elem),
            left: self.left.as_mut().map(|node| &amp;mut **node),
            right: self.right.as_mut().map(|node| &amp;mut **node),
        }
    }
}


impl&lt;'a, T&gt; Iterator for NodeIterMut&lt;'a, T&gt; {
    type Item = State&lt;'a, T&gt;;

    fn next(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        match self.left.take() {
            Some(node) =&gt; Some(State::Node(node)),
            None =&gt; match self.elem.take() {
                Some(elem) =&gt; Some(State::Elem(elem)),
                None =&gt; match self.right.take() {
                    Some(node) =&gt; Some(State::Node(node)),
                    None =&gt; None,
                }
            }
        }
    }
}

impl&lt;'a, T&gt; DoubleEndedIterator for NodeIterMut&lt;'a, T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        match self.right.take() {
            Some(node) =&gt; Some(State::Node(node)),
            None =&gt; match self.elem.take() {
                Some(elem) =&gt; Some(State::Elem(elem)),
                None =&gt; match self.left.take() {
                    Some(node) =&gt; Some(State::Node(node)),
                    None =&gt; None,
                }
            }
        }
    }
}

impl&lt;'a, T&gt; Iterator for IterMut&lt;'a, T&gt; {
    type Item = &amp;'a mut T;
    fn next(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        loop {
            match self.0.front_mut().and_then(|node_it| node_it.next()) {
                Some(State::Elem(elem)) =&gt; return Some(elem),
                Some(State::Node(node)) =&gt; self.0.push_front(node.iter_mut()),
                None =&gt; if let None = self.0.pop_front() { return None },
            }
        }
    }
}

impl&lt;'a, T&gt; DoubleEndedIterator for IterMut&lt;'a, T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        loop {
            match self.0.back_mut().and_then(|node_it| node_it.next_back()) {
                Some(State::Elem(elem)) =&gt; return Some(elem),
                Some(State::Node(node)) =&gt; self.0.push_back(node.iter_mut()),
                None =&gt; if let None = self.0.pop_back() { return None },
            }
        }
    }
}
</code></pre></pre>
<p>All of these are completely safe and work on stable Rust! This ultimately
falls out of the simple struct case we saw before: Rust understands that you
can safely split a mutable reference into subfields. We can then encode
permanently consuming a reference via Options (or in the case of slices,
replacing with an empty slice).</p>
<a class="header" href="print.html#type-conversions" id="type-conversions"><h1>Type Conversions</h1></a>
<p>At the end of the day, everything is just a pile of bits somewhere, and type
systems are just there to help us use those bits right. There are two common
problems with typing bits: needing to reinterpret those exact bits as a
different type, and needing to change the bits to have equivalent meaning for
a different type. Because Rust encourages encoding important properties in the
type system, these problems are incredibly pervasive. As such, Rust
consequently gives you several ways to solve them.</p>
<p>First we'll look at the ways that Safe Rust gives you to reinterpret values.
The most trivial way to do this is to just destructure a value into its
constituent parts and then build a new type out of them. e.g.</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct Foo {
    x: u32,
    y: u16,
}

struct Bar {
    a: u32,
    b: u16,
}

fn reinterpret(foo: Foo) -&gt; Bar {
    let Foo { x, y } = foo;
    Bar { a: x, b: y }
}
#}</code></pre></pre>
<p>But this is, at best, annoying. For common conversions, Rust provides
more ergonomic alternatives.</p>
<a class="header" href="print.html#coercions" id="coercions"><h1>Coercions</h1></a>
<p>Types can implicitly be coerced to change in certain contexts. These changes are
generally just <em>weakening</em> of types, largely focused around pointers and
lifetimes. They mostly exist to make Rust &quot;just work&quot; in more cases, and are
largely harmless.</p>
<p>Here's all the kinds of coercion:</p>
<p>Coercion is allowed between the following types:</p>
<ul>
<li>Transitivity: <code>T_1</code> to <code>T_3</code> where <code>T_1</code> coerces to <code>T_2</code> and <code>T_2</code> coerces to
<code>T_3</code></li>
<li>Pointer Weakening:
<ul>
<li><code>&amp;mut T</code> to <code>&amp;T</code></li>
<li><code>*mut T</code> to <code>*const T</code></li>
<li><code>&amp;T</code> to <code>*const T</code></li>
<li><code>&amp;mut T</code> to <code>*mut T</code></li>
</ul>
</li>
<li>Unsizing: <code>T</code> to <code>U</code> if <code>T</code> implements <code>CoerceUnsized&lt;U&gt;</code></li>
<li>Deref coercion: Expression <code>&amp;x</code> of type <code>&amp;T</code> to <code>&amp;*x</code> of type <code>&amp;U</code> if <code>T</code> derefs to <code>U</code> (i.e. <code>T: Deref&lt;Target=U&gt;</code>)</li>
</ul>
<p><code>CoerceUnsized&lt;Pointer&lt;U&gt;&gt; for Pointer&lt;T&gt; where T: Unsize&lt;U&gt;</code> is implemented
for all pointer types (including smart pointers like Box and Rc). Unsize is
only implemented automatically, and enables the following transformations:</p>
<ul>
<li><code>[T; n]</code> =&gt; <code>[T]</code></li>
<li><code>T</code> =&gt; <code>Trait</code> where <code>T: Trait</code></li>
<li><code>Foo&lt;..., T, ...&gt;</code> =&gt; <code>Foo&lt;..., U, ...&gt;</code> where:
<ul>
<li><code>T: Unsize&lt;U&gt;</code></li>
<li><code>Foo</code> is a struct</li>
<li>Only the last field of <code>Foo</code> has type involving <code>T</code></li>
<li><code>T</code> is not part of the type of any other fields</li>
<li><code>Bar&lt;T&gt;: Unsize&lt;Bar&lt;U&gt;&gt;</code>, if the last field of <code>Foo</code> has type <code>Bar&lt;T&gt;</code></li>
</ul>
</li>
</ul>
<p>Coercions occur at a <em>coercion site</em>. Any location that is explicitly typed
will cause a coercion to its type. If inference is necessary, the coercion will
not be performed. Exhaustively, the coercion sites for an expression <code>e</code> to
type <code>U</code> are:</p>
<ul>
<li>let statements, statics, and consts: <code>let x: U = e</code></li>
<li>Arguments to functions: <code>takes_a_U(e)</code></li>
<li>Any expression that will be returned: <code>fn foo() -&gt; U { e }</code></li>
<li>Struct literals: <code>Foo { some_u: e }</code></li>
<li>Array literals: <code>let x: [U; 10] = [e, ..]</code></li>
<li>Tuple literals: <code>let x: (U, ..) = (e, ..)</code></li>
<li>The last expression in a block: <code>let x: U = { ..; e }</code></li>
</ul>
<p>Note that we do not perform coercions when matching traits (except for
receivers, see below). If there is an impl for some type <code>U</code> and <code>T</code> coerces to
<code>U</code>, that does not constitute an implementation for <code>T</code>. For example, the
following will not type check, even though it is OK to coerce <code>t</code> to <code>&amp;T</code> and
there is an impl for <code>&amp;T</code>:</p>
<pre><code class="language-rust ignore">trait Trait {}

fn foo&lt;X: Trait&gt;(t: X) {}

impl&lt;'a&gt; Trait for &amp;'a i32 {}


fn main() {
    let t: &amp;mut i32 = &amp;mut 0;
    foo(t);
}
</code></pre>
<pre><code class="language-text">&lt;anon&gt;:10:5: 10:8 error: the trait bound `&amp;mut i32 : Trait` is not satisfied [E0277]
&lt;anon&gt;:10     foo(t);
              ^~~
</code></pre>
<a class="header" href="print.html#the-dot-operator" id="the-dot-operator"><h1>The Dot Operator</h1></a>
<p>The dot operator will perform a lot of magic to convert types. It will perform
auto-referencing, auto-dereferencing, and coercion until types match.</p>
<p>TODO: steal information from http://stackoverflow.com/questions/28519997/what-are-rusts-exact-auto-dereferencing-rules/28552082#28552082</p>
<a class="header" href="print.html#casts" id="casts"><h1>Casts</h1></a>
<p>Casts are a superset of coercions: every coercion can be explicitly
invoked via a cast. However some conversions require a cast.
While coercions are pervasive and largely harmless, these &quot;true casts&quot;
are rare and potentially dangerous. As such, casts must be explicitly invoked
using the <code>as</code> keyword: <code>expr as Type</code>.</p>
<p>True casts generally revolve around raw pointers and the primitive numeric
types. Even though they're dangerous, these casts are infallible at runtime.
If a cast triggers some subtle corner case no indication will be given that
this occurred. The cast will simply succeed. That said, casts must be valid
at the type level, or else they will be prevented statically. For instance,
<code>7u8 as bool</code> will not compile.</p>
<p>That said, casts aren't <code>unsafe</code> because they generally can't violate memory
safety <em>on their own</em>. For instance, converting an integer to a raw pointer can
very easily lead to terrible things. However the act of creating the pointer
itself is safe, because actually using a raw pointer is already marked as
<code>unsafe</code>.</p>
<p>Here's an exhaustive list of all the true casts. For brevity, we will use <code>*</code>
to denote either a <code>*const</code> or <code>*mut</code>, and <code>integer</code> to denote any integral
primitive:</p>
<ul>
<li><code>*T as *U</code> where <code>T, U: Sized</code></li>
<li><code>*T as *U</code> TODO: explain unsized situation</li>
<li><code>*T as integer</code></li>
<li><code>integer as *T</code></li>
<li><code>number as number</code></li>
<li><code>field-less enum as integer</code></li>
<li><code>bool as integer</code></li>
<li><code>char as integer</code></li>
<li><code>u8 as char</code></li>
<li><code>&amp;[T; n] as *const T</code></li>
<li><code>fn as *T</code> where <code>T: Sized</code></li>
<li><code>fn as integer</code></li>
</ul>
<p>Note that lengths are not adjusted when casting raw slices -
<code>*const [u16] as *const [u8]</code> creates a slice that only includes
half of the original memory.</p>
<p>Casting is not transitive, that is, even if <code>e as U1 as U2</code> is a valid
expression, <code>e as U2</code> is not necessarily so.</p>
<p>For numeric casts, there are quite a few cases to consider:</p>
<ul>
<li>casting between two integers of the same size (e.g. i32 -&gt; u32) is a no-op</li>
<li>casting from a larger integer to a smaller integer (e.g. u32 -&gt; u8) will
truncate</li>
<li>casting from a smaller integer to a larger integer (e.g. u8 -&gt; u32) will
<ul>
<li>zero-extend if the source is unsigned</li>
<li>sign-extend if the source is signed</li>
</ul>
</li>
<li>casting from a float to an integer will round the float towards zero
<ul>
<li><strong><a href="https://github.com/rust-lang/rust/issues/10184">NOTE: currently this will cause Undefined Behavior if the rounded
value cannot be represented by the target integer type</a></strong>.
This includes Inf and NaN. This is a bug and will be fixed.</li>
</ul>
</li>
<li>casting from an integer to float will produce the floating point
representation of the integer, rounded if necessary (rounding to
nearest, ties to even)</li>
<li>casting from an f32 to an f64 is perfect and lossless</li>
<li>casting from an f64 to an f32 will produce the closest possible value
(rounding to nearest, ties to even)</li>
</ul>
<a class="header" href="print.html#transmutes" id="transmutes"><h1>Transmutes</h1></a>
<p>Get out of our way type system! We're going to reinterpret these bits or die
trying! Even though this book is all about doing things that are unsafe, I
really can't emphasize that you should deeply think about finding Another Way
than the operations covered in this section. This is really, truly, the most
horribly unsafe thing you can do in Rust. The railguards here are dental floss.</p>
<p><code>mem::transmute&lt;T, U&gt;</code> takes a value of type <code>T</code> and reinterprets it to have
type <code>U</code>. The only restriction is that the <code>T</code> and <code>U</code> are verified to have the
same size. The ways to cause Undefined Behavior with this are mind boggling.</p>
<ul>
<li>First and foremost, creating an instance of <em>any</em> type with an invalid state
is going to cause arbitrary chaos that can't really be predicted.</li>
<li>Transmute has an overloaded return type. If you do not specify the return type
it may produce a surprising type to satisfy inference.</li>
<li>Making a primitive with an invalid value is UB</li>
<li>Transmuting between non-repr(C) types is UB</li>
<li>Transmuting an &amp; to &amp;mut is UB
<ul>
<li>Transmuting an &amp; to &amp;mut is <em>always</em> UB</li>
<li>No you can't do it</li>
<li>No you're not special</li>
</ul>
</li>
<li>Transmuting to a reference without an explicitly provided lifetime
produces an <a href="unbounded-lifetimes.html">unbounded lifetime</a></li>
</ul>
<p><code>mem::transmute_copy&lt;T, U&gt;</code> somehow manages to be <em>even more</em> wildly unsafe than
this. It copies <code>size_of&lt;U&gt;</code> bytes out of an <code>&amp;T</code> and interprets them as a <code>U</code>.
The size check that <code>mem::transmute</code> has is gone (as it may be valid to copy
out a prefix), though it is Undefined Behavior for <code>U</code> to be larger than <code>T</code>.</p>
<p>Also of course you can get most of the functionality of these functions using
pointer casts.</p>
<a class="header" href="print.html#working-with-uninitialized-memory" id="working-with-uninitialized-memory"><h1>Working With Uninitialized Memory</h1></a>
<p>All runtime-allocated memory in a Rust program begins its life as
<em>uninitialized</em>. In this state the value of the memory is an indeterminate pile
of bits that may or may not even reflect a valid state for the type that is
supposed to inhabit that location of memory. Attempting to interpret this memory
as a value of <em>any</em> type will cause Undefined Behavior. Do Not Do This.</p>
<p>Rust provides mechanisms to work with uninitialized memory in checked (safe) and
unchecked (unsafe) ways.</p>
<a class="header" href="print.html#checked-uninitialized-memory" id="checked-uninitialized-memory"><h1>Checked Uninitialized Memory</h1></a>
<p>Like C, all stack variables in Rust are uninitialized until a value is
explicitly assigned to them. Unlike C, Rust statically prevents you from ever
reading them until you do:</p>
<pre><code class="language-rust ignore">fn main() {
    let x: i32;
    println!(&quot;{}&quot;, x);
}
</code></pre>
<pre><code class="language-text">src/main.rs:3:20: 3:21 error: use of possibly uninitialized variable: `x`
src/main.rs:3     println!(&quot;{}&quot;, x);
                                 ^
</code></pre>
<p>This is based off of a basic branch analysis: every branch must assign a value
to <code>x</code> before it is first used. Interestingly, Rust doesn't require the variable
to be mutable to perform a delayed initialization if every branch assigns
exactly once. However the analysis does not take advantage of constant analysis
or anything like that. So this compiles:</p>
<pre><pre class="playpen"><code class="language-rust">fn main() {
    let x: i32;

    if true {
        x = 1;
    } else {
        x = 2;
    }

    println!(&quot;{}&quot;, x);
}
</code></pre></pre>
<p>but this doesn't:</p>
<pre><code class="language-rust ignore">fn main() {
    let x: i32;
    if true {
        x = 1;
    }
    println!(&quot;{}&quot;, x);
}
</code></pre>
<pre><code class="language-text">src/main.rs:6:17: 6:18 error: use of possibly uninitialized variable: `x`
src/main.rs:6   println!(&quot;{}&quot;, x);
</code></pre>
<p>while this does:</p>
<pre><pre class="playpen"><code class="language-rust">fn main() {
    let x: i32;
    if true {
        x = 1;
        println!(&quot;{}&quot;, x);
    }
    // Don't care that there are branches where it's not initialized
    // since we don't use the value in those branches
}
</code></pre></pre>
<p>Of course, while the analysis doesn't consider actual values, it does
have a relatively sophisticated understanding of dependencies and control
flow. For instance, this works:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
let x: i32;

loop {
    // Rust doesn't understand that this branch will be taken unconditionally,
    // because it relies on actual values.
    if true {
        // But it does understand that it will only be taken once because
        // we unconditionally break out of it. Therefore `x` doesn't
        // need to be marked as mutable.
        x = 0;
        break;
    }
}
// It also knows that it's impossible to get here without reaching the break.
// And therefore that `x` must be initialized here!
println!(&quot;{}&quot;, x);
#}</code></pre></pre>
<p>If a value is moved out of a variable, that variable becomes logically
uninitialized if the type of the value isn't Copy. That is:</p>
<pre><pre class="playpen"><code class="language-rust">fn main() {
    let x = 0;
    let y = Box::new(0);
    let z1 = x; // x is still valid because i32 is Copy
    let z2 = y; // y is now logically uninitialized because Box isn't Copy
}
</code></pre></pre>
<p>However reassigning <code>y</code> in this example <em>would</em> require <code>y</code> to be marked as
mutable, as a Safe Rust program could observe that the value of <code>y</code> changed:</p>
<pre><pre class="playpen"><code class="language-rust">fn main() {
    let mut y = Box::new(0);
    let z = y; // y is now logically uninitialized because Box isn't Copy
    y = Box::new(1); // reinitialize y
}
</code></pre></pre>
<p>Otherwise it's like <code>y</code> is a brand new variable.</p>
<a class="header" href="print.html#drop-flags" id="drop-flags"><h1>Drop Flags</h1></a>
<p>The examples in the previous section introduce an interesting problem for Rust.
We have seen that it's possible to conditionally initialize, deinitialize, and
reinitialize locations of memory totally safely. For Copy types, this isn't
particularly notable since they're just a random pile of bits. However types
with destructors are a different story: Rust needs to know whether to call a
destructor whenever a variable is assigned to, or a variable goes out of scope.
How can it do this with conditional initialization?</p>
<p>Note that this is not a problem that all assignments need worry about. In
particular, assigning through a dereference unconditionally drops, and assigning
in a <code>let</code> unconditionally doesn't drop:</p>
<pre><code>let mut x = Box::new(0); // let makes a fresh variable, so never need to drop
let y = &amp;mut x;
*y = Box::new(1); // Deref assumes the referent is initialized, so always drops
</code></pre>
<p>This is only a problem when overwriting a previously initialized variable or
one of its subfields.</p>
<p>It turns out that Rust actually tracks whether a type should be dropped or not
<em>at runtime</em>. As a variable becomes initialized and uninitialized, a <em>drop flag</em>
for that variable is toggled. When a variable might need to be dropped, this
flag is evaluated to determine if it should be dropped.</p>
<p>Of course, it is often the case that a value's initialization state can be
statically known at every point in the program. If this is the case, then the
compiler can theoretically generate more efficient code! For instance, straight-
line code has such <em>static drop semantics</em>:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
let mut x = Box::new(0); // x was uninit; just overwrite.
let mut y = x;           // y was uninit; just overwrite and make x uninit.
x = Box::new(0);         // x was uninit; just overwrite.
y = x;                   // y was init; Drop y, overwrite it, and make x uninit!
                         // y goes out of scope; y was init; Drop y!
                         // x goes out of scope; x was uninit; do nothing.
#}</code></pre></pre>
<p>Similarly, branched code where all branches have the same behavior with respect
to initialization has static drop semantics:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
# let condition = true;
let mut x = Box::new(0);    // x was uninit; just overwrite.
if condition {
    drop(x)                 // x gets moved out; make x uninit.
} else {
    println!(&quot;{}&quot;, x);
    drop(x)                 // x gets moved out; make x uninit.
}
x = Box::new(0);            // x was uninit; just overwrite.
                            // x goes out of scope; x was init; Drop x!
#}</code></pre></pre>
<p>However code like this <em>requires</em> runtime information to correctly Drop:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
# let condition = true;
let x;
if condition {
    x = Box::new(0);        // x was uninit; just overwrite.
    println!(&quot;{}&quot;, x);
}
                            // x goes out of scope; x might be uninit;
                            // check the flag!
#}</code></pre></pre>
<p>Of course, in this case it's trivial to retrieve static drop semantics:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
# let condition = true;
if condition {
    let x = Box::new(0);
    println!(&quot;{}&quot;, x);
}
#}</code></pre></pre>
<p>The drop flags are tracked on the stack and no longer stashed in types that
implement drop.</p>
<a class="header" href="print.html#unchecked-uninitialized-memory" id="unchecked-uninitialized-memory"><h1>Unchecked Uninitialized Memory</h1></a>
<p>One interesting exception to this rule is working with arrays. Safe Rust doesn't
permit you to partially initialize an array. When you initialize an array, you
can either set every value to the same thing with <code>let x = [val; N]</code>, or you can
specify each member individually with <code>let x = [val1, val2, val3]</code>.
Unfortunately this is pretty rigid, especially if you need to initialize your
array in a more incremental or dynamic way.</p>
<p>Unsafe Rust gives us a powerful tool to handle this problem:
<code>mem::uninitialized</code>. This function pretends to return a value when really
it does nothing at all. Using it, we can convince Rust that we have initialized
a variable, allowing us to do trickier things with conditional and incremental
initialization.</p>
<p>Unfortunately, this opens us up to all kinds of problems. Assignment has a
different meaning to Rust based on whether it believes that a variable is
initialized or not. If it's believed uninitialized, then Rust will semantically
just memcopy the bits over the uninitialized ones, and do nothing else. However
if Rust believes a value to be initialized, it will try to <code>Drop</code> the old value!
Since we've tricked Rust into believing that the value is initialized, we can no
longer safely use normal assignment.</p>
<p>This is also a problem if you're working with a raw system allocator, which
returns a pointer to uninitialized memory.</p>
<p>To handle this, we must use the <code>ptr</code> module. In particular, it provides
three functions that allow us to assign bytes to a location in memory without
dropping the old value: <code>write</code>, <code>copy</code>, and <code>copy_nonoverlapping</code>.</p>
<ul>
<li><code>ptr::write(ptr, val)</code> takes a <code>val</code> and moves it into the address pointed
to by <code>ptr</code>.</li>
<li><code>ptr::copy(src, dest, count)</code> copies the bits that <code>count</code> T's would occupy
from src to dest. (this is equivalent to memmove -- note that the argument
order is reversed!)</li>
<li><code>ptr::copy_nonoverlapping(src, dest, count)</code> does what <code>copy</code> does, but a
little faster on the assumption that the two ranges of memory don't overlap.
(this is equivalent to memcpy -- note that the argument order is reversed!)</li>
</ul>
<p>It should go without saying that these functions, if misused, will cause serious
havoc or just straight up Undefined Behavior. The only things that these
functions <em>themselves</em> require is that the locations you want to read and write
are allocated. However the ways writing arbitrary bits to arbitrary
locations of memory can break things are basically uncountable!</p>
<p>Putting this all together, we get the following:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
use std::mem;
use std::ptr;

// size of the array is hard-coded but easy to change. This means we can't
// use [a, b, c] syntax to initialize the array, though!
const SIZE: usize = 10;

let mut x: [Box&lt;u32&gt;; SIZE];

unsafe {
    // convince Rust that x is Totally Initialized
    x = mem::uninitialized();
    for i in 0..SIZE {
        // very carefully overwrite each index without reading it
        // NOTE: exception safety is not a concern; Box can't panic
        ptr::write(&amp;mut x[i], Box::new(i as u32));
    }
}

println!(&quot;{:?}&quot;, x);
#}</code></pre></pre>
<p>It's worth noting that you don't need to worry about <code>ptr::write</code>-style
shenanigans with types which don't implement <code>Drop</code> or contain <code>Drop</code> types,
because Rust knows not to try to drop them. Similarly you should be able to
assign to fields of partially initialized structs directly if those fields don't
contain any <code>Drop</code> types.</p>
<p>However when working with uninitialized memory you need to be ever-vigilant for
Rust trying to drop values you make like this before they're fully initialized.
Every control path through that variable's scope must initialize the value
before it ends, if it has a destructor.
<em><a href="unwinding.html">This includes code panicking</a></em>.</p>
<p>And that's about it for working with uninitialized memory! Basically nothing
anywhere expects to be handed uninitialized memory, so if you're going to pass
it around at all, be sure to be <em>really</em> careful.</p>
<a class="header" href="print.html#the-perils-of-ownership-based-resource-management-obrm" id="the-perils-of-ownership-based-resource-management-obrm"><h1>The Perils Of Ownership Based Resource Management (OBRM)</h1></a>
<p>OBRM (AKA RAII: Resource Acquisition Is Initialization) is something you'll
interact with a lot in Rust. Especially if you use the standard library.</p>
<p>Roughly speaking the pattern is as follows: to acquire a resource, you create an
object that manages it. To release the resource, you simply destroy the object,
and it cleans up the resource for you. The most common &quot;resource&quot; this pattern
manages is simply <em>memory</em>. <code>Box</code>, <code>Rc</code>, and basically everything in
<code>std::collections</code> is a convenience to enable correctly managing memory. This is
particularly important in Rust because we have no pervasive GC to rely on for
memory management. Which is the point, really: Rust is about control. However we
are not limited to just memory. Pretty much every other system resource like a
thread, file, or socket is exposed through this kind of API.</p>
<a class="header" href="print.html#constructors" id="constructors"><h1>Constructors</h1></a>
<p>There is exactly one way to create an instance of a user-defined type: name it,
and initialize all its fields at once:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct Foo {
    a: u8,
    b: u32,
    c: bool,
}

enum Bar {
    X(u32),
    Y(bool),
}

struct Unit;

let foo = Foo { a: 0, b: 1, c: false };
let bar = Bar::X(0);
let empty = Unit;
#}</code></pre></pre>
<p>That's it. Every other way you make an instance of a type is just calling a
totally vanilla function that does some stuff and eventually bottoms out to The
One True Constructor.</p>
<p>Unlike C++, Rust does not come with a slew of built-in kinds of constructor.
There are no Copy, Default, Assignment, Move, or whatever constructors. The
reasons for this are varied, but it largely boils down to Rust's philosophy of
<em>being explicit</em>.</p>
<p>Move constructors are meaningless in Rust because we don't enable types to
&quot;care&quot; about their location in memory. Every type must be ready for it to be
blindly memcopied to somewhere else in memory. This means pure on-the-stack-but-
still-movable intrusive linked lists are simply not happening in Rust (safely).</p>
<p>Assignment and copy constructors similarly don't exist because move semantics
are the only semantics in Rust. At most <code>x = y</code> just moves the bits of y into
the x variable. Rust does provide two facilities for providing C++'s copy-
oriented semantics: <code>Copy</code> and <code>Clone</code>. Clone is our moral equivalent of a copy
constructor, but it's never implicitly invoked. You have to explicitly call
<code>clone</code> on an element you want to be cloned. Copy is a special case of Clone
where the implementation is just &quot;copy the bits&quot;. Copy types <em>are</em> implicitly
cloned whenever they're moved, but because of the definition of Copy this just
means not treating the old copy as uninitialized -- a no-op.</p>
<p>While Rust provides a <code>Default</code> trait for specifying the moral equivalent of a
default constructor, it's incredibly rare for this trait to be used. This is
because variables <a href="uninitialized.html">aren't implicitly initialized</a>. Default is basically
only useful for generic programming. In concrete contexts, a type will provide a
static <code>new</code> method for any kind of &quot;default&quot; constructor. This has no relation
to <code>new</code> in other languages and has no special meaning. It's just a naming
convention.</p>
<p>TODO: talk about &quot;placement new&quot;?</p>
<a class="header" href="print.html#destructors" id="destructors"><h1>Destructors</h1></a>
<p>What the language <em>does</em> provide is full-blown automatic destructors through the
<code>Drop</code> trait, which provides the following method:</p>
<pre><code class="language-rust ignore">fn drop(&amp;mut self);
</code></pre>
<p>This method gives the type time to somehow finish what it was doing.</p>
<p><strong>After <code>drop</code> is run, Rust will recursively try to drop all of the fields
of <code>self</code>.</strong></p>
<p>This is a convenience feature so that you don't have to write &quot;destructor
boilerplate&quot; to drop children. If a struct has no special logic for being
dropped other than dropping its children, then it means <code>Drop</code> doesn't need to
be implemented at all!</p>
<p><strong>There is no stable way to prevent this behavior in Rust 1.0.</strong></p>
<p>Note that taking <code>&amp;mut self</code> means that even if you could suppress recursive
Drop, Rust will prevent you from e.g. moving fields out of self. For most types,
this is totally fine.</p>
<p>For instance, a custom implementation of <code>Box</code> might write <code>Drop</code> like this:</p>
<pre><pre class="playpen"><code class="language-rust">#![feature(ptr_internals, allocator_api, unique)]

use std::alloc::{Global, GlobalAlloc, Layout};
use std::mem;
use std::ptr::{drop_in_place, Unique};

struct Box&lt;T&gt;{ ptr: Unique&lt;T&gt; }

impl&lt;T&gt; Drop for Box&lt;T&gt; {
    fn drop(&amp;mut self) {
        unsafe {
            drop_in_place(self.ptr.as_ptr());
            Global.dealloc(self.ptr.as_ptr() as *mut _, Layout::new::&lt;T&gt;())
        }
    }
}
# fn main() {}
</code></pre></pre>
<p>and this works fine because when Rust goes to drop the <code>ptr</code> field it just sees
a <a href="phantom-data.html">Unique</a> that has no actual <code>Drop</code> implementation. Similarly nothing can
use-after-free the <code>ptr</code> because when drop exits, it becomes inaccessible.</p>
<p>However this wouldn't work:</p>
<pre><pre class="playpen"><code class="language-rust">#![feature(allocator_api, ptr_internals, unique)]

use std::alloc::{Global, GlobalAlloc, Layout};
use std::ptr::{drop_in_place, Unique};
use std::mem;

struct Box&lt;T&gt;{ ptr: Unique&lt;T&gt; }

impl&lt;T&gt; Drop for Box&lt;T&gt; {
    fn drop(&amp;mut self) {
        unsafe {
            drop_in_place(self.ptr.as_ptr());
            Global.dealloc(self.ptr.as_ptr() as *mut _, Layout::new::&lt;T&gt;());
        }
    }
}

struct SuperBox&lt;T&gt; { my_box: Box&lt;T&gt; }

impl&lt;T&gt; Drop for SuperBox&lt;T&gt; {
    fn drop(&amp;mut self) {
        unsafe {
            // Hyper-optimized: deallocate the box's contents for it
            // without `drop`ing the contents
            Global.dealloc(self.my_box.ptr.as_ptr() as *mut _, Layout::new::&lt;T&gt;());
        }
    }
}
# fn main() {}
</code></pre></pre>
<p>After we deallocate the <code>box</code>'s ptr in SuperBox's destructor, Rust will
happily proceed to tell the box to Drop itself and everything will blow up with
use-after-frees and double-frees.</p>
<p>Note that the recursive drop behavior applies to all structs and enums
regardless of whether they implement Drop. Therefore something like</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct Boxy&lt;T&gt; {
    data1: Box&lt;T&gt;,
    data2: Box&lt;T&gt;,
    info: u32,
}
#}</code></pre></pre>
<p>will have its data1 and data2's fields destructors whenever it &quot;would&quot; be
dropped, even though it itself doesn't implement Drop. We say that such a type
<em>needs Drop</em>, even though it is not itself Drop.</p>
<p>Similarly,</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
enum Link {
    Next(Box&lt;Link&gt;),
    None,
}
#}</code></pre></pre>
<p>will have its inner Box field dropped if and only if an instance stores the
Next variant.</p>
<p>In general this works really nicely because you don't need to worry about
adding/removing drops when you refactor your data layout. Still there's
certainly many valid usecases for needing to do trickier things with
destructors.</p>
<p>The classic safe solution to overriding recursive drop and allowing moving out
of Self during <code>drop</code> is to use an Option:</p>
<pre><pre class="playpen"><code class="language-rust">#![feature(allocator_api, ptr_internals, unique)]

use std::alloc::{GlobalAlloc, Global, Layout};
use std::ptr::{drop_in_place, Unique};
use std::mem;

struct Box&lt;T&gt;{ ptr: Unique&lt;T&gt; }

impl&lt;T&gt; Drop for Box&lt;T&gt; {
    fn drop(&amp;mut self) {
        unsafe {
            drop_in_place(self.ptr.as_ptr());
            Global.dealloc(self.ptr.as_ptr() as *mut _, Layout::new::&lt;T&gt;());
        }
    }
}

struct SuperBox&lt;T&gt; { my_box: Option&lt;Box&lt;T&gt;&gt; }

impl&lt;T&gt; Drop for SuperBox&lt;T&gt; {
    fn drop(&amp;mut self) {
        unsafe {
            // Hyper-optimized: deallocate the box's contents for it
            // without `drop`ing the contents. Need to set the `box`
            // field as `None` to prevent Rust from trying to Drop it.
            let my_box = self.my_box.take().unwrap();
            Global.dealloc(my_box.ptr.as_ptr() as *mut _, Layout::new::&lt;T&gt;());
            mem::forget(my_box);
        }
    }
}
# fn main() {}
</code></pre></pre>
<p>However this has fairly odd semantics: you're saying that a field that <em>should</em>
always be Some <em>may</em> be None, just because that happens in the destructor. Of
course this conversely makes a lot of sense: you can call arbitrary methods on
self during the destructor, and this should prevent you from ever doing so after
deinitializing the field. Not that it will prevent you from producing any other
arbitrarily invalid state in there.</p>
<p>On balance this is an ok choice. Certainly what you should reach for by default.
However, in the future we expect there to be a first-class way to announce that
a field shouldn't be automatically dropped.</p>
<a class="header" href="print.html#leaking" id="leaking"><h1>Leaking</h1></a>
<p>Ownership-based resource management is intended to simplify composition. You
acquire resources when you create the object, and you release the resources when
it gets destroyed. Since destruction is handled for you, it means you can't
forget to release the resources, and it happens as soon as possible! Surely this
is perfect and all of our problems are solved.</p>
<p>Everything is terrible and we have new and exotic problems to try to solve.</p>
<p>Many people like to believe that Rust eliminates resource leaks. In practice,
this is basically true. You would be surprised to see a Safe Rust program
leak resources in an uncontrolled way.</p>
<p>However from a theoretical perspective this is absolutely not the case, no
matter how you look at it. In the strictest sense, &quot;leaking&quot; is so abstract as
to be unpreventable. It's quite trivial to initialize a collection at the start
of a program, fill it with tons of objects with destructors, and then enter an
infinite event loop that never refers to it. The collection will sit around
uselessly, holding on to its precious resources until the program terminates (at
which point all those resources would have been reclaimed by the OS anyway).</p>
<p>We may consider a more restricted form of leak: failing to drop a value that is
unreachable. Rust also doesn't prevent this. In fact Rust <em>has a function for
doing this</em>: <code>mem::forget</code>. This function consumes the value it is passed <em>and
then doesn't run its destructor</em>.</p>
<p>In the past <code>mem::forget</code> was marked as unsafe as a sort of lint against using
it, since failing to call a destructor is generally not a well-behaved thing to
do (though useful for some special unsafe code). However this was generally
determined to be an untenable stance to take: there are many ways to fail to
call a destructor in safe code. The most famous example is creating a cycle of
reference-counted pointers using interior mutability.</p>
<p>It is reasonable for safe code to assume that destructor leaks do not happen, as
any program that leaks destructors is probably wrong. However <em>unsafe</em> code
cannot rely on destructors to be run in order to be safe. For most types this
doesn't matter: if you leak the destructor then the type is by definition
inaccessible, so it doesn't matter, right? For instance, if you leak a <code>Box&lt;u8&gt;</code>
then you waste some memory but that's hardly going to violate memory-safety.</p>
<p>However where we must be careful with destructor leaks are <em>proxy</em> types. These
are types which manage access to a distinct object, but don't actually own it.
Proxy objects are quite rare. Proxy objects you'll need to care about are even
rarer. However we'll focus on three interesting examples in the standard
library:</p>
<ul>
<li><code>vec::Drain</code></li>
<li><code>Rc</code></li>
<li><code>thread::scoped::JoinGuard</code></li>
</ul>
<a class="header" href="print.html#drain" id="drain"><h2>Drain</h2></a>
<p><code>drain</code> is a collections API that moves data out of the container without
consuming the container. This enables us to reuse the allocation of a <code>Vec</code>
after claiming ownership over all of its contents. It produces an iterator
(Drain) that returns the contents of the Vec by-value.</p>
<p>Now, consider Drain in the middle of iteration: some values have been moved out,
and others haven't. This means that part of the Vec is now full of logically
uninitialized data! We could backshift all the elements in the Vec every time we
remove a value, but this would have pretty catastrophic performance
consequences.</p>
<p>Instead, we would like Drain to fix the Vec's backing storage when it is
dropped. It should run itself to completion, backshift any elements that weren't
removed (drain supports subranges), and then fix Vec's <code>len</code>. It's even
unwinding-safe! Easy!</p>
<p>Now consider the following:</p>
<pre><code class="language-rust ignore">let mut vec = vec![Box::new(0); 4];

{
    // start draining, vec can no longer be accessed
    let mut drainer = vec.drain(..);

    // pull out two elements and immediately drop them
    drainer.next();
    drainer.next();

    // get rid of drainer, but don't call its destructor
    mem::forget(drainer);
}

// Oops, vec[0] was dropped, we're reading a pointer into free'd memory!
println!(&quot;{}&quot;, vec[0]);
</code></pre>
<p>This is pretty clearly Not Good. Unfortunately, we're kind of stuck between a
rock and a hard place: maintaining consistent state at every step has an
enormous cost (and would negate any benefits of the API). Failing to maintain
consistent state gives us Undefined Behavior in safe code (making the API
unsound).</p>
<p>So what can we do? Well, we can pick a trivially consistent state: set the Vec's
len to be 0 when we start the iteration, and fix it up if necessary in the
destructor. That way, if everything executes like normal we get the desired
behavior with minimal overhead. But if someone has the <em>audacity</em> to
mem::forget us in the middle of the iteration, all that does is <em>leak even more</em>
(and possibly leave the Vec in an unexpected but otherwise consistent state).
Since we've accepted that mem::forget is safe, this is definitely safe. We call
leaks causing more leaks a <em>leak amplification</em>.</p>
<a class="header" href="print.html#rc" id="rc"><h2>Rc</h2></a>
<p>Rc is an interesting case because at first glance it doesn't appear to be a
proxy value at all. After all, it manages the data it points to, and dropping
all the Rcs for a value will drop that value. Leaking an Rc doesn't seem like it
would be particularly dangerous. It will leave the refcount permanently
incremented and prevent the data from being freed or dropped, but that seems
just like Box, right?</p>
<p>Nope.</p>
<p>Let's consider a simplified implementation of Rc:</p>
<pre><code class="language-rust ignore">struct Rc&lt;T&gt; {
    ptr: *mut RcBox&lt;T&gt;,
}

struct RcBox&lt;T&gt; {
    data: T,
    ref_count: usize,
}

impl&lt;T&gt; Rc&lt;T&gt; {
    fn new(data: T) -&gt; Self {
        unsafe {
            // Wouldn't it be nice if heap::allocate worked like this?
            let ptr = heap::allocate::&lt;RcBox&lt;T&gt;&gt;();
            ptr::write(ptr, RcBox {
                data: data,
                ref_count: 1,
            });
            Rc { ptr: ptr }
        }
    }

    fn clone(&amp;self) -&gt; Self {
        unsafe {
            (*self.ptr).ref_count += 1;
        }
        Rc { ptr: self.ptr }
    }
}

impl&lt;T&gt; Drop for Rc&lt;T&gt; {
    fn drop(&amp;mut self) {
        unsafe {
            (*self.ptr).ref_count -= 1;
            if (*self.ptr).ref_count == 0 {
                // drop the data and then free it
                ptr::read(self.ptr);
                heap::deallocate(self.ptr);
            }
        }
    }
}
</code></pre>
<p>This code contains an implicit and subtle assumption: <code>ref_count</code> can fit in a
<code>usize</code>, because there can't be more than <code>usize::MAX</code> Rcs in memory. However
this itself assumes that the <code>ref_count</code> accurately reflects the number of Rcs
in memory, which we know is false with <code>mem::forget</code>. Using <code>mem::forget</code> we can
overflow the <code>ref_count</code>, and then get it down to 0 with outstanding Rcs. Then
we can happily use-after-free the inner data. Bad Bad Not Good.</p>
<p>This can be solved by just checking the <code>ref_count</code> and doing <em>something</em>. The
standard library's stance is to just abort, because your program has become
horribly degenerate. Also <em>oh my gosh</em> it's such a ridiculous corner case.</p>
<a class="header" href="print.html#threadscopedjoinguard" id="threadscopedjoinguard"><h2>thread::scoped::JoinGuard</h2></a>
<p>The thread::scoped API intends to allow threads to be spawned that reference
data on their parent's stack without any synchronization over that data by
ensuring the parent joins the thread before any of the shared data goes out
of scope.</p>
<pre><code class="language-rust ignore">pub fn scoped&lt;'a, F&gt;(f: F) -&gt; JoinGuard&lt;'a&gt;
    where F: FnOnce() + Send + 'a
</code></pre>
<p>Here <code>f</code> is some closure for the other thread to execute. Saying that
<code>F: Send +'a</code> is saying that it closes over data that lives for <code>'a</code>, and it
either owns that data or the data was Sync (implying <code>&amp;data</code> is Send).</p>
<p>Because JoinGuard has a lifetime, it keeps all the data it closes over
borrowed in the parent thread. This means the JoinGuard can't outlive
the data that the other thread is working on. When the JoinGuard <em>does</em> get
dropped it blocks the parent thread, ensuring the child terminates before any
of the closed-over data goes out of scope in the parent.</p>
<p>Usage looked like:</p>
<pre><code class="language-rust ignore">let mut data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
{
    let guards = vec![];
    for x in &amp;mut data {
        // Move the mutable reference into the closure, and execute
        // it on a different thread. The closure has a lifetime bound
        // by the lifetime of the mutable reference `x` we store in it.
        // The guard that is returned is in turn assigned the lifetime
        // of the closure, so it also mutably borrows `data` as `x` did.
        // This means we cannot access `data` until the guard goes away.
        let guard = thread::scoped(move || {
            *x *= 2;
        });
        // store the thread's guard for later
        guards.push(guard);
    }
    // All guards are dropped here, forcing the threads to join
    // (this thread blocks here until the others terminate).
    // Once the threads join, the borrow expires and the data becomes
    // accessible again in this thread.
}
// data is definitely mutated here.
</code></pre>
<p>In principle, this totally works! Rust's ownership system perfectly ensures it!
...except it relies on a destructor being called to be safe.</p>
<pre><code class="language-rust ignore">let mut data = Box::new(0);
{
    let guard = thread::scoped(|| {
        // This is at best a data race. At worst, it's also a use-after-free.
        *data += 1;
    });
    // Because the guard is forgotten, expiring the loan without blocking this
    // thread.
    mem::forget(guard);
}
// So the Box is dropped here while the scoped thread may or may not be trying
// to access it.
</code></pre>
<p>Dang. Here the destructor running was pretty fundamental to the API, and it had
to be scrapped in favor of a completely different design.</p>
<a class="header" href="print.html#unwinding" id="unwinding"><h1>Unwinding</h1></a>
<p>Rust has a <em>tiered</em> error-handling scheme:</p>
<ul>
<li>If something might reasonably be absent, Option is used.</li>
<li>If something goes wrong and can reasonably be handled, Result is used.</li>
<li>If something goes wrong and cannot reasonably be handled, the thread panics.</li>
<li>If something catastrophic happens, the program aborts.</li>
</ul>
<p>Option and Result are overwhelmingly preferred in most situations, especially
since they can be promoted into a panic or abort at the API user's discretion.
Panics cause the thread to halt normal execution and unwind its stack, calling
destructors as if every function instantly returned.</p>
<p>As of 1.0, Rust is of two minds when it comes to panics. In the long-long-ago,
Rust was much more like Erlang. Like Erlang, Rust had lightweight tasks,
and tasks were intended to kill themselves with a panic when they reached an
untenable state. Unlike an exception in Java or C++, a panic could not be
caught at any time. Panics could only be caught by the owner of the task, at which
point they had to be handled or <em>that</em> task would itself panic.</p>
<p>Unwinding was important to this story because if a task's
destructors weren't called, it would cause memory and other system resources to
leak. Since tasks were expected to die during normal execution, this would make
Rust very poor for long-running systems!</p>
<p>As the Rust we know today came to be, this style of programming grew out of
fashion in the push for less-and-less abstraction. Light-weight tasks were
killed in the name of heavy-weight OS threads. Still, on stable Rust as of 1.0
panics can only be caught by the parent thread. This means catching a panic
requires spinning up an entire OS thread! This unfortunately stands in conflict
to Rust's philosophy of zero-cost abstractions.</p>
<p>There is an unstable API called <code>catch_panic</code> that enables catching a panic
without spawning a thread. Still, we would encourage you to only do this
sparingly. In particular, Rust's current unwinding implementation is heavily
optimized for the &quot;doesn't unwind&quot; case. If a program doesn't unwind, there
should be no runtime cost for the program being <em>ready</em> to unwind. As a
consequence, actually unwinding will be more expensive than in e.g. Java.
Don't build your programs to unwind under normal circumstances. Ideally, you
should only panic for programming errors or <em>extreme</em> problems.</p>
<p>Rust's unwinding strategy is not specified to be fundamentally compatible
with any other language's unwinding. As such, unwinding into Rust from another
language, or unwinding into another language from Rust is Undefined Behavior.
You must <em>absolutely</em> catch any panics at the FFI boundary! What you do at that
point is up to you, but <em>something</em> must be done. If you fail to do this,
at best, your application will crash and burn. At worst, your application <em>won't</em>
crash and burn, and will proceed with completely clobbered state.</p>
<a class="header" href="print.html#exception-safety" id="exception-safety"><h1>Exception Safety</h1></a>
<p>Although programs should use unwinding sparingly, there's a lot of code that
<em>can</em> panic. If you unwrap a None, index out of bounds, or divide by 0, your
program will panic. On debug builds, every arithmetic operation can panic
if it overflows. Unless you are very careful and tightly control what code runs,
pretty much everything can unwind, and you need to be ready for it.</p>
<p>Being ready for unwinding is often referred to as <em>exception safety</em>
in the broader programming world. In Rust, there are two levels of exception
safety that one may concern themselves with:</p>
<ul>
<li>
<p>In unsafe code, we <em>must</em> be exception safe to the point of not violating
memory safety. We'll call this <em>minimal</em> exception safety.</p>
</li>
<li>
<p>In safe code, it is <em>good</em> to be exception safe to the point of your program
doing the right thing. We'll call this <em>maximal</em> exception safety.</p>
</li>
</ul>
<p>As is the case in many places in Rust, Unsafe code must be ready to deal with
bad Safe code when it comes to unwinding. Code that transiently creates
unsound states must be careful that a panic does not cause that state to be
used. Generally this means ensuring that only non-panicking code is run while
these states exist, or making a guard that cleans up the state in the case of
a panic. This does not necessarily mean that the state a panic witnesses is a
fully coherent state. We need only guarantee that it's a <em>safe</em> state.</p>
<p>Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe.
It controls all the code that runs, and most of that code can't panic. However
it is not uncommon for Unsafe code to work with arrays of temporarily
uninitialized data while repeatedly invoking caller-provided code. Such code
needs to be careful and consider exception safety.</p>
<a class="header" href="print.html#vecpush_all" id="vecpush_all"><h2>Vec::push_all</h2></a>
<p><code>Vec::push_all</code> is a temporary hack to get extending a Vec by a slice reliably
efficient without specialization. Here's a simple implementation:</p>
<pre><code class="language-rust ignore">impl&lt;T: Clone&gt; Vec&lt;T&gt; {
    fn push_all(&amp;mut self, to_push: &amp;[T]) {
        self.reserve(to_push.len());
        unsafe {
            // can't overflow because we just reserved this
            self.set_len(self.len() + to_push.len());

            for (i, x) in to_push.iter().enumerate() {
                self.ptr().offset(i as isize).write(x.clone());
            }
        }
    }
}
</code></pre>
<p>We bypass <code>push</code> in order to avoid redundant capacity and <code>len</code> checks on the
Vec that we definitely know has capacity. The logic is totally correct, except
there's a subtle problem with our code: it's not exception-safe! <code>set_len</code>,
<code>offset</code>, and <code>write</code> are all fine; <code>clone</code> is the panic bomb we over-looked.</p>
<p>Clone is completely out of our control, and is totally free to panic. If it
does, our function will exit early with the length of the Vec set too large. If
the Vec is looked at or dropped, uninitialized memory will be read!</p>
<p>The fix in this case is fairly simple. If we want to guarantee that the values
we <em>did</em> clone are dropped, we can set the <code>len</code> every loop iteration. If we
just want to guarantee that uninitialized memory can't be observed, we can set
the <code>len</code> after the loop.</p>
<a class="header" href="print.html#binaryheapsift_up" id="binaryheapsift_up"><h2>BinaryHeap::sift_up</h2></a>
<p>Bubbling an element up a heap is a bit more complicated than extending a Vec.
The pseudocode is as follows:</p>
<pre><code class="language-text">bubble_up(heap, index):
    while index != 0 &amp;&amp; heap[index] &lt; heap[parent(index)]:
        heap.swap(index, parent(index))
        index = parent(index)

</code></pre>
<p>A literal transcription of this code to Rust is totally fine, but has an annoying
performance characteristic: the <code>self</code> element is swapped over and over again
uselessly. We would rather have the following:</p>
<pre><code class="language-text">bubble_up(heap, index):
    let elem = heap[index]
    while index != 0 &amp;&amp; elem &lt; heap[parent(index)]:
        heap[index] = heap[parent(index)]
        index = parent(index)
    heap[index] = elem
</code></pre>
<p>This code ensures that each element is copied as little as possible (it is in
fact necessary that elem be copied twice in general). However it now exposes
some exception safety trouble! At all times, there exists two copies of one
value. If we panic in this function something will be double-dropped.
Unfortunately, we also don't have full control of the code: that comparison is
user-defined!</p>
<p>Unlike Vec, the fix isn't as easy here. One option is to break the user-defined
code and the unsafe code into two separate phases:</p>
<pre><code class="language-text">bubble_up(heap, index):
    let end_index = index;
    while end_index != 0 &amp;&amp; heap[end_index] &lt; heap[parent(end_index)]:
        end_index = parent(end_index)

    let elem = heap[index]
    while index != end_index:
        heap[index] = heap[parent(index)]
        index = parent(index)
    heap[index] = elem
</code></pre>
<p>If the user-defined code blows up, that's no problem anymore, because we haven't
actually touched the state of the heap yet. Once we do start messing with the
heap, we're working with only data and functions that we trust, so there's no
concern of panics.</p>
<p>Perhaps you're not happy with this design. Surely it's cheating! And we have
to do the complex heap traversal <em>twice</em>! Alright, let's bite the bullet. Let's
intermix untrusted and unsafe code <em>for reals</em>.</p>
<p>If Rust had <code>try</code> and <code>finally</code> like in Java, we could do the following:</p>
<pre><code class="language-text">bubble_up(heap, index):
    let elem = heap[index]
    try:
        while index != 0 &amp;&amp; elem &lt; heap[parent(index)]:
            heap[index] = heap[parent(index)]
            index = parent(index)
    finally:
        heap[index] = elem
</code></pre>
<p>The basic idea is simple: if the comparison panics, we just toss the loose
element in the logically uninitialized index and bail out. Anyone who observes
the heap will see a potentially <em>inconsistent</em> heap, but at least it won't
cause any double-drops! If the algorithm terminates normally, then this
operation happens to coincide precisely with the how we finish up regardless.</p>
<p>Sadly, Rust has no such construct, so we're going to need to roll our own! The
way to do this is to store the algorithm's state in a separate struct with a
destructor for the &quot;finally&quot; logic. Whether we panic or not, that destructor
will run and clean up after us.</p>
<pre><code class="language-rust ignore">struct Hole&lt;'a, T: 'a&gt; {
    data: &amp;'a mut [T],
    /// `elt` is always `Some` from new until drop.
    elt: Option&lt;T&gt;,
    pos: usize,
}

impl&lt;'a, T&gt; Hole&lt;'a, T&gt; {
    fn new(data: &amp;'a mut [T], pos: usize) -&gt; Self {
        unsafe {
            let elt = ptr::read(&amp;data[pos]);
            Hole {
                data: data,
                elt: Some(elt),
                pos: pos,
            }
        }
    }

    fn pos(&amp;self) -&gt; usize { self.pos }

    fn removed(&amp;self) -&gt; &amp;T { self.elt.as_ref().unwrap() }

    unsafe fn get(&amp;self, index: usize) -&gt; &amp;T { &amp;self.data[index] }

    unsafe fn move_to(&amp;mut self, index: usize) {
        let index_ptr: *const _ = &amp;self.data[index];
        let hole_ptr = &amp;mut self.data[self.pos];
        ptr::copy_nonoverlapping(index_ptr, hole_ptr, 1);
        self.pos = index;
    }
}

impl&lt;'a, T&gt; Drop for Hole&lt;'a, T&gt; {
    fn drop(&amp;mut self) {
        // fill the hole again
        unsafe {
            let pos = self.pos;
            ptr::write(&amp;mut self.data[pos], self.elt.take().unwrap());
        }
    }
}

impl&lt;T: Ord&gt; BinaryHeap&lt;T&gt; {
    fn sift_up(&amp;mut self, pos: usize) {
        unsafe {
            // Take out the value at `pos` and create a hole.
            let mut hole = Hole::new(&amp;mut self.data, pos);

            while hole.pos() != 0 {
                let parent = parent(hole.pos());
                if hole.removed() &lt;= hole.get(parent) { break }
                hole.move_to(parent);
            }
            // Hole will be unconditionally filled here; panic or not!
        }
    }
}
</code></pre>
<a class="header" href="print.html#poisoning" id="poisoning"><h1>Poisoning</h1></a>
<p>Although all unsafe code <em>must</em> ensure it has minimal exception safety, not all
types ensure <em>maximal</em> exception safety. Even if the type does, your code may
ascribe additional meaning to it. For instance, an integer is certainly
exception-safe, but has no semantics on its own. It's possible that code that
panics could fail to correctly update the integer, producing an inconsistent
program state.</p>
<p>This is <em>usually</em> fine, because anything that witnesses an exception is about
to get destroyed. For instance, if you send a Vec to another thread and that
thread panics, it doesn't matter if the Vec is in a weird state. It will be
dropped and go away forever. However some types are especially good at smuggling
values across the panic boundary.</p>
<p>These types may choose to explicitly <em>poison</em> themselves if they witness a panic.
Poisoning doesn't entail anything in particular. Generally it just means
preventing normal usage from proceeding. The most notable example of this is the
standard library's Mutex type. A Mutex will poison itself if one of its
MutexGuards (the thing it returns when a lock is obtained) is dropped during a
panic. Any future attempts to lock the Mutex will return an <code>Err</code> or panic.</p>
<p>Mutex poisons not for true safety in the sense that Rust normally cares about. It
poisons as a safety-guard against blindly using the data that comes out of a Mutex
that has witnessed a panic while locked. The data in such a Mutex was likely in the
middle of being modified, and as such may be in an inconsistent or incomplete state.
It is important to note that one cannot violate memory safety with such a type
if it is correctly written. After all, it must be minimally exception-safe!</p>
<p>However if the Mutex contained, say, a BinaryHeap that does not actually have the
heap property, it's unlikely that any code that uses it will do
what the author intended. As such, the program should not proceed normally.
Still, if you're double-plus-sure that you can do <em>something</em> with the value,
the Mutex exposes a method to get the lock anyway. It <em>is</em> safe, after all.
Just maybe nonsense.</p>
<a class="header" href="print.html#concurrency-and-parallelism" id="concurrency-and-parallelism"><h1>Concurrency and Parallelism</h1></a>
<p>Rust as a language doesn't <em>really</em> have an opinion on how to do concurrency or
parallelism. The standard library exposes OS threads and blocking sys-calls
because everyone has those, and they're uniform enough that you can provide
an abstraction over them in a relatively uncontroversial way. Message passing,
green threads, and async APIs are all diverse enough that any abstraction over
them tends to involve trade-offs that we weren't willing to commit to for 1.0.</p>
<p>However the way Rust models concurrency makes it relatively easy to design your own
concurrency paradigm as a library and have everyone else's code Just Work
with yours. Just require the right lifetimes and Send and Sync where appropriate
and you're off to the races. Or rather, off to the... not... having... races.</p>
<a class="header" href="print.html#data-races-and-race-conditions" id="data-races-and-race-conditions"><h1>Data Races and Race Conditions</h1></a>
<p>Safe Rust guarantees an absence of data races, which are defined as:</p>
<ul>
<li>two or more threads concurrently accessing a location of memory</li>
<li>one of them is a write</li>
<li>one of them is unsynchronized</li>
</ul>
<p>A data race has Undefined Behavior, and is therefore impossible to perform
in Safe Rust. Data races are <em>mostly</em> prevented through rust's ownership system:
it's impossible to alias a mutable reference, so it's impossible to perform a
data race. Interior mutability makes this more complicated, which is largely why
we have the Send and Sync traits (see below).</p>
<p><strong>However Rust does not prevent general race conditions.</strong></p>
<p>This is pretty fundamentally impossible, and probably honestly undesirable. Your
hardware is racy, your OS is racy, the other programs on your computer are racy,
and the world this all runs in is racy. Any system that could genuinely claim to
prevent <em>all</em> race conditions would be pretty awful to use, if not just
incorrect.</p>
<p>So it's perfectly &quot;fine&quot; for a Safe Rust program to get deadlocked or do
something nonsensical with incorrect synchronization. Obviously such a program
isn't very good, but Rust can only hold your hand so far. Still, a race
condition can't violate memory safety in a Rust program on its own. Only in
conjunction with some other unsafe code can a race condition actually violate
memory safety. For instance:</p>
<pre><pre class="playpen"><code class="language-rust no_run">
# #![allow(unused_variables)]
#fn main() {
use std::thread;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;

let data = vec![1, 2, 3, 4];
// Arc so that the memory the AtomicUsize is stored in still exists for
// the other thread to increment, even if we completely finish executing
// before it. Rust won't compile the program without it, because of the
// lifetime requirements of thread::spawn!
let idx = Arc::new(AtomicUsize::new(0));
let other_idx = idx.clone();

// `move` captures other_idx by-value, moving it into this thread
thread::spawn(move || {
    // It's ok to mutate idx because this value
    // is an atomic, so it can't cause a Data Race.
    other_idx.fetch_add(10, Ordering::SeqCst);
});

// Index with the value loaded from the atomic. This is safe because we
// read the atomic memory only once, and then pass a copy of that value
// to the Vec's indexing implementation. This indexing will be correctly
// bounds checked, and there's no chance of the value getting changed
// in the middle. However our program may panic if the thread we spawned
// managed to increment before this ran. A race condition because correct
// program execution (panicking is rarely correct) depends on order of
// thread execution.
println!(&quot;{}&quot;, data[idx.load(Ordering::SeqCst)]);
#}</code></pre></pre>
<pre><pre class="playpen"><code class="language-rust no_run">
# #![allow(unused_variables)]
#fn main() {
use std::thread;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;

let data = vec![1, 2, 3, 4];

let idx = Arc::new(AtomicUsize::new(0));
let other_idx = idx.clone();

// `move` captures other_idx by-value, moving it into this thread
thread::spawn(move || {
    // It's ok to mutate idx because this value
    // is an atomic, so it can't cause a Data Race.
    other_idx.fetch_add(10, Ordering::SeqCst);
});

if idx.load(Ordering::SeqCst) &lt; data.len() {
    unsafe {
        // Incorrectly loading the idx after we did the bounds check.
        // It could have changed. This is a race condition, *and dangerous*
        // because we decided to do `get_unchecked`, which is `unsafe`.
        println!(&quot;{}&quot;, data.get_unchecked(idx.load(Ordering::SeqCst)));
    }
}
#}</code></pre></pre>
<a class="header" href="print.html#send-and-sync" id="send-and-sync"><h1>Send and Sync</h1></a>
<p>Not everything obeys inherited mutability, though. Some types allow you to
have multiple aliases of a location in memory while mutating it. Unless these types use
synchronization to manage this access, they are absolutely not thread-safe. Rust
captures this through the <code>Send</code> and <code>Sync</code> traits.</p>
<ul>
<li>A type is Send if it is safe to send it to another thread.</li>
<li>A type is Sync if it is safe to share between threads (<code>&amp;T</code> is Send).</li>
</ul>
<p>Send and Sync are fundamental to Rust's concurrency story. As such, a
substantial amount of special tooling exists to make them work right. First and
foremost, they're <a href="safe-unsafe-meaning.html">unsafe traits</a>. This means that they are unsafe to
implement, and other unsafe code can assume that they are correctly
implemented. Since they're <em>marker traits</em> (they have no associated items like
methods), correctly implemented simply means that they have the intrinsic
properties an implementor should have. Incorrectly implementing Send or Sync can
cause Undefined Behavior.</p>
<p>Send and Sync are also automatically derived traits. This means that, unlike
every other trait, if a type is composed entirely of Send or Sync types, then it
is Send or Sync. Almost all primitives are Send and Sync, and as a consequence
pretty much all types you'll ever interact with are Send and Sync.</p>
<p>Major exceptions include:</p>
<ul>
<li>raw pointers are neither Send nor Sync (because they have no safety guards).</li>
<li><code>UnsafeCell</code> isn't Sync (and therefore <code>Cell</code> and <code>RefCell</code> aren't).</li>
<li><code>Rc</code> isn't Send or Sync (because the refcount is shared and unsynchronized).</li>
</ul>
<p><code>Rc</code> and <code>UnsafeCell</code> are very fundamentally not thread-safe: they enable
unsynchronized shared mutable state. However raw pointers are, strictly
speaking, marked as thread-unsafe as more of a <em>lint</em>. Doing anything useful
with a raw pointer requires dereferencing it, which is already unsafe. In that
sense, one could argue that it would be &quot;fine&quot; for them to be marked as thread
safe.</p>
<p>However it's important that they aren't thread-safe to prevent types that
contain them from being automatically marked as thread-safe. These types have
non-trivial untracked ownership, and it's unlikely that their author was
necessarily thinking hard about thread safety. In the case of <code>Rc</code>, we have a nice
example of a type that contains a <code>*mut</code> that is definitely not thread-safe.</p>
<p>Types that aren't automatically derived can simply implement them if desired:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct MyBox(*mut u8);

unsafe impl Send for MyBox {}
unsafe impl Sync for MyBox {}
#}</code></pre></pre>
<p>In the <em>incredibly rare</em> case that a type is inappropriately automatically
derived to be Send or Sync, then one can also unimplement Send and Sync:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#![feature(optin_builtin_traits)]

#fn main() {
// I have some magic semantics for some synchronization primitive!
struct SpecialThreadToken(u8);

impl !Send for SpecialThreadToken {}
impl !Sync for SpecialThreadToken {}
#}</code></pre></pre>
<p>Note that <em>in and of itself</em> it is impossible to incorrectly derive Send and
Sync. Only types that are ascribed special meaning by other unsafe code can
possible cause trouble by being incorrectly Send or Sync.</p>
<p>Most uses of raw pointers should be encapsulated behind a sufficient abstraction
that Send and Sync can be derived. For instance all of Rust's standard
collections are Send and Sync (when they contain Send and Sync types) in spite
of their pervasive use of raw pointers to manage allocations and complex ownership.
Similarly, most iterators into these collections are Send and Sync because they
largely behave like an <code>&amp;</code> or <code>&amp;mut</code> into the collection.</p>
<p>TODO: better explain what can or can't be Send or Sync. Sufficient to appeal
only to data races?</p>
<a class="header" href="print.html#atomics" id="atomics"><h1>Atomics</h1></a>
<p>Rust pretty blatantly just inherits C11's memory model for atomics. This is not
due to this model being particularly excellent or easy to understand. Indeed,
this model is quite complex and known to have <a href="http://plv.mpi-sws.org/c11comp/popl15.pdf">several flaws</a>.
Rather, it is a pragmatic concession to the fact that <em>everyone</em> is pretty bad
at modeling atomics. At very least, we can benefit from existing tooling and
research around C.</p>
<p>Trying to fully explain the model in this book is fairly hopeless. It's defined
in terms of madness-inducing causality graphs that require a full book to
properly understand in a practical way. If you want all the nitty-gritty
details, you should check out <a href="http://www.open-std.org/jtc1/sc22/wg14/www/standards.html#9899">C's specification (Section 7.17)</a>.
Still, we'll try to cover the basics and some of the problems Rust developers
face.</p>
<p>The C11 memory model is fundamentally about trying to bridge the gap between the
semantics we want, the optimizations compilers want, and the inconsistent chaos
our hardware wants. <em>We</em> would like to just write programs and have them do
exactly what we said but, you know, fast. Wouldn't that be great?</p>
<a class="header" href="print.html#compiler-reordering" id="compiler-reordering"><h1>Compiler Reordering</h1></a>
<p>Compilers fundamentally want to be able to do all sorts of complicated
transformations to reduce data dependencies and eliminate dead code. In
particular, they may radically change the actual order of events, or make events
never occur! If we write something like</p>
<pre><code class="language-rust ignore">x = 1;
y = 3;
x = 2;
</code></pre>
<p>The compiler may conclude that it would be best if your program did</p>
<pre><code class="language-rust ignore">x = 2;
y = 3;
</code></pre>
<p>This has inverted the order of events and completely eliminated one event.
From a single-threaded perspective this is completely unobservable: after all
the statements have executed we are in exactly the same state. But if our
program is multi-threaded, we may have been relying on <code>x</code> to actually be
assigned to 1 before <code>y</code> was assigned. We would like the compiler to be
able to make these kinds of optimizations, because they can seriously improve
performance. On the other hand, we'd also like to be able to depend on our
program <em>doing the thing we said</em>.</p>
<a class="header" href="print.html#hardware-reordering" id="hardware-reordering"><h1>Hardware Reordering</h1></a>
<p>On the other hand, even if the compiler totally understood what we wanted and
respected our wishes, our hardware might instead get us in trouble. Trouble
comes from CPUs in the form of memory hierarchies. There is indeed a global
shared memory space somewhere in your hardware, but from the perspective of each
CPU core it is <em>so very far away</em> and <em>so very slow</em>. Each CPU would rather work
with its local cache of the data and only go through all the anguish of
talking to shared memory only when it doesn't actually have that memory in
cache.</p>
<p>After all, that's the whole point of the cache, right? If every read from the
cache had to run back to shared memory to double check that it hadn't changed,
what would the point be? The end result is that the hardware doesn't guarantee
that events that occur in the same order on <em>one</em> thread, occur in the same
order on <em>another</em> thread. To guarantee this, we must issue special instructions
to the CPU telling it to be a bit less smart.</p>
<p>For instance, say we convince the compiler to emit this logic:</p>
<pre><code class="language-text">initial state: x = 0, y = 1

THREAD 1        THREAD2
y = 3;          if x == 1 {
x = 1;              y *= 2;
                }
</code></pre>
<p>Ideally this program has 2 possible final states:</p>
<ul>
<li><code>y = 3</code>: (thread 2 did the check before thread 1 completed)</li>
<li><code>y = 6</code>: (thread 2 did the check after thread 1 completed)</li>
</ul>
<p>However there's a third potential state that the hardware enables:</p>
<ul>
<li><code>y = 2</code>: (thread 2 saw <code>x = 1</code>, but not <code>y = 3</code>, and then overwrote <code>y = 3</code>)</li>
</ul>
<p>It's worth noting that different kinds of CPU provide different guarantees. It
is common to separate hardware into two categories: strongly-ordered and weakly-
ordered. Most notably x86/64 provides strong ordering guarantees, while ARM
provides weak ordering guarantees. This has two consequences for concurrent
programming:</p>
<ul>
<li>
<p>Asking for stronger guarantees on strongly-ordered hardware may be cheap or
even free because they already provide strong guarantees unconditionally.
Weaker guarantees may only yield performance wins on weakly-ordered hardware.</p>
</li>
<li>
<p>Asking for guarantees that are too weak on strongly-ordered hardware is
more likely to <em>happen</em> to work, even though your program is strictly
incorrect. If possible, concurrent algorithms should be tested on
weakly-ordered hardware.</p>
</li>
</ul>
<a class="header" href="print.html#data-accesses" id="data-accesses"><h1>Data Accesses</h1></a>
<p>The C11 memory model attempts to bridge the gap by allowing us to talk about the
<em>causality</em> of our program. Generally, this is by establishing a <em>happens
before</em> relationship between parts of the program and the threads that are
running them. This gives the hardware and compiler room to optimize the program
more aggressively where a strict happens-before relationship isn't established,
but forces them to be more careful where one is established. The way we
communicate these relationships are through <em>data accesses</em> and <em>atomic
accesses</em>.</p>
<p>Data accesses are the bread-and-butter of the programming world. They are
fundamentally unsynchronized and compilers are free to aggressively optimize
them. In particular, data accesses are free to be reordered by the compiler on
the assumption that the program is single-threaded. The hardware is also free to
propagate the changes made in data accesses to other threads as lazily and
inconsistently as it wants. Most critically, data accesses are how data races
happen. Data accesses are very friendly to the hardware and compiler, but as
we've seen they offer <em>awful</em> semantics to try to write synchronized code with.
Actually, that's too weak.</p>
<p><strong>It is literally impossible to write correct synchronized code using only data
accesses.</strong></p>
<p>Atomic accesses are how we tell the hardware and compiler that our program is
multi-threaded. Each atomic access can be marked with an <em>ordering</em> that
specifies what kind of relationship it establishes with other accesses. In
practice, this boils down to telling the compiler and hardware certain things
they <em>can't</em> do. For the compiler, this largely revolves around re-ordering of
instructions. For the hardware, this largely revolves around how writes are
propagated to other threads. The set of orderings Rust exposes are:</p>
<ul>
<li>Sequentially Consistent (SeqCst)</li>
<li>Release</li>
<li>Acquire</li>
<li>Relaxed</li>
</ul>
<p>(Note: We explicitly do not expose the C11 <em>consume</em> ordering)</p>
<p>TODO: negative reasoning vs positive reasoning? TODO: &quot;can't forget to
synchronize&quot;</p>
<a class="header" href="print.html#sequentially-consistent" id="sequentially-consistent"><h1>Sequentially Consistent</h1></a>
<p>Sequentially Consistent is the most powerful of all, implying the restrictions
of all other orderings. Intuitively, a sequentially consistent operation
cannot be reordered: all accesses on one thread that happen before and after a
SeqCst access stay before and after it. A data-race-free program that uses
only sequentially consistent atomics and data accesses has the very nice
property that there is a single global execution of the program's instructions
that all threads agree on. This execution is also particularly nice to reason
about: it's just an interleaving of each thread's individual executions. This
does not hold if you start using the weaker atomic orderings.</p>
<p>The relative developer-friendliness of sequential consistency doesn't come for
free. Even on strongly-ordered platforms sequential consistency involves
emitting memory fences.</p>
<p>In practice, sequential consistency is rarely necessary for program correctness.
However sequential consistency is definitely the right choice if you're not
confident about the other memory orders. Having your program run a bit slower
than it needs to is certainly better than it running incorrectly! It's also
mechanically trivial to downgrade atomic operations to have a weaker
consistency later on. Just change <code>SeqCst</code> to <code>Relaxed</code> and you're done! Of
course, proving that this transformation is <em>correct</em> is a whole other matter.</p>
<a class="header" href="print.html#acquire-release" id="acquire-release"><h1>Acquire-Release</h1></a>
<p>Acquire and Release are largely intended to be paired. Their names hint at their
use case: they're perfectly suited for acquiring and releasing locks, and
ensuring that critical sections don't overlap.</p>
<p>Intuitively, an acquire access ensures that every access after it stays after
it. However operations that occur before an acquire are free to be reordered to
occur after it. Similarly, a release access ensures that every access before it
stays before it. However operations that occur after a release are free to be
reordered to occur before it.</p>
<p>When thread A releases a location in memory and then thread B subsequently
acquires <em>the same</em> location in memory, causality is established. Every write
that happened before A's release will be observed by B after its acquisition.
However no causality is established with any other threads. Similarly, no
causality is established if A and B access <em>different</em> locations in memory.</p>
<p>Basic use of release-acquire is therefore simple: you acquire a location of
memory to begin the critical section, and then release that location to end it.
For instance, a simple spinlock might look like:</p>
<pre><pre class="playpen"><code class="language-rust">use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;

fn main() {
    let lock = Arc::new(AtomicBool::new(false)); // value answers &quot;am I locked?&quot;

    // ... distribute lock to threads somehow ...

    // Try to acquire the lock by setting it to true
    while lock.compare_and_swap(false, true, Ordering::Acquire) { }
    // broke out of the loop, so we successfully acquired the lock!

    // ... scary data accesses ...

    // ok we're done, release the lock
    lock.store(false, Ordering::Release);
}
</code></pre></pre>
<p>On strongly-ordered platforms most accesses have release or acquire semantics,
making release and acquire often totally free. This is not the case on
weakly-ordered platforms.</p>
<a class="header" href="print.html#relaxed" id="relaxed"><h1>Relaxed</h1></a>
<p>Relaxed accesses are the absolute weakest. They can be freely re-ordered and
provide no happens-before relationship. Still, relaxed operations are still
atomic. That is, they don't count as data accesses and any read-modify-write
operations done to them occur atomically. Relaxed operations are appropriate for
things that you definitely want to happen, but don't particularly otherwise care
about. For instance, incrementing a counter can be safely done by multiple
threads using a relaxed <code>fetch_add</code> if you're not using the counter to
synchronize any other accesses.</p>
<p>There's rarely a benefit in making an operation relaxed on strongly-ordered
platforms, since they usually provide release-acquire semantics anyway. However
relaxed operations can be cheaper on weakly-ordered platforms.</p>
<a class="header" href="print.html#example-implementing-vec" id="example-implementing-vec"><h1>Example: Implementing Vec</h1></a>
<p>To bring everything together, we're going to write <code>std::Vec</code> from scratch.
Because all the best tools for writing unsafe code are unstable, this
project will only work on nightly (as of Rust 1.9.0). With the exception of the
allocator API, much of the unstable code we'll use is expected to be stabilized
in a similar form as it is today.</p>
<p>However we will generally try to avoid unstable code where possible. In
particular we won't use any intrinsics that could make a code a little
bit nicer or efficient because intrinsics are permanently unstable. Although
many intrinsics <em>do</em> become stabilized elsewhere (<code>std::ptr</code> and <code>str::mem</code>
consist of many intrinsics).</p>
<p>Ultimately this means our implementation may not take advantage of all
possible optimizations, though it will be by no means <em>naive</em>. We will
definitely get into the weeds over nitty-gritty details, even
when the problem doesn't <em>really</em> merit it.</p>
<p>You wanted advanced. We're gonna go advanced.</p>
<a class="header" href="print.html#layout" id="layout"><h1>Layout</h1></a>
<p>First off, we need to come up with the struct layout. A Vec has three parts:
a pointer to the allocation, the size of the allocation, and the number of
elements that have been initialized.</p>
<p>Naively, this means we just want this design:</p>
<pre><pre class="playpen"><code class="language-rust">pub struct Vec&lt;T&gt; {
    ptr: *mut T,
    cap: usize,
    len: usize,
}
# fn main() {}
</code></pre></pre>
<p>And indeed this would compile. Unfortunately, it would be incorrect. First, the
compiler will give us too strict variance. So a <code>&amp;Vec&lt;&amp;'static str&gt;</code>
couldn't be used where an <code>&amp;Vec&lt;&amp;'a str&gt;</code> was expected. More importantly, it
will give incorrect ownership information to the drop checker, as it will
conservatively assume we don't own any values of type <code>T</code>. See <a href="ownership.html">the chapter
on ownership and lifetimes</a> for all the details on variance and
drop check.</p>
<p>As we saw in the ownership chapter, we should use <code>Unique&lt;T&gt;</code> in place of
<code>*mut T</code> when we have a raw pointer to an allocation we own. Unique is unstable,
so we'd like to not use it if possible, though.</p>
<p>As a recap, Unique is a wrapper around a raw pointer that declares that:</p>
<ul>
<li>We are variant over <code>T</code></li>
<li>We may own a value of type <code>T</code> (for drop check)</li>
<li>We are Send/Sync if <code>T</code> is Send/Sync</li>
<li>Our pointer is never null (so <code>Option&lt;Vec&lt;T&gt;&gt;</code> is null-pointer-optimized)</li>
</ul>
<p>We can implement all of the above requirements except for the last
one in stable Rust:</p>
<pre><pre class="playpen"><code class="language-rust">use std::marker::PhantomData;
use std::ops::Deref;
use std::mem;

struct Unique&lt;T&gt; {
    ptr: *const T,              // *const for variance
    _marker: PhantomData&lt;T&gt;,    // For the drop checker
}

// Deriving Send and Sync is safe because we are the Unique owners
// of this data. It's like Unique&lt;T&gt; is &quot;just&quot; T.
unsafe impl&lt;T: Send&gt; Send for Unique&lt;T&gt; {}
unsafe impl&lt;T: Sync&gt; Sync for Unique&lt;T&gt; {}

impl&lt;T&gt; Unique&lt;T&gt; {
    pub fn new(ptr: *mut T) -&gt; Self {
        Unique { ptr: ptr, _marker: PhantomData }
    }

    pub fn as_ptr(&amp;self) -&gt; *mut T {
        self.ptr as *mut T
    }
}

# fn main() {}
</code></pre></pre>
<p>Unfortunately the mechanism for stating that your value is non-zero is
unstable and unlikely to be stabilized soon. As such we're just going to
take the hit and use std's Unique:</p>
<pre><pre class="playpen"><code class="language-rust">#![feature(ptr_internals, unique)]

use std::ptr::{Unique, self};

pub struct Vec&lt;T&gt; {
    ptr: Unique&lt;T&gt;,
    cap: usize,
    len: usize,
}

# fn main() {}
</code></pre></pre>
<p>If you don't care about the null-pointer optimization, then you can use the
stable code. However we will be designing the rest of the code around enabling
this optimization. It should be noted that <code>Unique::new</code> is unsafe to call, because
putting <code>null</code> inside of it is Undefined Behavior. Our stable Unique doesn't
need <code>new</code> to be unsafe because it doesn't make any interesting guarantees about
its contents.</p>
<a class="header" href="print.html#allocating-memory" id="allocating-memory"><h1>Allocating Memory</h1></a>
<p>Using Unique throws a wrench in an important feature of Vec (and indeed all of
the std collections): an empty Vec doesn't actually allocate at all. So if we
can't allocate, but also can't put a null pointer in <code>ptr</code>, what do we do in
<code>Vec::new</code>? Well, we just put some other garbage in there!</p>
<p>This is perfectly fine because we already have <code>cap == 0</code> as our sentinel for no
allocation. We don't even need to handle it specially in almost any code because
we usually need to check if <code>cap &gt; len</code> or <code>len &gt; 0</code> anyway. The recommended
Rust value to put here is <code>mem::align_of::&lt;T&gt;()</code>. Unique provides a convenience
for this: <code>Unique::empty()</code>. There are quite a few places where we'll
want to use <code>empty</code> because there's no real allocation to talk about but
<code>null</code> would make the compiler do bad things.</p>
<p>So:</p>
<pre><code class="language-rust ignore">#![feature(alloc, heap_api)]

use std::mem;

impl&lt;T&gt; Vec&lt;T&gt; {
    fn new() -&gt; Self {
        assert!(mem::size_of::&lt;T&gt;() != 0, &quot;We're not ready to handle ZSTs&quot;);
        Vec { ptr: Unique::empty(), len: 0, cap: 0 }
    }
}
</code></pre>
<p>I slipped in that assert there because zero-sized types will require some
special handling throughout our code, and I want to defer the issue for now.
Without this assert, some of our early drafts will do some Very Bad Things.</p>
<p>Next we need to figure out what to actually do when we <em>do</em> want space. For
that, we'll need to use the rest of the heap APIs. These basically allow us to
talk directly to Rust's allocator (jemalloc by default).</p>
<p>We'll also need a way to handle out-of-memory (OOM) conditions. The standard
library calls <code>std::alloc::oom()</code>, which in turn calls the the <code>oom</code> langitem.
By default this just aborts the program by executing an illegal cpu instruction.
The reason we abort and don't panic is because unwinding can cause allocations
to happen, and that seems like a bad thing to do when your allocator just came
back with &quot;hey I don't have any more memory&quot;.</p>
<p>Of course, this is a bit silly since most platforms don't actually run out of
memory in a conventional way. Your operating system will probably kill the
application by another means if you legitimately start using up all the memory.
The most likely way we'll trigger OOM is by just asking for ludicrous quantities
of memory at once (e.g. half the theoretical address space). As such it's
<em>probably</em> fine to panic and nothing bad will happen. Still, we're trying to be
like the standard library as much as possible, so we'll just kill the whole
program.</p>
<p>Okay, now we can write growing. Roughly, we want to have this logic:</p>
<pre><code class="language-text">if cap == 0:
    allocate()
    cap = 1
else:
    reallocate()
    cap *= 2
</code></pre>
<p>But Rust's only supported allocator API is so low level that we'll need to do a
fair bit of extra work. We also need to guard against some special
conditions that can occur with really large allocations or empty allocations.</p>
<p>In particular, <code>ptr::offset</code> will cause us a lot of trouble, because it has
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
not have dealt with this instruction, here's the basic story with GEP: alias
analysis, alias analysis, alias analysis. It's super important to an optimizing
compiler to be able to reason about data dependencies and aliasing.</p>
<p>As a simple example, consider the following fragment of code:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
# let x = &amp;mut 0;
# let y = &amp;mut 0;
*x *= 7;
*y *= 3;
#}</code></pre></pre>
<p>If the compiler can prove that <code>x</code> and <code>y</code> point to different locations in
memory, the two operations can in theory be executed in parallel (by e.g.
loading them into different registers and working on them independently).
However the compiler can't do this in general because if x and y point to
the same location in memory, the operations need to be done to the same value,
and they can't just be merged afterwards.</p>
<p>When you use GEP inbounds, you are specifically telling LLVM that the offsets
you're about to do are within the bounds of a single &quot;allocated&quot; entity. The
ultimate payoff being that LLVM can assume that if two pointers are known to
point to two disjoint objects, all the offsets of those pointers are <em>also</em>
known to not alias (because you won't just end up in some random place in
memory). LLVM is heavily optimized to work with GEP offsets, and inbounds
offsets are the best of all, so it's important that we use them as much as
possible.</p>
<p>So that's what GEP's about, how can it cause us trouble?</p>
<p>The first problem is that we index into arrays with unsigned integers, but
GEP (and as a consequence <code>ptr::offset</code>) takes a signed integer. This means
that half of the seemingly valid indices into an array will overflow GEP and
actually go in the wrong direction! As such we must limit all allocations to
<code>isize::MAX</code> elements. This actually means we only need to worry about
byte-sized objects, because e.g. <code>&gt; isize::MAX</code> <code>u16</code>s will truly exhaust all of
the system's memory. However in order to avoid subtle corner cases where someone
reinterprets some array of <code>&lt; isize::MAX</code> objects as bytes, std limits all
allocations to <code>isize::MAX</code> bytes.</p>
<p>On all 64-bit targets that Rust currently supports we're artificially limited
to significantly less than all 64 bits of the address space (modern x64
platforms only expose 48-bit addressing), so we can rely on just running out of
memory first. However on 32-bit targets, particularly those with extensions to
use more of the address space (PAE x86 or x32), it's theoretically possible to
successfully allocate more than <code>isize::MAX</code> bytes of memory.</p>
<p>However since this is a tutorial, we're not going to be particularly optimal
here, and just unconditionally check, rather than use clever platform-specific
<code>cfg</code>s.</p>
<p>The other corner-case we need to worry about is empty allocations. There will
be two kinds of empty allocations we need to worry about: <code>cap = 0</code> for all T,
and <code>cap &gt; 0</code> for zero-sized types.</p>
<p>These cases are tricky because they come
down to what LLVM means by &quot;allocated&quot;. LLVM's notion of an
allocation is significantly more abstract than how we usually use it. Because
LLVM needs to work with different languages' semantics and custom allocators,
it can't really intimately understand allocation. Instead, the main idea behind
allocation is &quot;doesn't overlap with other stuff&quot;. That is, heap allocations,
stack allocations, and globals don't randomly overlap. Yep, it's about alias
analysis. As such, Rust can technically play a bit fast and loose with the notion of
an allocation as long as it's <em>consistent</em>.</p>
<p>Getting back to the empty allocation case, there are a couple of places where
we want to offset by 0 as a consequence of generic code. The question is then:
is it consistent to do so? For zero-sized types, we have concluded that it is
indeed consistent to do a GEP inbounds offset by an arbitrary number of
elements. This is a runtime no-op because every element takes up no space,
and it's fine to pretend that there's infinite zero-sized types allocated
at <code>0x01</code>. No allocator will ever allocate that address, because they won't
allocate <code>0x00</code> and they generally allocate to some minimal alignment higher
than a byte. Also generally the whole first page of memory is
protected from being allocated anyway (a whole 4k, on many platforms).</p>
<p>However what about for positive-sized types? That one's a bit trickier. In
principle, you can argue that offsetting by 0 gives LLVM no information: either
there's an element before the address or after it, but it can't know which.
However we've chosen to conservatively assume that it may do bad things. As
such we will guard against this case explicitly.</p>
<p><em>Phew</em></p>
<p>Ok with all the nonsense out of the way, let's actually allocate some memory:</p>
<pre><code class="language-rust ignore">use std::alloc::oom;

fn grow(&amp;mut self) {
    // this is all pretty delicate, so let's say it's all unsafe
    unsafe {
        // current API requires us to specify size and alignment manually.
        let align = mem::align_of::&lt;T&gt;();
        let elem_size = mem::size_of::&lt;T&gt;();

        let (new_cap, ptr) = if self.cap == 0 {
            let ptr = heap::allocate(elem_size, align);
            (1, ptr)
        } else {
            // as an invariant, we can assume that `self.cap &lt; isize::MAX`,
            // so this doesn't need to be checked.
            let new_cap = self.cap * 2;
            // Similarly this can't overflow due to previously allocating this
            let old_num_bytes = self.cap * elem_size;

            // check that the new allocation doesn't exceed `isize::MAX` at all
            // regardless of the actual size of the capacity. This combines the
            // `new_cap &lt;= isize::MAX` and `new_num_bytes &lt;= usize::MAX` checks
            // we need to make. We lose the ability to allocate e.g. 2/3rds of
            // the address space with a single Vec of i16's on 32-bit though.
            // Alas, poor Yorick -- I knew him, Horatio.
            assert!(old_num_bytes &lt;= (::std::isize::MAX as usize) / 2,
                    &quot;capacity overflow&quot;);

            let new_num_bytes = old_num_bytes * 2;
            let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _,
                                        old_num_bytes,
                                        new_num_bytes,
                                        align);
            (new_cap, ptr)
        };

        // If allocate or reallocate fail, we'll get `null` back
        if ptr.is_null() { oom(); }

        self.ptr = Unique::new(ptr as *mut _);
        self.cap = new_cap;
    }
}
</code></pre>
<p>Nothing particularly tricky here. Just computing sizes and alignments and doing
some careful multiplication checks.</p>
<a class="header" href="print.html#push-and-pop" id="push-and-pop"><h1>Push and Pop</h1></a>
<p>Alright. We can initialize. We can allocate. Let's actually implement some
functionality! Let's start with <code>push</code>. All it needs to do is check if we're
full to grow, unconditionally write to the next index, and then increment our
length.</p>
<p>To do the write we have to be careful not to evaluate the memory we want to write
to. At worst, it's truly uninitialized memory from the allocator. At best it's the
bits of some old value we popped off. Either way, we can't just index to the memory
and dereference it, because that will evaluate the memory as a valid instance of
T. Worse, <code>foo[idx] = x</code> will try to call <code>drop</code> on the old value of <code>foo[idx]</code>!</p>
<p>The correct way to do this is with <code>ptr::write</code>, which just blindly overwrites the
target address with the bits of the value we provide. No evaluation involved.</p>
<p>For <code>push</code>, if the old len (before push was called) is 0, then we want to write
to the 0th index. So we should offset by the old len.</p>
<pre><code class="language-rust ignore">pub fn push(&amp;mut self, elem: T) {
    if self.len == self.cap { self.grow(); }

    unsafe {
        ptr::write(self.ptr.offset(self.len as isize), elem);
    }

    // Can't fail, we'll OOM first.
    self.len += 1;
}
</code></pre>
<p>Easy! How about <code>pop</code>? Although this time the index we want to access is
initialized, Rust won't just let us dereference the location of memory to move
the value out, because that would leave the memory uninitialized! For this we
need <code>ptr::read</code>, which just copies out the bits from the target address and
interprets it as a value of type T. This will leave the memory at this address
logically uninitialized, even though there is in fact a perfectly good instance
of T there.</p>
<p>For <code>pop</code>, if the old len is 1, we want to read out of the 0th index. So we
should offset by the new len.</p>
<pre><code class="language-rust ignore">pub fn pop(&amp;mut self) -&gt; Option&lt;T&gt; {
    if self.len == 0 {
        None
    } else {
        self.len -= 1;
        unsafe {
            Some(ptr::read(self.ptr.offset(self.len as isize)))
        }
    }
}
</code></pre>
<a class="header" href="print.html#deallocating" id="deallocating"><h1>Deallocating</h1></a>
<p>Next we should implement Drop so that we don't massively leak tons of resources.
The easiest way is to just call <code>pop</code> until it yields None, and then deallocate
our buffer. Note that calling <code>pop</code> is unneeded if <code>T: !Drop</code>. In theory we can
ask Rust if <code>T</code> <code>needs_drop</code> and omit the calls to <code>pop</code>. However in practice
LLVM is <em>really</em> good at removing simple side-effect free code like this, so I
wouldn't bother unless you notice it's not being stripped (in this case it is).</p>
<p>We must not call <code>heap::deallocate</code> when <code>self.cap == 0</code>, as in this case we
haven't actually allocated any memory.</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; Drop for Vec&lt;T&gt; {
    fn drop(&amp;mut self) {
        if self.cap != 0 {
            while let Some(_) = self.pop() { }

            let align = mem::align_of::&lt;T&gt;();
            let elem_size = mem::size_of::&lt;T&gt;();
            let num_bytes = elem_size * self.cap;
            unsafe {
                heap::deallocate(self.ptr.as_ptr() as *mut _, num_bytes, align);
            }
        }
    }
}
</code></pre>
<a class="header" href="print.html#deref" id="deref"><h1>Deref</h1></a>
<p>Alright! We've got a decent minimal stack implemented. We can push, we can
pop, and we can clean up after ourselves. However there's a whole mess of
functionality we'd reasonably want. In particular, we have a proper array, but
none of the slice functionality. That's actually pretty easy to solve: we can
implement <code>Deref&lt;Target=[T]&gt;</code>. This will magically make our Vec coerce to, and
behave like, a slice in all sorts of conditions.</p>
<p>All we need is <code>slice::from_raw_parts</code>. It will correctly handle empty slices
for us. Later once we set up zero-sized type support it will also Just Work
for those too.</p>
<pre><code class="language-rust ignore">use std::ops::Deref;

impl&lt;T&gt; Deref for Vec&lt;T&gt; {
    type Target = [T];
    fn deref(&amp;self) -&gt; &amp;[T] {
        unsafe {
            ::std::slice::from_raw_parts(self.ptr.as_ptr(), self.len)
        }
    }
}
</code></pre>
<p>And let's do DerefMut too:</p>
<pre><code class="language-rust ignore">use std::ops::DerefMut;

impl&lt;T&gt; DerefMut for Vec&lt;T&gt; {
    fn deref_mut(&amp;mut self) -&gt; &amp;mut [T] {
        unsafe {
            ::std::slice::from_raw_parts_mut(self.ptr.as_ptr(), self.len)
        }
    }
}
</code></pre>
<p>Now we have <code>len</code>, <code>first</code>, <code>last</code>, indexing, slicing, sorting, <code>iter</code>,
<code>iter_mut</code>, and all other sorts of bells and whistles provided by slice. Sweet!</p>
<a class="header" href="print.html#insert-and-remove" id="insert-and-remove"><h1>Insert and Remove</h1></a>
<p>Something <em>not</em> provided by slice is <code>insert</code> and <code>remove</code>, so let's do those
next.</p>
<p>Insert needs to shift all the elements at the target index to the right by one.
To do this we need to use <code>ptr::copy</code>, which is our version of C's <code>memmove</code>.
This copies some chunk of memory from one location to another, correctly
handling the case where the source and destination overlap (which will
definitely happen here).</p>
<p>If we insert at index <code>i</code>, we want to shift the <code>[i .. len]</code> to <code>[i+1 .. len+1]</code>
using the old len.</p>
<pre><code class="language-rust ignore">pub fn insert(&amp;mut self, index: usize, elem: T) {
    // Note: `&lt;=` because it's valid to insert after everything
    // which would be equivalent to push.
    assert!(index &lt;= self.len, &quot;index out of bounds&quot;);
    if self.cap == self.len { self.grow(); }

    unsafe {
        if index &lt; self.len {
            // ptr::copy(src, dest, len): &quot;copy from source to dest len elems&quot;
            ptr::copy(self.ptr.offset(index as isize),
                      self.ptr.offset(index as isize + 1),
                      self.len - index);
        }
        ptr::write(self.ptr.offset(index as isize), elem);
        self.len += 1;
    }
}
</code></pre>
<p>Remove behaves in the opposite manner. We need to shift all the elements from
<code>[i+1 .. len + 1]</code> to <code>[i .. len]</code> using the <em>new</em> len.</p>
<pre><code class="language-rust ignore">pub fn remove(&amp;mut self, index: usize) -&gt; T {
    // Note: `&lt;` because it's *not* valid to remove after everything
    assert!(index &lt; self.len, &quot;index out of bounds&quot;);
    unsafe {
        self.len -= 1;
        let result = ptr::read(self.ptr.offset(index as isize));
        ptr::copy(self.ptr.offset(index as isize + 1),
                  self.ptr.offset(index as isize),
                  self.len - index);
        result
    }
}
</code></pre>
<a class="header" href="print.html#intoiter" id="intoiter"><h1>IntoIter</h1></a>
<p>Let's move on to writing iterators. <code>iter</code> and <code>iter_mut</code> have already been
written for us thanks to The Magic of Deref. However there's two interesting
iterators that Vec provides that slices can't: <code>into_iter</code> and <code>drain</code>.</p>
<p>IntoIter consumes the Vec by-value, and can consequently yield its elements
by-value. In order to enable this, IntoIter needs to take control of Vec's
allocation.</p>
<p>IntoIter needs to be DoubleEnded as well, to enable reading from both ends.
Reading from the back could just be implemented as calling <code>pop</code>, but reading
from the front is harder. We could call <code>remove(0)</code> but that would be insanely
expensive. Instead we're going to just use ptr::read to copy values out of
either end of the Vec without mutating the buffer at all.</p>
<p>To do this we're going to use a very common C idiom for array iteration. We'll
make two pointers; one that points to the start of the array, and one that
points to one-element past the end. When we want an element from one end, we'll
read out the value pointed to at that end and move the pointer over by one. When
the two pointers are equal, we know we're done.</p>
<p>Note that the order of read and offset are reversed for <code>next</code> and <code>next_back</code>
For <code>next_back</code> the pointer is always after the element it wants to read next,
while for <code>next</code> the pointer is always at the element it wants to read next.
To see why this is, consider the case where every element but one has been
yielded.</p>
<p>The array looks like this:</p>
<pre><code class="language-text">          S  E
[X, X, X, O, X, X, X]
</code></pre>
<p>If E pointed directly at the element it wanted to yield next, it would be
indistinguishable from the case where there are no more elements to yield.</p>
<p>Although we don't actually care about it during iteration, we also need to hold
onto the Vec's allocation information in order to free it once IntoIter is
dropped.</p>
<p>So we're going to use the following struct:</p>
<pre><code class="language-rust ignore">struct IntoIter&lt;T&gt; {
    buf: Unique&lt;T&gt;,
    cap: usize,
    start: *const T,
    end: *const T,
}
</code></pre>
<p>And this is what we end up with for initialization:</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; Vec&lt;T&gt; {
    fn into_iter(self) -&gt; IntoIter&lt;T&gt; {
        // Can't destructure Vec since it's Drop
        let ptr = self.ptr;
        let cap = self.cap;
        let len = self.len;

        // Make sure not to drop Vec since that will free the buffer
        mem::forget(self);

        unsafe {
            IntoIter {
                buf: ptr,
                cap: cap,
                start: *ptr,
                end: if cap == 0 {
                    // can't offset off this pointer, it's not allocated!
                    *ptr
                } else {
                    ptr.offset(len as isize)
                }
            }
        }
    }
}
</code></pre>
<p>Here's iterating forward:</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; Iterator for IntoIter&lt;T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
        } else {
            unsafe {
                let result = ptr::read(self.start);
                self.start = self.start.offset(1);
                Some(result)
            }
        }
    }

    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) {
        let len = (self.end as usize - self.start as usize)
                  / mem::size_of::&lt;T&gt;();
        (len, Some(len))
    }
}
</code></pre>
<p>And here's iterating backwards.</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; DoubleEndedIterator for IntoIter&lt;T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
        } else {
            unsafe {
                self.end = self.end.offset(-1);
                Some(ptr::read(self.end))
            }
        }
    }
}
</code></pre>
<p>Because IntoIter takes ownership of its allocation, it needs to implement Drop
to free it. However it also wants to implement Drop to drop any elements it
contains that weren't yielded.</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; Drop for IntoIter&lt;T&gt; {
    fn drop(&amp;mut self) {
        if self.cap != 0 {
            // drop any remaining elements
            for _ in &amp;mut *self {}

            let align = mem::align_of::&lt;T&gt;();
            let elem_size = mem::size_of::&lt;T&gt;();
            let num_bytes = elem_size * self.cap;
            unsafe {
                heap::deallocate(self.buf.as_ptr() as *mut _, num_bytes, align);
            }
        }
    }
}
</code></pre>
<a class="header" href="print.html#rawvec" id="rawvec"><h1>RawVec</h1></a>
<p>We've actually reached an interesting situation here: we've duplicated the logic
for specifying a buffer and freeing its memory in Vec and IntoIter. Now that
we've implemented it and identified <em>actual</em> logic duplication, this is a good
time to perform some logic compression.</p>
<p>We're going to abstract out the <code>(ptr, cap)</code> pair and give them the logic for
allocating, growing, and freeing:</p>
<pre><code class="language-rust ignore">struct RawVec&lt;T&gt; {
    ptr: Unique&lt;T&gt;,
    cap: usize,
}

impl&lt;T&gt; RawVec&lt;T&gt; {
    fn new() -&gt; Self {
        assert!(mem::size_of::&lt;T&gt;() != 0, &quot;TODO: implement ZST support&quot;);
        RawVec { ptr: Unique::empty(), cap: 0 }
    }

    // unchanged from Vec
    fn grow(&amp;mut self) {
        unsafe {
            let align = mem::align_of::&lt;T&gt;();
            let elem_size = mem::size_of::&lt;T&gt;();

            let (new_cap, ptr) = if self.cap == 0 {
                let ptr = heap::allocate(elem_size, align);
                (1, ptr)
            } else {
                let new_cap = 2 * self.cap;
                let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _,
                                            self.cap * elem_size,
                                            new_cap * elem_size,
                                            align);
                (new_cap, ptr)
            };

            // If allocate or reallocate fail, we'll get `null` back
            if ptr.is_null() { oom() }

            self.ptr = Unique::new(ptr as *mut _);
            self.cap = new_cap;
        }
    }
}


impl&lt;T&gt; Drop for RawVec&lt;T&gt; {
    fn drop(&amp;mut self) {
        if self.cap != 0 {
            let align = mem::align_of::&lt;T&gt;();
            let elem_size = mem::size_of::&lt;T&gt;();
            let num_bytes = elem_size * self.cap;
            unsafe {
                heap::deallocate(self.ptr.as_mut() as *mut _, num_bytes, align);
            }
        }
    }
}
</code></pre>
<p>And change Vec as follows:</p>
<pre><code class="language-rust ignore">pub struct Vec&lt;T&gt; {
    buf: RawVec&lt;T&gt;,
    len: usize,
}

impl&lt;T&gt; Vec&lt;T&gt; {
    fn ptr(&amp;self) -&gt; *mut T { self.buf.ptr.as_ptr() }

    fn cap(&amp;self) -&gt; usize { self.buf.cap }

    pub fn new() -&gt; Self {
        Vec { buf: RawVec::new(), len: 0 }
    }

    // push/pop/insert/remove largely unchanged:
    // * `self.ptr -&gt; self.ptr()`
    // * `self.cap -&gt; self.cap()`
    // * `self.grow -&gt; self.buf.grow()`
}

impl&lt;T&gt; Drop for Vec&lt;T&gt; {
    fn drop(&amp;mut self) {
        while let Some(_) = self.pop() {}
        // deallocation is handled by RawVec
    }
}
</code></pre>
<p>And finally we can really simplify IntoIter:</p>
<pre><code class="language-rust ignore">struct IntoIter&lt;T&gt; {
    _buf: RawVec&lt;T&gt;, // we don't actually care about this. Just need it to live.
    start: *const T,
    end: *const T,
}

// next and next_back literally unchanged since they never referred to the buf

impl&lt;T&gt; Drop for IntoIter&lt;T&gt; {
    fn drop(&amp;mut self) {
        // only need to ensure all our elements are read;
        // buffer will clean itself up afterwards.
        for _ in &amp;mut *self {}
    }
}

impl&lt;T&gt; Vec&lt;T&gt; {
    pub fn into_iter(self) -&gt; IntoIter&lt;T&gt; {
        unsafe {
            // need to use ptr::read to unsafely move the buf out since it's
            // not Copy, and Vec implements Drop (so we can't destructure it).
            let buf = ptr::read(&amp;self.buf);
            let len = self.len;
            mem::forget(self);

            IntoIter {
                start: *buf.ptr,
                end: buf.ptr.offset(len as isize),
                _buf: buf,
            }
        }
    }
}
</code></pre>
<p>Much better.</p>
<a class="header" href="print.html#drain-1" id="drain-1"><h1>Drain</h1></a>
<p>Let's move on to Drain. Drain is largely the same as IntoIter, except that
instead of consuming the Vec, it borrows the Vec and leaves its allocation
untouched. For now we'll only implement the &quot;basic&quot; full-range version.</p>
<pre><code class="language-rust ignore">use std::marker::PhantomData;

struct Drain&lt;'a, T: 'a&gt; {
    // Need to bound the lifetime here, so we do it with `&amp;'a mut Vec&lt;T&gt;`
    // because that's semantically what we contain. We're &quot;just&quot; calling
    // `pop()` and `remove(0)`.
    vec: PhantomData&lt;&amp;'a mut Vec&lt;T&gt;&gt;,
    start: *const T,
    end: *const T,
}

impl&lt;'a, T&gt; Iterator for Drain&lt;'a, T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
</code></pre>
<p>-- wait, this is seeming familiar. Let's do some more compression. Both
IntoIter and Drain have the exact same structure, let's just factor it out.</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
struct RawValIter&lt;T&gt; {
    start: *const T,
    end: *const T,
}

impl&lt;T&gt; RawValIter&lt;T&gt; {
    // unsafe to construct because it has no associated lifetimes.
    // This is necessary to store a RawValIter in the same struct as
    // its actual allocation. OK since it's a private implementation
    // detail.
    unsafe fn new(slice: &amp;[T]) -&gt; Self {
        RawValIter {
            start: slice.as_ptr(),
            end: if slice.len() == 0 {
                // if `len = 0`, then this is not actually allocated memory.
                // Need to avoid offsetting because that will give wrong
                // information to LLVM via GEP.
                slice.as_ptr()
            } else {
                slice.as_ptr().offset(slice.len() as isize)
            }
        }
    }
}

// Iterator and DoubleEndedIterator impls identical to IntoIter.
#}</code></pre></pre>
<p>And IntoIter becomes the following:</p>
<pre><code class="language-rust ignore">pub struct IntoIter&lt;T&gt; {
    _buf: RawVec&lt;T&gt;, // we don't actually care about this. Just need it to live.
    iter: RawValIter&lt;T&gt;,
}

impl&lt;T&gt; Iterator for IntoIter&lt;T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next() }
    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) { self.iter.size_hint() }
}

impl&lt;T&gt; DoubleEndedIterator for IntoIter&lt;T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next_back() }
}

impl&lt;T&gt; Drop for IntoIter&lt;T&gt; {
    fn drop(&amp;mut self) {
        for _ in &amp;mut self.iter {}
    }
}

impl&lt;T&gt; Vec&lt;T&gt; {
    pub fn into_iter(self) -&gt; IntoIter&lt;T&gt; {
        unsafe {
            let iter = RawValIter::new(&amp;self);

            let buf = ptr::read(&amp;self.buf);
            mem::forget(self);

            IntoIter {
                iter: iter,
                _buf: buf,
            }
        }
    }
}
</code></pre>
<p>Note that I've left a few quirks in this design to make upgrading Drain to work
with arbitrary subranges a bit easier. In particular we <em>could</em> have RawValIter
drain itself on drop, but that won't work right for a more complex Drain.
We also take a slice to simplify Drain initialization.</p>
<p>Alright, now Drain is really easy:</p>
<pre><code class="language-rust ignore">use std::marker::PhantomData;

pub struct Drain&lt;'a, T: 'a&gt; {
    vec: PhantomData&lt;&amp;'a mut Vec&lt;T&gt;&gt;,
    iter: RawValIter&lt;T&gt;,
}

impl&lt;'a, T&gt; Iterator for Drain&lt;'a, T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next() }
    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) { self.iter.size_hint() }
}

impl&lt;'a, T&gt; DoubleEndedIterator for Drain&lt;'a, T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next_back() }
}

impl&lt;'a, T&gt; Drop for Drain&lt;'a, T&gt; {
    fn drop(&amp;mut self) {
        for _ in &amp;mut self.iter {}
    }
}

impl&lt;T&gt; Vec&lt;T&gt; {
    pub fn drain(&amp;mut self) -&gt; Drain&lt;T&gt; {
        unsafe {
            let iter = RawValIter::new(&amp;self);

            // this is a mem::forget safety thing. If Drain is forgotten, we just
            // leak the whole Vec's contents. Also we need to do this *eventually*
            // anyway, so why not do it now?
            self.len = 0;

            Drain {
                iter: iter,
                vec: PhantomData,
            }
        }
    }
}
</code></pre>
<p>For more details on the <code>mem::forget</code> problem, see the
<a href="leaking.html">section on leaks</a>.</p>
<a class="header" href="print.html#handling-zero-sized-types" id="handling-zero-sized-types"><h1>Handling Zero-Sized Types</h1></a>
<p>It's time. We're going to fight the specter that is zero-sized types. Safe Rust
<em>never</em> needs to care about this, but Vec is very intensive on raw pointers and
raw allocations, which are exactly the two things that care about
zero-sized types. We need to be careful of two things:</p>
<ul>
<li>The raw allocator API has undefined behavior if you pass in 0 for an
allocation size.</li>
<li>raw pointer offsets are no-ops for zero-sized types, which will break our
C-style pointer iterator.</li>
</ul>
<p>Thankfully we abstracted out pointer-iterators and allocating handling into
RawValIter and RawVec respectively. How mysteriously convenient.</p>
<a class="header" href="print.html#allocating-zero-sized-types" id="allocating-zero-sized-types"><h2>Allocating Zero-Sized Types</h2></a>
<p>So if the allocator API doesn't support zero-sized allocations, what on earth
do we store as our allocation? <code>Unique::empty()</code> of course! Almost every operation
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
to be considered to store or load them. This actually extends to <code>ptr::read</code> and
<code>ptr::write</code>: they won't actually look at the pointer at all. As such we never need
to change the pointer.</p>
<p>Note however that our previous reliance on running out of memory before overflow is
no longer valid with zero-sized types. We must explicitly guard against capacity
overflow for zero-sized types.</p>
<p>Due to our current architecture, all this means is writing 3 guards, one in each
method of RawVec.</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; RawVec&lt;T&gt; {
    fn new() -&gt; Self {
        // !0 is usize::MAX. This branch should be stripped at compile time.
        let cap = if mem::size_of::&lt;T&gt;() == 0 { !0 } else { 0 };

        // Unique::empty() doubles as &quot;unallocated&quot; and &quot;zero-sized allocation&quot;
        RawVec { ptr: Unique::empty(), cap: cap }
    }

    fn grow(&amp;mut self) {
        unsafe {
            let elem_size = mem::size_of::&lt;T&gt;();

            // since we set the capacity to usize::MAX when elem_size is
            // 0, getting to here necessarily means the Vec is overfull.
            assert!(elem_size != 0, &quot;capacity overflow&quot;);

            let align = mem::align_of::&lt;T&gt;();

            let (new_cap, ptr) = if self.cap == 0 {
                let ptr = heap::allocate(elem_size, align);
                (1, ptr)
            } else {
                let new_cap = 2 * self.cap;
                let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _,
                                            self.cap * elem_size,
                                            new_cap * elem_size,
                                            align);
                (new_cap, ptr)
            };

            // If allocate or reallocate fail, we'll get `null` back
            if ptr.is_null() { oom() }

            self.ptr = Unique::new(ptr as *mut _);
            self.cap = new_cap;
        }
    }
}

impl&lt;T&gt; Drop for RawVec&lt;T&gt; {
    fn drop(&amp;mut self) {
        let elem_size = mem::size_of::&lt;T&gt;();

        // don't free zero-sized allocations, as they were never allocated.
        if self.cap != 0 &amp;&amp; elem_size != 0 {
            let align = mem::align_of::&lt;T&gt;();

            let num_bytes = elem_size * self.cap;
            unsafe {
                heap::deallocate(self.ptr.as_ptr() as *mut _, num_bytes, align);
            }
        }
    }
}
</code></pre>
<p>That's it. We support pushing and popping zero-sized types now. Our iterators
(that aren't provided by slice Deref) are still busted, though.</p>
<a class="header" href="print.html#iterating-zero-sized-types" id="iterating-zero-sized-types"><h2>Iterating Zero-Sized Types</h2></a>
<p>Zero-sized offsets are no-ops. This means that our current design will always
initialize <code>start</code> and <code>end</code> as the same value, and our iterators will yield
nothing. The current solution to this is to cast the pointers to integers,
increment, and then cast them back:</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; RawValIter&lt;T&gt; {
    unsafe fn new(slice: &amp;[T]) -&gt; Self {
        RawValIter {
            start: slice.as_ptr(),
            end: if mem::size_of::&lt;T&gt;() == 0 {
                ((slice.as_ptr() as usize) + slice.len()) as *const _
            } else if slice.len() == 0 {
                slice.as_ptr()
            } else {
                slice.as_ptr().offset(slice.len() as isize)
            }
        }
    }
}
</code></pre>
<p>Now we have a different bug. Instead of our iterators not running at all, our
iterators now run <em>forever</em>. We need to do the same trick in our iterator impls.
Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll
basically be treating the two pointers as if they point to bytes, we'll just
map size 0 to divide by 1.</p>
<pre><code class="language-rust ignore">impl&lt;T&gt; Iterator for RawValIter&lt;T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
        } else {
            unsafe {
                let result = ptr::read(self.start);
                self.start = if mem::size_of::&lt;T&gt;() == 0 {
                    (self.start as usize + 1) as *const _
                } else {
                    self.start.offset(1)
                };
                Some(result)
            }
        }
    }

    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) {
        let elem_size = mem::size_of::&lt;T&gt;();
        let len = (self.end as usize - self.start as usize)
                  / if elem_size == 0 { 1 } else { elem_size };
        (len, Some(len))
    }
}

impl&lt;T&gt; DoubleEndedIterator for RawValIter&lt;T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
        } else {
            unsafe {
                self.end = if mem::size_of::&lt;T&gt;() == 0 {
                    (self.end as usize - 1) as *const _
                } else {
                    self.end.offset(-1)
                };
                Some(ptr::read(self.end))
            }
        }
    }
}
</code></pre>
<p>And that's it. Iteration works!</p>
<a class="header" href="print.html#the-final-code" id="the-final-code"><h1>The Final Code</h1></a>
<pre><pre class="playpen"><code class="language-rust">#![feature(ptr_internals)]
#![feature(allocator_api)]
#![feature(unique)]

use std::ptr::{Unique, self};
use std::mem;
use std::ops::{Deref, DerefMut};
use std::marker::PhantomData;
use std::alloc::{GlobalAlloc, Layout, Global, oom};

struct RawVec&lt;T&gt; {
    ptr: Unique&lt;T&gt;,
    cap: usize,
}

impl&lt;T&gt; RawVec&lt;T&gt; {
    fn new() -&gt; Self {
        // !0 is usize::MAX. This branch should be stripped at compile time.
        let cap = if mem::size_of::&lt;T&gt;() == 0 { !0 } else { 0 };

        // Unique::empty() doubles as &quot;unallocated&quot; and &quot;zero-sized allocation&quot;
        RawVec { ptr: Unique::empty(), cap: cap }
    }

    fn grow(&amp;mut self) {
        unsafe {
            let elem_size = mem::size_of::&lt;T&gt;();

            // since we set the capacity to usize::MAX when elem_size is
            // 0, getting to here necessarily means the Vec is overfull.
            assert!(elem_size != 0, &quot;capacity overflow&quot;);

            let (new_cap, ptr) = if self.cap == 0 {
                let ptr = Global.alloc(Layout::array::&lt;T&gt;(1).unwrap());
                (1, ptr)
            } else {
                let new_cap = 2 * self.cap;
                let ptr = Global.realloc(self.ptr.as_ptr() as *mut _,
                                         Layout::array::&lt;T&gt;(self.cap).unwrap(),
                                         Layout::array::&lt;T&gt;(new_cap).unwrap().size());
                (new_cap, ptr)
            };

            // If allocate or reallocate fail, oom
            if ptr.is_null() {
                oom()
            }

            self.ptr = Unique::new_unchecked(ptr as *mut _);
            self.cap = new_cap;
        }
    }
}

impl&lt;T&gt; Drop for RawVec&lt;T&gt; {
    fn drop(&amp;mut self) {
        let elem_size = mem::size_of::&lt;T&gt;();
        if self.cap != 0 &amp;&amp; elem_size != 0 {
            unsafe {
                Global.dealloc(self.ptr.as_ptr() as *mut _,
                               Layout::array::&lt;T&gt;(self.cap).unwrap());
            }
        }
    }
}

pub struct Vec&lt;T&gt; {
    buf: RawVec&lt;T&gt;,
    len: usize,
}

impl&lt;T&gt; Vec&lt;T&gt; {
    fn ptr(&amp;self) -&gt; *mut T { self.buf.ptr.as_ptr() }

    fn cap(&amp;self) -&gt; usize { self.buf.cap }

    pub fn new() -&gt; Self {
        Vec { buf: RawVec::new(), len: 0 }
    }
    pub fn push(&amp;mut self, elem: T) {
        if self.len == self.cap() { self.buf.grow(); }

        unsafe {
            ptr::write(self.ptr().offset(self.len as isize), elem);
        }

        // Can't fail, we'll OOM first.
        self.len += 1;
    }

    pub fn pop(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.len == 0 {
            None
        } else {
            self.len -= 1;
            unsafe {
                Some(ptr::read(self.ptr().offset(self.len as isize)))
            }
        }
    }

    pub fn insert(&amp;mut self, index: usize, elem: T) {
        assert!(index &lt;= self.len, &quot;index out of bounds&quot;);
        if self.cap() == self.len { self.buf.grow(); }

        unsafe {
            if index &lt; self.len {
                ptr::copy(self.ptr().offset(index as isize),
                          self.ptr().offset(index as isize + 1),
                          self.len - index);
            }
            ptr::write(self.ptr().offset(index as isize), elem);
            self.len += 1;
        }
    }

    pub fn remove(&amp;mut self, index: usize) -&gt; T {
        assert!(index &lt; self.len, &quot;index out of bounds&quot;);
        unsafe {
            self.len -= 1;
            let result = ptr::read(self.ptr().offset(index as isize));
            ptr::copy(self.ptr().offset(index as isize + 1),
                      self.ptr().offset(index as isize),
                      self.len - index);
            result
        }
    }

    pub fn into_iter(self) -&gt; IntoIter&lt;T&gt; {
        unsafe {
            let iter = RawValIter::new(&amp;self);
            let buf = ptr::read(&amp;self.buf);
            mem::forget(self);

            IntoIter {
                iter: iter,
                _buf: buf,
            }
        }
    }

    pub fn drain(&amp;mut self) -&gt; Drain&lt;T&gt; {
        unsafe {
            let iter = RawValIter::new(&amp;self);

            // this is a mem::forget safety thing. If Drain is forgotten, we just
            // leak the whole Vec's contents. Also we need to do this *eventually*
            // anyway, so why not do it now?
            self.len = 0;

            Drain {
                iter: iter,
                vec: PhantomData,
            }
        }
    }
}

impl&lt;T&gt; Drop for Vec&lt;T&gt; {
    fn drop(&amp;mut self) {
        while let Some(_) = self.pop() {}
        // allocation is handled by RawVec
    }
}

impl&lt;T&gt; Deref for Vec&lt;T&gt; {
    type Target = [T];
    fn deref(&amp;self) -&gt; &amp;[T] {
        unsafe {
            ::std::slice::from_raw_parts(self.ptr(), self.len)
        }
    }
}

impl&lt;T&gt; DerefMut for Vec&lt;T&gt; {
    fn deref_mut(&amp;mut self) -&gt; &amp;mut [T] {
        unsafe {
            ::std::slice::from_raw_parts_mut(self.ptr(), self.len)
        }
    }
}





struct RawValIter&lt;T&gt; {
    start: *const T,
    end: *const T,
}

impl&lt;T&gt; RawValIter&lt;T&gt; {
    unsafe fn new(slice: &amp;[T]) -&gt; Self {
        RawValIter {
            start: slice.as_ptr(),
            end: if mem::size_of::&lt;T&gt;() == 0 {
                ((slice.as_ptr() as usize) + slice.len()) as *const _
            } else if slice.len() == 0 {
                slice.as_ptr()
            } else {
                slice.as_ptr().offset(slice.len() as isize)
            }
        }
    }
}

impl&lt;T&gt; Iterator for RawValIter&lt;T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
        } else {
            unsafe {
                let result = ptr::read(self.start);
                self.start = if mem::size_of::&lt;T&gt;() == 0 {
                    (self.start as usize + 1) as *const _
                } else {
                    self.start.offset(1)
                };
                Some(result)
            }
        }
    }

    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) {
        let elem_size = mem::size_of::&lt;T&gt;();
        let len = (self.end as usize - self.start as usize)
                  / if elem_size == 0 { 1 } else { elem_size };
        (len, Some(len))
    }
}

impl&lt;T&gt; DoubleEndedIterator for RawValIter&lt;T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; {
        if self.start == self.end {
            None
        } else {
            unsafe {
                self.end = if mem::size_of::&lt;T&gt;() == 0 {
                    (self.end as usize - 1) as *const _
                } else {
                    self.end.offset(-1)
                };
                Some(ptr::read(self.end))
            }
        }
    }
}




pub struct IntoIter&lt;T&gt; {
    _buf: RawVec&lt;T&gt;, // we don't actually care about this. Just need it to live.
    iter: RawValIter&lt;T&gt;,
}

impl&lt;T&gt; Iterator for IntoIter&lt;T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next() }
    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) { self.iter.size_hint() }
}

impl&lt;T&gt; DoubleEndedIterator for IntoIter&lt;T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next_back() }
}

impl&lt;T&gt; Drop for IntoIter&lt;T&gt; {
    fn drop(&amp;mut self) {
        for _ in &amp;mut *self {}
    }
}




pub struct Drain&lt;'a, T: 'a&gt; {
    vec: PhantomData&lt;&amp;'a mut Vec&lt;T&gt;&gt;,
    iter: RawValIter&lt;T&gt;,
}

impl&lt;'a, T&gt; Iterator for Drain&lt;'a, T&gt; {
    type Item = T;
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next_back() }
    fn size_hint(&amp;self) -&gt; (usize, Option&lt;usize&gt;) { self.iter.size_hint() }
}

impl&lt;'a, T&gt; DoubleEndedIterator for Drain&lt;'a, T&gt; {
    fn next_back(&amp;mut self) -&gt; Option&lt;T&gt; { self.iter.next_back() }
}

impl&lt;'a, T&gt; Drop for Drain&lt;'a, T&gt; {
    fn drop(&amp;mut self) {
        // pre-drain the iter
        for _ in &amp;mut self.iter {}
    }
}

# fn main() {}
</code></pre></pre>
<a class="header" href="print.html#implementing-arc-and-mutex" id="implementing-arc-and-mutex"><h1>Implementing Arc and Mutex</h1></a>
<p>Knowing the theory is all fine and good, but the <em>best</em> way to understand
something is to use it. To better understand atomics and interior mutability,
we'll be implementing versions of the standard library's Arc and Mutex types.</p>
<p>TODO: ALL OF THIS OMG</p>
<a class="header" href="print.html#foreign-function-interface" id="foreign-function-interface"><h1>Foreign Function Interface</h1></a>
<a class="header" href="print.html#introduction" id="introduction"><h1>Introduction</h1></a>
<p>This guide will use the <a href="https://github.com/google/snappy">snappy</a>
compression/decompression library as an introduction to writing bindings for
foreign code. Rust is currently unable to call directly into a C++ library, but
snappy includes a C interface (documented in
<a href="https://github.com/google/snappy/blob/master/snappy-c.h"><code>snappy-c.h</code></a>).</p>
<a class="header" href="print.html#a-note-about-libc" id="a-note-about-libc"><h2>A note about libc</h2></a>
<p>Many of these examples use <a href="https://crates.io/crates/libc">the <code>libc</code> crate</a>, which provides various
type definitions for C types, among other things. If you’re trying these
examples yourself, you’ll need to add <code>libc</code> to your <code>Cargo.toml</code>:</p>
<pre><code class="language-toml">[dependencies]
libc = &quot;0.2.0&quot;
</code></pre>
<p>and add <code>extern crate libc;</code> to your crate root.</p>
<a class="header" href="print.html#calling-foreign-functions" id="calling-foreign-functions"><h2>Calling foreign functions</h2></a>
<p>The following is a minimal example of calling a foreign function which will
compile if snappy is installed:</p>
<pre><code class="language-rust ignore">extern crate libc;
use libc::size_t;

#[link(name = &quot;snappy&quot;)]
extern {
    fn snappy_max_compressed_length(source_length: size_t) -&gt; size_t;
}

fn main() {
    let x = unsafe { snappy_max_compressed_length(100) };
    println!(&quot;max compressed length of a 100 byte buffer: {}&quot;, x);
}
</code></pre>
<p>The <code>extern</code> block is a list of function signatures in a foreign library, in
this case with the platform's C ABI. The <code>#[link(...)]</code> attribute is used to
instruct the linker to link against the snappy library so the symbols are
resolved.</p>
<p>Foreign functions are assumed to be unsafe so calls to them need to be wrapped
with <code>unsafe {}</code> as a promise to the compiler that everything contained within
truly is safe. C libraries often expose interfaces that aren't thread-safe, and
almost any function that takes a pointer argument isn't valid for all possible
inputs since the pointer could be dangling, and raw pointers fall outside of
Rust's safe memory model.</p>
<p>When declaring the argument types to a foreign function, the Rust compiler
cannot check if the declaration is correct, so specifying it correctly is part
of keeping the binding correct at runtime.</p>
<p>The <code>extern</code> block can be extended to cover the entire snappy API:</p>
<pre><code class="language-rust ignore">extern crate libc;
use libc::{c_int, size_t};

#[link(name = &quot;snappy&quot;)]
extern {
    fn snappy_compress(input: *const u8,
                       input_length: size_t,
                       compressed: *mut u8,
                       compressed_length: *mut size_t) -&gt; c_int;
    fn snappy_uncompress(compressed: *const u8,
                         compressed_length: size_t,
                         uncompressed: *mut u8,
                         uncompressed_length: *mut size_t) -&gt; c_int;
    fn snappy_max_compressed_length(source_length: size_t) -&gt; size_t;
    fn snappy_uncompressed_length(compressed: *const u8,
                                  compressed_length: size_t,
                                  result: *mut size_t) -&gt; c_int;
    fn snappy_validate_compressed_buffer(compressed: *const u8,
                                         compressed_length: size_t) -&gt; c_int;
}
# fn main() {}
</code></pre>
<a class="header" href="print.html#creating-a-safe-interface" id="creating-a-safe-interface"><h1>Creating a safe interface</h1></a>
<p>The raw C API needs to be wrapped to provide memory safety and make use of higher-level concepts
like vectors. A library can choose to expose only the safe, high-level interface and hide the unsafe
internal details.</p>
<p>Wrapping the functions which expect buffers involves using the <code>slice::raw</code> module to manipulate Rust
vectors as pointers to memory. Rust's vectors are guaranteed to be a contiguous block of memory. The
length is the number of elements currently contained, and the capacity is the total size in elements of
the allocated memory. The length is less than or equal to the capacity.</p>
<pre><code class="language-rust ignore"># extern crate libc;
# use libc::{c_int, size_t};
# unsafe fn snappy_validate_compressed_buffer(_: *const u8, _: size_t) -&gt; c_int { 0 }
# fn main() {}
pub fn validate_compressed_buffer(src: &amp;[u8]) -&gt; bool {
    unsafe {
        snappy_validate_compressed_buffer(src.as_ptr(), src.len() as size_t) == 0
    }
}
</code></pre>
<p>The <code>validate_compressed_buffer</code> wrapper above makes use of an <code>unsafe</code> block, but it makes the
guarantee that calling it is safe for all inputs by leaving off <code>unsafe</code> from the function
signature.</p>
<p>The <code>snappy_compress</code> and <code>snappy_uncompress</code> functions are more complex, since a buffer has to be
allocated to hold the output too.</p>
<p>The <code>snappy_max_compressed_length</code> function can be used to allocate a vector with the maximum
required capacity to hold the compressed output. The vector can then be passed to the
<code>snappy_compress</code> function as an output parameter. An output parameter is also passed to retrieve
the true length after compression for setting the length.</p>
<pre><code class="language-rust ignore"># extern crate libc;
# use libc::{size_t, c_int};
# unsafe fn snappy_compress(a: *const u8, b: size_t, c: *mut u8,
#                           d: *mut size_t) -&gt; c_int { 0 }
# unsafe fn snappy_max_compressed_length(a: size_t) -&gt; size_t { a }
# fn main() {}
pub fn compress(src: &amp;[u8]) -&gt; Vec&lt;u8&gt; {
    unsafe {
        let srclen = src.len() as size_t;
        let psrc = src.as_ptr();

        let mut dstlen = snappy_max_compressed_length(srclen);
        let mut dst = Vec::with_capacity(dstlen as usize);
        let pdst = dst.as_mut_ptr();

        snappy_compress(psrc, srclen, pdst, &amp;mut dstlen);
        dst.set_len(dstlen as usize);
        dst
    }
}
</code></pre>
<p>Decompression is similar, because snappy stores the uncompressed size as part of the compression
format and <code>snappy_uncompressed_length</code> will retrieve the exact buffer size required.</p>
<pre><code class="language-rust ignore"># extern crate libc;
# use libc::{size_t, c_int};
# unsafe fn snappy_uncompress(compressed: *const u8,
#                             compressed_length: size_t,
#                             uncompressed: *mut u8,
#                             uncompressed_length: *mut size_t) -&gt; c_int { 0 }
# unsafe fn snappy_uncompressed_length(compressed: *const u8,
#                                      compressed_length: size_t,
#                                      result: *mut size_t) -&gt; c_int { 0 }
# fn main() {}
pub fn uncompress(src: &amp;[u8]) -&gt; Option&lt;Vec&lt;u8&gt;&gt; {
    unsafe {
        let srclen = src.len() as size_t;
        let psrc = src.as_ptr();

        let mut dstlen: size_t = 0;
        snappy_uncompressed_length(psrc, srclen, &amp;mut dstlen);

        let mut dst = Vec::with_capacity(dstlen as usize);
        let pdst = dst.as_mut_ptr();

        if snappy_uncompress(psrc, srclen, pdst, &amp;mut dstlen) == 0 {
            dst.set_len(dstlen as usize);
            Some(dst)
        } else {
            None // SNAPPY_INVALID_INPUT
        }
    }
}
</code></pre>
<p>Then, we can add some tests to show how to use them.</p>
<pre><code class="language-rust ignore"># extern crate libc;
# use libc::{c_int, size_t};
# unsafe fn snappy_compress(input: *const u8,
#                           input_length: size_t,
#                           compressed: *mut u8,
#                           compressed_length: *mut size_t)
#                           -&gt; c_int { 0 }
# unsafe fn snappy_uncompress(compressed: *const u8,
#                             compressed_length: size_t,
#                             uncompressed: *mut u8,
#                             uncompressed_length: *mut size_t)
#                             -&gt; c_int { 0 }
# unsafe fn snappy_max_compressed_length(source_length: size_t) -&gt; size_t { 0 }
# unsafe fn snappy_uncompressed_length(compressed: *const u8,
#                                      compressed_length: size_t,
#                                      result: *mut size_t)
#                                      -&gt; c_int { 0 }
# unsafe fn snappy_validate_compressed_buffer(compressed: *const u8,
#                                             compressed_length: size_t)
#                                             -&gt; c_int { 0 }
# fn main() { }

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn valid() {
        let d = vec![0xde, 0xad, 0xd0, 0x0d];
        let c: &amp;[u8] = &amp;compress(&amp;d);
        assert!(validate_compressed_buffer(c));
        assert!(uncompress(c) == Some(d));
    }

    #[test]
    fn invalid() {
        let d = vec![0, 0, 0, 0];
        assert!(!validate_compressed_buffer(&amp;d));
        assert!(uncompress(&amp;d).is_none());
    }

    #[test]
    fn empty() {
        let d = vec![];
        assert!(!validate_compressed_buffer(&amp;d));
        assert!(uncompress(&amp;d).is_none());
        let c = compress(&amp;d);
        assert!(validate_compressed_buffer(&amp;c));
        assert!(uncompress(&amp;c) == Some(d));
    }
}
</code></pre>
<a class="header" href="print.html#destructors-1" id="destructors-1"><h1>Destructors</h1></a>
<p>Foreign libraries often hand off ownership of resources to the calling code.
When this occurs, we must use Rust's destructors to provide safety and guarantee
the release of these resources (especially in the case of panic).</p>
<p>For more about destructors, see the <a href="../std/ops/trait.Drop.html">Drop trait</a>.</p>
<a class="header" href="print.html#callbacks-from-c-code-to-rust-functions" id="callbacks-from-c-code-to-rust-functions"><h1>Callbacks from C code to Rust functions</h1></a>
<p>Some external libraries require the usage of callbacks to report back their
current state or intermediate data to the caller.
It is possible to pass functions defined in Rust to an external library.
The requirement for this is that the callback function is marked as <code>extern</code>
with the correct calling convention to make it callable from C code.</p>
<p>The callback function can then be sent through a registration call
to the C library and afterwards be invoked from there.</p>
<p>A basic example is:</p>
<p>Rust code:</p>
<pre><pre class="playpen"><code class="language-rust no_run">extern fn callback(a: i32) {
    println!(&quot;I'm called from C with value {0}&quot;, a);
}

#[link(name = &quot;extlib&quot;)]
extern {
   fn register_callback(cb: extern fn(i32)) -&gt; i32;
   fn trigger_callback();
}

fn main() {
    unsafe {
        register_callback(callback);
        trigger_callback(); // Triggers the callback.
    }
}
</code></pre></pre>
<p>C code:</p>
<pre><code class="language-c">typedef void (*rust_callback)(int32_t);
rust_callback cb;

int32_t register_callback(rust_callback callback) {
    cb = callback;
    return 1;
}

void trigger_callback() {
  cb(7); // Will call callback(7) in Rust.
}
</code></pre>
<p>In this example Rust's <code>main()</code> will call <code>trigger_callback()</code> in C,
which would, in turn, call back to <code>callback()</code> in Rust.</p>
<a class="header" href="print.html#targeting-callbacks-to-rust-objects" id="targeting-callbacks-to-rust-objects"><h2>Targeting callbacks to Rust objects</h2></a>
<p>The former example showed how a global function can be called from C code.
However it is often desired that the callback is targeted to a special
Rust object. This could be the object that represents the wrapper for the
respective C object.</p>
<p>This can be achieved by passing a raw pointer to the object down to the
C library. The C library can then include the pointer to the Rust object in
the notification. This will allow the callback to unsafely access the
referenced Rust object.</p>
<p>Rust code:</p>
<pre><pre class="playpen"><code class="language-rust no_run">#[repr(C)]
struct RustObject {
    a: i32,
    // Other members...
}

extern &quot;C&quot; fn callback(target: *mut RustObject, a: i32) {
    println!(&quot;I'm called from C with value {0}&quot;, a);
    unsafe {
        // Update the value in RustObject with the value received from the callback:
        (*target).a = a;
    }
}

#[link(name = &quot;extlib&quot;)]
extern {
   fn register_callback(target: *mut RustObject,
                        cb: extern fn(*mut RustObject, i32)) -&gt; i32;
   fn trigger_callback();
}

fn main() {
    // Create the object that will be referenced in the callback:
    let mut rust_object = Box::new(RustObject { a: 5 });

    unsafe {
        register_callback(&amp;mut *rust_object, callback);
        trigger_callback();
    }
}
</code></pre></pre>
<p>C code:</p>
<pre><code class="language-c">typedef void (*rust_callback)(void*, int32_t);
void* cb_target;
rust_callback cb;

int32_t register_callback(void* callback_target, rust_callback callback) {
    cb_target = callback_target;
    cb = callback;
    return 1;
}

void trigger_callback() {
  cb(cb_target, 7); // Will call callback(&amp;rustObject, 7) in Rust.
}
</code></pre>
<a class="header" href="print.html#asynchronous-callbacks" id="asynchronous-callbacks"><h2>Asynchronous callbacks</h2></a>
<p>In the previously given examples the callbacks are invoked as a direct reaction
to a function call to the external C library.
The control over the current thread is switched from Rust to C to Rust for the
execution of the callback, but in the end the callback is executed on the
same thread that called the function which triggered the callback.</p>
<p>Things get more complicated when the external library spawns its own threads
and invokes callbacks from there.
In these cases access to Rust data structures inside the callbacks is
especially unsafe and proper synchronization mechanisms must be used.
Besides classical synchronization mechanisms like mutexes, one possibility in
Rust is to use channels (in <code>std::sync::mpsc</code>) to forward data from the C
thread that invoked the callback into a Rust thread.</p>
<p>If an asynchronous callback targets a special object in the Rust address space
it is also absolutely necessary that no more callbacks are performed by the
C library after the respective Rust object gets destroyed.
This can be achieved by unregistering the callback in the object's
destructor and designing the library in a way that guarantees that no
callback will be performed after deregistration.</p>
<a class="header" href="print.html#linking" id="linking"><h1>Linking</h1></a>
<p>The <code>link</code> attribute on <code>extern</code> blocks provides the basic building block for
instructing rustc how it will link to native libraries. There are two accepted
forms of the link attribute today:</p>
<ul>
<li><code>#[link(name = &quot;foo&quot;)]</code></li>
<li><code>#[link(name = &quot;foo&quot;, kind = &quot;bar&quot;)]</code></li>
</ul>
<p>In both of these cases, <code>foo</code> is the name of the native library that we're
linking to, and in the second case <code>bar</code> is the type of native library that the
compiler is linking to. There are currently three known types of native
libraries:</p>
<ul>
<li>Dynamic - <code>#[link(name = &quot;readline&quot;)]</code></li>
<li>Static - <code>#[link(name = &quot;my_build_dependency&quot;, kind = &quot;static&quot;)]</code></li>
<li>Frameworks - <code>#[link(name = &quot;CoreFoundation&quot;, kind = &quot;framework&quot;)]</code></li>
</ul>
<p>Note that frameworks are only available on macOS targets.</p>
<p>The different <code>kind</code> values are meant to differentiate how the native library
participates in linkage. From a linkage perspective, the Rust compiler creates
two flavors of artifacts: partial (rlib/staticlib) and final (dylib/binary).
Native dynamic library and framework dependencies are propagated to the final
artifact boundary, while static library dependencies are not propagated at
all, because the static libraries are integrated directly into the subsequent
artifact.</p>
<p>A few examples of how this model can be used are:</p>
<ul>
<li>
<p>A native build dependency. Sometimes some C/C++ glue is needed when writing
some Rust code, but distribution of the C/C++ code in a library format is
a burden. In this case, the code will be archived into <code>libfoo.a</code> and then the
Rust crate would declare a dependency via <code>#[link(name = &quot;foo&quot;, kind = &quot;static&quot;)]</code>.</p>
<p>Regardless of the flavor of output for the crate, the native static library
will be included in the output, meaning that distribution of the native static
library is not necessary.</p>
</li>
<li>
<p>A normal dynamic dependency. Common system libraries (like <code>readline</code>) are
available on a large number of systems, and often a static copy of these
libraries cannot be found. When this dependency is included in a Rust crate,
partial targets (like rlibs) will not link to the library, but when the rlib
is included in a final target (like a binary), the native library will be
linked in.</p>
</li>
</ul>
<p>On macOS, frameworks behave with the same semantics as a dynamic library.</p>
<a class="header" href="print.html#unsafe-blocks" id="unsafe-blocks"><h1>Unsafe blocks</h1></a>
<p>Some operations, like dereferencing raw pointers or calling functions that have been marked
unsafe are only allowed inside unsafe blocks. Unsafe blocks isolate unsafety and are a promise to
the compiler that the unsafety does not leak out of the block.</p>
<p>Unsafe functions, on the other hand, advertise it to the world. An unsafe function is written like
this:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
unsafe fn kaboom(ptr: *const i32) -&gt; i32 { *ptr }
#}</code></pre></pre>
<p>This function can only be called from an <code>unsafe</code> block or another <code>unsafe</code> function.</p>
<a class="header" href="print.html#accessing-foreign-globals" id="accessing-foreign-globals"><h1>Accessing foreign globals</h1></a>
<p>Foreign APIs often export a global variable which could do something like track
global state. In order to access these variables, you declare them in <code>extern</code>
blocks with the <code>static</code> keyword:</p>
<pre><code class="language-rust ignore">extern crate libc;

#[link(name = &quot;readline&quot;)]
extern {
    static rl_readline_version: libc::c_int;
}

fn main() {
    println!(&quot;You have readline version {} installed.&quot;,
             unsafe { rl_readline_version as i32 });
}
</code></pre>
<p>Alternatively, you may need to alter global state provided by a foreign
interface. To do this, statics can be declared with <code>mut</code> so we can mutate
them.</p>
<pre><code class="language-rust ignore">extern crate libc;

use std::ffi::CString;
use std::ptr;

#[link(name = &quot;readline&quot;)]
extern {
    static mut rl_prompt: *const libc::c_char;
}

fn main() {
    let prompt = CString::new(&quot;[my-awesome-shell] $&quot;).unwrap();
    unsafe {
        rl_prompt = prompt.as_ptr();

        println!(&quot;{:?}&quot;, rl_prompt);

        rl_prompt = ptr::null();
    }
}
</code></pre>
<p>Note that all interaction with a <code>static mut</code> is unsafe, both reading and
writing. Dealing with global mutable state requires a great deal of care.</p>
<a class="header" href="print.html#foreign-calling-conventions" id="foreign-calling-conventions"><h1>Foreign calling conventions</h1></a>
<p>Most foreign code exposes a C ABI, and Rust uses the platform's C calling convention by default when
calling foreign functions. Some foreign functions, most notably the Windows API, use other calling
conventions. Rust provides a way to tell the compiler which convention to use:</p>
<pre><code class="language-rust ignore">extern crate libc;

#[cfg(all(target_os = &quot;win32&quot;, target_arch = &quot;x86&quot;))]
#[link(name = &quot;kernel32&quot;)]
#[allow(non_snake_case)]
extern &quot;stdcall&quot; {
    fn SetEnvironmentVariableA(n: *const u8, v: *const u8) -&gt; libc::c_int;
}
# fn main() { }
</code></pre>
<p>This applies to the entire <code>extern</code> block. The list of supported ABI constraints
are:</p>
<ul>
<li><code>stdcall</code></li>
<li><code>aapcs</code></li>
<li><code>cdecl</code></li>
<li><code>fastcall</code></li>
<li><code>vectorcall</code>
This is currently hidden behind the <code>abi_vectorcall</code> gate and is subject to change.</li>
<li><code>Rust</code></li>
<li><code>rust-intrinsic</code></li>
<li><code>system</code></li>
<li><code>C</code></li>
<li><code>win64</code></li>
<li><code>sysv64</code></li>
</ul>
<p>Most of the abis in this list are self-explanatory, but the <code>system</code> abi may
seem a little odd. This constraint selects whatever the appropriate ABI is for
interoperating with the target's libraries. For example, on win32 with a x86
architecture, this means that the abi used would be <code>stdcall</code>. On x86_64,
however, windows uses the <code>C</code> calling convention, so <code>C</code> would be used. This
means that in our previous example, we could have used <code>extern &quot;system&quot; { ... }</code>
to define a block for all windows systems, not only x86 ones.</p>
<a class="header" href="print.html#interoperability-with-foreign-code" id="interoperability-with-foreign-code"><h1>Interoperability with foreign code</h1></a>
<p>Rust guarantees that the layout of a <code>struct</code> is compatible with the platform's
representation in C only if the <code>#[repr(C)]</code> attribute is applied to it.
<code>#[repr(C, packed)]</code> can be used to lay out struct members without padding.
<code>#[repr(C)]</code> can also be applied to an enum.</p>
<p>Rust's owned boxes (<code>Box&lt;T&gt;</code>) use non-nullable pointers as handles which point
to the contained object. However, they should not be manually created because
they are managed by internal allocators. References can safely be assumed to be
non-nullable pointers directly to the type.  However, breaking the borrow
checking or mutability rules is not guaranteed to be safe, so prefer using raw
pointers (<code>*</code>) if that's needed because the compiler can't make as many
assumptions about them.</p>
<p>Vectors and strings share the same basic memory layout, and utilities are
available in the <code>vec</code> and <code>str</code> modules for working with C APIs. However,
strings are not terminated with <code>\0</code>. If you need a NUL-terminated string for
interoperability with C, you should use the <code>CString</code> type in the <code>std::ffi</code>
module.</p>
<p>The <a href="https://crates.io/crates/libc"><code>libc</code> crate on crates.io</a> includes type aliases and function
definitions for the C standard library in the <code>libc</code> module, and Rust links
against <code>libc</code> and <code>libm</code> by default.</p>
<a class="header" href="print.html#variadic-functions" id="variadic-functions"><h1>Variadic functions</h1></a>
<p>In C, functions can be 'variadic', meaning they accept a variable number of arguments. This can
be achieved in Rust by specifying <code>...</code> within the argument list of a foreign function declaration:</p>
<pre><code class="language-no_run">extern {
    fn foo(x: i32, ...);
}

fn main() {
    unsafe {
        foo(10, 20, 30, 40, 50);
    }
}
</code></pre>
<p>Normal Rust functions can <em>not</em> be variadic:</p>
<pre><code class="language-ignore">// This will not compile

fn foo(x: i32, ...) { }
</code></pre>
<a class="header" href="print.html#the-nullable-pointer-optimization" id="the-nullable-pointer-optimization"><h1>The &quot;nullable pointer optimization&quot;</h1></a>
<p>Certain Rust types are defined to never be <code>null</code>. This includes references (<code>&amp;T</code>,
<code>&amp;mut T</code>), boxes (<code>Box&lt;T&gt;</code>), and function pointers (<code>extern &quot;abi&quot; fn()</code>). When
interfacing with C, pointers that might be <code>null</code> are often used, which would seem to
require some messy <code>transmute</code>s and/or unsafe code to handle conversions to/from Rust types.
However, the language provides a workaround.</p>
<p>As a special case, an <code>enum</code> is eligible for the &quot;nullable pointer optimization&quot; if it contains
exactly two variants, one of which contains no data and the other contains a field of one of the
non-nullable types listed above.  This means no extra space is required for a discriminant; rather,
the empty variant is represented by putting a <code>null</code> value into the non-nullable field. This is
called an &quot;optimization&quot;, but unlike other optimizations it is guaranteed to apply to eligible
types.</p>
<p>The most common type that takes advantage of the nullable pointer optimization is <code>Option&lt;T&gt;</code>,
where <code>None</code> corresponds to <code>null</code>. So <code>Option&lt;extern &quot;C&quot; fn(c_int) -&gt; c_int&gt;</code> is a correct way
to represent a nullable function pointer using the C ABI (corresponding to the C type
<code>int (*)(int)</code>).</p>
<p>Here is a contrived example. Let's say some C library has a facility for registering a
callback, which gets called in certain situations. The callback is passed a function pointer
and an integer and it is supposed to run the function with the integer as a parameter. So
we have function pointers flying across the FFI boundary in both directions.</p>
<pre><code class="language-rust ignore">extern crate libc;
use libc::c_int;

# #[cfg(hidden)]
extern &quot;C&quot; {
    /// Registers the callback.
    fn register(cb: Option&lt;extern &quot;C&quot; fn(Option&lt;extern &quot;C&quot; fn(c_int) -&gt; c_int&gt;, c_int) -&gt; c_int&gt;);
}
# unsafe fn register(_: Option&lt;extern &quot;C&quot; fn(Option&lt;extern &quot;C&quot; fn(c_int) -&gt; c_int&gt;,
#                                            c_int) -&gt; c_int&gt;)
# {}

/// This fairly useless function receives a function pointer and an integer
/// from C, and returns the result of calling the function with the integer.
/// In case no function is provided, it squares the integer by default.
extern &quot;C&quot; fn apply(process: Option&lt;extern &quot;C&quot; fn(c_int) -&gt; c_int&gt;, int: c_int) -&gt; c_int {
    match process {
        Some(f) =&gt; f(int),
        None    =&gt; int * int
    }
}

fn main() {
    unsafe {
        register(Some(apply));
    }
}
</code></pre>
<p>And the code on the C side looks like this:</p>
<pre><code class="language-c">void register(void (*f)(void (*)(int), int)) {
    ...
}
</code></pre>
<p>No <code>transmute</code> required!</p>
<a class="header" href="print.html#calling-rust-code-from-c" id="calling-rust-code-from-c"><h1>Calling Rust code from C</h1></a>
<p>You may wish to compile Rust code in a way so that it can be called from C. This is
fairly easy, but requires a few things:</p>
<pre><pre class="playpen"><code class="language-rust">#[no_mangle]
pub extern fn hello_rust() -&gt; *const u8 {
    &quot;Hello, world!\0&quot;.as_ptr()
}
# fn main() {}
</code></pre></pre>
<p>The <code>extern</code> makes this function adhere to the C calling convention, as
discussed above in &quot;<a href="ffi.html#foreign-calling-conventions">Foreign Calling
Conventions</a>&quot;. The <code>no_mangle</code>
attribute turns off Rust's name mangling, so that it is easier to link to.</p>
<a class="header" href="print.html#ffi-and-panics" id="ffi-and-panics"><h1>FFI and panics</h1></a>
<p>It’s important to be mindful of <code>panic!</code>s when working with FFI. A <code>panic!</code>
across an FFI boundary is undefined behavior. If you’re writing code that may
panic, you should run it in a closure with <a href="../std/panic/fn.catch_unwind.html"><code>catch_unwind</code></a>:</p>
<pre><pre class="playpen"><code class="language-rust">use std::panic::catch_unwind;

#[no_mangle]
pub extern fn oh_no() -&gt; i32 {
    let result = catch_unwind(|| {
        panic!(&quot;Oops!&quot;);
    });
    match result {
        Ok(_) =&gt; 0,
        Err(_) =&gt; 1,
    }
}

fn main() {}
</code></pre></pre>
<p>Please note that <a href="../std/panic/fn.catch_unwind.html"><code>catch_unwind</code></a> will only catch unwinding panics, not
those who abort the process. See the documentation of <a href="../std/panic/fn.catch_unwind.html"><code>catch_unwind</code></a>
for more information.</p>
<a class="header" href="print.html#representing-opaque-structs" id="representing-opaque-structs"><h1>Representing opaque structs</h1></a>
<p>Sometimes, a C library wants to provide a pointer to something, but not let you
know the internal details of the thing it wants. The simplest way is to use a
<code>void *</code> argument:</p>
<pre><code class="language-c">void foo(void *arg);
void bar(void *arg);
</code></pre>
<p>We can represent this in Rust with the <code>c_void</code> type:</p>
<pre><code class="language-rust ignore">extern crate libc;

extern &quot;C&quot; {
    pub fn foo(arg: *mut libc::c_void);
    pub fn bar(arg: *mut libc::c_void);
}
# fn main() {}
</code></pre>
<p>This is a perfectly valid way of handling the situation. However, we can do a bit
better. To solve this, some C libraries will instead create a <code>struct</code>, where
the details and memory layout of the struct are private. This gives some amount
of type safety. These structures are called ‘opaque’. Here’s an example, in C:</p>
<pre><code class="language-c">struct Foo; /* Foo is a structure, but its contents are not part of the public interface */
struct Bar;
void foo(struct Foo *arg);
void bar(struct Bar *arg);
</code></pre>
<p>To do this in Rust, let’s create our own opaque types with <code>enum</code>:</p>
<pre><pre class="playpen"><code class="language-rust">pub enum Foo {}
pub enum Bar {}

extern &quot;C&quot; {
    pub fn foo(arg: *mut Foo);
    pub fn bar(arg: *mut Bar);
}
# fn main() {}
</code></pre></pre>
<p>By using an <code>enum</code> with no variants, we create an opaque type that we can’t
instantiate, as it has no variants. But because our <code>Foo</code> and <code>Bar</code> types are
different, we’ll get type safety between the two of them, so we cannot
accidentally pass a pointer to <code>Foo</code> to <code>bar()</code>.</p>

                    </main>

                    <nav class="nav-wrapper" aria-label="Page navigation">
                        <!-- Mobile navigation buttons -->
                        

                        

                        <div style="clear: both"></div>
                    </nav>
                </div>
            </div>

            <nav class="nav-wide-wrapper" aria-label="Page navigation">
                

                
            </nav>

        </div>

        

        

        
        <script type="text/javascript">
            document.addEventListener('DOMContentLoaded', function() {
                window.print();
            })
        </script>
        

        

        
        <script src="searchindex.js" type="text/javascript" charset="utf-8"></script>
        
        
        <script src="elasticlunr.min.js" type="text/javascript" charset="utf-8"></script>
        <script src="mark.min.js" type="text/javascript" charset="utf-8"></script>
        <script src="searcher.js" type="text/javascript" charset="utf-8"></script>
        

        <script src="clipboard.min.js" type="text/javascript" charset="utf-8"></script>
        <script src="highlight.js" type="text/javascript" charset="utf-8"></script>
        <script src="book.js" type="text/javascript" charset="utf-8"></script>

        <!-- Custom JS scripts -->
        

    </body>
</html>