Understanding Rust Ownership and Borrowing

Learn Rust ownership and borrowing with practical examples. Understand moves, references, lifetimes, the borrow checker, and memory safety.

What is Ownership?

Every value in Rust must have a minimum of 1 owner (ignoring smart pointer behaviour such as Arc and Rc in this article which does allow for multi-ownership). For now, to just grasp the concept of ownership will be a massive step for beginners in order to build confidence to write lean, professional Rust code which is performant and idiomatic.

So lets get stuck-in:

A value always has a type. And based on what that type is, it is stored in either stack memory or heap memory. Memory is organised into Stack and Heap to be able to prioritise retrieval and organisation of values that are fixed in size vs those that are not. Stack is a highly organised Last-In First-out (LIFO) organised fast I/O memory storage. Addressability and discovery of values as a result of LIFO is very fast with no possibility of fragmentation or address shuffling or reallocation. Heap has no such benefits and as such is the second class relatively disorganised storage option for all “other” types that the privileged primitives (we will get more into this later).

Of course values in memory must persist, and provide the optional capability of being mutable.

As variables are assigned values - they consume memory, as those variables are no longer needed the memory resource can be deallocated. The process of which is the subject of most of this article. To not deallocate memory, is to leak memory which only has one finality - app death.

All programming languages make a conscious choice as to how they avoid running out of memory, this choice boils down to one of 3 approaches:

Developer (ir)responsibility (wing and a prayer)
Outsourced responsibility to a scheduled Garbage Collection (GC) clean-up process
Shift-left, let’s lean into this problem and own it: RAII

Languages where the memory management responsibility is left to the whim of the developer are notoriously correlated to memory vulnerabilities - we all make mistakes after all! (There are numerous examples most famously from Microsoft and Google where all future development of systems programming is now halted on C/C++ in favour of Rust.)

Languages which outsource the responsibilities to a separate process benefit from abstracting the complexity away from the developer. The GC process is delighted to deallocate (collect) unused objects (garbage) from memory. It does this on a schedule based on a reference counter for each memory object, once it reaches zero sayonara object. The problem however with the garbage collection process is that it consumes significant resource, is uninvited and unpredictable as to when it runs, this can be oppressive and unwanted on critical high latency software or software running on minimal resource such as edge devices where performance and fuel use must be critically managed.

Garbage Collection process — Garbage Collection

Third up is Resource Acquisition is Initialisation (RAII). This is a popular choice that truly “owns” the problem of memory management. It does this by assigning a memory resource lifetime to the value at the time of initialisation (creation), hence the almost perfectly descriptive name.

Acquisition of resource (memory) is 100% tightly coupled to (is) initialisation of a variable (let x: i32 = 42;). What is not included in the name is that resource acquisition is also guaranteed deallocation, because at the point of init you are also setting the scope of the variable - whether that scope be the main function? global / static variable?, user defined function? or just a code sub-block { } ? When that scope ends, the lifetime expires. These scope are entirely visible to the compiler in advance of runtime and so not a difficult task for the compiler to manage.

So why outsource housekeeping into runtime? why ignore the protection and management of your most finite and precious resource(memory)? Rust does neither - RAII lifetimes are the perfect solution to avoiding leaky memory and delivering software that is lean and secure.

So where does Ownership come into the picture?

The term ownership is more synonymous with responsibility. Every value (data in memory) in Rustlang is owned, that is - under the responsibility of - one and only one symbol (there are no concept of “objects” in Rust). The owning symbol dictates the lifetime of the value as described above.

Ownership is a responsibility, and with it comes perfectly well maintained resources in memory - nothing old or stale, no leaks, just all perfect and valid. The borrow checked (part of the compiler process) enforces this with strict dues, kinda like grounds-fees on lease-held property.

Trivial things like binding an existing value to a different variable (symbol), or passing a value as a parameter to another function (symbol) in Rust results in moving ownership, and therefore surrendering ownership of the value to the new symbol (a value can have only one owner). The lifetime of the value will now be that of the proud new owner. The original (previous) owner is now decoupled from the value and as you will quickly discover in Rust - any attempt to use that symbol in code will return a move error.

Move Semantics

fn main() {
    let s1: String = String::from("Rust Unwrapped");
    let s2: String = s1; // ownership of value moved to s2
    // s1 has been deallocated and dropped
    // the following line will not compile
    println!("String s1: {}", s1); // s1 cannot be used
}

Rust 2021 · stable · editable

Think of this process as move semantics.

Now for the gotcha - this behaviour of ownership is 100% true, however… depending on the type, move semantics might be superseded by a different behaviour: copy.

Copy is a behaviour mutually exclusive to move and is a trait applied only to primitive types.

https://doc.rust-lang.org/std/index.html#primitives

For these types (array, tuple, floats, ints, bool, char, str - plus a couple of other more niche types); when you bind a value to a new variable or pass it into a function parameter the value in memory actually copies (duplicates) and so the original symbol retain ownership of it’s copy, and the new symbol gets it’s own copy to own.

This behaviour is from a certain perspective quite wasteful because we are duplicating resources and therefore consuming more memory potentially needlessly - however the beauty of primitive types (those with the copy trait - also called simply “copy types”) is that they are (and must be) Sized that is - the size is known to the compiler, the size is also fixed (cannot grow or shrink). Although they are typically quite small (scalar types), they are not always - included in the list of primitive types are compound types (those that combine multiple values into a single type) namely Arrays and Tuples. Those have the potentially for being large. Any arbitrary copying can result in an expensive use of the finite resource stack memory - the envy of the heap! The solution to this (details out of scope from this article) is that - if we so choose - we have the option to alter the default behaviour of copy types by “boxing” them with the box<T> type, which has the result of storing the value in the heap and denying it the automatic behaviour of coping making the type now effectively demoted back down to move semantics.

Attack of the Clones - When confronted with the borrow checker’s move error - unfortunately the compiler can lead the developer down an undesirable path of bad habits. Did you notice that the compiler feedback in the error message states that the type does not implement the copy type and to “consider cloning the value”. This is tempting and seems a quick fix solution - our saviour to compiler errors seems to be to .clone() the value.

Note the caveat though - if the performance cost is acceptable.

So whats the hit?

The approach manually forces a deep copy of the heap value allocating a new location in memory and generating a stack pointer to it.

Due to the proliferation of this situation you can pretty quickly ending up packing out your code with endless .clone() methods and smug developers happy to be on the compilers good side - blissfully not considering seriously the cost of the caveat.

So let’s surface some alternatives which do not cause memory bloat and code shrapnel, and some that will allow our Rust code to live up to the famed lean, high performant glory for which it is famed…

References - If ownership pictures singular responsibility to define a lifetime, borrowing (referencing) is a truly beautiful solution. References are like views (for those from a database background) - looking in but with no responsibility.

I like to envisage a reference as like watching a reality TV show - think Osbournes or Kardashians - from the comfort of our own living rooms millions of us at the same time can get a glimpse into the original, actual lives of celebs - share in their laughs, cry when they cry - as if we were actually there (kinda)….

In Rust a shared reference is a read-only view of the value, and like the millions of viewers there are no limits to the number of concurrent shared references (reads) on the resource. The beauty also is that those holding a reference have zero responsibilities to clean up or deallocate memory - because we are not the owner. A value can have unlimited shared references, but only one owner.

Instead of cloning our string, we will initialise s2 as a reference to s1. References are defined with the ampersand & symbol.

Shared References

#[allow(unused)]
fn main() {
    let s1: String = String::from("Rust Unwrapped");
    let s2 = &s1; // s2 is a shared reference to s1
    let s3 = &s1; 
    println!("Strings s1 and s3: {}, {}", s1, s3);
    // now we will print the mem address in heap of s2 and s3
    // this shows they are both pointing to the same address:
    println!("Memory pointer location of s1 and s3: {:p}, {:p}", s2, s3);
}

Rust 2021 · stable · editable

To prove that the two shared references s2 and s3 both point to the same memory location we can use the :p formatter in our println macro.

See that they both reference the same hex location.

Zero space waste, no cloning, no dupes. Lean.

Ownership and Borrowing Demystified