By using let, it's possible to perform a few transformations on a value but have the variable be immutable after those transformations have been completed.
An array isn't as flexible as the `vector` type, though. A vector is a similar collection type provided by the standard library that *is allowed to grow or shrink in size*.
`&String` can be used in place of a `&str` (string slice) trough *String Coercion*. The `&String` gets converted to a `&str` that borrows the entire string.
Some languages have garbage collection that constantly looks for no longer used memory as the program runs; in other languages, the programmer must explicitly allocate and free the memory.
Rust uses a third approach: memory is managed through a system of ownership with a set of rules that the compiler checks at compile time.
Adding data is called *pushing* onto the stack, and removing data is called *popping* off the stack.
All data stored on the stack must have a known, fixed size. Data with an unknown size at compile time or a size that might change must be stored on the heap instead.
The heap is less organized: when you put data on the heap, you request a certain amount of space. The memory allocator finds an empty spot in the heap that is big enough, marks it as being in use, and returns a **pointer**, which is the address of that location. This process is called *allocating on the heap* and is sometimes abbreviated as just *allocating*.
Pushing to the stack is faster than allocating on the heap because the allocator never has to search for a place to store new data; that location is always at the top of the stack.
Comparatively, allocating space on the heap requires more work, because the allocator must first find a big enough space to hold the data and then perform bookkeeping to prepare for the next allocation.
Accessing data in the heap is slower than accessing data on the stack because you have to follow a pointer to get there. Contemporary processors are faster if they jump around less in memory.
Keeping track of what parts of code are using what data on the heap, minimizing the amount of duplicate data on the heap, and cleaning up unused data on the heap so you don't run out of space are all problems that ownership addresses.
Copies happen implicitly, for example as part of an assignment `y = x`. The behavior of `Copy` is not overloadable; it is always a simple bit-wise copy.
The semantics for passing a value to a function are similar to those for assigning a value to a variable. Passing a variable to a function will move or copy, just as assignment does.
```rs
fn main() {
let s = String::from("hello"); // s comes into scope
takes_ownership(s); // s's value moves into the function and so is no longer valid here
A *reference* to a trait is called a **Trait Object**. Like any other reference, a trait object points to some value, it has a lifetime, and it can be either mut or shared.
What makes a trait object different is that Rust usually doesn’t know the type of the referent at compile time.
So a trait object includes a little extra information about the referent’s type.
In memory, a trait object is a fat pointer consisting of a pointer to the value, plus a pointer to a table representing that value’s type.
> **Note**: Rust automatically converts ordinary references into trait objects when needed
```rs
let trait_object = &mut dyn Trait = &mut source;
let trait_object: Box<dynTrait> = Box::new(source); // same for Rc<T>, Arc<T>, ...
A generic type parameter can only be substituted with *one concrete type* at a time, whereas trait objects allow for *multiple* concrete types to fill in for the trait object at runtime.
If homogeneous collections are needed, using generics and trait bounds is preferable because the definitions will be monomorphized at compile time to use the concrete types.
The code that results from *monomorphization* uses **static dispatch**, which is when the compiler knows what method will be called at compile time.
This is opposed to **dynamic dispatch**, which is when the compiler can't tell at compile time which method will be called. At runtime Rust uses the pointers inside the trait object to know which method to call.
There is a runtime cost when this lookup happens that doesn't occur with static dispatch.
Dynamic dispatch also prevents the compiler from choosing to inline a method's code, which in turn prevents some optimizations.
It's only possible to make *object-safe* traits into trait objects. A trait is object safe if all the methods defined in the trait have the following properties:
A *match expression* is made up of *arms*. An arm consists of a *pattern* and the code that should be run if the value given to the beginning of the match expression fits that arm's pattern. Rust takes the value given to match and looks through each arm's pattern in turn.
> **Note**: `match` arms must be exhaustive for compilation.
Variable | `count` | Match any value and copy it to variable
Bind with syb-pattern | `variable @ <pattern>` | Match pattern and copy to variable
Enum | `Some(value)` |
Tuple | `(key, value)` |
Array | `[first, second, third]` |
Slice | `[first, .., last]` |
Struct | `Point { x, y, .. }` |
Reference | `&value` |
Multiple Patterns | `'a' \| 'A'` | `match, if let, while let` only
Guard Expression | `<pattern> if <condition>` | `match` only
> **Note**: `..` in slices matches *any number* of elements. `..` in structs *ignores* all remaining fields
```rs
// unpack a struct into local variables
let Struct { local_1, local_2, local_3, .. } = source;
// ...unpack a function argument that's a tuple
fn distance_to((x, y): (f64, f64)) -> f64 { }
// iterate over keys and values of a HashMap
for (key, value) in &hash_map { }
// automatically dereference an argument to a closure
// (handy because sometimes other code passes you a reference
// when you'd rather have a copy)
let sum = numbers.fold(0, |a, &num| a + num);
```
Patterns that always match are special in Rust. They’re called **irrefutable patterns**, and they’re the only patterns allowed in `let`, in function arguments, after `for`, and in closure arguments.
A **refutable pattern** is one that might not match, like `Ok(x)`. Refutable patterns can be used in `match` arms, because match is designed for them: if one pattern fails to match, it’s clear what happens next.
Refutable patterns are also allowed in `if let` and `while let` expressions:
The `Option` type is used in many places because it encodes the very common scenario in which a value could be something or it could be nothing. Expressing this concept in terms of the type system means the compiler can check whether you've handled all the cases you should be handling; this functionality can prevent bugs that are extremely common in other programming languages.
`Result<T, E>` is the type used for returning and propagating errors. It is an enum with the variants, `Ok(T)`, representing success and containing a value, and `Err(E)`, representing error and containing an error value.
Ending an expression with `?` will result in the unwrapped success (`Ok`) value, unless the result is `Err`, in which case `Err` is returned early from the enclosing function.
`?` can only be used in functions that return `Result` because of the early return of `Err` that it provides.
> **Note**: When `None` is used the type of `Option<T>` must be specified, because the compiler can't infer the type that the `Some` variant will hold by looking only at a `None` value.
> **Note**: error values that have the `?` operator called on them go through the `from` function, defined in the `From` trait in the standard library, which is used to convert errors from one type into another
When working with multiple error types is useful to return a "generic error" type. All the standard library error types can be represented by `Box<dyn std::Error + Send + Sync + 'static>`.
Vectors allow to store more than one value in a single data structure that puts all the values next to each other in memory. Vectors can only store values of the *same type*.
Like any other struct, a vector is freed when it goes out of scope. When the vector gets dropped, all of its contents are also dropped.
```rs
let v: Vec<Type> = Vec<Type>::new(); // empty vec init
Rust's closures are anonymous functions that can be saved in a variable or passed as arguments to other functions.
Unlike functions, closures can capture values from the scope in which they're defined.
Closures are usually short and relevant only within a narrow context rather than in any arbitrary scenario.
Within these limited contexts, the compiler is reliably able to infer the types of the parameters and the return type, similar to how it's able to infer the types of most variables.
The first time a closure is called with an argument, the compiler infers the type of the parameter and the return type of the closure.
Those types are then locked into the closure and a type error is returned if a different type is used with the same closure.
```rs
// closure definition
let closure = |param1, param2| <expr>;
let closure = |param1, param2| {/* multiple lines of code */};
let closure = |num: i32| = <expr>;
// closure usage
let result = closure(arg1, arg2);
```
### Storing Closures Using Generic Parameters and the Fn Traits
To make a struct that holds a closure, the type of the closure must be specified, because a struct definition needs to know the types of each of its fields.
Each closure instance has its own unique anonymous type: that is, even if two closures have the same signature, their types are still considered different.
To define structs, enums, or function parameters that use closures generics and trait bounds are used.
The `Fn` traits are provided by the standard library. All closures implement at least one of the traits: `Fn`, `FnMut`, or `FnOnce`.
```rs
struct ClosureStruct<T> where T: Fn(u32) -> u32 {
closure: T,
}
```
### Capturing the Environment with Closures
When a closure captures a value from its environment, it uses memory to store the values for use in the closure body.
Because functions are never allowed to capture their environment, defining and using functions will never incur this overhead.
Closures can capture values from their environment in three ways, which directly map to the three ways a function can take a parameter:
- taking ownership
- borrowing mutably
- and borrowing immutably.
These are encoded in the three `Fn` traits as follows:
-`FnOnce` consumes the variables it captures from its enclosing scope. The closure takes ownership of these variables and move them into the closure when it is defined.
-`FnMut` can change the environment because it mutably borrows values.
-`Fn` borrows values from the environment immutably.
When a closure is created, Rust infers which trait to use based on how the closure uses the values from the environment.
All closures implement `FnOnce` because they can all be called at least once. Closures that don't move the captured variables also implement `FnMut`, and closures that don't need mutable access to the captured variables also implement `Fn`.
To force the closure to take ownership of the values it uses in the environment, use the `move` keyword before the parameter list.
This technique is mostly useful when passing a closure to a new thread to move the data so it's owned by the new thread.
The *iterator pattern* allows to perform some task on a sequence of items in turn. An iterator is responsible for the logic of iterating over each item and determining when the sequence has finished.
In Rust, iterators are *lazy*, meaning they have no effect until a call to methods that consume the iterator to use it up.
```rs
// iterator trait
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
// methods with default implementations elided
}
```
Calling the `next` method on an iterator changes internal state that the iterator uses to keep track of where it is in the sequence.
In other words, this code consumes, or uses up, the iterator. Each call to next eats up an item from the iterator.
Methods that call next are called `consuming adaptors`, because calling them uses up the iterator.
Other methods defined on the `Iterator` trait, known as *iterator adaptors*, allow to change iterators into different kinds of iterators.
It's possible to chain multiple calls to iterator adaptors to perform complex actions in a readable way.
But because all iterators are lazy, a call one of the consuming adaptor methods is needed to get the results.
A **pointer** is a general concept for a variable that contains an address in memory. This address refers to, or "points at" some other data.
The most common kind of pointer in Rust is a *reference*, which you learned about in Chapter 4. References are indicated by the `&` symbol and borrow the value they point to.
They don't have any special capabilities other than referring to data. Also, they don`t have any overhead and are the kind of pointer used most often.
**Smart pointers**, on the other hand, are *data structures* that not only act like a pointer but also have additional metadata and capabilities.
The different smart pointers defined in the standard library provide functionality beyond that provided by references.
In Rust, which uses the concept of ownership and borrowing, an additional difference between references and smart pointers is that references are pointers that only borrow data;
in contrast, in many cases, smart pointers *own* the data they point to.
Smart pointers are usually implemented using structs. The characteristic distinguishing a smart pointer from a struct is that smart pointers implement the `Deref` and `Drop` traits.
The `Deref` trait allows an instance of the smart pointer struct to behave like a reference so it's possible to write code that works with either references or smart pointers.
The `Drop` trait allows to customize the code that is run when an instance of the smart pointer goes out of scope.
The most common smart pointers in the standard library are:
-`Box<T>`: for allocating values on the heap
-`Rc<T>`: a reference counting type that enables multiple ownership
-`Ref<T>` and `RefMut<T>`, accessed through `RefCell<T>`: a type that enforces the borrowing rules at runtime instead of compile time
### Using `Box<T>` to Point to Data on the Heap
The most straightforward smart pointer is `Box<T>`. Boxes allow to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data.
Boxes don't have performance overhead, other than storing their data on the heap instead of on the stack. But they don't have many extra capabilities either.
`Box<T>` use cases:
- Using a type whose size can't be known at compile time in a context that requires an exact size
- Transferring ownership of a large amount of data but ensuring the data won't be copied when you do so
- Owning a value and which implements a particular trait rather than being of a specific type
```rs
let _box = Box::new(pointed_value);
```
### `Deref` Trait & Deref Coercion
Implementing the `Deref` trait allows to customize the behavior of the dereference operator, `*`.
By implementing `Deref` in such a way that a smart pointer can be treated like a regular reference.
```rs
struct CustomSmartPointer<T>(T);
impl<T> CustomSmartPointer<T> {
fn new(x: T) {
CustomSmartPointer(x)
}
}
impl<T> Deref for CustomSmartPointer<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
// return reference to value
}
}
let s = CustomSmartPointer::new(value);
let v = *s;
// same as
let v = *(s.deref());
```
*Deref coercion* is a convenience that Rust performs on arguments to functions and methods.
It works only on types that implement the `Deref` trait and converts such a type into a reference to another type.
Deref coercion was added to Rust so that programmers writing function and method calls don't need to add as many explicit references and dereferences with `&` and `*`.
When the `Deref` trait is defined for the types involved, Rust will analyze the types and use `Deref::deref` as many times as necessary to get a reference to match the parameter's type.
Similar to the `Deref` trait to override the `*` operator on *immutable references*, it's possible to use the `DerefMut` trait to override the `*` operator on *mutable references*.
Rust does *deref coercion* when it finds types and trait implementations in three cases:
- From `&T` to `&U` when `T: Deref<Target=U>`
- From `&mut T` to `&mut U` when `T: DerefMut<Target=U>`
- From `&mut T` to `&U` when `T: Deref<Target=U>`
### `Drop` Trait
`Drop` allows to customize what happens when a value is about to go out of scope. It-s possible to provide an implementation for the `Drop` trait on any type.
```rs
struct CustomSmartPointer<T>(T);
impl<T> Drop for CustomSmartPointer<T> {
fn drop(&mut self) {
// clean up memory
}
}
fn main() {
let var1 = CCustomSmartPointer(value); // dropped when var1 goes out of scope
let var2 = CCustomSmartPointer(value);
drop(var2); // dropped early by using std::mem::drop
}
```
Rust automatically calls `drop` when the instances went go of scope. Variables are dropped in the reverse order of their creation.
Rust provides the *reference-counted* pointer types `Rc<T>` and `Arc<T>`.
The `Rc<T>` and `Arc<T>` types are very similar; the only difference between them is that an `Arc<T>` is safe to share between
threads directly (the name Arc is short for *atomic* reference count) whereas a plain `Rc<T>` uses faster non-thread-safe code to update its reference count.
*Interior mutability* is a design pattern in Rust that allows to mutate data even when there are immutable references to that data;
normally, this action is disallowed by the borrowing rules.
To mutate data, the pattern uses unsafe code inside a data structure to bend Rust's usual rules that govern mutation and borrowing.
With references and `Box<T>`, the borrowing rules' invariants are enforced at compile time. With `RefCell<T>`, these invariants are enforced at runtime.
With references, if these rules are broken, a compiler error is thrown. With `RefCell<T>` the program will panic and exit.
The advantages of checking the borrowing rules at compile time are that errors will be caught sooner in the development process, and there is no impact on runtime performance because all the analysis is completed beforehand.
For those reasons, checking the borrowing rules at compile time is the best choice in the majority of cases, which is why this is Rust's default.
The advantage of checking the borrowing rules at runtime instead is that certain memory-safe scenarios are then allowed, whereas they are disallowed by the compile-time checks.
Static analysis, like the Rust compiler, is inherently conservative.
When creating immutable and mutable references, the `&` and `&mut` syntax is used, respectively.
With `RefCell<T>`, the `borrow` and `borrow_mut` methods are ued, which are part of the safe API that belongs to `RefCell<T>`.
The `borrow` method returns the smart pointer type `Ref<T>`, and `borrow_mut` returns the smart pointer type `RefMut<T>`.
Both types implement `Deref`, so can be treated like regular references.
The `RefCell<T>` keeps track of how many `Ref<T>` and `RefMut<T>` smart pointers are currently active.
Every time `borrow` is called, the `RefCell<T>` increases its count of how many immutable borrows are active.
When a `Ref<T>` value goes out of scope, the count of immutable borrows goes down by one.
Just like the compile-time borrowing rules, `RefCell<T>` allows to have many immutable borrows or one mutable borrow at any point in time.
A common way to use `RefCell<T>` is in combination with `Rc<T>`. `Rc<T>` allows to have multiple owners of some data, but it only gives immutable access to that data.
By having a `Rc<T>` that holds a `RefCell<T>`, its' possible to get a value that can have multiple owners and that can mutate.
The standard library has other types that provide interior mutability:
-`Cell<T>` which is similar except that instead of giving references to the inner value, the value is copied in and out of the `Cell<T>`.
-`Mutex<T>` which offers interior mutability that's safe to use across threads;
### Reference Cycles Can Leak Memory
Rust's memory safety guarantees make it difficult, but not impossible, to accidentally create memory that is never cleaned up (known as a memory leak).
Rust allows memory leaks by using `Rc<T>` and `RefCell<T>`: it's possible to create references where items refer to each other in a cycle.
This creates memory leaks because the reference count of each item in the cycle will never reach 0, and the values will never be dropped.
- in conjunction with the crate keyword to make Rust code aware of other Rust crates in the project
- in foreign function interfaces (FFI).
`extern` is used in two different contexts within FFI. The first is in the form of external blocks, for declaring function interfaces that Rust code can call foreign code by.
```rs
#[link(name = "my_c_library")]
extern "C" {
fn my_c_function(x: i32) -> bool;
}
```
This code would attempt to link with `libmy_c_library.so` on unix-like systems and `my_c_library.dll` on Windows at runtime, and panic if it can't find something to link to.
Rust code could then use my_c_function as if it were any other unsafe Rust function.
Working with non-Rust languages and FFI is inherently unsafe, so wrappers are usually built around C APIs.
The mirror use case of FFI is also done via the extern keyword:
If compiled as a dylib, the resulting `.so` could then be linked to from a C library, and the function could be used as if it was from any other library