Skip to main content
Photo by Fredy Jacob on Unsplash

Caching Expensive Functions in Rust

Using the cached crate to speed up expensive function calls
·1048 words·5 mins

Memo-what?
#

The term memoization can sound intimidating at first. But it boils down to a very simple concept: only do hard things once.

Imagine you have a function that takes awhile to complete. This might be because it requires many CPU cycles or because it needs to fetch something from the network.

use log::{info, LevelFilter};
use simple_logging;

fn main() {
    simple_logging::log_to_stderr(LevelFilter::Info);

    for _ in 1..=5 {
        let res = slow_loris();
        info!("Result: {:?}", res)
    }
}

fn slow_loris() -> i32 {
    // For brevity, I've made a macro to sleep for n seconds
    sleep!(3);

    42
}

With 5 calls, each lasting 3 seconds, this takes 15 seconds to fetch all the results:

[00:00:03.005] (1f6388c00) INFO   Result: 42
[00:00:06.010] (1f6388c00) INFO   Result: 42
[00:00:09.014] (1f6388c00) INFO   Result: 42
[00:00:12.015] (1f6388c00) INFO   Result: 42
[00:00:15.018] (1f6388c00) INFO   Result: 42

But we know that the function will always return the same result! We don’t want to have to wait for it to calculate the same answer again. How can we speed this up?

Well, we could save the result, right? Wrap the function in some kind of if check and create an Optional variable that we populate the first time through. Then, when we enter the function, we can check the variable and, if it has a value, just skip the function entirely and return the cached value.

Congratulations! You now understand function memoization.

The cached crate
#

Let’s install the cached crate and add a macro to our slow_loris function:

use cached::proc_macro::once;

// ...

#[once]
fn slow_loris() -> i32 {
    sleep!(3);

    42
}

And that’s all it takes to bring our 15 second runtime down to 3 seconds:

[00:00:03.005] (1f6388c00) INFO   Result: 42
[00:00:03.005] (1f6388c00) INFO   Result: 42
[00:00:03.005] (1f6388c00) INFO   Result: 42
[00:00:03.005] (1f6388c00) INFO   Result: 42
[00:00:03.005] (1f6388c00) INFO   Result: 42

The first time the function is called, it will be run and the result will be cached. After that, the cached result will always be returned.

Handling arguments
#

But often, your function will have have one or more arguments which affect the result. Memoization can still be useful if the function is called frequently with repeated arguments. Enter the #[cached] macro. By default, this macro will generate a cache key using the value of all function arguments. So let’s tweak our sample program slightly and pass an argument.

fn main() {
    simple_logging::log_to_stderr(LevelFilter::Info);

    for _name in ["a", "b", "c"] {
        for i in 1..=5 {
            let result = fetch_result(i);
            info!("Result: {:?}", result)
        }
    }
}

#[cached]
fn process(i: i32) -> i32 {
    sleep!(i);

    15 * i
}

Now, cached assumes that the argument will affect the return value and inserts each result in the cache under the value of the argument.

[00:00:01.003] (1f6388c00) INFO   Result: 15
[00:00:03.005] (1f6388c00) INFO   Result: 30
[00:00:06.010] (1f6388c00) INFO   Result: 45
[00:00:10.011] (1f6388c00) INFO   Result: 60
[00:00:15.014] (1f6388c00) INFO   Result: 75
[00:00:15.014] (1f6388c00) INFO   Result: 15  <--- cache kicks in
[00:00:15.014] (1f6388c00) INFO   Result: 30
[00:00:15.014] (1f6388c00) INFO   Result: 45
[00:00:15.014] (1f6388c00) INFO   Result: 60
[00:00:15.014] (1f6388c00) INFO   Result: 75
[00:00:15.015] (1f6388c00) INFO   Result: 15
[00:00:15.015] (1f6388c00) INFO   Result: 30
[00:00:15.015] (1f6388c00) INFO   Result: 45
[00:00:15.015] (1f6388c00) INFO   Result: 60
[00:00:15.015] (1f6388c00) INFO   Result: 75

And what if we want to key results on some of the function arguments but not others? We can do that too, using the macro’s convert attribute. Now the function will only be run if it’s called with a value for i that it has never seen before:

fn main() {
    simple_logging::log_to_stderr(LevelFilter::Info);

    for name in ["a", "b", "c"] {
        for i in 1..=5 {
            let result = process(i, name);
            info!("Result: {:?}", result)
        }
    }
}

#[cached(key = "i32", convert = "{i}")]
fn process(i: i32, name: &str) -> i32 {
    println!("Processing: {} '{}'", i, name);
    sleep!(i);

    15 * i
}

Working with functions that return Result
#

By default, cached saves the return value of your function exactly. But if your function returns a Result that you want to unwrap, use the result = true attribute. This will check return values before they get added to the cache and discard Err results. That way, if a given set of arguments generates an Err, the function will try again the next time through.

Other ways to use cached
#

All of the above caches are in-memory, but you can also have cached store its cache on disk or in a Redis server. And the cache is by default unbounded, meaning it will store anything you give it until the program exits—or runs out of memory, falls over, and then exits! To set a size limit on the cache, use a SizedCache like so:

#[cached(
    ty = "SizedCache<(i32, i32), i32>",
    create = "{ SizedCache::with_size(10) }"
)]
fn multiply(i: i32, j: i32) -> i32 {
    println!("Processing: {}*{}", i, j);
    sleep!(j);

    i * j
}

Handling multiple threads or tasks
#

If you use #[cached] on a function but then call it from multiple threads or tasks, you’ll likely get incorrect results. Execution can be interrupted by a second call to the function, and then there will be a race condition to enter the result in the cache. Fortunately, making sure that results are entered in the order the function was called just requires using the sync_writes attribute:

#[cached(
    key = "i32",
    convert = "{i}",
    sync_writes = true
)]
fn process(i: i32, name: &str) -> i32 {
    println!("Processing: {} '{}'", i, name);
    sleep!(i);

    15 * i
}

When not to use memoization
#

Note that doing this sort of caching could be a performance bottleneck if the function takes a long time to complete and is called with many different inputs (generating many cache misses). The locking needed for sync_writes could also be slower than doing it yourself if the actual critical region is smaller than the whole function.

In the end, performance is an empirical art. Humans aren’t very good judges of what a computer will be able to do quickly, so it’s important to use benchmarking to figure out how changes are affecting your code’s performance. Here is a good introduction to the criterion benchmarking library.

The good news is that cached makes memoizing a function very quick. If your benchmarks tell you it didn’t help, or actually slowed things down, you won’t have wasted very long setting it up!