Going further

Let’s finish fulfilling the API requirements. Yesterday’s post was fairly lengthy, being about discovering things, so today’s will be shorter.

There are just more code blocks, so don’t worry about the scroll bar looking worse.

Picking up where we left off

Yesterday, we finished the /fib API route; today, let’s implement the other route, /inv. The front-end is fairly simple, so we can essentially copy-paste the code.

#[get("/inv/{n}")]
async fn get_rev(web::Path(n): web::Path<usize>, app_data: web::Data<AppState>) -> String {
    // TODO: do not unwrap directly
    let mut cache = app_data.cache.lock().unwrap();
    // TODO: u64 doesn't work, figure out why
    format!("{}", cache.reverse(n))
}

Oh, and let’s not forget to register the service in main

// ...
HttpServer::new(move || {
    App::new()
        .app_data(data.clone())
        .service(status)
        .service(get_fib)
        .service(get_rev)
// ...
})

Alright, there’s no point in trying to call this route, as SimpleCache::reverse is currently a stub, which would panic. (You might be curious, like I was, as to how the server would react to this, but I will come to that later.)

TODO or not to… done

Let’s briefly touch on that // TODO: u64 doesn't work, figure out why": Actix expects something serializable as output, and it doesn’t serialize numbers, but it does serialize strings. Since HTTP is a text-based protocol, the string conversion would have to be performed eventually, using a String is fine. (Though, allocating data on the heap instead of just passing a number seems wasteful imo, but this is probably not going to be our biggest performance drag.)

Now, without further ado…

Implementing reverse()

I’d like to point out that I went back to the original specification, in French, of the API; and I found the wording to be actually somewhat confusing… my interpretation seems coherent, though, so I’ll keep rolling with it. (In a true agile context, I’d have asked the client for confirmation, but I’m confident about my interpretation, and this would be a fairly lightweight change, so I’m not going to bother them.)

Let’s first discuss how to implement it logically… we want to find the closest Fibonacci number to a given integer. All integers must be either a Fibonacci number, or sandwiched between two. Different cases make for more complex code, so let’s try to uniformize.

We can treat the first case as being sandwiched between itself and the next Fibonacci number. (We could instead consider itself and the previous Fibonacci number, but that isn’t always guaranteed to exist.) We can express the “sandwiching” as fib(k) < n < fib(k+1); accounting for the “is a Fibonacci” case amounts to turning it into fib(k) ≤ n < fib(k+1).

Once we found the k satisfying this, we compute the distances, dk = n - fib(k) and dk+1 = fib(k+1) - n; distances normally use abs(), but the inequality guarantees that these are positive. And finally, if dk < dk+1, we return k; if dk > dk+1, we return k+1.

…What about dk == dk+1, though? Well, can it even happen? Yeah: fib(4) = 3, fib(5) = 5, so 4 is exactly in the middle; more generally, there will be such a number when fib(k+1) - fib(k) is even, so when fib(k-1) is even, which is true of every third term. (This is quite easy to prove, try it at home!)

The specification explicitly mentions returning “the integer such that […]”, so we may not return both numbers. It may be argued that the specification is wrong, as the answer is not always unique. It might also be poor wording from the client; again, this would warrant contacting them again, but this is not a real setting so I’m taking a few shortcuts.

We’ll arbitrarily pick fib(k) over fib(k+1) in this case. The function and front-end could easily be extended to return both if the client were to really want that behavior. Plus, both behaviors could be provided via alternative routes (e.g. /rev/{n} and /rev/{n}/all), so it’s no big deal.

Writing the code

As we described, there are two parts: finding k, and then picking between k and k+1. Let’s write a skeleton!

fn reverse(&mut self, n: u64) -> usize {
    // First, find k such that cache[k] ≤ n < cache[k + 1]
    let k = 0;
    todo!();

    let (ldelta, rdelta) = (n - self.0[k], self.0[k + 1] - n);
    if ldelta < rdelta {
        k
    } else {
        k + 1
    }
}

Well, I outright wrote the second part, since it was that simple. rustc warns that code after todo!() is unreachable, but that’s to be expected. Let’s get to the meat!

We could ask for self.fib(k) in a loop, but we’re not barbaric, we can do better. The cache’s Vec is sorted, so we can search for k using dichotomy! …assuming the end of the cache is greater than n.

Okay, let’s assume it is, and we’ll deal with the other case later.

// First, find k such that cache[k-1] ≤ n < cache[k]
// Does the cache contain such a k?
let k = if self.0.last().unwrap() < &n {
    todo!()
} else {
    // Yes: look for the two closest to `n` using dichotomy
    // Invariants: cache[a] ≤ n; n < cache[b]
    let (mut a, mut b) = (0, self.0.len() - 1);
    // Continue until we narrowed down to a single range
    while a != b - 1 {
        let mid = (a + b) / 2;
        if self.0[mid] <= n {
            a = mid;
        } else {
            b = mid;
        }
    }
    a
};

Oh yeah, it’s fine to unwrap() here, because we know the Vec can’t be empty. On to the other case!

If all cached entries are lower than n, we need to find the one that’s higher… But due to how Fibonacci numbers are computed, we’ll have to proceed sequentially. At least we know directly how to proceed…

let k = if self.0.last().unwrap() < &n {
    // No: populate the cache until `n ≤ cache[k+1]`
    let mut k = self.0.len() - 1;
    loop {
        let sum = self.0[k - 1] + self.0[k];
        self.0.push(sum);
        k += 1;
        if n < sum {
            break;
        }
    }
    k - 1
} else {
	// ...

But, hm, wait. If we do see n (as self.0.last()), why bother computing an extra number? Can we change the invariant to instead be cache[k] < n ≤ cache[k+1]? Yes, we can!

    fn reverse(&mut self, n: u64) -> usize {
        // First, find k such that cache[k-1] < n ≤ cache[k]
        // Does the cache contain such a k?
        let k = if self.0.last().unwrap() < &n {
            // No: populate the cache until `n ≤ cache[k]`
            let mut k = self.0.len() - 1;
            loop {
                let sum = self.0[k - 1] + self.0[k];
                self.0.push(sum);
                k += 1;
                if n <= sum {
                    break;
                }
            }
            k
        } else {
            // Yes: look for the two closest to `n` using dichotomy
            // Invariants: cache[a] < n; n ≤ cache[b]
            let (mut a, mut b) = (0, self.0.len() - 1);
            // Continue until we narrowed down to a single range
            while a != b - 1 {
                let mid = (a + b) / 2;
                if self.0[mid] < n {
                    a = mid;
                } else {
                    b = mid;
                }
            }
            b
        };

        let (ldelta, rdelta) = (n - self.0[k - 1], self.0[k] - n);
        // If ldelta == rdelta, arbitrarily return either
        if ldelta < rdelta {
            k - 1
        } else {
            k
        }
    }
}

This is slightly clever, by ensuring that k is never 0: the upper code path begins with k being at least 1, so it must return at least 2; the lower code path cannot lower b below 1. Overall, this makes the code slightly lazier, without changing much of the logic. Small win, but small effort, too.

Demo time!

Right, this seems sensible. That was more complex logic, so I’m not sure I can trust it… Sounds like time to write a test!

// Check that `reverse` returns a suitable result for the first ~150 params
#[test]
fn rev_first_150() {
	let mut cache = SimpleCache::new();

    for n in 0..150 {
        // Get `reverse`'s answer
        let k = cache.reverse(n);

        // Let's not have assertions mutate the cache being tested
        let mut ref_cache = SimpleCache::new();
        // Get the corresponding Fibonacci number
        let closest = ref_cache.fib(k);
        // `n` must be between two Fibonacci numbers, so compute the distance between it and each of them
        let (delta, j, other_delta) = if n < closest {
            (closest - n, k - 1, n - ref_cache.fib(k - 1))
        } else {
            (n - closest, k + 1, ref_cache.fib(k + 1) - n)
        };
        // It's entirely possible that the deltas are identical!
        // e.g. 4 is exactly in the middle of fib(4) = 3, and fib(5) = 5, so both 4 and 5 would be valid
        assert!(
            delta <= other_delta,
            "reverse({}) returned {} (delta {}), but fib({}) = {} only has a delta of {}",
            n,
            k,
            delta,
            j,
            ref_cache.fib(j),
            other_delta
        );
    }
}

If I can’t be sure that my results are correct, I can just ask a computer to do it for me! :D Note that the test checks exactly that the result fulfills the criterion specified in the API. Note also that the cache used to compute fibs is different from the cache being tested: since fib has side-effects, I don’t want them to accidentally interfere with the reverse that I’m testing. In fact, it would interfere, by preventing the “upper bound not in cache” path from ever running! (Except maybe once.)

But, now that I’m thinking about it, one reverse could effect the later ones… I should test the same, but creating a new, pristine SimpleCache on every iteration. Since the tests would be almost identical, instead of copy-pasting the code, let’s instead factor out the common part.

// Check that `reverse` returns a suitable result for the first ~150 params
fn rev_first_150_common<F: FnMut(u64) -> usize>(mut f: F) {
    for n in 0..150 {
        // Get `reverse`'s answer
        let k = f(n);

        // ... the rest is the same
    }
}

#[test]
fn rev_first_150() {
    let mut cache = SimpleCache::new();
    rev_first_150_common(|n| cache.reverse(n));
}

#[test]
fn rev_first_150_independent() {
    rev_first_150_common(|n| SimpleCache::new().reverse(n));
}

Eyy, better. Oh, and while I’m at it, I should do the same to the fib test, though I’ll omit the code for *ahem* brevity. And since this is getting fairly messy, and I don’t need those function when not testing, I should move all of this to a dedicated module, the idiomatic solution:

#[cfg(test)]
mod tests {
    use super::*;

    // Check the first few in the sequence, which should be a good indication that the algorithm is good
    fn first_13_common<F: FnMut(usize) -> u64>(f: F) {
        let expected = vec![0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144];
        let first_13: Vec<u64> = (0..expected.len()).map(f).collect();
        assert_eq!(first_13, expected);
    }

    // Etc etc
}

(The #[cfg(test)] line means to only compile the module when in the test configuration.)

Let’s see if everything works out:

Great!

Cleanup

Let’s clean up the front-end code a bit: I want to encapsulate the AppState a little bit better. We’ll put it in a separate module, so that the “main” code won’t be able to poke directly at its internals. We’ll also abstract all the mutex locking to get the cache, in a single get_cache getter.

mod app_state {
    use fibrs_lib::caches::SimpleCache;
    use fibrs_lib::Cache;
    use std::sync::{Mutex, MutexGuard};

    #[derive(Debug)]
    pub struct AppState {
        cache: Mutex<SimpleCache>,
    }

    impl AppState {
        pub fn new() -> Self {
            Self {
                cache: Mutex::new(SimpleCache::new()),
            }
        }
        pub fn get_cache(&self) -> MutexGuard<SimpleCache> {
            // TODO: do not unwrap directly
            self.cache.lock().unwrap()
        }
    }
}
use app_state::AppState;

#[get("/fib/{n}")]
async fn get_fib(web::Path(n): web::Path<usize>, app_data: web::Data<AppState>) -> String {
    let k = app_data.get_cache().fib(n);
    format!("{}", k)
}

// Etc.

Tubular!

Panics?

Let’s reveal a dark secret I’ve been keeping all along. Our API is vulnerable to a serious DoS attack.

What!? This can’t be… what’s worse, asking for anything now returns an error!

Let’s watch the server’s output, requesting /fib/100 then /fib/50.

‼️ Now, isn’t that interesting… On that note, we can see that a panic handling one request does not take the whole server down! I believe that’s Actix catching panics in the worker threads, but I may be wrong.

That said, we do have two problems here.

Overflow

Rust has an interesting stance towards integer overflow. Consider that integer overflow can both be a feature, and a pain in the neck… writing explicit overflow checks correctly is tricky (if (x + 1 < x), if (x + 1 > INT_MAX) and a few others don’t work), and thinking of overflow everywhere is tricky as well.

So, in Rust, an over/underflow causes a panic. Yep, you heard that right! But, you can explicitly opt into overflow, either case-by-case, or for a variable. That said, since panic checks are fairly slow, they are disabled in release builds, so addition is still native-fast in production.

Anyway, the problem is that fib(100) is just too big for a u64 to contain, and that’s causing a panic in debugging builds. In production, we’d just get a wrong value…

Mutex poisoning

Okay, we understand why /fib/100 crashes, but why do all subsequent requests panic as well? The error message actually tells the first half of the story:

thread 'actix-rt:worker:1' panicked at 'called `Result::unwrap()` on an `Err` value: "PoisonError { inner: .. }"', src/main.rs:24:31

And the relevant code:

pub fn get_cache(&self) -> MutexGuard<SimpleCache> {
    // TODO: do not unwrap directly
    self.cache.lock().unwrap() // <-- This is where the panic occurs
}

Aha, self.cache.lock() is failing. But why? The method’s documentation claims that:

If another user of this mutex panicked while holding the mutex, then this call will return an error once the mutex is acquired.

The top of the page tells the second half of the story. To avoid broken invariants due to panics, the mutex then becomes “poisoned”, preventing anyone from locking it, unless explicitly acknowledging that the invariants may have been broken.

We’d need to recover from this state somehow.

We’ll deal with both of these problems… next time!

↑