1. Why invalidations?

Our system currently avoids the need for invalidations by making the
assumption that all still-valid results become invalid immediately
after they are received from the database. Such results are still
useful, because running transactions in the past allows them to be
used until they become too stale for the freshness limit. However,
this limits the scalability of the system.

Note the distinction between requiring the programmer to provide
explicit invalidations -- which is what memcached does and we'd like
to avoid for usability reasons -- and using invalidations at all. We'd
like to have invalidations automatically generated by the database as
things change.

The need for invalidations can be seen from some of our benchmark
results:
  - cache effectiveness is ultimately limited by the freshness
    requirement rather than the cache size (after about 64MB)
  - we see cache hit rates in the 25-75% range depending on freshness
    requirement (don't have the actual numbers handy). Users of
    memcached strive for hit rates well into the 90s; 99% is not
    uncommon.
  - we see little to no measurable performance benefit on workloads
    where the database does not fit in memory.

The first observation reflects the fact that cached data becomes
useless after the freshness requirement (e.g. 30 seconds)
elapses. Thus, only so much useful data can be stored in the cache:
the data generated by the application in the last 30 seconds is not
that large.

Thus, for read-only workloads with large data sets, it's not
surprising that memcached does better. With invalidations, a cache
entry is valid for however long it's up to date (plus the freshness
requirement, in our system). This will often be much longer.

Competition between our cache and the database's buffer cache explains
why we don't see a benefit for disk-bound workloads. If we assume that
the workload is I/O-bound, then only reducing the number of I/Os will
improve performance. Our cache only stores data for ~30 seconds, so
it's likely that a database with a reasonably-sized buffer cache will
have the tuples in cache. We do save some computation costs --
especially for application-level objects derived from multiple queries
-- but if database I/O is the bottleneck that doesn't help much.


2. Implementing invalidations: the cache server side

Regardless of whether our implementations are happening explicitly or
being generated automatically, the server side will look roughly the
same.

Recall our existing cache interface:
  - put(key, value, validity interval)
     stores a mapping in the cache with a given validity interval
     (derived from the database's results)
  - get(key, timestamp) -> value
     returns the value for the key in the cache, valid at the given
     timestamp
  The latter is actually get(key, interval) -> value, interval
  which returns any value valid during the given interval; lazy
  timestamp selection requires this, but it's OK to think about the
  simpler version above that just takes a timestamp.

We'll extend this slightly. When inserting into the cache, the
inserter needs to specify whether the item is still valid (as far as
it knows), and a set of *invalidation tags*:
   put(key, value, validity interval, stillValid, invalTags)

There's also a new invalidation operation:
   invalidate(timestamp, invalTags)
This provides the latest timestamp and the set of invalidation tags
such that any entry that is no longer valid has at least one of these
tags. Invalidate operations need to arrive in timestamp-order. in
practice this means that they need to come from/through the database
and be multicast to all nodes; more on this later.

For queries, the cache treats any item flagged as still-valid as
having its upper validity bound equal to the timestamp of the last
invalidation received. When it receives an invalidation request for a
particular timestamp and set of invalidation tags, it sets the upper
bound of any entry matching one of the invalidation tags, clearing its
still-valid flag, and also extends the effective bound of any
still-valid transaction to that timestamp.

For a more visual explanation, see Figure 2 of our paper, but imagine
that some of the intervals extend out to the dotted 'now' line. These
represent items that are still valid. When an invalidation is
received, the dotted line moves right, and the upper bounds of any
items not invalidated move with it.  (I realize that describing a
figure through email does not exactly qualify as a "more visual
explanation"!)

Invalidation tags can be hierarchical in order to support multiple
granularities. This probably won't make much sense until we discuss
where these tags come from, and some examples of their use. But the
rule is that invalidating a tag invalidates any entry tagged with that
tag, any of its ancestors, or any of its descendants.


3. Explicit invalidations

Suppose we want to implement explicit invalidations in the style of
memcached. (This isn't great, in that our usability argument centers
on no requiring them, but it's no worse than existing systems.)

We have a bit of a naming challenge. With memcached, you'd say
lookup(key), so it's natural to say invalidate(key) with the same
key. But we are using a memoizing interface and having our library
generate the keys itself, so what do we put in as the argument to
invalidate(...)?

A solution is to associate an invalidation tag with the value produced
by a function call, say function:arguments, e.g. 'getItem:id1234'. Then,
we can pass the corresponding argument to an invalidation call. Not
too dissimilar from how you'd do it in memcached.

We can take advantage of the fact that invalidation tags are
hierarchical here. If we invalidate 'getItem:id1234', that'll
invalidate one specific set of arguments, but we can also just
invalidate 'getItem', which would invalidate the call with *any*
arguments -- useful for global changes to the system state, or ones
too complicated to figure out the dependencies for. It's also possible
to tag a cached value with 'getItem', which means it'll be invalidated
by any invalidation starting with 'getItem', though it's not
especially clear to me how this would be useful in this context.

How do we send the invalidations to the cache nodes? I think the only
workable way is to send it through the database. This seems strange,
because the database isn't involved in generating the invalidations,
but it's important to get their ordering right. The invalidation needs
to be delivered with the correct timestamp, which is determined by the
database at commit time, and invalidations also need to be delivered
in-order (or at least can't be processed out of order, so later ones
would have to be delayed). Given that not every commit would trigger
an invalidation, the only other way I've found to do things correctly
is a sort of two-phase commit where database commits are blocked until
the client confirms that it's sent the invalidations to the cache
nodes. This is obviously a horrible idea.

One final note is that, although this approach shares with memcached
the need for the programmer to identify places to insert
invalidations, we have an important benefit. We can always fall back
on the current no-invalidation approach (assuming the data is invalid
immediately) for certain cached values. This might be useful if it's
too hard for the programmer to figure out all the places it needs to
be invalidated, or if we know it changes so frequently that
invalidating isn't going to give much benefit.


4. Semi-automatic invalidations

I expect to have the cache server-side part of invalidations
implemented shortly. I also have a plan for automatically generating
fine-grained invalidations based on revocable read locks (I will write
another note about this soon), but it requires significant work on the
database side. Though we've made exciting progress on it, I'm
skeptical about whether we'll have something usable by OSDI.

So, is there a middle ground? Something that would preserve the
performance benefits of invalidations while being easier for a
programmer to use than explicit invalidations, and practical to
implement in the next two weeks?

One obvious possibility would be to do invalidations at a table
level. This is pretty easily doable. We need to know what tables a
query looks at, which is not hard to get (we could even get away with
asking the programmer to specify this, I think, not that we couldn't
figure out ourselves). And we need to know what tables an update
transaction modifies, which is also pretty obvious.

Table-level invalidations aren't sufficiently fine-grained to satisfy
me as a real solution, but it's worth noting that we would get some
benefit from it. The RUBiS benchmark has some tables that *never*
change, some of them quite large, and these would effectively be
cacheable forever.

A refinement on this would be to use two levels of granularity: tables
and primary-keys. This is a hackish solution based largely on the
observation that a lot of applications, including our benchmark, spend
a lot of time looking up objects by their key. If we can identify such
queries (possibly by programmer annotation), we can tag them with a
table:key invalidation tag. Then, when inserting or modifying a single
row, we can simply generate an invalidation for the associated
key. For more complex queries, or more complex modifications that
touch many rows, we can fall back on table-level invalidations. I have
some ideas about how to throw this together quickly, and I think it's
a viable option for performance/usability.


In my next note, I'll describe the more elegant scheme I came up with
for automatically generating fine-grained invalidations. I'll also
describe some related work I'm doing on making Postgres provide
serializability, and we'll take a brief foray into the criminal
justice system of Wisconsin (for reasons that will become clear later).