caching-vs-replication [drkp 20091215]

The purpose of this note is to try to straighten out my thinking about
the fundamental differences between database-level replication (like
the related database work) and application-level caching (our system)
and why they must lead to very different designs.


0. Summary

Database-level systems replicate tuples, which have the following
properties:
  - the set of objects replicated is finite
  - the set of objects changed at any time is finite and relatively
    small, and the changes easy to identify and apply to the replicas

Our system deals with application-level objects, meaning:
  - the set of objects can be infinite or at least very large
  - the set of objects changed at any time can also be very large,
    possibly infinite; identifying these objects isn't
    straightforward, and applying these changes is most likely
    infeasible.

...so it is necessary to use a true cache of recently-computed
objects, rather than maintain full replicas of the object space.
We need to use some new mechanisms to deal with this.

1. Database replication

As mentioned above, the objects being replicated are the set of
database tuples, which is finite, so it's certainly possible for the
system to maintain a complete copy of all of them. (Indeed, the
database is exactly a copy itself.) Thus, a replication approach is
possible.

Moreover, finding and applying the set of updates to these objects and
applying them on the replicas is easy. We can simply re-execute any
update queries that the database performed. (If we want to be a bit
more clever, we can do writeset extraction to find block-level
changes, and do batching for write-absorption).

This reflects itself in the solutions we see in the database
replication systems. Straightforward database replication, with no
staleness, is a viable option for read-dominated workloads. Relaxing
freshness, as in FAS and Ganymed, reduces the cost of applying the
updates to the slave replicas.


2. Application-level objects

Application-level objects are fundamentally different, necessitating a
caching approach rather than replication. 

The space of possible objects is large. In our system, which deals
with memoized functions, there are an infinite number of arguments
that can be passed to a cacheable function, and thus an infinite
number of cacheable values. In a different model, they might not be
infinite, but still there can be many more application objects than
database rows. So the cache can't be complete: it can't contain every
possible application object.

Moreover, we can't use the same strategy to keep the cache up to
date. Identifying the set of objects changed by an operation requires
invalidations (explicit or automatic). Computing the values of these
changed objects is infeasible, since there can be many of them and
they can require significant computation each. This is in contrast to
database replication, where a small set of updated rows can simply be
sent to each replica.

So our application-level cache really needs to be passively caching
recently-computed values rather than actively maintaining a complete
snapshot of the world, i.e. it is truly a cache rather than a
replication system.


3. Caching

What are the implications of having an incomplete cache of
recently-used objects vs. a complete replica of all objects?

First, we need to be able to allow a transaction to use cached objects
that were computed at different times: if a database is being
continuously updated, it's unlikely that more than a few objects will
be computed on the same snapshot. This means we need to start thinking
in terms of objects that are valid at multiple times, i.e. validity
intervals. Thus, we introduce a mechanism for computing validity
intervals on objects (recognizing that they are valid at more than a
single timestamp), and a scheme for ensuring consistency by choosing
objects with acceptable validity.

Second, the availability of cached objects depends on what timestamp
is chosen for the transaction, since objects are valid over different
ranges of time. This isn't the case in database replication systems:
it doesn't matter what timestamp they pick, because they have full
snapshots. We deal with this problem using our lazy timestamp
selection algorithm.