Invalidation schemes
* Implicit invalidation - All intervals are conservatively bounded and
  we use extensions.  This results in everything in our cache
  effectively expiring after the freshness interval has passed, which
  severely limits its scalability.
* Explicit invalidation - Intervals may or may not be conservatively
  bounded.  For those that are not, the programmer must explicitly
  bound every key affected by a read/write transaction.
** With updates - If a cache entry is convenient to compute from
   information available during an update, actively put that entry
   into the cache.  Because this is eager, it may waste work, but when
   the cache entry is used, this saves a trip to the database.  This
   makes the cache much more like a very limited distributed database
   where applications fall back on the central database if necessary.
* Automatic invalidation
** Database selected - Every RO query includes the set of keys that
   will be based on the results and thus that should be shot down when
   something in the database changes.  If the result of the query is
   unbounded, the database records these keys, possibly on a table
   level, possibly on a primary key level, or possibly some other way.
   When an update occurs on some object (table, primary key, etc), the
   database shoots down all keys registered with that object.  The
   disadvantage of this is that that key set may grow large and thus
   the shoot downs may require a lot of communication.
*** There may be ways to optimize the shoot down communication.  For
    example, if a key set involved in a query is too large, the
    database could fall back on returning conservatively bounded
    intervals.  We could also use escalation - if many keys in a key
    set come from the same key class, we could substitute them with
    the class and broadcast class invalidations.
** Cache selected - With every RO query that returns an unbounded
   interval, the database also returns a set of abstract identifiers
   representing database objects that went into that query (tables,
   primary keys, etc).  A cache entry includes the union of all of
   these identifiers.  Whenever something changes in the database, it
   broadcasts the identifiers of all affected objects and the cache
   nodes find all of the entries that should be shot down.
*** This works well for tables, since the identifier sets will always
    be small, but what if we're using more fine-grained objects?  For
    example, if we're using both tables and primary keys, an update to
    many table rows needs to broadcast all of the primary keys
    affected.  We might be able to use hierarchical names, so that a
    shoot down of many primary keys can be escalated to the
    containing table and cache entries based on the primary keys will
    be invalidated if the table is invalidated.

Explicit invalidation protocol
* The database still returns truncated intervals, but can mark them as
  unbounded.  Each frame records whether or not its accumulated
  interval is unbounded.  When performing an insertion into the cache,
  if the function is marked as explicit and still has an unbounded
  interval, insert it as unbounded.  Otherwise, insert it as bounded.
  This allows non-explicit functions to call explicit functions
  because the truncated interval is still being tracked and will be
  used for the non-explicit function.
** We have to be careful about the race between queries and
   invalidations.  For example, a RO transaction could get a query
   result that is unbounded, then a RW transaction could invalidate
   it, and then the RO transaction could put its cached result.
* RW transactions PRECOMMIT, which assigns them an xstamp but does not
  release locks, then synchronously (but in parallel) post
  invalidations at that pin number, then they COMMIT, then they tell
  the pin cushion to move the dotted line to their xstamp
* The pin cushion keeps track of the dotted line, which is the bound
  on the highest known usable pin.  When a pin set is requested, it
  only returns pins below this.
** If COMMIT's are forced to happen in the same order as PRECOMMIT's,
   then it's safe to simply move the dotted line up whenever a client
   reports a higher value.  All invalidations for that pin will have
   completed before any other transaction can observe the
   transaction's changes.  This is not true if COMMIT ordering is not
   forced.
* The pin cushion should also return the value of the dotted line
  itself so that unbounded intervals returned from the cache can be
  bounded, possibly in the same manner as unbounded intervals returned
  by the database.
* The PRECOMMIT part of this protocol is really bad.

Explicit invalidation protocol, take 2
* The PRECOMMIT thing is a pain.  What if you're conservative and send
  out invalidations for some pin number known to be <= the xstamp that
  the RW transaction will actually receive?
* RW transactions, before committing, retrieve the latest xstamp from
  the database, send out synchronous invalidations for this pin
  number, commit, then bump the pin cushion's dotted line (possibly to
  the preliminary xstamp they received?).
** The dotted line isn't quite right here.  Transaction A gets
   preliminary xstamp 10 and starts invalidating.  Transaction B gets
   preliminary xstamp 11, invalidates things, then moves the dotted
   line to 11.  Meanwhile, transaction A hasn't finished
   invalidating.
*** Actually, that's fine, because transaction A hasn't actually
    committed yet, which means it will actually get a higher xstamp
    than transaction B, so it's okay that it hasn't finished
    invalidating because its updates won't actually be visible at
    xstamp 11.
* Alternatively, the database could send an in-order stream of commit
  xstamps to the pin cushion and the pin cushion could simply move the
  dotted line according to these.

Experiments
* Clients versus req/sec graph for non-cached (stock or modified
  Postgres?) and for cached with, say, 30 second freshness
** For VLDB, it might make sense to include both stock and modified
   Postgres, even though they're basically the same line, because
   people will be more interested in the database side of things
* Corresponding latency graph
* Peak throughput with varying freshness.  We should probably get this
  by producing more clients vs. req/sec graphs, though we only need
  the region around peak throughput.  Only show the clients
  vs. req/sec for one sample (probably 30 seconds)
* Is there any way to say anything about the size of the cache?  Like
  the working set size versus the number of clients or the freshness?
* It would be nice to try this with a more read-oriented workload
* With and without microcaching, if that turns out to be interesting
* We can say a lot about specific latencies in the system, though
  probably not with a graph.  Maybe two tables showing the cache
  overhead (roWrap/[hM]) for a hit and a miss and the call latency
  (roCall/[hM]) for a hit and a miss (and maybe RW) at peak
  throughput, for the in-memory case and the not in-memory case.

Evaluation
* How did we modify RUBiS?
** Obviously, we ported it to Postgres.
** Obviously, we marked its functions cacheable.  We had to modify a
   handful of non-deterministic SQL queries to make these functions
   actually satisfy the requirements of cacheable functions.  For the
   most part, we cache essentially the entire produced HTML of pages,
   though we also perform fine-grained caching of authentication
   results and optionally of caching of common primary key lookups
   (such as for items and users)
** We disabled image loading (perhaps we should say that we cache
   images?) because images represented a large portion of the workload
   on the web server that never involved the database.  We should be
   clear that this was not because there was some large variety of
   images (web sites do have to deal with image traffic, though most
   probably punt it to Akamai), but merely because RUBiS was stupid
   (in nicer words).
* Compare between databases that fit in RAM versus those that don't.
  Drill down on the shift in underlying latencies that happens and
  perhaps run through the calculations that show that we're not
  actually reducing the DB workload by much when the latency of a hit
  is much than the latency of a miss.



Remaining questions
* Why exactly do we not help if the database server is disk-bound?
** Maybe we do and we just chose the number of threads wrong
** Drill down to exactly which relative latencies differ and how
*** Separate out query latencies by query (group queries that differ
    only by parameters)
* If 60 second freshness is not limited by the web server, how well
  does it do?
* Better pin policies?  Better cache hit policies?
