* Transactional Caching of Application Data using Recent Snapshots

** Caching of Application Data
*** Basic motivation: make database-driven websites faster
*** Solution: use a cache like memcached

*** What is an application data cache?
**** Caches application-level objects
***** Not: database cache, database replica, query cache
*** Interface
**** In memcached etc.'s world: provide a hash table for the app
**** In our world: caches function calls (memoization)
**** Some examples: queries, objects, generated pages, parts of pages

*** Why cache application data?
**** General theme: cache higher-level stuff closer to what the app really wants
**** Reduce DB load
***** DB load is the hardest bottleneck to remove
***** One cache hit can eliminate >1 database queries
**** Reduce application server load by caching computations
***** This matters too! [Rubis]
***** Easier to remove bottleneck but still expensive [Facebook]

** Transactional Caching
*** Define
**** Within a transaction block, accesses to both cache & DB see same view of DB
**** Not possible in memcached (others?)
*** Why transactions?
**** Avoid exposing anomalies to the user (=> correctness)
**** Avoid violating internal invariants (=> simpler code)

** Recent Snapshots
*** Can't have all 3: freshness, transactional consistency, non-blocking
**** Concurrent update between two reads in a xaction
**** And a blocking cache is a non-starter, so choose freshness or consistency
**** Most caches try for freshness
**** But we want consistency
*** Embracing staleness
**** Transactional consistency requires us to have some staleness 
***** e.g. snapshot isolation: snapshot current at transaction start
**** We go a bit further: r/o transactions can run on *previous* snapshots
***** Improves cache hit rate
***** Avoids blocking
***** Avoids invalidations
*** Is this safe?
**** Stale data is already everywhere
***** No guarantee results are current w/ concurrent updates
***** Especially if there's caching
***** Just need to make sure things are causally consistent
**** Avoiding anomalies
***** Ensure that users see their own data
***** More generally, can ensure causal consistency w/ Lamport clocks
***** Apps can specify fixed time bound on staleness
***** R/W transacitons bypass cache 

* Building a TxCache
** Architecture
*** Basic picture [from SOSP]
*** Validity intervals
*** Data flow

** Cache
*** it's a DHT
**** but versioned
**** and without complexity
*** interface:
**** put: key, value @ interval
**** get: key, (ts or interval) -> value, interval, miss

** Database
*** Need two main things:
**** Computing validity intervals for query results
**** Access to previous snapshots
***** Because of cache misses
**** Reuse existing mechanisms: < 1 KLOC

*** Background: concurrency control in Postgres
**** Multiversion: snapshot isolation 
**** No read locks, write locks as usual
**** Every tuple has birth, death timestamps
**** Queries are assigned timestamps, see only "live" tuples
**** Stale tuples are periodically "vacuumed" away once truly dead.

*** Exposing multiversion concurrency
**** Recall: need to run on old DB snapshot if cache miss
**** Snapshot isolation supports this internally (but no interface)
**** Save a transactions snapshot and restore it for a new transaction
***** PIN, UNPIN, BEGIN <snapshotid>
**** No performance impact: just keep old versions around slightly longer

*** Tracking pins -- the pincushion
**** Need to keep track of which snapshots are pinned
**** Separate from database just for simplicity/overhead
**** Interface
***** Insert new pin
***** Look up pins fresher than n seconds
**** Automatic expiration

*** Generating validity intervals
**** Goal: SELECT... -> VALID FROM 5 TO 8 [example throughout]
**** Tuple validity
***** Each tuple has birth/death timestamps -- looks like a validity interval
***** Intersect the returned tuples
****** Might need to combine when joining/aggregating/etc
***** Is this enough?
**** Invalidity mask
***** Tracks tuples *not* returned: "phantoms"
***** Add lifetime of any tuple discarded due to snapshot invisibility
***** Conservative measure:  might have found later they didn't match anyway

** TxCache library
*** Overview
**** Language-independent part
***** Accessing & updating cache; assigning timestamps; ensuring consistency
**** Language-dependent bindings (PHP)
***** Wrapping cacheable functions, assigning names, marshalling values, etc
**** All cache accesses made by lib 
**** All DB access go through lib (but just observing)

*** Basic flow (example here?)
**** beginRO(staleness) -> library assigns timestamp (magically for now)
**** app calls cacheable function -> check cache
***** if hit: return directly to app
***** if miss: invoke app's function implementation
****** start database transaction on correct snapshot, or pin new one
****** run queries on DB as needed, gathering validity intervals
****** marshal result and insert into cache w/ intersection of intervals

*** Lazy timestamp selection
**** How can we choose a timestamp upfront? We don't know what's in the cache!
***** Choose lazily based on cache availability
**** Two things to track:
***** Pin set: set of pinned snapshot timestamps at which transaction can
      be serialized
***** Validity interval: set of times at which cacheable function
      result is valid

**** Algorithm:
***** At xact begin:
****** Pin set = all sufficiently fresh snapshots [from pincushion] + *
***** At cacheable function call:
****** Check cache for any results whose validity intersects pinset
******* If hit: intersect pin set with result validity
******* If miss:
******** Start DB transaction at some snapshot in pin set
********* If *, reify: pin new snapshot and replace * with its ts
******** Set validity = [-inf, inf]
******** On each query, intersect pin set & fn validity w/ query validity


* Evaluation
** RUBiS
*** Auction site [screenshot?] w/ client emulator

*** Porting experience
**** Straightforward (except rubis sucks)
**** Cached nearly-complete HTML
**** Cached common function calls (authenticate, getItem, etc) 

** Configuration
*** 1 DB, 5 Apaches, 2 caches, 4 emulators
*** ~ 1 GB DB
*** obligatory system stats

** Basic evaluation
*** Classic congestion collapse (lock/scheduling overhead on DB)
*** Caching (30 sec freshness) improves peak by reducing DB load

** Statistics
*** Latencies: cache hit/miss, DB query, etc
*** Hit rate
*** Cache server utilization low

** Varying freshness
*** Peak throughput & hit rate vs freshness req
*** Determines amount of cacheable data

** Varying cache size
*** Only need about 64 MB


* Related work
** App-level caches
*** memcached
*** web caches (many)
*** No transactional semantics

** DB replication
*** Many examples w/ different consistency levels
*** Most closely related: Ganymed/FAS/RC-serializability
**** Same idea of using stale data
**** But very different -- full database replica vs cache that can miss
***** can't use for app-level replication

** Relaxing freshness
*** MVCC: snapshot isolation
**** anomalies for r/w transactions
*** Using older snapshots
*** RC-serializability
*** Causal ordering: Lamport clocks, ISIS, etc

* Future work
** Invalidations [maybe?]

* Conclusions
** Should have some of these