* performance testing
*** looking at livejournal and simulating a workload
*** using a web/database benchmark
*** comparisons:
***** with/without cache
***** replicated database
***** memcached?
* acausality could cause users to see stale data after contacting two servers
*** possible solution: lamport cookies

* steps to start a cache transaction:
1. begin cache transaction message to the pin cushion w/ freshness
   requirement
2. looks up pin number based on freshness requirement
3. Adds that to the priority queue of pin requirements
4. Returns pin number
5. pin number -> infinity become initial window of the cache
   transaction

* steps to 

* pin cushion keeps:
1. table of pin numbers, pin timestamps (tell the pin cushion when you
   do a new read-only transaction and pin something)
2. list of running transactions and priority queue of their pin
   requirements
      
* future work:
1. stampeding (lock using hints at one timestamp and everyone with an
   interval including that timestamp waits, only one timestamp because
   there is only one timestamp you know for sure will be covered by
   your result)
2. batching

* policy choices
*** Which entry to return from the cache
***** Latest
***** Maximal overlap
***** Classification - tag each pin with the "type" of transaction
      that created it and try to pick a pin with the same type as the
      requesting transaction
*** Which pin to use when going to the database
***** Latest
***** Oldest
***** Something from the middle

* evaluation
*** Measure the performance overhead of our hacked Postgres over
    regular Postgres.  We probably shouldn't dwell on this, just say
    what the percentage was and maybe a little bit about the setup.
*** Graph total requests/sec versus the number of clients for no cache
    (but with hacked Postgres), x cache nodes, and y cache nodes
*** Graph total requests/sec versus the cache size, keeping the number
    of clients fixed at some number that pegs the system, but doesn't
    go into the overload zone.  This number will probably be
    determined from the previous graph.
*** Somehow measure the impact of freshness on performance and perhaps
    its space trade-off
*** Somehow measure the space overhead of stale data
*** Measure latency
*** Is the database or the application the bottleneck?
**** If the application is, we can also spread it across multiple web
     servers until it isn't.  If this is the case, we should *also*
     try it with the application as the bottleneck, since, unlike a
     database cache, we can help with that, too.
*** Try both the read/write workload and the read/only workload.

* misc things to think about
*** We could allow the client to choose between conservative bounding
    and explicit invalidation.  In fact, we could even do this on a
    per-transaction basis.
*** In order to simulate memcached-like behavior, I think we _need_ to
    add explicit invalidation.  Each cache line would contain only the
    latest entry put into it and it would contain it until it was
    explicitly invalidated.

* Problems
** Without Lamport cookies or invalidations, policies basically don't
   have anything to choose between.
*** Never mind.  This was just because the pin policy wasn't
    choosing *.
** How do we measure scalability?  If we can't saturate the cache, it
   doesn't matter how many nodes we throw at it.  But everything in
   the cache effectively expires after the freshness requirement has
   passed, which means we need to be able to saturate it in that much
   time.  It takes us forever just to saturate the buffer cache.

Why isn't it faster?
* Do we not understand where the bottleneck is?
* Add client-side statistics.  # RW transactions, # RO transactions,
  per-function comparison of latency for a cache hit versus latency
  for a cache miss, # RW queries, # RO queries
* More server-side statistics.  Lookup time, put time
* We should probably be always running with txcache, just not using RO
  transactions
* Are we simply measuring the wrong thing?  RUBiS's req/s is *purely*
  a function of latency, but perhaps that's the nature of throughput.
  What would happen if we just hit the web server as hard as possible?
  What would that mean?  How many concurrent requests do you perform?
** If it is all about latency, we should still be decreasing latency
   by decreasing load on the DB (unless that's not the main source of
   latency).  Measure the time we spend in various things, including
   all calls to the cache and calls to the DB.  Profile the cache
   server and the pin cushion to make sure they're actually fast.
* Measure the latency distribution of cache hits versus cache misses
  and the breakdown of cache misses into time spent dealing with the
  cache versus time spent in the actual call.
* Measure the latency of connecting.
* Frans thinks we're not actually overloading the database because
  it's still responding to everything.  He thinks we should look at
  iostat and figure out of IO is really bottlenecked (or CPU, but
  that's not likely).  Perhaps the right way to do this is to generate
  some fixed number of requests per second and see how many of them
  get through?
