\svnInfo $Id$  

\txcache is designed for systems consisting of a number of application
servers that interact with a database server. Here, the application
servers might be web servers running embedded scripts (\eg{with
  \texttt{mod\_php} or the like}), or they might be dedicated
application servers, as with Enterprise Java Beans. The database
server is a standard relational database; we assume that there is a
single database and the applications use it to store all of their
persistent state.

\begin{figure}[tp]
  \centering
  \includegraphics[scale=0.37]{arch.pdf}
  \caption{XXX}
  \label{fig:architecture}
\end{figure}

\txcache introduces two new components, as shown in
Figure~\ref{fig:architecture}: a cache and a client-side library for
managing it. The cache is partitioned across a set of cache nodes,
which may run on dedicated hardware or share it with other
servers. These nodes store cached data as key-value mappings and keep
the data entirely in memory. As Section~\ref{sec:cache} describes in
detail, the cache is versioned: it can store multiple versions of the
same cached object, tagged with the time interval at which each
version was valid. This versioning is used to ensure consistency:
logically, each transaction only reads cached values from a snapshot
of the database taken at a particular time.

A key characteristic of the design is that the cache does not lie
directly between the clients and the database. Unlike query caches or
other middle-tier database caches~\cite{csql,dbcache,timesten},
\txcache does not cache database results directly. Instead, it caches
the result of application computations, which may be derived from
database queries. Applications typically perform some processing on
data they obtain from the database, perhaps converting it into an
internal object representation or generating a HTML page. \txcache can
cache the results of these computations, reducing the load on the
application server as well as the database. This property is
important, as the application server is a bottleneck in many web
applications~\cite{amza02:_bottl_charac_of_dynam_web_site_bench}.

\subsection{Programming Model}
\label{sec:architecture:model}

In addition to providing consistency guarantees, one of our main goals
was to make it effortless to incorporate caching into a new or existing
application. \txcache's library makes it possible to cache
computations simply by designating what should be cached. In this
section, we describe the interface it presents to the programmer and
the requirements for using it.

Programs group their operations into transactions, through the
\txcache library's \command{begin} and \command{commit}
functions. When starting a transaction, the program declares whether
it will be a read-only or read/write transaction. If read-only, the
application can also specify any requirements for how fresh the data
must be; Section~\ref{sec:stale:anomalies} discusses how applications
can use these freshness requirements.  We focus on optimizing
read-only transactions, as these are most common in most
workloads.\edatnote{DRKP}{Citation for this?}  Read/write transactions
do not take advantage of caching; \txcache's library forwards them
directly to the database, so they execute exactly as they would on an
unmodified system.

Within a transaction, operations can be grouped into \emph{cacheable
  functions}. These are actual functions in the program's code,
annotated to indicate that their results can be cached. Cacheable
functions are essentially memoized: \txcache's library replaces them
with a wrapper function that, when called, checks if the result of
another call to the same function with the same arguments is in the
cache. If so, it returns the cached value.  Otherwise, the function's
actual implementation is executed and the returned value placed in the
cache.

A cacheable function can consist of database queries and
computation. Caching imposes some fundamental restrictions on these
functions: they must not have side-effects, and they must not depend
on any inputs other than their arguments and the state of the
database. For example, it would not make sense to cache a function
that returns the current time. We believe it is reasonable for
programmers to identify such cacheable functions.

The application must perform all of its database accesses through the
\txcache library interface. However, the library interposes on
database queries only to monitor them for dependency-tracking
purposes, as described in Section~\ref{sec:library}. It does not
attempt to parse or rewrite the SQL queries themselves.

Notably, \txcache does not require applications to explicitly
invalidate cached results when they modify the database, in contrast
to \memcached. This was an important design goal, because adding
explicit invalidations requires global reasoning about the entire
application, hindering modularity: adding caching for an object
requires knowing every place it could possibly change.  For example,
consider placing a new bid on an item in our example auction
site. Clearly, any cached copies of the item's page must be
invalidated, because the price has changed. Some of the other objects
that must be invalidated are less obvious: the item's price also
appears on various search result pages, and on the home pages of all
users who bid on it. Finding all of these cached objects is not
straightforward, especially in applications so complex that no single
developer is aware of all of them.


%%% Local Variables: 
%%% mode: latex
%%% TeX-PDF-mode: t
%%% TeX-master: "paper.tex"
%%% End: 

% LocalWords:  versioned versioning php timestamp Cacheable cacheable memoized
% LocalWords:  invalidations
