\chapter{From Consistent Reads to Serializability}
\label{cha:serializability-appendix}

The protocol presented in \autoref{cha:consistency-protocol} uses
validity intervals to provide a \emph{consistent reads} property: all
data seen by a transaction reflects a consistent state of the storage
layer, regardless of whether that data was obtained from the cache or
from the database. The properties we ultimately want are system-wide
transaction isolation guarantees. This appendix explains how the
consistent reads property interacts with the storage layer's
concurrency control mechanisms to provide either serializability or
snapshot isolation, depending on which is supported by the storage
layer.

This appendix begins by reviewing the consistent reads property that
the consistency protocol provides
(\autoref{sec:serializability-appendix:consistent-reads}). We next
consider snapshot isolation databases, and show that the consistent
reads property ensures snapshot isolation
(\autoref{sec:serializability-appendix:si}.) Subsequently, we consider
serializable databases
(\autoref{sec:serializability-appendix:serializability}). We first
show that \txcache provides serializability for databases that provide
a \emph{commitment ordering} property, namely that the order in which
transactions commit matches the apparent serial order of execution. We
then consider databases that provide serializability \emph{without}
this stronger requirement, and explain how to extend the system to
support these databases.

\section{Consistent Reads}
\label{sec:serializability-appendix:consistent-reads}

\txcache ensures that for every read-only transaction, there is a
timestamp $t$ such that the validity intervals of each object read by
the transaction contains $t$. As a result, the consistent reads
property of validity intervals
(\autoref{sec:consistency-protocol:validity-intervals:properties})
means that the transaction sees the effects of all read/write
transactions that committed before time $t$ (and no later
transactions), \ie the transaction sees the same data it would if its
reads were executed on the storage layer at time $t$.

Recall also that read/write transactions do not use the cache, so
their reads are sent directly to the database. (We do not consider
here the extensions in \autoref{cha:rw} that allow read/write
transactions to use the cache.)

\section{Snapshot Isolation Databases}
\label{sec:serializability-appendix:si}

The property described above is similar to the snapshot isolation
level provided by many multiversion databases, such as Oracle and
\postgres. This isolation level is defined as follows:

\begin{itemize}
\item \textbf{snapshot isolation}: a transaction always reads data
  from a snapshot of committed data valid as of its
  \emph{start-timestamp}, which may be any time before the
  transaction's first read; updates by other transactions active after
  the start-timestamp are not visible to the transaction. When a
  transaction is ready to commit, it is assigned a
  \emph{commit-timestamp} larger than any existing start-timestamp or
  commit-timestamp, and allowed to commit only if no concurrent
  transaction modified the same data.
\end{itemize}
(This definition is based on the one given by Berenson
et~al.~\cite{berenson95:_critiq_of_ansi_sql_isolat_level};
Adya~\cite{adya99:_weak_consis} provides an implementation-independent
definition in terms of proscribed phenomena in the serialization
graph.)

When used with a snapshot isolation database, \txcache provides
snapshot isolation for all transactions. Read/write transactions
certainly have snapshot isolation, because they bypass the cache and
execute directly on the database using its normal concurrency control
mechanisms. Read-only transactions can use cached data, but \txcache's
consistent reads property ensures that transactions are assigned a
timestamp and only see the effects of transactions that committed
before that timestamp.

\txcache differs from snapshot isolation databases in that it allows
read-only transactions to see a consistent state of the database from
\emph{before} the transaction started (subject to the
application-specified freshness requirement). That is, a read-only
transaction's start-timestamp might be lower than the latest
commit-timestamp. Our definition of snapshot isolation allows this,
though not all definitions do. The original
definition~\cite{berenson95:_critiq_of_ansi_sql_isolat_level} is
ambiguous; Adya's definition~\cite{adya99:_weak_consis} explicitly
permits the system to use an earlier start time; Elnikety
et~al.~\cite{elnikety05:_datab_replic_using_gener_snaps_isolat} refer
to this property as \emph{generalized} snapshot isolation,
distinguishing it from conventional snapshot isolation. Note, however,
that \txcache only provides this relaxed freshness guarantee for
read-only transactions; read/write transactions still execute with
snapshot isolation under either definition.

\section{Serializability}
\label{sec:serializability-appendix:serializability}

The standard criterion for correct execution of concurrent
transactions is serializability:
\begin{itemize}
\item \textbf{serializability}: the execution of concurrent
  transactions produces the same effect as some serial execution of
  the same transactions, \ie some execution where only one transaction
  executes at a time
\end{itemize}
This definition is implementation-independent; there are many
possible ways to implement serializability. For example, two-phase
locking~\cite{eswaran76:_notion_consis_predic_locks_datab_system},
optimistic concurrency
control~\cite{kung81:_optim_method_for_concur_contr}, timestamp
ordering~\cite{reed78:_namin_and_synch_in_decen_comput_system},
serialization graph
testing~\cite{casanova81:_concur_contr_probl_datab_system}, and
several other techniques can provide serializability.

Many (though not all) systems that provide serializability also
provide the following stronger property:
\begin{itemize}
\item \textbf{commitment ordering}\footnote{The \emph{commitment
    ordering} terminology is due to
  Raz~\cite{raz92:_princ_commit_order}; the same property has also
  been referred to as \emph{dynamic
    atomicity}~\cite{weihl84:_specif_implem_atomic_data_types} and
  \emph{strong recoverability}~\cite{breitbart91:_rigor_trans_sched}.}:
the execution of concurrent transactions produces the same effect as a
serial execution of the transactions \emph{in the order they
  committed}
\end{itemize}
If the database provides serializability with this commitment ordering
property, then \txcache's validity-interval-based consistency protocol
is sufficient to provide serializability
(\autoref{sec:serializability-appendix:co}). If the database does not
provide this property, then we require some additional support from
the database to provide serializability, which might require read-only
transactions to sometimes block
(\autoref{sec:serializability-appendix:non-co}).

\subsection{Serializable Databases with the Commitment Ordering
  Property}
\label{sec:serializability-appendix:co}


Two of the most common concurrency control techniques provide this
property. Strict two-phase locking ensures that if two concurrent
transactions attempt to make conflicting accesses to the same object,
the second access is blocked until the first transaction completes,
thereby ensuring that if one transaction precedes another in the
serial order it also precedes it in the commit order.  Optimistic
concurrency control achieves this property by preventing transactions
from committing if they conflict with a previously-committed
transaction.

When used with a database that provides serializability and the
commitment ordering property, \txcache guarantees serializability,
even for transactions that use cached data. Read/write transactions
bypass the cache, so the database's concurrency control mechanism
ensures that there is a corresponding serial order -- and that this
serial order matches the order in which the transactions committed.
\txcache's consistent reads property ensures that each read-only
transaction sees the effects of all transactions that committed before
that transaction's timestamp. This set of transactions is a
prefix of the serial order, so the read-only transaction can be
serialized at that point in the serial order.

The idea of using snapshot reads for read-only transactions while
using a standard concurrency control mechanisms for read/write
transactions is not a new one. The first instance appears to be a
database system from Prime Computer that uses two-phase locking for
read/write transactions, but uses multiversion storage to ensure that
read-only transactions see a consistent
snapshot~\cite{chan82:_implem_integ_concur_contr_recov_schem,
  chan85:_implem_distr_read_only_trans}. Bernstein and Goodman later
generalized this technique as the ``multiversion mixed
method''~\cite{bernstein83:_multiv_concur_contr}. A similar technique
can be achieved with many databases today by running read/write
transactions with two-phase locking but running read-only transactions
under snapshot isolation.

\subsection{Serializable Databases without the Commitment Ordering Property}
\label{sec:serializability-appendix:non-co}


Not all serializability implementations have the commitment ordering
property described above. For example, approaches based on
serialization graph
testing~\cite{casanova81:_concur_contr_probl_datab_system} do not. One
example is Serializable Snapshot Isolation
(SSI)~\cite{cahill08:_serial_isolat_snaps_datab}, which is relevant
because it is used as serializable isolation level in the \postgres
DBMS~\cite{ports12:_serial_snaps_isolat_postg} that we use in our
implementation of \txcache. This approach guarantees serializability,
but the serial order may not match the order in which transactions
commit.

Applying \txcache to a database like this requires some additional
support from the database.  The reason is that it is more difficult is
that \txcache's consistent reads property ensures that each read-only
transaction sees the effects of all transactions that committed before
a certain time. In other words, the read-only transaction sees the
effects of the transactions in a prefix of the \emph{commit} order,
but this might not be a prefix of the \emph{serial} order.

To see how consistent reads might not be enough to ensure
serializability, consider an example of how the commit order might
differ from the serial order in \postgres's Serializable Snapshot Isolation.
In this example, we simulate a transaction-processing system that maintains two
tables.\footnote{Kevin Grittner suggested this example.} A \emph{receipts} table tracks the day's receipts, with each
row tagged with the associated batch number. A separate \emph{control}
table simply holds the current batch number. There are three
transaction types:

\begin{itemize}
\item \sql{new-receipt}: reads the current batch number from the
  control table, then inserts a new entry in the receipts table tagged
  with that batch number
\item \sql{close-batch}: increments the current batch number in the
  control table
\item \sql{report}: reads the current batch number from the control
  table, then reads all entries from the receipts table with the
  \emph{previous} batch number (\ie to display a total of the previous
  day's receipts)
\end{itemize}

If the system is serializable, the following useful invariant holds:
after a \sql{report} transaction has shown the total for a particular batch,
subsequent transactions cannot change that total. This is because the
\sql{report} shows the previous batch's transactions, so it must follow
a \sql{close-batch} transaction. Every \sql{new-receipt} transaction
must either precede both transactions, making it
visible to the \sql{report}, or follow the \sql{close-batch}
transaction, in which case it will be assigned the next batch number.

\begin{figure}
  \centering
  \input{batch-processing-example}
  \caption{Example of an anomaly that can occur when the commit order does not match
    the serial order}
  \label{fig:example-batch}
\end{figure}

Consider the interleaving of transactions $T_1$ and $T_2$ in the
interleaving shown in \autoref{fig:example-batch}. (Ignore the
read-only transaction $T_R$ for the moment.) In this interleaving, the
\sql{new-receipt} transaction $T_1$ first reads the current batch
number, then uses it to insert a new value into the receipts
table. While it is executing, however, the \sql{close-batch}
transaction $T_2$ increments the batch number and commits first -- so
the batch number that $T_1$ read is no longer current by the time it
commits. This execution would not be permitted by either strict
two-phase locking or optimistic concurrency control. However, this
execution \emph{is} a valid serializable history -- it is equivalent
to the serial execution $\tup<T_1, T_2>$ -- and \postgres's
serializable mode does allow it.

Note that although there is an equivalent serial order $\tup<T_1,
T_2>$ for these two transactions, they commit in the opposite
order. This means that the database may have a temporarily
inconsistent state in the interval between when they commit, which is
only a problem if some other transaction observes the inconsistent
state. For example, the read-only transaction $T_R$ in
\autoref{fig:example-batch} reads both the current batch number and
the list of receipts for the previous batch; it sees $T_2$'s
incremented batch number but not $T_1$'s new receipt, violating
serializability. To prevent serializability violations like this, the
database needs to monitor the data read by each transaction, even
read-only ones. If executed directly on the database, \postgres would
detect the potential serializability violation here, and prevent it by
aborting one of the transactions.

This concurrency control approach is problematic for \txcache because
the database needs to be aware of the data read by each transaction,
even if that transaction is read-only. With \txcache, some of the data
accessed by a transaction might come from the cache, and the database
would be unaware of this. Validity intervals aren't sufficient to
maintain consistency because they reflect the commit order of
transactions, not their serial order.

% Accordingly, \txcache requires
% that the storage layer support serializability using a mechanism that
% ensures the commit order matches the serial order.

However, it is desirable to support databases such as \postgres that
do not have this commitment ordering property. We can support
these databases by observing that the consistency protocol in
\autoref{cha:consistency-protocol} ensures that the data read by any
read-only transaction is consistent with the database state at some
timestamp \emph{that corresponds to a pinned snapshot}. Therefore, it
is sufficient if pinned snapshots are taken only at points where the
commit order matches the equivalent serial order of execution, \ie
points where it is not possible for a subsequent transaction to commit
and be assigned an earlier position in the serial order. \postgres
already has a way to identify these points, referred to as \emph{safe
  snapshots}, as they are useful for certain in-database optimizations
for read-only transactions~\cite{ports12:_serial_snaps_isolat_postg}.
We can ensure that \txcache interoperates correctly with \postgres's
serializable mode as long as we ensure that all pinned snapshots are
also safe snapshots; this is done by having the \command{pin} command
wait until the next safe snapshot is available. One implication of
this is that read-only transactions may have to block while they wait
for a safe snapshot to become available. 

%%% Local Variables: 
%%% mode: latex
%%% TeX-command-default: "Make"
%%% TeX-PDF-mode: t
%%% TeX-master: "main.tex"
%%% End: 
% LocalWords:  timestamp serializability serializable SSI OCC
%  LocalWords:  multiversion