\ifdraft
\svnInfo $Id$
\fi

\section{Architecture}
\label{sec:implementation:architecture}

Our implementation of Arpeggio is divided into three layers, as shown
in Figure~\ref{fig:implementation:architecture:modules}; for maximum
flexibility, the core indexing functionality is separated from the
underlying DHT implementation and from the user interface. At the
lowest level, the routing layer encapsulates the functions provided by
the DHT. The indexing layer implements the algorithms described in
Chapter~\ref{chap:indexing}, making it possible to index and search
file metadata. Finally, the user interface layer has several
components, making it possible to access the indexing functionality
through various means.

\begin{figure}[tbp]
  \centering
  \includegraphics[scale=0.8]{figures/architecture}
  \caption{Arpeggio implementation architecture: key modules}
  \label{fig:implementation:architecture:modules}
\end{figure}

\section{Routing Layer}
\label{sec:implementation:routing}

The routing layer of Arpeggio provides the DHT's \proc{Lookup}
primitive. It is separated into its own layer, with an extremely
narrow RPC interface, for two reasons. First, this separation makes it
possible to substitute a different DHT instead of Chord, as the upper
layers of the system are independent from the details of the DHT
algorithm. Second, for ease of implementation, it allows the Arpeggio
codebase to be separated from that of the DHT.


Our implementation of Arpeggio's routing layer consists of a daemon
called \texttt{cd} (\emph{Chord daemon})\footnote{In retrospect, it
  should have been clear that choosing a name identical to that of a
  common UNIX shell command was not conducive to sanity.}. This daemon
is implemented in C++, using the \texttt{libasync} event-driven
framework~\cite{mazieres01:_toolk_for_user_level_file_system}, and
links against the MIT Chord codebase. It accepts remote procedure
calls and presents the interface described in
Table~\ref{tab:implementation:routing:cd-interface}. This interface
essentially consists of the bare \proc{Lookup} primitive. Notably, it
does not include the higher-level \proc{Get-Block}/\proc{Put-Block}
DHT storage layer functions; implementing a replicated storage layer
is left to the indexing layer in order to enable techniques such as
index-side filtering
(Section~\ref{sec:indexing:design-evolution:index-side-filtering}).

\begin{table}[tb]
  \centering
  \caption{\texttt{cd} RPC interface}
  \label{tab:implementation:routing:cd-interface}
  \begin{tabular}{|l|l|}
    \hline
    \textbf{Function} & \textbf{Description} \\
    \hline
    \texttt{newchord} & Create a Chord instance and join a ring\\
    \texttt{unnewchord} & Shut down the Chord instance\\
    \texttt{lookup} & Look up the successors of a key\\
    \texttt{getsucclist} & Return the list of successors of our node\\
    \texttt{getpredlist} & Return the list of predecessors of our node\\
    \hline
  \end{tabular}
\end{table}

In order to enable replication, the RPC interface also includes
functions for obtaining the current successor and predecessor
lists. These are used for implementing index replication, since in
Chord a node will only replicate data that is stored on an adjacent
node. For use with other DHTs, these functions could be replaced by
their more general equivalent: finding the set of nodes on which data
should be replicated. 

\section{Indexing Layer}
\label{sec:implementation:indexing}

The indexing layer consists of the Arpeggio daemon, \texttt{arpd},
which implements the Arpeggio indexing algorithms. \texttt{arpd}
accepts high-level commands such as ``add metadata to index'' from the
user interface, via a simple RPC interface described in
Table~\ref{tab:implementation:indexing:arpd-ui-interface}. 

\begin{table}[tbp]
  \caption{\texttt{arpd} RPC interface presented to UI}
  \label{tab:implementation:indexing:arpd-ui-interface}
  \centering
  \begin{tabular}{|l|l|}
    \hline
    \textbf{Function} & \textbf{Description} \\
    \hline
    \texttt{insert} & Add a metadata block to the appropriate
    indexes\\
    \texttt{search} & Perform a search for files matching a given
    query\\
    \hline
  \end{tabular}
\end{table}

\begin{table}[tbp]
  \centering
  \caption{\texttt{arpd}--\texttt{arpd} RPC interface}
  \label{tab:implementation:indexing:arpd-arpd-interface}

  \begin{tabular}{|l|l|}
    \hline
    \textbf{Function} & \textbf{Description} \\
    \hline
    \texttt{add} & Add a metadata block to all appropriate indexes on
    this node\\
    \texttt{search} & Search a particular index for metadata blocks
    matching a query\\
    \texttt{gatewayInsert} & Request that an index gateway refresh
    indexes if necessary\\
    \texttt{offerBlocks} & Inform a remote node about blocks that it
    should replicate\\
    \texttt{requestBlocks} & Request a set of blocks (by hash) from a
    remote node\\
    \hline  
  \end{tabular}
\end{table}

\texttt{arpd} is implemented in Python, using the Twisted asynchronous
framework~\cite{fettig06:_twist_networ_progr_essen}. Each
\texttt{arpd} node presents several interfaces, as shown in
Figure~\ref{fig:implementation:architecture:modules}. \texttt{arpd}
communicates with the Chord daemon, \texttt{cd}, via a local RPC
connection over a Unix domain socket in order to perform Chord
lookups; it communicates with other instances of \texttt{arpd} on
other nodes in the network via Twisted's Perspective Broker remote
method invocation system~\cite{introd_to_persp_broker}. Finally, it
operates a HTTP interface for the user interface.

To participate in Arpeggio indexing, each node in the network operates
an \texttt{arpd} daemon. Upon startup, \texttt{arpd} starts the Chord
daemon, \texttt{cd}, and establishes a local RPC connection to it over
a Unix domain socket. \texttt{arpd} then creates a Chord node, and
connects it to a user-specified well-known node for bootstrap
purposes. Once the connection is established, the local \texttt{arpd}
begins communicating with its neighbors in the Chord ring in order to
begin synchronizing the data it is now responsible for replicating.

Each node acts as a client, an index gateway, and an index
server. Upon receiving a request to insert a file into the index,
\texttt{arpd} performs a Chord lookup for the key corresponding to the
file. It contacts the \texttt{arpd} instance on that node, which acts
as an index gateway, generating the keyword sets and contacting
whichever index servers are necessary. As an index server, the node
receives inserted blocks, answers queries, and periodically checks for
block expiration and participates in the index replication protocol.

Each index entry is stored on $k$ replicas, where $k$ is a parameter
that can be adjusted based on a tradeoff between index data durability
and maintenance cost; our implementation currently uses $k=3$. Nodes
periodically participate in a replication protocol to ensure that each
of the replicas responsible for an index has all of the data
associated with that index. This protocol operates as follows:
periodically, or when churn is detected, a node contacts its $k$
successors and predecessors and sends it a ``\proc{Offer-Blocks}''
request, providing the hashes of every index entry it has that the
other replica should have. Note that this can be computed locally,
because of the use of consistent hashing to map indexes to responsible
nodes. The remote node uses the hashes to see if any index entries are
missing; if so, it sends a ``\proc{Request-Blocks}'' RPC to the
original node to request their contents. As an optimization, the
expiration time for each block is included in the
``\proc{Offer-Blocks}'' request so that if an index entry has simply
been refreshed, its contents do not need to be retransmitted.

This replication protocol is greatly simplified compared to some
alternatives, such as the Merkle-tree-based synchronization protocol
used by DHash~\cite{cates03:_robus_and_effic_data_manag}. We have
chosen this option primarily for simplicity of implementation. \marnote{Other protocol?}

\section{User Interface}
\label{sec:implementation:interface}

The implementation of Arpeggio is designed such that the user
interface is separate from the indexing implementation. This means
that a variety of user interfaces can be combined with the indexing
core, similar in spirit to the giFT file sharing
program~\cite{gift_projec}. This allows it to be used with different
file sharing programs, or for general searching. For example, a set of
command-line tools can be used to add files to the index or perform
searches; these can coexist with plugins in BitTorrent or Gnutella
clients.

At a high level, the interface requires two components: a means for
the user to perform searches, and a mechanism for registering shared
files with the index. Here, we present an interface that meets these
requirements with a web-based search interface, and a plugin for a
popular BitTorrent client that indexes shared torrents.

\subsection{Web Search Interface}
\label{sec:implementation:interface:web}

By embedding a HTTP server into the \texttt{arpd} daemon, the web
interface allows the user to perform searches simply by pointing a web
browser at the local host on a certain port. The interface is very
simple, and consists simply of a search form that allows keyword
queries, as shown in
Figure~\ref{fig:implementation:interface:web:screenshot}; the field
names for metadata entries are configurable.

We choose to use a web interface because of its near-universality; it
can be used in conjunction with any file sharing client, or by
itself. In particular, users of BitTorrent clients are often
accustomed to performing their searches on tracker websites, and many
clients, such as \texttt{ktorrent}~\cite{ktorrent}, include integrated web
browsers for accessing search sites.

Each result returned via the web interface includes a link for
downloading the file. As described in
Section~\ref{sec:content-distribution:bittorrent}, this is a magnet
link containing the file's hash, which many clients can use to
initiate a download.

\begin{figure}[tbp]
  \centering
  \includegraphics[scale=0.5]{figures/web-interface-screenshot}
  \caption{Screenshot of web search interface}
  \label{fig:implementation:interface:web:screenshot}
\end{figure}

\subsection{Azureus Indexing Plugin}
\label{sec:implementation:interface:azureus}

Besides searching, the other necessary component of a user interface
is a system for registering shared content with the index. This can be
achieved by adapting a file sharing client to notify the Arpeggio
indexing daemon of which files it has available. We have implemented
this functionality as a plugin for the popular Azureus BitTorrent
client.

The Arpeggio plugin for Azureus operates in the background, without
user interaction; its only interface is a status display shown in
Figure~\ref{fig:implementation:interface:azureus:screenshot}. The
plugin consists of a set of \emph{metadata extractors} that examine
the torrent files and generate metadata entries, plus a core that
combines the generated metadata and sends it to the local
\texttt{arpd} daemon. These metadata extractor plugins can generate
metadata that is always available, such as the keywords from the
torrent's name and the files contained within or the hash of the file,
or metadata entries for certain types of files, such as the ID3 tags
of a MP3 file or their equivalent for other media files.  The plugin
communicates with \texttt{arpd} over a local
XML-RPC~\cite{winer99:_xml_rpc_specif} connection. It periodically
sends requests to re-register the metadata for files being shared, in
order to keep the index entries from expiring.

\begin{figure}
  \centering
  \includegraphics[scale=0.5]{figures/azureus-screenshot}
  \caption{Azureus indexing plugin interface}
  \label{fig:implementation:interface:azureus:screenshot}
\end{figure}

\section{Deployability}
\label{sec:implementation:deployability}

Though Arpeggio has been implemented, ensuring that it is widely
adopted requires some further issues to address. The system must be
easy for end-users to use, and provide an incentive for users to use
it over the current alternatives.

The current implementation of Arpeggio uses the MIT Chord
implementation in its routing daemon (\texttt{cd}). This presents a
problem as the MIT Chord implementation is a research prototype, and
cannot easily be used by end users. It requires many dependencies and
can be difficult to build; it also runs only on Unix-based systems,
not under Windows. To address this problem, \texttt{cd} can be
replaced with a more robust implementation, using a either a different
implementation of Chord or a different distributed hash table, such as
the Kademlia implementations currently used by a number of BitTorrent
clients.

Since many BitTorrent clients already implement their own DHT, it is
natural to wonder whether Arpeggio can use the same DHT, eliminating
the need for clients to run two separate DHTs. However, it is not
straightforward to do so because Arpeggio's indexing system requires
the nodes to support network-side processing features such as
index-side filtering. One option for integrating it with an existing
DHT would be to use a protocol for subgroup lookup such as
Diminished~Chord~\cite{karger04:_dimin_chord}, or OpenDHT's ReDiR
service. With Diminished~Chord's ``lookup key in subgroup'' primitive,
which efficiently finds the immediate successor of a particular key
among the set of nodes in a subgroup, a single DHT could be used for
routing even if only some nodes are participating in our indexing
system.

Though it violates the peer-to-peer nature of the system, it is
possible to configure the user interface tools to contact a remote
\texttt{arpd} daemon rather than one running locally. This could be
useful if some users are unable to run their own indexing daemon.

In order for users to begin using Arpeggio, it must provide an
improvement over existing services. As currently implemented, it
provides largely the same functionality as existing BitTorrent
websites. Its advantage is that it operates cooperatively, and hence
can scale better and respond better to failures than a centralized
solution. However, there is a network effect involved: if no users are
registering files with the system, then no new users will have any
incentive to use it for their searches. To aid in deployment, the
index can be bootstrapped using data from existing indexes of
files. For the purposes of evaluating our system, we developed a tool
(Section~\ref{sec:evaluation:corpora:bittorrent}) for periodically
scanning the top BitTorrent search websites for new torrents; this
could be adapted to register the torrents it discovers with the
Arpeggio index, creating an initial index. Such a tool could be run by
any user of the network, and would only need to be run by a single
user. As users begin to use the system, the index registration plugins
in their BitTorrent clients would register the files they are sharing,
eventually eliminating the need for this measure.

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "thesis"
%%% TeX-command-default: "rubber"
%%% End: 

% LocalWords:  DHT metadata DHT's Lookup RPC Azureus BitTorrent plugin ktorrent
% LocalWords:  arpd Kademlia