\ifdraft
\svnInfo $Id$
\fi

We have introduced the Arpeggio content sharing system, which provides
both a metadata indexing system and a content distribution
system. Arpeggio differs from other peer-to-peer networks in that both
aspects are implemented using a DHT-based lookup algorithm, allowing
it to be both fully decentralized and scalable. We extend
the standard DHT interface to support not only lookup-by-key but
search-by-keyword queries. In order to make distributed indexing scale
to the size of a peer-to-peer file sharing network, we introduce
network-side processing techniques such as index-side filtering, index
gateways, and expiration. For metadata indexing, where the average
number of keywords per file is small, we improve query load balancing
by indexing based on keyword sets rather than individual keywords. 

For the content-distribution side of the system, we provide two
options. We introduce a new content-distribution system based on
indirect storage via Chord subrings that uses chunking to leverage
file similarity, and thereby optimize availability and transfer speed.
It further enhances availability by using postfetching, which uses cache
space on other peers to replicate rare but demanded
files. Alternatively, for ease of deployment, we present a simpler
alternative that uses the trackerless BitTorrent system to handle much
of the content distribution tasks.

Arpeggio has been implemented, providing a core indexing module that
can be used with a variety of user interfaces. Currently available are
a web interface that allows convenient searches for BitTorrent files,
and a plugin for a popular BitTorrent client that automatically
registers shared files with the distributed index.

Our simulation studies evaluate the feasibility of using Arpeggio to
index files from Gnutella, various BitTorrent search sites, and the
FreeDB CD database. We find that keyword-set indexing improves query
load balancing, and Arpeggio is able to perform scalably under a
Gnutella-based synthetic workload.

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "thesis"
%%% TeX-command-default: "rubber"
%%% End: 

% LocalWords:  metadata DHT lookup BitTorrent trackerless postfetching subrings
% LocalWords:  plugin Gnutella FreeDB
