\documentclass[11pt]{article}
%\usepackage{fullpage}

\begin{document}
\title{Efficient Searching in a \textrm{Chord}-based Peer-to-Peer
  System \vspace{.5em} \\
  \large UROP Proposal: Spring 2005}
\author{Dan Ports \\
  \texttt{drkp@mit.edu}}
\maketitle


\section*{Project Overview}
\label{sec:overview}


The peer-to-peer model has become a popular architecture for internet
services such as file-sharing networks. Rather than store data on a
centralized server or group of servers, data is stored on \emph{every}
node in the network, and nodes communicate directly with each
other to request data. The challenge, then is to locate the nodes
containing the desired data.

The Chord distributed hash lookup primitive, developed at MIT, solves
one important problem that makes it useful for developing peer-to-peer
systems: given an identifier for a piece of data, it can determine the
node responsible for storing that piece of data. It does so
efficiently, and with many other desirable properties: it is entirely
decentralized; it can deal well with nodes joining and leaving the
network; etc.

In the case of a file sharing network, this means that Chord is very
effective for finding out which node has a copy of a particular file.
It is not, however, capable of searching for \emph{any} file that
matches some search criteria. This problem is difficult to solve. It
is, however, much simpler when the only a small set of metadata (e.g.
the file name) is searchable, not the whole document.  This project's
goal is to add metadata-searching capabilities to a Chord-based
network.


\section*{Project Plan}
\label{sec:plan}

This project began during the summer of 2004. It is still in progress,
and I will be continuing to work on it this semester.

I have been working directly with Professor David Karger (32-G592,
258-6167, \texttt{karger@mit.edu}) of the Computer Science and
Artificial Intelligence Laboratory on this project, as well as Austin
Clements, another UROP.

We have been designing and implementing a framework for searching an
index stored in a Chord-based distributed hash table. For pragmatic
reasons, this has also involved infrastructure work related to
implementation of the Chord algorithm, and distributed hash table
services. Once these components are completed, we will then be able to
implement peer-to-peer applications using this system.

The first application will be a file-sharing program, similar in
function to Napster, Gnutella, etc., but made more scalable through
the use of our distributed index system. We also intend to address
other issues relevant to file-sharing networks, including how to
effectively transfer files as well as index them, and how to ensure
adequate availability of files. The result will be a usable program
suitable for public release. We will test it to identify the
performance characteristics of our index system.

A preliminary paper\footnote{A. T. Clements, D.  R. K. Ports, D. R.
  Karger. \emph{Arpeggio: Metadata Searching and Content Sharing with
    Chord}. Proc. 4th International Workshop on Peer-to-Peer Systems,
  to appear.} related to our design for the Arpeggio content-sharing
system has been accepted for publication and will appear in print
shortly. We have presented this paper at the Fourth International
Workshop on Peer to Peer Systems in Ithaca, NY in February 2005.

If time permits, we may also develop other peer-to-peer applications
using the index system, such as a distributed ``yellow-pages''
directory system, or other indices for name resolution.


\end{document}