Recent years have seen a dramatic increase in the popularity of peer-to-peer file sharing systems. These systems, which allow users to locate and obtain files shared by other users, allow data to be transferred more efficiently than the traditional client-server model by utilizing peers' upload bandwidth and break the complexity barrier of content publication. Moreover, they can be implemented without the need for a central server. However, many current file sharing networks trade-off scalability for correctness, resulting in either systems that scale well but sacrifice completeness of search results or vice-versa. The unique abilities of distributed hash tables are ideal for constructing peer-to-peer networks because they overcome the difficulties of quickly and correctly locating peers. However, the lookup query primitives provided by DHTs are not immediately sufficient to perform complex search queries of the data stored in the network, nor it is clear how to construct interfaces for search queries without sacrificing scalability or query completeness. We will present the design for Arpeggio, a novel file-sharing system that uses the indexing primitives of the Chord distributed hash table to construct a system that facilitates the finding and sharing of bulk data. This design retains many of the advantages of a central index such as completeness and speed of queries, while providing the scalability and other benefits of full decentralization. By utilizing the properties of distributed hash tables, Arpeggio provides guarantees on network scalability, the correctness of search results, the speed of searches, and the availability of data. The amount of network traffic scales approximately O(n log n) with the number of files present in the network, which itself scales approximately linearly with the number of users, thus achieving scalability. The system can consistently locate even rare files scattered throughout the network, thus achieving perfect recall. Searches generally require O(1) queries to be sent to the network, and the response length is O(n) in the number of files that actually match the query. Multiple forms of caching are employed to improve file availability by ensuring both that the supply of files in high-demand is actively increased to meet demand and that rare files are made available to those requesting them.