

\section{Conclusion}
\label{sec:conclusion}

In this paper, we have reviewed algorithms from the following families:
hierarchical, partitioning, density-based, and
grid-based. Partitioning algorithms are very popular in practice, but they
have the downsides of long run times and the possibility of returning
local minima instead of global ones. Also, they have difficulty
dealing with outliers and non-spherical clusters.  Other algorithms,
such as OPTICS, also have fairly inefficient run times, particularly
when faced with high-dimensional data.  However, this sacrifice in
efficiency is made in the interest of more accurate results when
clusters are arbitrarily shaped.

Many of these algorithms are parameterized, which introduces the
difficulty of determining appropriate values for those parameters.
For example, DBScan can perform quite well on abitrarily shaped
clusters, but only if the parameters it has are well-suited to the
densities of the clusters in the data.  While most of these algorithms
are meant for numerical data, Boolean data can be clustered as well,
as demonstrated by ROCK.  CURE can deal with elongated clusters, but
not completely arbitrarily shaped ones.  In a more extreme case,
DBscan and WaveCluster work well with arbitrary shapes.

There are many other clustering algorithms that are not included
here, but these were chosen because they give a good cross-section of the
clustering options that are currently available.  Each of the
algorithms we have discussed here has its own strengths and
weaknesses, and while they all aim to solve the same problem, they vary
greatly in implementation.  There is no particular algorithm here that
is the best across the board. When someone is in need of a clustering
algorithm, they should consider all of these options to determine
which type of algorithm would work best for their particular
situation.

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "paper"
%%% End: 
