\documentclass{article}
\input{6897-preamble}

\begin{document}
\psetnum{7}
\date{2005/03/30}
\begin{pset}
  We give an algorithm for sorting $n$ $w$-bit integers on a word RAM
  with word size $w$ in expected time $\bigO{n \lg \frac{w}{\lg
      n}}$. This is accomplished using a technique reminiscent of van
  Emde Boas.

  \begin{theorem}
    The problem of sorting $n$ $w$-bit integers can be reduced in
    expected $\bigO{n}$ time to a set of sorting problems on
    $\nicefrac w2$-bit integers, with $n$ input integers in total.
  \end{theorem}

  \begin{proof}
    Let $x_1, \ldots x_n$ be the input integers, and define $x_i^H$
    and $x_i^L$ to be the high- and low-order $\nicefrac w2$ bits of
    $x_i$ respectively. We begin by building a hash table (call it
    $I$) keyed on the high-order bits $x_i^H$. The hash table can be
    chosen to be sufficiently large that in expectation there will be
    no collisions. Then (assuming no collisions) $I[k]$ contains all
    $x_i$ with high-order bits equal to $k$. While doing so, we also
    maintain a list of the values $y$ such that $I[y]$ is non-empty.
    We sort these high-order bits $y$ with a recursive call.

    With this ordering of the high-order bits, we can proceed to sort
    the integers. For each $y$ in increasing order, we scan through
    $I[y]$ to identify the minimum, output it, and then remove it.
    Then we recursively sort all $x \in I[y]$ (except the removed
    minimum) by the low-order bits $x^L$, and output them in that
    order. Since this is done in increasing order based on the
    high-order bits, and then on the low-order bits, it produces the
    correct sorted output.

    Note that all integers sorted by recursive calls are $\nicefrac
    w2$ bits long, and there are a total of $n$ of them over all the
    recursive calls: if there are $h$ different high-order prefixes
    over all the input integers, then there will be $h$ high-order
    bits to sort in the first step, and a total of $n-h$ low-order
    bits over the remaining steps (since the minimum of each of the
    $h$ sets is removed). Identifying the minimum can of course be
    done in linear time because the total number of elements in the
    sets is $n$.
  \end{proof}

  \begin{corollary}
    On a word RAM with word size $w$, $n$ $w$-bit integers can be
    sorted in expected time $\bigO{n \lg \frac{w}{\lg n}}$.
  \end{corollary}

  \begin{proof}
    We use the recursion described above to reduce the input to a set
    of smaller problems with half the word length. We repeatedly apply
    this recursion until the word size reaches $\lg n$. At this point,
    we can apply radix-sort to sort the subproblem in linear time.

    By the theorem above, the recurrence produced is
    \[ T(n,w) =
    \begin{cases}
      \bigO{n} + \sum_i T\left(n_i, \frac{w}{2}\right) \text{ where }
      \sum_i n_i
      = n & \text{ if } w > \lg n \\
      \bigO{n} & \text{ if } w \le \lg n
    \end{cases}
    \]

    Consider the resulting recursion tree. The branching factor may
    vary (depending on the number of unique high-order prefixes in
    each subproblem), but at each level the total size of the
    subproblems will still be $n$, and so the amount of work done at
    each level will be $\bigO{n}$. Moreover, at each level the word
    size is reduced by half (independent of the value of $n$). The
    recursion is performed until the word size is decreased from $w$
    to $\lg n$, so the tree has height $\lg \frac{w}{\lg n}$. So the
    total time required is $\bigO{n  \lg \frac{w}{\lg n}}$.
  \end{proof}
\end{pset}
\end{document}
 