\documentclass{article}
\input{6897-preamble}
\usepackage{clrscode}

\begin{document}
\psetnum{4}
\date{2005/03/01}

\begin{pset}
  We represent the graph as a forest of trees, where each tree
  corresponds to a set. In each tree, the vertices in the represented
  tree are represented as \emph{leaf} nodes in the tree, and the root
  of the represented tree is stored as a label in the root of the
  tree. We also tag each node with its height value, where leaves have
  height 0 and other nodes have height one greater than the maximum
  height of their children.

  In order to achieve the desired runtime bounds, we will ensure
  that the trees have height $\bigO{\log_b n}$ by ensuring that the
  tree always has branching factor at least $b$ and is sufficiently
  ``bushy'': every internal node in each tree (except perhaps the root)
  has at least $b$ children that have height one less than their
  parent. The root has only children that have height one less than
  the height of the root. Notice that the invariants are trivially met
  in the initial case where every node is its own tree.

  To perform a \proc{Union} of two nodes $\mathfrak{x}$ and
  $\mathfrak{y}$, we first follow the chain of parent pointers until
  we reach the roots of their trees. Call the associated roots
  $x$ and $y$, and let $h_x$ and $h_y$ be their heights, and $c_x$ and
  $c_y$ be the number of children they have. There
  are two cases:

  \begin{enumerate}
  \item $h_x = h_y$. Then assume \emph{w.l.o.g.\ }that $c_x \le c_y$.
    If $c_x \le b$, then we can simply change the parent pointers of
    each of $x$'s children to point to $y$, and add the children of
    $x$ to the list of children of $y$. Then $y$ is the new root, and
    the interior node $x$ can be discarded. Note that the invariant is
    maintained because $x$'s children have height $h_x - 1 = h_y - 1$.

    If instead $c_x > b$, we cannot change the parent pointers for all
    of $x$'s children because this would exceed our desired $\bigO{b}$
    runtime. Instead we create a new root $z$, and make both $x$ and
    $y$ children of $z$. Note that this maintains the invariant tree
    properties because both $x$ and $y$ have more than $b$ children
    and are the same height. It also requires manipulating only two
    pairs of parent/child pointers and creating one new node.
    
  \item $h_x \neq h_y$. Then assume \emph{w.l.o.g.} that $h_x < h_y$.
    If $h_y = 1$ and $h_x = 0$, then we simply make $x$ a child of
    $y$. Otherwise, $y$ has at least one child of height $h_y - 1$
    which is an internal node (call it $z$) and hence $z$ has at least
    $b$ children; $x$'s children all have height $h_x - 1$. If $x$ has
    no more than $b$ children, then we can simply discard $x$ and make
    these children of $z$; this preserves the invariant since $z$'s
    height is at least that of $x$, and $z$ is an internal node and
    already has at least $b$ children of height $h_y - 2$. It
    requires manipulating at most $b$ pointers.

    If instead $c_x > b$, then we will add $x$ itself to
    $y$'s tree. If $h_x = h_y - 1$, then $x$ can be added as a child
    of $y$ while preserving the invariant. Otherwise, $h_x < h_y -
    1$. Then we can make $x$ a child of $z$, which preserves the
    invariant for the same reason as above. In either case, only one
    pair of parent/child pointers needs to be changed.
  \end{enumerate}

  As we perform these manipulations, we also update the height value
  associated with each node, and ensure that the correct root label is
  placed on the root node; due to space constraints, we omit the
  details.
  
  In each of these cases, at most $\bigO{b}$ pointers are manipulated,
  so performing the operation meets the desired runtime bound. The
  invariants are maintained by a \proc{Union} operation in each case,
  so the height of the tree is kept at most $\bigO{\log_b n}$.
  
  To perform a \proc{Find} on a node $a$, we follow the chain of
  parent pointers from $a$ to the root of its tree, then read the root
  of the represented tree containing $a$ from the label in this
  node. Since the tree is constructed to have maximum height
  $\bigO{\log_b n}$, this gives the desired $\bigO{\log_b n}$
  runtime. 
\end{pset}
\end{document}
