Another modification to the power of two choices technique is to keep
state, using memory from the previous decision in order to make the
next one.  In 2002, Mitzenmacher, Prabhakar, and Shah proposed what
they termed the \emph{$(d, m)-$memory policy}
\cite{mitzenmacher02:_balls_and_bins_with_memor}.  In this process,
$d$ bins are chosen uniformly at random, just as in the standard
power-of-two-choices method.  The difference here is that there are
also $m$ bins already selected and stored in memory.  The least full
of all $m + d$ bins is chosen, and the ball is placed there.  Then,
the $m$ emptiest bins out of all $m + d$ are stored in memory for the
next time a ball has to be assigned to a bin.

For the purpose of comparison, the simple power-of-two-choices method
in which no memory is kept will be referred to as the
\emph{$d$-random} policy.  The asymmetric case in which the leftmost
bin is chosen out of a tied pair will be termed the \emph{$d$-left}
policy.  The $(1,1)$-memory policy compares favorably with both the
2-left and 2-random policies.  While 2-random has a bound of
$\frac{\log \log n}{\log 2} + \bigO{1}$, (1,1)-memory achieves a bound
of $\frac{\log \log n}{2 \log \phi} + \bigO{1}$, where $\phi = \frac{1
  + \sqrt{5}}{2}$.  This means that replacing one of the random
choices with a choice from memory improves the 2-random technique.
For $d > 2$, the improvement is even more marked.  The maximum load
under the $(d, 1)$-memory policy is reduced to $\frac{\log \log
  n}{\log (2d -1)} + \bigO{1}$.  This means that having one bin saved
in memory from the previous round is nearly equivalent to having a
choice between $2d$ random bins.  So, for the price of one bin being
kept as state, we have an improvement equivalent to the one that would
result from doubling the number of choices.  Intuitively, this is
because the one bin saved in memory is the best of $d$ randomly chosen
bins, so choosing between it and $d$ others is like choosing the best
of $2d $total.  The fact that the improvement is similar to
multiplying $d$ by a constant, rather than adding a constant, can
easily justify the extra resources that would have to be devoted to
save a small amount of state.

The analysis performed by the authors deals almost exclusively with
cases where $m = 1$, as the scheme is feasible but the analysis
becomes much more complex for $m > 1$.  The $(d, 1)$ case has
significant enough results on its own, though, that the more complex
analysis is unnecessary. We will prove the bound that results from
using the $(1,1)$-memory policy to assign balls.

\begin{theorem}
  If $n$ balls are assigned to $n$ bins using the $(1,1)$-memory
  policy, the maximum load of any bin is $\frac{\log \log
  n}{\log (2d -1)} + \bigO{1}$ with high probability.
\end{theorem}

\begin{proof}
  We begin by defining some notation. Time $t$ will refer to the time
  before the $t$th ball is dropped in a bin.  The fraction of bins with
  a load of at least $i$ at time $t$ will be denoted by the term $s_i(t)$,
  and $p_i(t)$ will be the probability that at time $t$ the bin in memory
  has a load of at least $i$.  Given these definitions, we can determine
  the following relation:
  \begin{equation}
    \label{eq:memory-markov}
    \Exp{s_i(t+1) - s_i(t) \;\st\; s(t)} = \frac{s_{i-1}(t) \cdot p_{i-1}(t) -
    s_i(t) \cdot p_i(t)}{n}
  \end{equation}
  This is because $s_i$ increases by $\nicefrac1n$ whenever the number
  of bins with at least $i$ balls increases by 1, and $s_i$ will only
  increase if some bin that had $i-1$ balls before gains an $i$th
  ball.  This will only happen in the case where both the randomly
  selected bin and the one in memory had $i-1$ balls but not $i$ in
  them.
  
  To make analysis simpler, this equation will be converted into a
  differential equation using the fluid-limit method of
  Section~\ref{sec:power-of-two-choices:fluid-limit}.  First, $t$ will
  be scaled so that time runs from 0 to 1, and $t$ changes by
  increments of $\Delta t = \nicefrac1n$ for the $n$ items that need
  to be assigned.  This gives us
  \begin{equation}
    \label{eq:memory-markov-scaled}
    \frac{\Exp{s_i(t + \Delta t) - s_i(t) \;\st\; s(t)}}{t} =
    s_{i-1}(t) \cdot p_{i-1}(t) -  s_i(t) \cdot p_i(t)
  \end{equation}

  Taking the limit as $\Delta t \to 0$, this then becomes the
  following differential equation:
  \begin{equation}
    \label{eq:memory-diffeq}
    \der{s_i}{t} = s_{i-1} \cdot p_{i-1} - s_i \cdot  p_i
  \end{equation}
  
  We also need an equation to govern the $p_i(t)$ term.  There are
  three cases in which the next bin placed in memory will have a load
  of size $i$ or more.  First, it could be that the current bin in
  memory has a load of $i-1$, but all of the randomly chosen bins have
  loads of $i$ or more.  Second, the bin in memory could have a load
  of $i$ or more, but one of the randomly selected ones has a load of
  $i-1$.  Finally, if all of the randomly chosen bins and the one in
  memory have loads of at least $i$ items already, then the one that
  gets placed in memory will also.  In order to get an equation for
  $p_i(t)$ we need to sum the probabilities of each of these three
  scenarios.  This will give us the equation for $p_i(t)$ shown below:
  \begin{equation}
    \label{eq:memory-pi}
    p_i(t+1) = \left(p_{i-1}(t) - p_i(t)\right) s_i(t) + p_i(t)
    \left(s_{i-1}(t) - s_i(t)\right) + p_i(t) \cdot s_i(t)
  \end{equation}
  
  Using large deviations theory, the authors demonstrate that the
  Markov chain for the value of $p_i(t)$ changes state at a much
  faster rate than the one for the $s_i(t)$ values.  Therefore, in the
  equation for $p_i$ we can instead find the equilibrium values for
  $p_i$ given that the $s_i$ values are fixed.  This gives us
  \begin{equation}
    \label{eq:memory-pi-with-fixed-si}
    p_i = p_{i-1} \cdot s_i + p_i \left(s_{i-1} - s_i\right),
  \end{equation}
  which can also be stated as
  \begin{equation}
    \label{eq:memory-pi-with-fixed-si-2}
    p_i = \frac{p_{i-1} \cdot s_i}{1 - s_{i-1} + s_i}.
  \end{equation}
  This $p_i$ equation can then be used to substitute for $p_i$ in
  \eqref{eq:memory-diffeq}. Solving the resulting system of
  differential equations shows that maximum load of any bin is
  $\frac{\log \log n}{\log (2d -1)}$, and applying Kurtz's theorem to
  return to the discrete, probabilistic case gives us the same result
  with high probability with only a $\bigO{1}$ additive factor.
\end{proof}

\begin{remark}
  To bound the $(d, 1)$-memory case, we can use following equations instead:  
  \begin{align}
    \der{s_i}{t} &= s_{i-1}^d \cdot p_{i-1} - s_i^d p_i \\
    p_i = \frac{p_{i-1} \cdot s_i^d}{1 - d \left( s_{i-1} - s_i \right)
      s_{i-1}^{d-1}}
  \end{align}
  
  Solving these equations in the same manner as above will give a
  high-probability maximum load for the system of \[\frac{\log \log
    n}{\log f(d)} + \bigO{1},\] where $f(d) = \frac{1}{2}\left(2d +
    1+\sqrt{4d^2 + 1}\right)$, which falls bnetween $2d$ and $2d+1$.
  Therefore, this bound is slightly better than the bound for a
  $2d$-random system.
\end{remark}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "paper"
%%% End: