\documentclass{article}
\input{6046preamble}


\begin{document}
\pset{6-3}{Dan Ports}{No collab}{Recitation \#8}

\begin{ppart}{Part a}
\paragraph{Algorithm}
Let $A$ and $B$ be given as two sets containing elements in the range $\{0\ldots m\}$. We will compute the set $C = \{x+y : x\in A y\in B\}$. To do this, we first create two polynomials of degree $m$, represented as arrays of coefficients $X[0 \ldots m]$ and $Y[0 \ldots m]$. We initialize the array $X$ to contain zeros, and iterate through the set $A$; for each element we let $x$ be the element's value then set $X[x] = 1$. When we are complete, $X[x] = 1$ if an element with value $x$ is in $A$, and $0$ otherwise. We initialize $Y$ in the same way, using the set $B$. We next use the FFT-based fast polynomial multiplication algorithm to multiply these two polynomials. The output is a polynomial represented as an array $Z[0 \ldots 2m]$ of coefficients. We iterate through this array; if $Z[x] \neq 0$, we know $x \in C$ and output $x$.
\paragraph{Proof of correctness}
We assume the correctness of the fast polynomial multiplication algorithm. It simply remains to show that the reduction of the sums problem to a polynomial multiplication is valid. From algebra, it is clear that the polynomial $Z = X Y$ has a term of degree $n$ with coefficient $(x_0 y_n + x_1 y_{n-1} + \dots + x_{n-1} y_1 + x_n y_0)$. Since the coefficients of $x$ and $y$ are only zero or one, $Z$ has a non-zero term of degree $n$ if and only if there exist an $i$ and $j$ such that $i + j = n$ and $x_i$ and $y_j$ are non-zero. We generated the polynomials such that $x_i$ and $y_j$ are only non-zero if $A$ contains $i$ and $B$ contains $j$, so our algorithm always gives the correct result.
\paragraph{Runtime analysis}
Initializing the coefficient arrays takes constant time to set them to zero, plus $\Theta(\abs{A} + \abs{B})$ time to iterate through the two arrays $A$ and $B$. We apply the FFT fast polynomial multiplication algorithm to two polynomials of degree $m$, which requires $\Theta(m \lg m)$ time. Finally, we iterate through the array $Z$, which also takes $\Theta(m)$ time. The algorithm thus overall requires $\Theta(m \lg m)$ time.
\end{ppart}

\begin{ppart}{Part b}
\paragraph{Algorithm}
Given an array $A$ of $n$ integers which take on values from $1$ to $m$ ($n \le m$), we show that it is possible to determine whether there are three distinct indices $i,j,k$ such that $A[i] + A[j] + A[k]$ is equal to some target sum $x$ in $\Theta(m \log m)$ time. In order to solve this problem, we will generate and make use of a polynomial defined by $B(x) = x^{A[1]} + x^{A[2]} + \cdots + x^{A[n]}$. As before, we will represent this polynomial as an array $B$ of coefficients that has length $m$. We can generate the array $B$ (in $\Theta(n) = O(m)$ time) by initializing it to zeros and iterating through the array, incrementing $B[A[i]]$ by one for every $i$. We wil also make use of a polynomial $C(x) = B(x^2) = x^{2 A[1]} + x^{2 A[2]} + \cdots + x^{2 A[n]}$. We represent this as an array $C$ of coefficients with length $2m$. We can generate it in the same way, with the same $O(m)$ time bound; we simply increment $C[2 A[i]]$ for every $i$.
\paragraph{}
We will show (below) that, given a target value $t$, there exist three different indices $i,j,k$ such that $A[i] + A[j] + A[k] = t$ if and only if the polynomial $F(x) = B^3(x) - B(x^2)B(x) = B^3(x) - C(x)B(x)$ has a non-zero coefficient of degree $t$. We can generate this polynomial by using the FFT to multiply $B$ by itself, then multiplying the result by $B$ again; this only requires $\Theta(m \log m)$ time. We put the resulting coefficients in an array $D[0 \ldots 3m]$. We also generate $E = C(x)B(x)$ by multiplying $B$ and $C$ (again in $\Theta(m \log m)$ time) and putting the results in an array $E[0\ldots 3m]$. We then generate the array $F$ in $\Theta(n)$ time by iterating from $i=0$ to $i=3m$ and filling it with $F[i] = D[i] - 3 E[i]$. Finally, we check (in constant time) whether $F[t] > 0$; if it is we indicate that a matching triple of indices exists, and if it is not we indicate that no such triple exists.
\paragraph{Proof of correctness}
We assume the correctness of the fast polynomial multiplication algorithm and the results of part a. We will justify the correctness of this algorithm by showing that there exist three different indices $i,j,k$ such that $A[i] + A[j] + A[k] = t$ if and only if the polynomial $F(x) = B^3(x) - B(x^2)B(x) = B^3(x) - C(x)B(x)$ has a non-zero coefficient of degree $t$. We first consider the simpler problem in which the indices $i,j,k$ are not required to be different. In this case, we simply compute $B^3$ by multiplying $B$ by itself twice and check whether $B^3[t] > 0$. The proof of this statement follows trivially from the results of part a.
\paragraph{}
This argument does not apply when we require $i,j,k$ to be distinct, because it would find cases in which $i=j=k$ and $A[i] + A[i] + A[i] = t$, or $i=j\neq k$ and $A[i]+A[i]+A[k]=t$, for example. Subtracting three times the polynomial $E(x) = B(x)C(x)$ handles this constraint. This is because $C(x) = B(x^2)$ represents the numbers that can be achieved by summing two indices that are the same, that is $A[i] + A[i]$ for any $i$. When we multiply by $B(x)$, we obtain a representation for the numbers that can be achieved by summing two identical indices and one other: $A[i] + A[i] + A[j]$ for any $i,j$ including $i=j$. We need to subtract three times this polynomial because there are three different ways to assign the indices (i.e. $(i=j=1, k=2)$ is equivalent to $(i=k=1, j=2)$ and $(j=k=1, i=2)$). The resulting polynomial is $F(x)$, so checking $F[t]$ gives the correct result. (The correctness of the procedures used to generate the polynomials $B,D,E,F$ is easy to see.)
\paragraph{Runtime analysis}
The arrays $B$ and $C$ are each generated by iterating through the input array $A$, which requires time $\Theta(n) = O(m)$. Generating $D$ and $E$ requires polynomial multiplication in time $\Theta(m \log m)$ via the FFT. $F$ is easily generated in $\Theta(n)$ time. Checking $F[t]$ requires constant time. Thus, the algorithm requires $\Theta(m \log m)$ time overall.
\end{ppart}

\end{document}

