\documentclass{article}
\input{6046preamble}


\begin{document}
\pset{5-2}{Dan Ports}{No collab}{Recitation \#8}

\begin{ppart}{Part a}
\paragraph{}
We first show that if a graph has an Euler tour, then in-degree($v$)=out-degree($v$) for every vertex $v\in V$. A Euler tour, by definition, passes through all edges, and thus through all vertices since we are given that the graph is connected. Choosing an arbitrary starting vertex $x$, the tour starts at $x$, passes through every other vertex (and possibly also $x$ again) at least once, and returns to $x$. Each time the tour passes through one vertex, it must enter the vertex and then exit it; there is also one edge that leaves $x$ at the beginning and one that arrives at $x$ at the end of the tour. So the Euler tour contains one edge entering each vertex for each edge leaving the vertex; since every edge in the graph must be part of the Euler tour, we must have in-degree($v$)=out-degree($v$) for every $v\in V$.
\paragraph{}
To show that any graph with in-degree(v)=out-degree(v) has an Euler tour, we use the algorithm in part b. This algorithm finds a Euler tour, and it requires only that in-degree(v)=out-degree(v), so it suffices as a proof.
\end{ppart}

\begin{ppart}{Part b}
\paragraph{Algorithm}
Assume we are given a graph as a list of vertices $V$ and a list of directed edges $E$, and it satisfies in-degree($v$)=out-degree($v$) $\forall v \in V$.
 \paragraph{}
Next we use a modified depth-first search to generate a list of cycles $\{C_1, C_2, ...\}$ in this graph. We modify the given DFS algorithm (CLRS, p. 541) so that it can mark edges as either visited or unvisited. We initialize each edge (in $\Theta(\abs{E})$ time) to be unvisited. We also ignore the ``coloring'' process done by the given DFS algorithm.
We modify the DFS-Visit routine so that, when we DFS-Visit a node $u$, we select the first unmarked edge (call it $e$) from $u$ to some other vertex (call it $v$), mark that edge $e$, and add $e$ to our current cycle $C_i$. We then DFS-Visit the vertex $v$, and continue this process until we reach a vertex that has no remaining unmarked edges. We will show (below) that this only happens when we complete a cycle.
\paragraph{}
To use this modified DFS-Visit to find cycles, we start by initializing $i \leftarrow 1$ and the list of found cycles to be empty. We begin by choosing some arbitrary unmarked edge $e$ and apply DFS-Visit to its source vertex. When this terminates, $C_1$ will contain a cycle beginning with edge $e$, and every edge in the cycle will be marked. If there are no remaining unmarked edges, then the cycle we just found contains every edge, and it is an Euler tour, so we are done. Otherwise we increment $i \leftarrow i+1$, choose another unmarked edge $e$, and DFS-Visit the source vertex of $e$. We repeat this process until there are no remaining unmarked edges. We then have a set of edge-disjoint cycles $\{C_1,C_2,...C_n\}$ that contain every edge of the graph.
\paragraph{}
We show that we can combine this set of cycles into an Euler tour. Because the graph is connected, we can find two cycles $C_i$ and $C_j$ that share a vertex (call it $v$). If this were not true, then some vertex would not be reachable from some other vertex, contradicting the connectedness of the graph. Choosing arbitrary starting vertices $i \in C_i$ and $j \in C_j$, we can write the two cycles as paths $i \rightarrow v \rightarrow i$ and $j \rightarrow v \rightarrow j$. We can then replace $C_i$ and $C_j$ with a new cycle that follows the path $i \rightarrow v \rightarrow j \rightarrow v \rightarrow i$. This new cycle contains every edge in $C_i \cup C_j$. We can repeat this process, combining cycles until we have only one remaining cycle that contains all edges in the graph. This cycle is an Euler tour, so we return it.
\paragraph{Proof of correctness}
We must show that, when we call DFS-Visit on some vertex $u$, it recursively follows a chain of edges (marking them and adding them to the current cycle), and can only terminate when it has returned to the vertex $u$. We show this first for the initial case, when all edges are unmarked. We know that in-degree(v) = out-degree(v) $\forall v \in V$, and so unmarked-in-degree(v) = unmarked-out-degree(v) (because all edges are unmarked). When we start by DFS-Visiting a vertex $u$, we choose an edge $e$ from $u$ to $v$ (we know one exists because we chose $u$ accordingly), add it to the cycle, and mark it. This decreases unmarked-out-degree($u$) and unmarked-in-degree($v$) by 1. For every other vertex $v$ we encounter along this path, we decrement unmarked-in-degree($v$) when we mark the edge into $v$ and decrement unmarked-out-degree($v$) when we mark the edge out of $v$. Thus we maintain the invariant that, for all $v \neq u$ unmarked-in-degree($v$) = unmarked-out-degree($v$). Assuming $v \neq u$, we know there will be an unmarked edge out of $v$ because unmarked-in-degree($v$) has to be non-zero for there to be an unmarked edge into $v$, and unmarked-in-degree($v$) = unmarked-out-degree($v$). The only time this is not true is when $v = u$, in which case unmarked-out-degree($u$) = unmarked-in-degree($u$)-1 because we already marked one edge out of $u$ when we started the traversal. Since the DFS-Visit recursion only terminates when there are no remaining unmarked edges, the recursion can only terminate when it returns to the starting vertex $u$, thus finding a cycle. We have also shown that the unmarked-in-degree($v$) = unmarked-out-degree($v$) $\forall v \in V$ invariant is maintained, so the same proof applies for later calls to DFS-Visit.
\paragraph{}
We have just shown that repeatedly calling DFS-Visit as specified in the algorithm produces cycles. Each application of DFS-Visit will identify and mark a cycle, so it will mark some elements on each iteration. Thus the number of unmarked edges is strictly decreasing, so it will eventually reach zero and every edge will be marked. Since we continue to call DFS-Visit until there are no remaining unmarked edges, when we finish the set $\{C_1, C_2, ... C_n\}$ contains all edges in $V$. Also, since we can never add a marked edge to a new cycle, each of the cycles $C_i$ is edge-disjoint. We showed a process for combining these cycles into one large cycle two paragraphs above, and argued its correctness. The resulting path is a cycle that contains every edge, so it is an Euler tour, which is the correct result. 

\paragraph{Runtime analysis}
We will construct several additional data structures to make this algorithm efficient. First, we construct an adjacency list A of unmarked edges, which is an array of doubly-linked lists, one for each vertex (noting that each edge is initially unmarked). For each vertex $v\in V$, $A_v$ is a list of pointers to all unmarked edges in $E$ whose source endpoint is $v$. Generating these adjacency lists takes time $\Theta(\abs{E})$. Given a vertex $v$, this allows us to find some unmarked edge that starts at $v$ in constant time, and allows us to unmark an edge in constant time (assuming we have a pointer to the edge's entry in the adjacency list, which we accomplish by including with each edge a pointer to its location in the adjacency list).
\paragraph{}
We will represent a cycle as a data structure containing a doubly-linked list of pointers to edges and a size-$\abs{V}$ array that allows us to quickly check whether a vertex is in the cycle. Specifically, the $v$th element of the array is null if $v$ is not in the cycle, and a pointer to an edge in the cycle beginning at $v$ if it is. It takes constant time to initialize this cycle, since we assume that we can initialize an array to zero in constant time, and initializing a linked list is trivial. This structure allows us to add an edge to the cycle in constant time (add it to the end of the list, and set the corresponding pointer in the array). We can also check (using the array) whether a given vertex is included in the cycle, in constant time.
\paragraph{}
These data structures allow us to perform the DFS-Visit cycle-identification procedure in time $\Theta(\abs{E})$. Each call to DFS-Visit requires choosing some arbitrary unmarked edge, marking that edge, and adding it to the current cycle, all of which we have shown above can be done in $\Theta(1)$ time. Each call to DFS-Visit marks an unmarked edge, and there are only $\abs{E}$ unmarked edges to begin with, so DFS-Visit is called a total of $\Theta(\abs{E})$ times at constant runtime per call, so the overall runtime of the cycle-identification procedure is $\Theta(\abs{E})$.
\paragraph{}
These data structures also allow us to merge the cycles in $\Theta(\abs{E})$ time. To merge a cycle $C_j$ into a cycle $C_i$, we iterate through each edge $e$ in $C_j$'s list, checking at each point whether the endpoint of $e$ is a vertex in $C_i$. If it is, we use $C_i$'s array to find the edge $f$ that begins at the endpoint of $e$. We then continue to iterate through $C_j$, splicing each edge in before $f$ and setting the appropriate pointer in $C_i$'s array. Once we have iterated through $C_j$, we have updated the cycle $C_i$ so that it contains every edge that was formerly in $C_i \cup C_j$, and we can discard $C_j$. Each iteration of this merge procedure takes constant time, so the total runtime of merging $C_j$ into $C_i$ is $\Theta(\abs{C_j})$. Thus, the total runtime required to merge all cycles ${C_1, C_2, ... C_n}$ is $\Theta(\sum_{i=1}^n{\abs{C_i}}) = \Theta(\abs{E})$
\paragraph{}
Each of the two major parts of the algorithm has runtime $\Theta(\abs{E})$, so the algorithm's overall runtime is $\Theta(\abs{E})$.
\end{ppart}
\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: 
