\documentclass{report}
\input{6111-preamble}
\usepackage{graphicx,epsfig,verbatim,lscape,listings}


\begin{document}
\pagenumbering{roman}
\title{\Huge{\textbf{Frequency Domain Analysis and Processing of Audio Signals}}}
%\subtitle{Using Finite Impulse Response Convolution}
\author{\LARGE{\vspace{,5em}\textbf{Dan R. K. Ports}}\\\vspace{.5em} \LARGE{\textbf{Albert C. Chiou}}\\
\LARGE{\textbf{Andrew L. Clough}}}
\date{\Large{December 12, 2003}}
\maketitle

\begin{abstract}
\normalsize{ The frequency domain is a natural representation for
audio signals. Our project explores the frequency domain
representation of an audio signal by allowing the user to view the
frequency spectrum of a supplied analog input signal. It also performs
processing of the audio signal in the frequency domain, making
possible a variety of audio effects: frequency shifting, harmonic
generation, peak sharpening, etc. Hence, the project implements --- in
the FFT / Conversion module --- analog/digital and digital/analog
sampling and transforms via the Fast Fourier Transform and Inverse
Fast Fourier Transform. The Processing module implements the frequency
domain processing. Finally, the Video module displays frequency
spectra of the input and output signals on a video monitor using the
MC6847 video processor chip.

Issues of testing and debugging are discussed. Though the project was
not fully implemented due to some numeric error and integration
issues, all components were implemented and tested in simulation, and
most were also realized in hardware.

}
\end{abstract}
\newpage

\tableofcontents
\newpage
\listoffigures
\listoftables
\lstlistoflistings
\newpage
\pagenumbering{arabic}


\chapter{Introduction}
\section{Overview}

Audio signals are typically described in the time domain: by measuring
the amplitude of the signal at regular time intervals. This is the
representation seen when viewing a signal on the oscilloscope, for
example. But the frequency domain is another natural representation
for an audio signal. Instead, the signal is described in terms of its
frequency components, built up from exponential functions. This
representation proves useful because we frequently think of audio
signals in terms that relate to their frequency components: describing
the pitch of a musical note, or the quality of harmonics, for
example. 

This project is intended to explore audio signals in the frequency
domain. An incoming audio signal is first sampled and converted to the
frequency domain, using the Fast Fourier Transform. Its spectrum is
displayed on a video monitor. The signal is processed in the frequency
domain, using a user-selected filter. A variety of interesting effects
become possible by manipulating the frequency domain
representation. The spectrum of the output signal is also displayed,
and the output signal is converted back to the time domain by the
Inverse Fast Fourier Transform and output as an analog signal.

\section{Operation}
Operation of this device proceeds nearly automatically. The reset
button on the FFT module must be held down first while the reset
button on the Processing module is pressed. This synchronizes the
starting states to begin parallel bus communication. 

The FFT module
will immediately begin sampling the input signal and computing the
Fast Fourier Transform, as well as transmitting data over the parallel
bus, and receiving processed results from the Processing module. The
output signal will be automatically transformed via the inverse FFT
and output via the digital-to-analog converter.

This device accepts a differential voltage input (V$_\mathrm{in}$)
with amplitude at most $\pm 1.28$V and processes it. The output is
generated with a $1.28$V offset, i.e. a range of 0V -- 2.56V
referenced to ground. Sampling is performed at approximately 22.143
kHz, so input signals can contain frequency components up to
approximately 11 kHz without causing aliasing. Additional analog
stages were originally planned to allow for different input and output
voltage ranges, but this was not implemented due to time constraints.

The Processing module provides a variety of audio effects that can be
applied to the input in the frequency domain. It can perform frequency
shifting, generate harmonics, generate half-harmonics, and sharpen
frequency peaks. The active effect can be selected by means of a group
of switches on the Processing kit; initially, the parameters to each
effect were to be configurable via a knob, but this functionality
became impossible to implement due to a failure of the lab kit's NuBus
interface.

The Video module automatically displays the frequency spectra of the
input and output signals on a video monitor.


\section{Design Overview}

The design of this system was divided into three components, each
designed and implemented by one person, using one lab kit. 

The Fast Fourier Transform / Conversion module, implemented by Dan
Ports, performs the necessary conversions between the analog and
digital and time and frequency domains. It uses an analog-to-digital
converter and digital-to-analog converter to sample the input signal
and generate the output signal; samples are stored in RAM buffers. It
can perform the Fast Fourier Transform and inverse Fast Fourier
Transform to convert the input and output signals from the time and
frequency domains, respectively. Finally, it also exchanges data with
the other two kits, transmitting frequency coefficients for the input
signal to the Processing and Video modules, and receiving frequency
coefficients for the output signal from the Processing module.

The alterations to the time domain data are accomplished in the
Processor module.  Here the incoming data is saved into a RAM, then read
out in a slightly different order, while sometimes being multiplied by
a constant factor.  In this way the data can be frequency shifted up
or down, sharpened, or have harmonics added to it.

The Video module displays the frequency spectra of the input and
output signals on a video monitor. This is accomplished by reading
data from the three-way parallel bus (described below) into a RAM,
then using a finite state machine to generate a video
representation in RAM. This is translated to a video signal by a
MC6847 chip and an output stage.

\section{Interface Protocol}
\label{interfaceprotocol}
The three modules of this kit must communicate frequency data between
each other. The FFT module must send frequency coefficients for
the input signal to the Processing and Video modules, and the
Processing module must send the frequency coefficients for the output
signal to the FFT and Video modules. This communication is
implemented by using a three-way 8-bit parallel bus.

The parallel bus requires eight data bits, plus five control
signals. These signals are connected between all three lab kits. The
signals are listed in Table~\ref{bustable}. All bus lines are
interfaced to the FPGAs using Schmitt-triggered line drivers and
receivers (e.g. 74LS244 or 74LS245).

\begin{table}[bthp]
\begin{center}
\begin{tabular}{|c|c|c|}
\hline
Signal & Source & Description \\
\hline
DATA & FFT or Processing & 8-bit wide data bus \\
FFTSENDING & FFT & High when FFT is transmitting \\
PROCSENDING & Processing & High when Processing module is transmitting \\
FFTRDYACK & FFT & Data ready (if FFT sending) or acknowledge (if not) \\
PROCRDYACK & Processing & Data ready (if Processing sending) or
acknowledge (if not) \\
ACK2 & Video & Acknowledge from video module\\
\hline
\end{tabular}
\end{center}
\caption{Parallel Bus Signals}
\label{bustable}
\end{table}

Each transmission consists of 2048 bytes: 1024 frequency coefficients,
with the real byte transmitted first, followed by the imaginary
byte. A transmission cycle begins when the FFT module sets FFTSENDING
high. It then places the first byte on the data bus, and sets
FFTRDYACK high. The two other modules read the data bus when they
sense FFTRDYACK is high. Once they are finished, they set their
respective acknowledge signals (PROCRDYACK and ACK2) high. The
FFT module will set FFTRDYACK low again when both acknowledge lines
are high, and wait for both acknowledge lines to go low again. It then
places the next byte on the data bus, and repeats the cycle. After the
final byte has been transmitted and acknowledged, it sets FFTSENDING
low again. 

This signals the Processing module to set PROCSENDING high, taking
control of the data bus. It then repeats essentially the same cycle,
except that PROCRDYACK becomes its ready line and FFTRDYACK becomes
the FFT's acknowledge line: placing the first byte on the data bus,
setting PROCRDYACK high, waiting for FFTRDYACK and ACK2 to go high,
setting PROCRDYACK low, waiting for the two acknowledge signals to go
low, and repeating. When it completes, it sets PROCSENDING again and
waits for the FFT to start another cycle.


\chapter{Conversion / FFT Module (D.~Ports)}
\section{Overview}
The Conversion / FFT Module is used to convert between the analog,
time domain input and output signals that interface to the external
world, and the digital, frequency domain representations used by the
Processing and Video Display modules. Hence, it must perform several
functions. It must convert the input signal from analog to digital,
and the output signal from digital to analog. It must transform the
input and output signals between the time and frequency domains; this
is implemented using the Fast Fourier Transform (FFT). Finally, it
must exchange frequency coefficent data with the other components of
this project via a parallel interface. These are implemented using
three subsystems which operate concurrently: an Analog/Digital (AD)
subsystem, a FFT subsystem (which also implements the inverse FFT),
and a parallel interface subsystem. In addition, a control unit and a
memory manager are used to allow the three subsystems to operate at
the same time without any concurrent execution issues.

The design of the module is represented by the FPGA Block Diagram and
Hardware Wiring Diagram in Figures~\ref{blockdiagram} and
\ref{wiringdiagram}.

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8in]{blockdiagram}
\end{center}
\caption{FPGA Block Diagram}
\label{blockdiagram}
\end{figure}
\end{landscape}

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8in]{wiringdiagram}
\end{center}
\caption{Hardware Wiring Diagram}
\label{wiringdiagram}
\end{figure}
\end{landscape}


\section{Control / Memory}
The three subsystems of this module operate concurrently with a
reasonably complex memory usage scheme. They therefore require complex
control logic and memory management to ensure that they are operating
at the correct times and using the correct buffers.
\subsection{Clock}
All synchronous components in this system operate on the rising edge
of a single, global clock signal. Because a 10 MHz clock signal was
too fast to satisfy the access time requirements of the SRAM chip, a 5
 MHz clock was chosen instead. This clock signal is generated by
dividing the standard 10 MHz 25\% duty cycle NuBus clock integral to
the lab kit by half using a 74LS163 counter.
\subsection{RAM}
Eight logical buffers, each containing 2048 locations of eight bits
each are used. These eight buffers are arranged as four ``rotating
buffers''; i.e. pairs of buffers such that the previous computation's
results are read from one buffer while the current computation's
results are computed and stored into the other buffer. After each
cycle of operation (one FFT/inverse FFT computation), the input and
output buffers are swapped.

One rotating buffer is used for storing the samples read from the
analog-to-digital converter; it also serves as the input to the FFT. A
second rotating buffer is used to hold the source data for the
digital-to-analog converter and results of the inverse FFT. Another
buffer is used to hold the results of the FFT of the input signal; it
serves as the source for the parallel bus's output to the other
kits. A final rotating buffer is used to hold the data received via
the parallel bus and acts as the source for the inverse FFT.

The FPGAs used in this implementation were not capable of synthesizing
a RAM unit of appropriate size (8 buffers $\times$ 2048 locations
$\times$ 8 bits$=$16384 bytes) to hold the eight buffers. Hence, an
external RAM chip was used. A Sharp LH52256C-70LL 32K $\times$ 8 SRAM
chip was selected. With twice as much capacity available as necessary,
one address line was tied to ground. The remaining address lines were
controlled by the memory manager implemented in the FPGA. The output
enable signal was tied low, and pulses on the write enable signal were
used to indicate when data was available for writing. The write enable
signal was gated with the clock to avoid spurious writes due to
glitches; however, this required the clock period to be increased so
that one half period would be greater than the 70ns access time of the
RAM.

\subsection{Memory Manager}
A memory manager is used to control access to memory. The memory
manager has one external port that connects to the RAM module, and
provides three logical ports that can be accessed by the A/D, FFT, and
interface controllers. These three ports are multiplexed in such a way
that each controller can send a request simultaneously, and they will
be serviced sequentially as the RAM becomes available.

Each logical port consists of a 14-bit address bus, 8-bit input and
output data buses, a positive-true write enable signal, a request
signal, and a done signal. The request signal is used to indicate to
the controller that a request is available and should be executed; the
done signal is high for one clock cycle after the request has been
serviced. The request signal must return low on the next clock cycle
after the done signal is asserted, or the request will be repeated.

Internally, the memory manager maintains two state variables. One
maintains the current port information, corresponding to the number of
the port whose request is currently being serviced, or zero if no
requests are in progress. This is used to multiplex the RAM address,
data, and write enable lines. The write enable line is also gated with
the clock, so that glitches can be avoided.


The port to be accessed on the next clock cycle is determined
combinationally by inspecting the request lines from each port. A
port's request line will be ignored if its done line is currently
high. The other state variable used by the memory manager is used to
help determine which port should be accessed first. Each cycle, a
different port is given priority; this ensures that processing does
not stall while one FSM continually requests memory access. An
internal state variable is incremented each clock cycle, and this is
used to determine the precedence of port request lines.

\begin{figure}[bpht]
\begin{center}
\includegraphics[width=6in]{memmgrtiming}
\end{center}
\caption{Memory Manager Timing Diagram}
\label{memmgrtiming}
\end{figure}

\subsection{Control Unit}
The control unit handles the task of ensuring that all three
subcomponents are operating simultaneously. It also controls their
buffer selection signals. It alternates between the input and output
buffers in the rotating buffers described above, and also selects the
appropriate signals for the FFT and inverse FFT modes.

The control unit is implemented as a synchronous finite state
machine. Its state transitions are shown in the state diagram in
Figure~\ref{controlstatediagram}.

\begin{figure}[pbth]
\begin{center}
\includegraphics[height=8.5in]{controlstate}
\end{center}
\caption{Control FSM State Diagram}
\label{controlstatediagram}
\end{figure}

The control unit begins in the s\_Start state; it also returns to this
state on a reset signal. In this state, it asserts the start signal to
each of the three FSMs. It waits until each of the three busy signals
from the FSMs become high, then moves to the s\_Wait1 state, in which
it waits for the FFT to terminate. When the FFTBusy signal goes low
again, it moves to the s\_StartIFFT state. In this state, it selects the
appropriate input and output buffers to perform the inverse FFT, sets
the FFTInverse signal high, and asserts the StartFFT signal again. As
soon as the FFTBusy signal goes high, it then moves to the s\_Wait2
stage, in which it waits for all three busy signals to go
low. Finally, it moves to the s\_Rotate state, in which it alternates
between buffer states as described in Table~\ref{bufferstatetable}. It
then returns to the s\_Start state and starts another cycle.

\begin{table}[bthp]
\begin{center}
\begin{tabular}{|c|c|c|}   
\hline
Buffer Variable & State 0 & State 1 \\
\hline
ADC Output & 000 & 111 \\
DAC Output & 001 & 010 \\
FFT Input & 111 & 000 \\
FFT Output & 110 & 101 \\
IFFT Input & 100 & 011 \\
IFFT Output & 010 & 001 \\
Interface Send & 101 & 110 \\  
Interface Receive & 011 & 100 \\
\hline
\end{tabular}
\caption{Buffer Variable Rotation States}
\label{bufferstatetable}
\end{center}
\end{table}

\begin{figure}[bpht]
\begin{center}
\includegraphics[width=6in]{controltiming}
\end{center}
\caption{Control FSM Timing Diagram}
\label{controlfsmtiming}
\end{figure}

\section{Analog / Digital Subsystem}
The Analog/Digital (A/D) subsystem converts analog input values to
digital representations for processing, and converts the output values
back to an analog signal. All A/D processing is done using blocks of
1024 samples. A finite state machine controller operates the external
analog to digital and digital to analog converters.

\subsection{Analog / Digital Control FSM}
The A/D Control FSM controls the analog-to-digital (ADC) and
digital-to-analog (DAC) converters. It is controlled by the major
control FSM, which provides a Start signal to indicate when a
conversion cycle should be performed, and selects the ADC Output and
DAC Input buffers. The A/D FSM then performs a series of 1024
conversions, during which the FSM outputs a busy signal. When the
conversion cycle is completed, the FSM sets its busy signal low and
returns to an idle state to await the next start signal.

The state transitions for the FSM are depicted in
Figure~\ref{adfsmstatediagram}. The FSM begins in the s\_Idle state,
in which it waits for a start signal from the Control FSM. When the
start signal is received, it moves to the s\_WaitForTimer state, and
waits for a pulse from the clock divider; this indicates that it is
time to perform one sample. It then loads the DAC output value from
the DAC input buffer (in the s\_RAMRead state), and outputs it to the
DAC (in the s\_DACWrite state). The next two states, s\_ADCEnable and
s\_ADCWait together activate the ADC and wait for it to perform a
conversion. Once conversion is completed, the s\_ADCRead state reads
the output of the ADC into an internal buffer; it is then written to
the ADC output RAM buffer in the s\_RAMWrite1 state.

The address counter is then incremented in the s\_AddrInc1 state. As 
the analog input and output values can take only real values (not
complex), the imaginary component of the buffer clearly must be
ignored. Thus, the next memory location in the ADC buffer must be
filled with a zero, and the next memory location in the DAC buffer
must be ignored. This is performed by writing a zero in the
s\_RAMWrite2 state, and incrementing the address counter again in the
s\_AddrInc2 state.  Finally, the FSM returns to the s\_WaitForTimer
state to await the next sample timer pulse, or instead to the s\_Idle
state after 1024 input samples have been recorded and the address
counter returns to zero.

\begin{figure}[pbht]
\begin{center}
\includegraphics[height=8in]{adfsmstate}
\end{center}
\caption{A/D Control FSM State Diagram}
\label{adfsmstatediagram}
\end{figure}

\begin{figure}[bpht]
\begin{center}
\includegraphics[width=6in]{adfsmtiming}
\end{center}
\caption{A/D Control FSM Timing Diagram}
\label{adfsmtimingdiagram}
\end{figure}


\subsection{Analog-to-Digital Converter}
An analog-to-digital converter
(ADC) is required to convert the incoming analog signal to a digital
representation. The Analog Devices AD670 chip is used to fulfill this
requirement. The chip accepts a differential voltage input, and is
configured for a bipolar $\pm$1.28V range. The two's complement output
format is selected, using the wiring configuration shown in
Figure~\ref{wiringdiagram}.

The A/D control FSM handles the task of starting ADC conversions when
appropriate and monitors the status line.


\subsection{Digital-to-Analog Converter}
The digital-to-analog converter (DAC) converts the computed output value to an analog voltage. This is implemented with an Analog Devices AD558 chip. The V$_\mathrm{out}$, V$_\mathrm{out}$ Sense, and V$_\mathrm{out}$ Select outputs are tied together to select the +2.56V range. The DAC is powered by the lab kit's +5 Analog supply, and referenced to ground. The output range is 0 -- +2.56V, so the zero value has a 1.28V offset relative to ground.

The DAC and ADC share a data bus. Because both chips are controlled by the Controller FSM, only one is active at any given time, and bus contention can never occur.

\subsection{Divider}
The purpose of the divider is to generate a 22.143 kHz sample timer
signal from the 5 MHz clock signal. The output of the divider is a
signal that is high for one pulse of the system clock, each time a
sample needs to be performed. This is done by maintaining an 8-bit
counter that is incremented each clock pulse. When the counter reaches
225, the output is set high. On the next clock pulse (when the counter
is 226), the counter is reset to zero and the output is set low again.

The divider also has a reset signal that allows the counter to be
reset to zero. This is connected to the RESET input.

\begin{figure}[bthp]
\begin{center}
\includegraphics[scale=.5]{divider-timing}
\end{center}
\caption{Divider timing diagram (ratio changed from 226:1 to 10:1)}
\label{dividertiming}
\end{figure}

\section{Fast Fourier Transform Subsystem}
\subsection{Theory}
\subsubsection{Discrete Fourier Transform}
The discrete Fourier transform (DFT) converts a discrete-time
representation of a signal into a discrete frequency representation:
the sampled spectrum of the input signal. The DFT of a signal of
length $N$ is defined by the analysis equation
\begin{equation}
\mathrm{DFT}\left\{x[n]\right\}[k] = \frac{1}{\sqrt{N}} \sum_{n=0}^{N-1}x[n]W_N^{kn}
\label{dftanalysis}
\end{equation}
and also satisfies the synthesis equation
\begin{equation}
x[n] = \frac{1}{\sqrt{n}}
\sum_{k=0}^{N-1}\mathrm{DFT}\left\{x[n]\right\}[k]  W_N^{-kn}
\label{dftsynthesis}
\end{equation}
in which $W_N^k$ is known as the $k$th twiddle factor and is defined
by
\begin{equation}
W_N^k = e^{-\frac{2\pi j k}{N}}
\label{twiddledef}
\end{equation}

The similarity between the analysis and synthesis equations is
noteworthy. The inverse DFT is simply the same as the forward DFT,
except that the twiddle factor $W_N^{kn}$ is replaced by
$W_N^{-kn}$. Applying the definition of the twiddle factor, it is not
difficult to see that the only change necessary to perform an inverse
DFT is to invert the sign of the imaginary component of the twiddle
factors.

\subsubsection{Fast Fourier Transform}
The Fast Fourier Transform (FFT) is an efficient method for computing
the DFT. The specific form of the FFT used in this implementation
is a radix-2 in-place decimation-in-time FFT, a variant of the
algorithm originally presented by Cooley and Tukey in 1965.

This algorithm relies on the fact that the DFT of $x[n]$ can be
written as
\begin{equation}
X[k] = \frac{1}{\sqrt{N}} \sum_{n=0}^{N-1}x[n]W_N^{kn} = \frac{1}{\sqrt{N}}
\sum_{i=0}^{\frac{N}{2}-1}x[2i]W_N^{2ik} + \frac{1}{\sqrt{N}} W_N^k
\sum_{i=0}^{\frac{N}{2}-1}x[2i + 1]W_N^{2ik}
\end{equation}
i.e. that the DFT can be written as the sum of the DFTs of the even
and odd halves of the input sequence, with one multiplied by a twiddle
factor. This allows a DFT of size $N$ to be decomposed into two DFTs
of size $\frac{N}{2}$. The recursive application of this procedure is
the radix-2 decimation-in-time FFT.

Of course, since recursion is difficult to implement in hardware, an
iterative approach is used instead. This performs the DFT from the
bottom up, beginning by computing a number of two-point DFTs, and then
using these to compute four-point DFTs, and so on until the full DFT
is computed.

The elementary computation used is a butterfly, so named because of
its graphical representation. It takes two points
from the input or an intermediate stage, and determines their new
values in the next stage:
\begin{align}
X[p]' &= X[p] + W_N^r X[q]\\
X[q]' &= X[p] - W_N^r X[q]
\end{align}
The values of $p$, $q$, and $r$ depend on the arrangement of the
butterflies in each stage. However, it is noteworthy that the
next-stage values of $X[p]$ and $X[q]$ (denoted $X[p]'$ and $X[q]'$
respectively) depend only on the previous stage's values of $X[p]$ and
$X[q]$. Therefore, the FFT computation can be performed in-place, with
only one buffer required.

% butterfly diagram?

The arrangement of butterflies in each pass requires some reasonably
complex logic. There are $\log_2 N$ passes (in this case, $N = 1024$ and
$\log_2 N = 10$). In each pass, $\frac{N}{2}$ butterflies are
performed. However, they are divided into a different number of
blocks: in the $p$th pass, $\frac{N}{2^{p+1}}$ blocks are performed,
each consisting of $2^p$ butterflies.

The iterative implementation is best understood by examining the
pseudocode in Listing~\ref{fftpseudocode}, written in a C-like language that supports
complex multiplications\footnote{Based on an procedure given by
Engineering Productivity Tools at http://www.eptools.com/tn/T0001/PT04.HTM}.

\lstinputlisting[float=bhtp,language=c,label=fftpseudocode,caption=Pseudocode for FFT Algorithm]{fftcode.c}

\subsection{Sources of Error}
\label{ffterror}
While the DFT and FFT are mathematically ideal algorithms that can be
proven to give the correct result for any input sequence, this
implementation is only a numeric approximation to the ideal algorithm,
subject to several sources of imprecision.

Our implementation of the FFT was verified to be working in simulation
and gave somewhat acceptable results for some input sequences, but was
subject to considerable error that made the results unacceptable. For
more information about the results, see Section~\ref{convtesting}.

\subsubsection{Finite-Precision Arithmetic}
As a mathematical ideal, the FFT should be performed using complex
numbers with infinite precision. Of course, this is not possible to
implement in hardware. Instead, we use 8-bit representations of the
real and imaginary components of a complex number, providing one bit
of sign information and 7 bits of magnitude. 

This necessarily introduces some error into the calculations. In
particular, the error was observed to be substantially greater when
moving from the 16-point FFT in testing to the 1024-point FFT. This is
not surprising, because more stages of computation are required, in
which the errors are compounded. Indeed, the Cooley-Tukey FFT
algorithm has an error bound of $O(\epsilon \log N)$, where $\epsilon$
is the numeric precision in use.\footnote{Wikipedia --- Fast Fourier
Transform, http://en.wikipedia.org/wiki/Fast\_Fourier\_transform}

The error is especially sensitive in this application, because an
inverse FFT is performed to the values generated by the FFT. This
allows the errors to be compounded even more than they would be by the
1024-point FFT alone. In addition, any error introduced by the
processing stage will also have an effect.

\subsubsection{Scaling, Overflow, and Underflow}
A second problem is the limited range of the fixed-point values used
in the computation. Each variable can hold only values of magnitude
less than or equal to 128 (0x7F or 0x80). The input values produced by
the ADC can reach these extremes, and there are very few assumptions
that can be made in this application about the properties of the input
signal.

Suppose the input signal is a large constant value distributed evenly
over the time domain. Then the FFT of this signal will be a large
impulse concentrated at one point. This is particularly true if the
FFT analysis and synthesis equations are expressed with the scaling
factors typically used in engineering applications:
\begin{align}
X[k] &= \sum_{n=0}^{N-1}x[n]W_N^{kn}\\
x[n] &= \frac{1}{N}
\sum_{k=0}^{N-1}X[k]  W_N^{-kn}
\end{align}
Here, the FFT can have a greater magnitude than the input
signal. Since the input signal can already reach the extremes of the
memory limits, overflow is clearly a problem. To avoid this, we
express the analysis and synthesis equations using a scaling factor of
$\frac{1}{\sqrt{N}}$, as is commonly done in theory:
\begin{align}
X[k] &= \frac{1}{\sqrt{N}} \sum_{n=0}^{N-1}x[n]W_N^{kn}\\
x[n] &= \frac{1}{\sqrt{N}}
\sum_{k=0}^{N-1}X[k]  W_N^{-kn}
\end{align}
This is implemented by shifting each intermediate result right one bit
during every alternate pass. But this is discarding data, so it
results in an increased imprecision, and underflow problems become
apparent for input signals of small magnitude. We thus find ourselves
in the paradoxical situation of having problems with both overflow and
underflow simultaneously. This problem cannot easily be solved short
of increasing the width of the data values, which was not practical
due to concerns with available FPGA resources.

\subsubsection{Twiddle Factor Error}
The twiddle factor values are constants, limited in precision by the
width of the value. An 8-bit two's complement fixed point
representation was chosen, which provided the twiddle factors with the
same numeric precision as the sample values. However, as the twiddle
factors are used in every butterfly calculation, the numeric
instability caused by error in the twiddle factors is substantial. The
Cooley-Tukey algorithm is very sensitive to errors in the twiddle
factors.\footnote{Wikipedia --- Fast Fourier Transform,
http://en.wikipedia.org/wiki/Fast\_Fourier\_transform} 
As there were 512 twiddle factors used in the computation of the
1024-point FFT, and an 8-bit representation was used for the real and
imaginary components of each value, providing only 256 possible values
for each component, it is clear that some error is inevitable. This
had a negative effect on the results of the FFT.

In addition, one of the most commonly used twiddle factors, $W_N^0 =
1$, could only be represented as 0x7F, a value of
$\frac{127}{128}$. This particular special case could have been
specifically accounted for if time had been available, but it
demonstrates the problem with error in the twiddle factors. It did
cause an observable scaling down of the results. 

\subsection{FFT FSM}
The Fast Fourier Transform is implemented using a finite state machine
that handles all the logic of loading and storing values from RAM,
performing butterfly arithmetic, and ensuring that the butterflies are
performed in the correct order. The FFT FSM is controlled by the
Control FSM. When it receives a start signal, the FFT FSM first copies
the input values from the input buffer to the output buffer (buffers
are selected by the Control FSM). It then perform a series of ten
passes of blocks of butterflies. When these complete, the DFT of the
input signal will be stored in the output buffer.

The state transitions for the FFT FSM are shown in the state diagram
in Figure~\ref{fftstatediagram}.

The FSM begins in the s\_Idle state, in which it waits for the start
signal. When this is received, it begins to copy from the input buffer
to the output buffer in bit-reversed order. All address bits are
reversed, except for the first three, which specify the buffer, and
the last one, which specifies the real or imaginary component. This
process is achieved by the s\_CopySel and s\_CopyWr states. After
all 2048 values are copied, the computation of the FFT begins.

The execution of the FFT is performed using ten passes, each made up
of 1 to 512 blocks of 512 to 1 butterflies, as in the pseudocode in
Listing~\ref{fftpseudocode}. Hence, the s\_PassStart, s\_PassEnd,
s\_BlockStart, s\_BlockEnd, and s\_BflyEnd states determine when to
start and end passes and blocks, and update the necessary state
variables upon each transition --- see the state diagram in
Figure~\ref{fftstatediagram} for details.

The s\_BflyLoad and s\_BflyWrite states perform the actual butterfly
computation. In the s\_BflyLoad state, the real component of
outbuf[BaseT+k], the imaginary component of outbuf[BaseT+k], the real
component of outbuf[BaseB+k], and the imaginary component of
outbuf[BaseB+k] are loaded into buffers tre, tim, bre, and bim
respectively. These are then combined combinationally with the twiddle
factors via the complex multiplier, adders, and subtractors, to
generate the output values. In the s\_BflyWrite state, the output
values are written to the same locations. The FSM then moves to the
s\_BflyEnd state, in which it increments the k counter, and determines
whether to perform another butterfly in the same block, or move to the
next block. 

\begin{figure}[pbth]
\begin{center}
\includegraphics[height=8in]{fftstate}
\end{center}
\caption{FFT FSM State Diagram}
\label{fftstatediagram}
\end{figure}

Timing diagrams for the FFT FSM are provided in
Figures~\ref{fft-timing-1}, \ref{fft-timing-2}, \ref{fft-timing-3},
\ref{fft-timing-4}, \ref{ifft-timing-1}, and \ref{ifft-timing-2}.

\subsection{Complex Multiplier}
A complex combinational multiplier is used to multiply the twiddle
factors and the sample values. The multiplication is performed
according to the identity
\begin{equation}
(a+bj)(c+dj) = (ac - bd) + j(bc + ad)
\end{equation}
Hence, four combinational multipliers (described below) are
instantiated in order to multiply each of the product terms, and an
adder and subtractor unit combine the results to form the real and
complex parts of the product.

While this approach is expensive in terms of FPGA resource allocation,
it allows the computation to be performed rapidly, in less than one
clock cycle. This allowed for greater optimization of the butterfly
execution loop.

\begin{figure}[bhtp]
\begin{center}
\includegraphics[scale=.5]{complexmult}
\end{center}
\caption{Complex Multiplier Timing Diagram}
\label{complexmulttiming}
\end{figure}

\subsection{Multiplier}
Four combinational multipliers were instantiated as part of the
complex multiplier unit. Each of these multipliers accepted two 8-bit
two's complement input values, and generated as output their product
in a 16-bit two's complement representation.

The combinational multiplier was implemented using Altera's
parameterized module library: specifically, the LPM\_MULT component.

\subsection{Twiddle Factor Generator}
A twiddle factor generator was used to generate the values of the
twiddle factors. The twiddle factor generator takes as input the index
of the twiddle factor requested, and outputs the real and imaginary
components of the appropriate twiddle factor. The imaginary component
can then be negated by the FFT FSM if necessary in order to perform
the inverse FFT.

Twiddle factor real and imaginary coefficients are represented as
8-bit two's complement values, in a fixed-point representation where
they are scaled by a factor of 256. The values are stored in two ROMs;
the twiddle factor generator selects the appropriate address line for
each ROM and passes the output through.

A twiddle factor generator capable of generating 512 twiddle factors
was used in the implementation of the 1024-point FFT. Two additional
generators, with 8 and 16 points respectively, were also created for
test purposes.

\begin{figure}[bhtp]
\begin{center}
\includegraphics[width=6in]{twiddler512timing}
\end{center}
\caption{Twiddle Factor Generator Timing Diagram}
\label{twiddlertiming}
\end{figure}

\subsection{Twiddle Factor ROMs}
Two ROMs are used to store the values of the twiddle factors. These
ROMs are implemented using the FPGA's onboard memory, using the Altera
LPM\_ROM parameterized module. Each ROM (a sine rom and a cosine rom,
used for the imaginary and real components of the twiddle factors) is
a 512-location by 8-bit unclocked ROM.

The values for each ROM were generated by a C program whose source
code is available in Appendix~\ref{twiddlifier}. 

\section{Parallel Interface Subsystem} 
The parallel interface subsystem handles the task of exchanging data
between this module and the Processing and Video modules. When
directed to do so by the Control FSM, it transmits one set of samples
from an input buffer to the other two modules, then receives another
set of samples from the Processing module into an output buffer.

\subsection{Interface FSM}
The Interface FSM implements the parallel interface, handling the
tasks of handshaking with the other two modules via the control
lines, as well as transmitting and receiving the appropriate data. It
is controlled by the Control FSM, and starts transmitting and
receiving one block of information when it receives the Start signal.

The state transitions are shown in the state diagram in
Figure~\ref{interfacefsmstatediagram}. It implements the protocol
described in Section~\ref{interfaceprotocol}.

\begin{figure}[pbth]
\begin{center}
\includegraphics[height=8in]{interfacefsm}
\end{center}
\caption{Interface FSM State Diagram}
\label{interfacefsmstatediagram}
\end{figure}

The FSM begins in the s\_Idle state, and waits for a Start pulse from
the Control FSM. In the s\_SendStart state, it asserts the FFTSending
signal. Through the s\_SendAddrSelect and s\_SendRAMWait states, it
loads a value from the RAM. In the s\_SendReady state, the FSM then
outputs the value on the data bus and asserts the FFTRdyAck line in
order to indicate that the data is ready. It waits until both the
ProcRdyAck and Ack2 lines have been set high, indicating that the
other modules have received and acknowledged the data, then moves to
the s\_SendAck stage, setting the FFTRdyAck line low again. When the
other two modules set their acknowledge lines low again, indicating
that they are ready for the next data, the FSM returns to the
s\_SendAddrSelect state. If the counter overflows and reaches 0 again,
then sending is complete, and the FSM moves to the s\_SendComplete
state.

In the s\_SendComplete state, the FFTSending line is set low,
indicating that the FFT has finished its transmission, and the FSM
waits for the Processing module to begin its transmission cycle by
setting ProcSending high. When this signal is received, it moves to
the s\_RecvWait state. In this state, the FSM reads the data bus when
the ProcRdyAck signal goes high, then moves to the s\_WriteAddrSelect
and s\_Write states, writing the appropriate value to RAM. It then
moves to the s\_RecvAck state, setting the FFTRdyAck signal high to
acknowledge reception of the data, and waits for ProcRdyAck to be set
low before moving to s\_RecvWait and waiting for the next data to
become available. If instead ProcSending is set low again, the FSM
detects that the Processing module has finished transmitting, and it
returns to the s\_Idle state. 

\begin{landscape}
\begin{figure}[bthp]
\begin{center}
\includegraphics[width=8in]{interfacefsmtiming}
\end{center}
\caption{Interface FSM Timing Diagram}
\label{interfacefsmtiming}
\end{figure}
\end{landscape}

\subsection{Synchronizer}
The input signals to the module from the other modules are
asynchronous with respect to the local system
clock. A synchronizer is used to generate synchronized versions of the
The synchonized output is
guaranteed not to change except on rising edges of the clock.

The implementation of the synchronizer is simply a D flip-flop, as in
Figure~\ref{syncschematic}.
\begin{figure}[bthp]
\begin{center}
%\epsfig{file=6111-lab2-sync.ps,bbllx=323,bblly=381,bburx=440,bbury=440}
%\includegraphics[bb=223 281 540 540,clip=true]{6111-lab2-sync}
\includegraphics[scale=.5]{6111-lab2-sync.eps}
\end{center}
\caption{Synchronizer logic diagram}
\label{syncschematic}
\end{figure}

\begin{figure}[bthp]
\begin{center}
\includegraphics[scale=.5]{sync-timing}
\end{center}
\caption{Synchronizer timing diagram}
\label{synctiming}
\end{figure}

\subsection{Line Drivers / Receivers}
To isolate the FPGA from the transmission lines, line drivers and line
receivers were used. These were realized using 74LS244 and 74LS245
chips. Schmitt triggering of the line receivers provided hysteresis,
which was useful in minimizing transmission line effects. These are
depicted in the wiring diagram in Figure~\ref{wiringdiagram}.

\section{Testing and Debugging}
\label{convtesting}
All components were tested first in simulation using MAX+PLUS II
software, then in hardware when the FFT was programmed.

\subsection{Simulation}
Testing of most components was relatively straightforward: for numeric
components, a representative sample of test inputs was provided, and
the correct results were verified. To test FSMs, input signals were
provided that placed the FSMs into each of their states. The correct
outputs and internal state variables were observed. All possible
states and state transitions were validated. Correct output was observed.

Testing of the FFT was difficult, as a 1024-point FFT was too complex
to simulate. Thus, the FFT FSM was modified to compute a 16-point FFT
instead, which was considerably simpler to simulate. In addition, it
was modified to access a smaller RAM. A test jig was constructed,
which consisted of the FFT FSM (containing its subcomponents required
for complex multiplication and twiddle factor generation), integrated
with a 128 location by 8 bit RAM. This was used to test the FFT FSM,
initially without the memory manager. After these tests were
completed, a separate test jig was constructed consisting of the
memory manager and RAM to test the memory manager, and finally a test
jig containing the FFT FSM, memory manager, and RAM. Extensive testing
was performed with this configuration.

Several FFTs and inverse FFTs were performed in simulation, with mixed
results. The order of butterflies was observed to be correct, and
processing was otherwise satisfactory; however, the results were not
always acceptable. Some amount of numeric error was inevitable, but
some results were unacceptable. A demonstration of a FFT being
performed, then an inverse FFT being performed on the results is
presented in Figures~\ref{fft-timing-1}, \ref{fft-timing-2},
\ref{fft-timing-3}, \ref{fft-timing-4}, \ref{ifft-timing-1}, and
\ref{ifft-timing-2}. The input sequence provided consists of
alternating values of 0x20 (real) and zero. Note that the results of
the FFT are correct, and the inverse FFT returns the original
sequence. There is, however, some observable error that causes the
magnitude to be slightly reduced; this is due to the causes described
in Section~\ref{ffterror}.

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8.5in]{fft-timing-1}
\end{center}
\caption{FFT Timing Diagram --- start of copy phase}
\label{fft-timing-1}
\end{figure}
\end{landscape}

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8.5in]{fft-timing-2}
\end{center}
\caption{FFT Timing Diagram --- start of FFT computation}
\label{fft-timing-2}
\end{figure}
\end{landscape}

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8.5in]{fft-timing-3}
\end{center}
\caption{FFT Timing Diagram --- first part of results}
\label{fft-timing-3}
\end{figure}
\end{landscape}

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8.5in]{fft-timing-4}
\end{center}
\caption{FFT Timing Diagram --- second part of results}
\label{fft-timing-4}
\end{figure}
\end{landscape}

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8.5in]{ifft-timing-1}
\end{center}
\caption{FFT Timing Diagram --- first part of inverse FFT results}
\label{ifft-timing-1}
\end{figure}
\end{landscape}

\begin{landscape}
\begin{figure}
\begin{center}
\includegraphics[width=8.5in]{ifft-timing-2}
\end{center}
\caption{FFT Timing Diagram --- second part of inverse FFT results}
\label{ifft-timing-2}
\end{figure}
\end{landscape}

\subsection{Hardware Testing}
After testing in simulation was complete, the hardware was
tested. First, a test program was programmed onto the 10K70 FPGA to
ensure that the FPGA was operating correctly and all output pins were
available. This test was completed successfully, though it was
discovered that the L1-0 pin on the lab kit was not operating normally.

Next, the RAM module was tested. As some problems were observed with
the original RAM chips selected (two Cypress CY7C185 8K $\times$ 8
chips), a test program for verifying RAM operation was programmed onto
the FPGA. This test program wrote values to several RAM addresses,
then read them back. A timing diagram is provided in
Figure~\ref{ramtesttiming}. The RAM address and data buses were
monitored using the logic analyzer. Eventually the two RAM chips were
replaced with a single Sharp LH52256C-70LL 32K $\times$ 8
chip. Correct RAM operation was observed.

\begin{figure}[bthp]
\begin{center}
\includegraphics[width=6in]{ramtest}
\end{center}
\caption{RAM Test Program Timing Diagram}
\label{ramtesttiming}
\end{figure}


The next stage of testing called for the FPGA to be programmed with
the project code. Two test modes were provided in the Control FSM, and
wired to the lab kit's switches. The first test was a test of the
Analog / Digital subsystem. With this mode enabled, the FFT, inverse
FFT, and interface were disabled. The Control FSM directed the A/D FSM
to continuously sample input into a buffer, then output the input
samples. The input was connected to a function generator and the
output connected to an oscilloscope; correct operation was
verified. Oscilloscope traces are provided in Figures~\ref{adtest1}
and \ref{adtest2}. While this is not a particularly exciting output,
it demonstrates the correct operation of a number of components: not
merely the ADC, DAC, and the A/D Control FSM, but also the Control
FSM, memory manager, and RAM, since the sample points must be
buffered. The logic analyzer was used to verify that RAM accesses were
occurring correctly.

\begin{figure}[pbht]
\begin{center}
\includegraphics[scale=.5]{p_a0}
\end{center}
\caption{A/D Test Mode --- sine wave input}
\label{adtest1}
\end{figure}

\begin{figure}[pbht]
\begin{center}
\includegraphics[scale=.5]{p_a1}
\end{center}
\caption{A/D Test Mode --- sine wave input, demonstrating sampling effects}
\label{adtest2}
\end{figure}

Once analog/digital testing was completed, the Control FSM's FFT test
mode was selected. In this mode, the Interface FSM was
bypassed. The A/D FSM reads input samples into a buffer, which are
then transformed by the FFT FSM. The results of the FFT are then
passed through the inverse FFT, and output by the DAC. Thus, the
output signal should approximate the input signal. Again, the function
generator was used as an input, and the oscilloscope used to monitor
the output. The logic analyzer was used to verify that RAM accesses
were working correctly, and correctly being multiplexed between the
FFT and A/D FSMs.

The results of this test were not satisfactory. An output was
generated, but it did not reliably reproduce the original
signal. However, it did bear some resemblance to the original signal,
as in Figure~\ref{ffttest}, where the sinusoidal input generates an
output that has a somewhat wavelike shape. This problem was likely due to the
numeric error described in Section~\ref{ffterror}.

\begin{figure}[bpht]
\begin{center}
\includegraphics[scale=.5]{p_f0}
\end{center}
\caption{FFT Test Mode Input and Output}
\label{ffttest}
\end{figure}

Next both test modes were disabled, and the output was observed on the
logic analyzer. As expected, the module stalled because the Interface
FSM was waiting for handshaking on the control lines of the parallel
bus, and the other two modules were not connected. Debounced switches
were wired to the control line inputs, and the correct outputs were
observed on the logic analyzer.

The two other modules were then connected to the parallel bus. After
resetting this module and the Processing module, data transmission was
observed for a few moments. However, the two modules could not
communicate reliably for more than a few hundred
milliseconds. Afterwards, both the FFTSending and ProcSending lines of
the bus were being held high, which is a forbidden state. It was not
clear what the cause of this deadlock was, and not enough time was
available to debug it entirely. A probable cause is that transmission
line effects led to glitches that caused the two FSMs to become out of
synchronization and caused each to believe that it was time for it to
transmit.


\chapter{Processing Module (A. Clough)}
\input{processing.tex}

\chapter{Video Module (A. Chiou)}
\input{video.tex}

\chapter{Conclusion}
All three modules of this project were successfully designed,
implemented, and tested in simulation. Some problems occurred when
these implementations were translated from the simulator to the
hardware, and when the three modules were integrated together.

The FFT module was successfully implemented in simulation. The
analog/digital tests passed in hardware, indicating that the
analog/digital hardware and controller was working correctly, as well
as the control unit, memory manager, and RAM. A 16-point FFT was
successfully performed in simulation, but numeric instability made the
1024-point FFT unreliable in the actual implementation.

The Processing module was also successfully implemented in simulation,
and verified to be working in hardware using the logic analyzer,
though problems with the lab kit forced some features to be removed,
due to a malfunctioning NuBus interface.

The Video module was also successfully implemented in simulation. It
was tested in hardware using a video monitor. A test circuit verified
the correct operation of the analog component, and a simple memory
test verified the correct operation of the RAM and memory
multiplexors. Address testing determined the correct format for
addressing RAM locations.

Integrating the three modules proved to be a problem. It was not
possible to make the parallel interface bus function reliably. The bus
would transmit some data, but then stall in a forbidden state in which
both the FFTSENDING and PROCSENDING lines were high. This suggests
that both the FFT module and Processing module interface FSMs believed
that they were transmitting data. This result was unexpected, as the
interface FSMs were tested in simulation and no such error was
observed. The problem is still unclear because not enough time to
perform extensive testing was available. Ultimately, we theorized that
transmission line effects were causing glitches on the bus control
lines that caused the two FSMs to become out of sync. Line drivers and
receivers were added to minimize these effects, which caused the bus
to become more functional for brief periods. However, these problems
were not solvable in the time available. If more time were available,
we would modify the interface FSMs of each module to make them more
robust to this sort of error.

As a result, our project was not entirely functional, since all three
components needed to be working and integrated in hardware to have an
operational result. Nonetheless, a valid design was completed, and
working code was implemented and validated, and testing was
performed. The correct operation of nearly every sub-module was
verified, which represented a substantial accomplishment. Indeed, the
only problems that remained were some numeric error in the FFT, which
was unavoidable, and some integration issues that were not able to be
debugged thoroughly. We are confident that, given enough time and
resources, these obstacles could be overcome while maintaining our
original, valid design.

On a more personal level, though we are obviously disappointed that
the full extent of our project could not be fully realized, we now
recognize the full complexity of the problem we were attempting to
solve. Even so, we were able to create a workable design for
approaching the problem, and an implementation of this design,
including some rather complex components. We were each able to
implement our respective designs, verifying their correct operation in
simulation, and translating most of them to hardware. Though some
problems were encountered that prevented the project from working
successfully, the majority of the project was nearly operational. We
consider this a substantial accomplishment. Moreover, the process of
designing and implementing this project proved quite educational,
demonstrating the process of taking a problem, generating a design to
approach it, and translating that design to a hardware
implementation. We found this to be quite rewarding.


\appendix
\chapter{Source Code}
\section{Conversion / FFT Module}
\subsection{top.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{top.vhd}
\newpage

\subsection{adfsm.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{adfsm.vhd}
\newpage

\subsection{complexmult.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{complexmult.vhd}
\newpage

\subsection{control.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{control.vhd}
\newpage

\subsection{divider.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{divider.vhd}
\newpage

\subsection{fft.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{fft.vhd}
\newpage

\subsection{interfacefsm.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{interfacefsm.vhd}
\newpage

\subsection{mult.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{mult.vhd}
\newpage

\subsection{synchronizer.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{synchronizer.vhd}
\newpage

\subsection{synchronizer1.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{synchronizer1.vhd}
\newpage

\subsection{twiddler8.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddler8.vhd}
\newpage

\subsection{twiddler16.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddler16.vhd}
\newpage

\subsection{twiddler512.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddler512.vhd}
\newpage

\subsection{twiddlerom8c.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddlerom8c.vhd}
\newpage

\subsection{twiddlerom8s.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddlerom8c.vhd}
\newpage

\subsection{twiddlerom16c.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddlerom16c.vhd}
\newpage

\subsection{twiddlerom16s.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddlerom16c.vhd}
\newpage

\subsection{twiddlerom512c.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddlerom16c.vhd}
\newpage

\subsection{twiddlerom512s.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{twiddlerom16c.vhd}
\newpage

\subsection{ffttestjig.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{ffttestjig.vhd}
\newpage

\subsection{memmgrtestjig.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{memmgrtestjig.vhd}
\newpage

\subsection{fftmgrtestjig.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{fftmgrtestjig.vhd}
\newpage

\subsection{testram.vhd}
\lstinputlisting[numbers=left,language=vhdl,breaklines=true]{testram.vhd}
\newpage

\subsection{twiddlify.c}
\label{twiddlifier}
\lstinputlisting[numbers=left,language=c,breaklines=true]{twiddlify.c}
\newpage

\subsection{twiddle8c.dat}
\lstinputlisting[numbers=left,breaklines=true]{twiddle8c.dat}
\newpage

\subsection{twiddle8c.ntl}
\lstinputlisting[numbers=left,breaklines=true]{twiddle8c.ntl}
\newpage

\subsection{twiddle8s.dat}
\lstinputlisting[numbers=left,breaklines=true]{twiddle8s.dat}
\newpage

\subsection{twiddle8s.ntl}
\lstinputlisting[numbers=left,breaklines=true]{twiddle8s.ntl}
\newpage

\subsection{twiddle16c.dat}
\lstinputlisting[numbers=left,breaklines=true]{twiddle16c.dat}
\newpage

\subsection{twiddle16c.ntl}
\lstinputlisting[numbers=left,breaklines=true]{twiddle16c.ntl}
\newpage

\subsection{twiddle16s.dat}
\lstinputlisting[numbers=left,breaklines=true]{twiddle16s.dat}
\newpage

\subsection{twiddle16s.ntl}
\lstinputlisting[numbers=left,breaklines=true]{twiddle16s.ntl}
\newpage

\subsection{twiddle512c.dat}
\lstinputlisting[numbers=left,breaklines=true]{twiddle512c.dat}
\newpage

\subsection{twiddle512c.ntl}
\lstinputlisting[numbers=left,breaklines=true]{twiddle512c.ntl}
\newpage

\subsection{twiddle512s.dat}
\lstinputlisting[numbers=left,breaklines=true]{twiddle512s.dat}
\newpage

\subsection{twiddle512s.ntl}
\lstinputlisting[numbers=left,breaklines=true]{twiddle512s.ntl}
\newpage

\subsection{top.acf}
\lstinputlisting[numbers=left,breaklines=true]{top.acf}
\newpage

\subsection{top.pin}
\lstinputlisting[numbers=left,breaklines=true]{top.pin}
\newpage

\section{Processing Module}
\subsection{P4Overshell.vhd}
\newpage
\newpage
\subsection{MassiveFSM.vhd}
\newpage
\newpage
\newpage
\newpage
\newpage
\subsection{P4Control.vhd}
\newpage
\newpage
\subsection{P4MathShell.vhd}
\newpage
\newpage
\subsection{P4Arithmetic.vhd}
\newpage
\newpage
\newpage

\section{Video Module}

\subsection{top.vhd}
\lstinputlisting[numbers=left,breaklines=true,language=vhdl]{top-video.vhd}

\newpage
\subsection{readfsm.vhd}
\lstinputlisting[numbers=left,breaklines=true,language=vhdl]{readfsmtest.vhd}

\newpage
\subsection{ramfsm.vhd}
\lstinputlisting[numbers=left,breaklines=true,language=vhdl]{ramfsmtest.vhd}

\newpage
\subsection{writefsmtest2.vhd}
\lstinputlisting[numbers=left,breaklines=true,language=vhdl]{writefsmtest2.vhd}

\newpage
\subsection{buff.vhd}
\lstinputlisting[numbers=left,breaklines=true,language=vhdl]{buff.vhd}


\end{document}

% LocalWords:  combinationally