\documentclass{article}
\include{6033-preamble}
\usepackage{graphicx,fullpage}

\begin{document}
%\title {Design Project 2 Executive Summary}
%\author {Dan Ports, Amanda Smith, Sarah Lieberman}
%\maketitle

\begin{center}
\par
\huge{AARDVARK}
\vskip 1em
{\lineskip .75em
\Large{Dan Ports, Amanda Smith, and Sarah Lieberman}\par
\Large{May 6, 2004}\par
\vskip 1em
\large{6.033 Design Project 2 Executive Summary}\par
\large{Recitation \#9: Karger/Lesniewski-Laas}
}\end{center}


\section{Overview}
       The Mars Rover project can most easily be split into three
components: communication protocol, mission execution, and mission distribution.
The problems associated with communication protocol are relatively
straightforward and include message authentication and error detection and
correction. Mission execution covers issues relating to the actual completion of missions by rovers, and includes such issues as sensor failure and data relay between rovers. Mission distribution, possibly the most difficult problem in the project, contains the algorithm which the command center uses to distribute missions to rovers, as well as the mechanisms it uses to detect and deal with rover failure, distribute amendments, and coordinate data uploads.

\section{Communications}\
	The basic communication protocol is an at-least-once delivery system, with possible duplicate messages, using a standard packet structure, and only broadcasting one packet at a time to prevent misordering of packets. The packet structure contains the name of the sending device, a message nonce, a message type, a message (optional), a signature, and a checksum. The name is simply an integer coded into each rover and the command center, and the nonce is simply a large integer used as a message identifier. The type is an integer code such as "01" for an acknowledgement, "06" for mission data, and "10" for a request to resend a message. The message is any additional content, such as mission assignments or nonces in acknowledgements. Finally, the devices use a shared-secret key and a SHA1 hash as a checksum to provide authentication and guard against transmission errors, respectively. Encryption and other protections against malicious devices are not needed, because the only other devices on Mars are non-malicious Poodle rovers. 
	To detect the presence of another device, a device sends out queries until it recieves an acknowledgement. The device then opens communication, which stops both devices from moving if it is acknowledged. The device will send out messages and packets of data one at a time, waiting for an acknowledgement (or a resend request) before sending another, thus ensuring at-least-once delivery; duplicate messages are fine since the only communications are data transmission rather than action requests. When the devices end communication, they send out end messages and then resume other operations. If one device does not recieve a message from another in a certain amount of time, it will try resending its last packet or sending a query; if this fails repeatedly, it will assume a dead device and end communication.


\section{Mission Execution}
\label{sec:mission-execution}

Each rover is capable of autonomously executing missions according to
parameters received from the control center. It does so by maintaining
a queue of instructions.  There are four types of instructions:
\begin{itemize}
\item travel to a certain location
\item perform an experiment (e.g. taking a photo)
\item deliver a message to another rover
\item return to the control center
\end{itemize}

When a rover performs a experiment, there is a non-zero probability of
sensor failure. However, to ensure that a correct result is returned
with 95\% probability, the rover will repeat the experiment the
required number of times. The rover will have a preprogrammed number of trials to perform with each sensor to ensure a 95\% chance of at least one accurate result, based on the failure rate of the sensor.

When two rovers are sent together to perform a mission, or a rover travels within communication range of another rover, they will attempt to transmit data to each other. A rover will reject any data which it does not have room to store in favor of keeping its own mission data. Each data copy will have a version number so that if the command center recives multiple copies of data from the same mission, it will know which is the most complete and up-to-date. This will reduce the probability of data loss due to rover failure.

\section{Mission Distribution}
\label{sec:mission-distribution}

The control center receives missions and amendments from Earth and
allocates them to rovers. Hence, it must maintain a table of
missions. Each mission comprises a set of tasks to be executed at a
certain, and is identified with a mission ID and revision number.
The control center also associates states with each mission: whether
it has been assigned, and whether it has been partially or fully
completed.

The control center also maintains a table of rovers. A rover can
be:

\begin{itemize}
\item \emph{available}, meaning it is present at the control
  center and waiting to be assigned a mission
\item \emph{busy}, meaning it is currently executing a mission
\item \emph{lame}, meaning it is currently executing a mission that
  has become moot due to an amendment
% is there a better term for this?
\item \emph{dead}, meaning that the control center has not heard from
  the rover and has presumed it to have failed
\end{itemize}
In addition, the control center keeps track of what mission, including
revision number (if any) the rover is currently executing, the time it
started, and when the rover is expected back (see below).

When a mission is received, it is added to the mission table in the
unassigned state. Scattered and scattered-cluster missions are broken
up into their individual or cluster components, and these are treated
as separate missions.\footnote{It may also be necessary to ensure that
  the components of a scattered or scattered-cluster mission are
  executed in order. If so, this can also be done by adding extra
  information to the missions table.} This forces rovers to return to
the control center to offload their data between parts of the
scattered mission, minimizing the amount of data that will be lost if
the rover fails.

Whenever rovers are in the available state, missions
are assigned to them. Since all missions are treated as having equal
priority, the goal is to maximize the number of missions that are
executed before all rovers fail. Hence, we use a greedy algorithm that
ranks missions in terms of their expected time to
complete\footnote{This algorithm is simple but sub-optimal. It could
  (and probably will) be improved by taking in to account factors such
  as batching of nearby missions to minimize travel time.} This is
computed as the sum of the round-trip travel time required plus for
each experiment the product of the expected time required to execute
the experiment and the expected number of times the experiment will
need to be repeated. When a rover is available, it assigns the
``cheapest'' mission to the rover by uploading the proper instruction
and setting the state tables appropriately.

When an amendment is received from Earth, it updates the mission table
with the new parameters and increments the mission's revision number.
If the mission is still unassigned, no further processing is
necessary. If the mission has already been assigned or completed, then
the modified portions of the mission will be rescheduled as though it
were a new mission. If a ``lame'' rover is still executing the
mission, then it may be advantageous to dispatch another rover to tell
it to abort.\footnote{Batching would be especially useful here --- the
  two rovers could then proceed to execute the amended mission and/or
  other missions without returning to the control center.} This can be
implemented by treating the message-delivery as a mission whose
expected time to completion for scheduling purposes takes into account
both the time required for delivering the message and the time saved
by aborting the mission. If the lame rover completes the old mission,
the revision number information will indicate to the control center
that it is outdated and should probably be ignored.

Rovers can fail unpredictably. The control center can only detect
failures when it fails to receive any response from the rover. There
must therefore be a time after which the control center assumes the
rover has failed if no response has been received. We compute this
time by summing the required travel time and the bound of required
time to perform and repeat the experiments in 95\% of cases. If the
mission has not been completed by this time, it is marked unassigned
and the rover assigned to it is marked dead. If it turns out that the
rover was in fact still operating, and it returns after being marked
dead, the experiment results will be downloaded, the mission marked
completed, and any new rover assigned to that mission treated as lame,
as though a mission amendment had been received.


\section{Conclusion}
\label{sec:conclusion}
This design should be able to ensure that resources are efficiently allocated in order to ensure that as much valid scientific data as possible gets sent back to the scientists at NASA.  In order to achieve this goal, we took into consider the various complications that could hinder the process.  These complications include mission amendments, the failure of sensors and rovers, and poor communications.  The mission distribution algorithm is made flexible by the variable amount of rovers that can get sent out to perform a mission.  This flexibility is an asset because it allows the system to efficiently allocate resources under a wide range of work loads.  Also, the priority values that can be assigned to missions make the overall system fairly customizable.  With simple changes in the procedure for value assignment, the overall system can be customized to suit a variety of needs.  Some of the design choices we have made depend on assumptions that would be valid in the real world, whereas other assumptions might only be valid given this design project’s model of the world.  However, we believe that we have kept these assumptions to a reasonable level and that within the world of this design project the design would soundly deal with the issues with which it is presented.  


\end{document}