% -*- TeX-master: "dp2.tex" -*-
% $Id: communications.tex,v 1.2 2004/05/06 14:10:42 dan Exp $

The most important communications between the devices on Mars are the
distribution of missions and the transmission of mission data. The
network layer is already specified as a ``black box'' with the functions
\proc{sendMsg}(\id{msg}, \id{length}) and
\proc{deliver_message}(\id{packet}, \id{length}), so only packet
segmenting must be dealt with. The link layer is also specified,
because it is just a broadcasting radio receiver, although there is an
issue of how to detect nearby devices which are available to receive
messages. Thus, the majority of the communications design is in the
end-to-end layer.


\subsection{Protocol Design}
\label{sec:protocol}


The first task is to design a standard packet
protocol. The Mars devices follow this protocol for the format of the
packets (in the end-to-end layer; the network and link layers may, of
course, add headers and trailers):

\begin{itemize}
\item Device name
\item Message nonce
\item Message type
\item Message arguments (for message types one, five, six, seven, and ten)
\item Signature
\item Checksum
\end{itemize}

\paragraph{Device name}
The first part of the packet, the device name, is trivial. The rovers
can be given numbers 1-50 as their names, and the command center can
have the name 0, thus providing each device with a unique identifier.

\paragraph{Message nonce}
The nonce is a unique integer chosen by the rover or command center
randomly. Since there are at most fifty-one messages being broadcast
at the same time, a nonce can simply be an integer of several bytes.

\begin{table}[htbp]
  \centering
  \begin{tabular}{|r|l|}
    \hline
    \textbf{ID} & \textbf{Message Type} \\
    \hline
    01 & Acknowledge \\
    02 & ``Who's here?'' \\
    03 & Begin communication \\
    04 & End communication \\
    05 & Mission assignment \\
    06 & Mission data \\
    07 & Data / assignment size \\
    08 & Not enough memory \\
    09 & Already have a mission \\
    10 & Resend message \\
    \hline
  \end{tabular}

  \caption{Message types}
  \label{tab:message-types}
\end{table}


\paragraph{Message type and arguments}
The message type is a single integer specifying the type of message,
such as an acknowledgement or error (see
Table~\ref{tab:message-types}). Message type one, the acknowledgement,
and message type ten, the request to resend, are followed by the
nonces of the messages to which they refer. Message types five and
six, which flag mission assignments and data respectively, will be
followed by the expected sort of message in a standard format
(specified in the section on mission distribution).  Messages type
seven, which flags a data size message, is followed by an integer
specifying the amount data the device wishes to send in bytes. This is
so the device receiving the data or assignment will know how much
memory to allocate, or send a type eight message in reply if there is
not enough memory.  The rest of the message types do not require
additional arguments.

\paragraph{Signature}
The next item, the signature, is to
provide authentication so that the messages from Poodle rovers will
not be mistaken for messages from NASA rovers. Since the Poodle rovers
are not malicious, there is no need for message encryption or
protection against attacks. The most logical solution is to have all
devices sign their messages using a shared secret key. The only
requirement for this key is that it is different from any key the
Poodle rovers might be using; a 54-bit key would be more than enough
protection. Finally, the packet is ended by a SHA1 hash of the
message, used as a checksum. Whenever a rover receives a message, it
checks the signature and recalculates the checksum of the message. If
the signature is not the correct, expected signature of another NASA
device, it disregards the message. If the checksum doesn't match, it
sends the rover a type ten (resend message) message. The checksum is
checked first, because there may be an error in the signature.


\subsection{Message Delivery Procedure}
\label{sec:message-delivery}


In delivering a message from one device to another, the first step
must be ensuring that another device is in range of the radio. This is
accomplished by sending a type two (``who's here'') message.  Any device
which receives this message will broadcast an acknowledgement (ack;
signified by type one). If the device doesn't receive an answer and
expects that it should (for instance, if a rover knows it is in range
of the command center), it can re-broadcast the query after a
reasonable period of time until it receives an ack. Repeated queries
and acks cause little overhead, and a rover need not keep any state
except for the knowledge that it is expecting a query reply. Note that
a rover
%% will only send an ``ack'' if it is not moving, and
% not anymore! -drkp
will not move while communicating with another device. This will
prevent rovers from moving out of radio range while another device is
attempting to communicate with them. When a suitable device has been
located in radio range, mission data can be transmitted.


\subsubsection{Rover -- Control Center Communications}

The command center will wait for queries. When a
query is received, the command center acknowledges the query, and the
rover will send a message of type three (begin communication). The
rover will not move until the communication is finished. The rover
will then send a type seven message along with the size of the data it
plans to transmit, which the command center will acknowledge.

After receiving an acknowledgement, the rover will begin transmitting
the data in 1500-byte packets, each flagged type six. The data will
simply be segmented into pieces sized 1500 bytes - (size of packet
headers and trailers). The rover will only transmit one packet at a
time and wait for an acknowledgement or a request to resend a packet
before sending a new one. If it does not receive an acknowledgement or
resend request containing the nonce of the message it just sent after
a timeout period (for instance, two or three times the average amount
of time necessary to send, receive, and acknowledge a 1500-byte
packet), it discards that nonce and resends the packet with a new
nonce. Thus the devices only keep one piece of state (the current
nonce) per communication.

The fact that devices only transmit one packet at a time, and wait for
an acknowledgement before transmitting a new packet means that each
device receives packets in order, and so no order information need be
included with a packet of data. Since message transmission time is
trivial compared to traveling and experimentation time, using more
complicated protocols to transmit multiple packets would add
complexity without much performance benefit.

A rover will continue transmitting its mission data until it has
received an acknowledgement for the last packet. It will then send a
type four (end communication) message. The command center can either
acknowledge this, in which case the rover will sit idle until the
command center contacts it (using the same query/begin system
described above), or send a type seven (data / assignment size)
message, containing the size of the mission assignment it wishes to
send to the rover. This is also how the command center contacts rovers
when they first reach Mars. The rover will acknowledge the assignment
and allocate memory space, unless it already has a mission (an error
condition which should not happen), in which case it will send a type
nine message. It should have enough space free, since it just uploaded
all of its data to the command center, so it can delete data starting
with the oldest.  The command center will transmit its mission
assignment using the same protocol that the rover used to transmit its
data. After it has finished transmitting the mission assignment, the
command center will send a type four (end communication) message, and
the rover will respond with an ack. This frees the rover to begin its
travel to the mission site.


\subsubsection{Rover -- Rover Communications}

Two rovers can share mission results with each other (described in
Section~\ref{sec:data-exchange}). This is accomplished using exactly
the same protocol as a rover sharing data with the command center,
except that a rover can, after receiving the ``data size'' message,
send a message stating it does not have enough memory for the data, at
which point the other rover can decide to send only part of its data
and try again, or terminate communication


\subsubsection{Control Center -- Earth Communcations}
.
The other special case is communication between the command center and
Earth. This also follows the same protocol as described above, except
that the command center does not wait for acknowledgements before
sending the packets. Earth will know how many packets to expect upon
receiving the ``data size'' packet, so the command center can simply
number each packet it sends. It can even send packets multiple times,
defending against packet loss, if it runs out of new data to send
during a transmission time. The command center will not delete data it
holds until it receives acknowledgements for every non-duplicate
packet, thus the command center must keep extra state for
communications with Earth. It will resend packets to Earth as
necessary.


\subsubsection{Failures}

If a rover fails in the middle of a communication, then the device at
the other end will notice after a suitably long period of time that it
has not received any messages. It will then attempt several queries.
If it times out again, then the device will assume that the rover has
died. If the rover was communicating with the command center, it will
be marked dead; if the rover was communicating with another rover,
then that rover will simply move on to other tasks.