Assignment 2

due: 2:15pm April 15th by email to cs303@cs.stanford.edu,
submission instructions below


1. Analyzing systems data with R.

Last week, many of you had some trouble with the first graphing assignment. To give you a little more practice (and build your confidence!), this week you will use R to recreate some of the plots from the Roofnet measurement paper. 

One of the very nice things about the Roofnet paper is that all of the experimental data is easy to download as a bzipped tarball: this has led to several follow-on papers that have examined the results more deeply. In the standard Roofnet dataset, one file describes all transmitted packets and another file describes all received packets. The receive file is over 1GB in size, which is a bit much for R to handle easily. So for this assignment we've done some preprocessing to make the dataset easier to handle and process in R. We've also cleaned up a few rough edges in the dataset, such as a few nodes with no send records but for which there are receive records.

The dataset can be found here and is laid out as follows:

 roofnet-links/
1/ 1Mbps measurements
2/ 2Mbps measurements
5.5/ 5.5Mbps measurements
11/ 11Mbps measurements

Within each measurement directory, there are two kinds of files: comma-separated value (.csv) files of packets received on a single, directional link, and text (.txt) files that state how many packets a given sender sent.

The csv files have a header describing the fields. The four fields you care about are:

 src: the source of the packet
 dst: the destination (who received the packet)
 noise: the noise value as described in the paper
 signal: the signal strength value as described in the paper

Note that the src and dst fields are constant within a given csv, and are also in the file name. So generally you'll only work with the noise and signal values. Recall that because signal and noise are on a logarithmic scale, signal-to-noise ratio (SNR) is signal - noise.

The txt files have just two entries: the sender ID and how many packets were sent. You can compute the packet reception ratio along a link by counting how many entries are in the corresponding csv and dividing it by how many packets the source sent.

Collaboration policy:  If you wish, you may collaborate with one other student. Please clearly state in your answer with whom you have collaborated. Each student should turn in her own copy of the assignment.

Part a)
Recreate the four plots in Figure 14 of the Roofnet paper using R. Check that they look correct. Note that this R program can be written in about 20 lines (and probably less).

Part b)
This question recaps what was discussed in class regarding the methodology of this paper:

Compute the minimum SNR observed for a packet on each link and plot a histogram of these values. You should see some links that have packet SNRs far lower than the curves in Figure 12 suggest is possible. Why did these values occur? Do they represent outliers that should be excluded from the analysis, or is something else at work? Make a (brief) argument for your case.

2. Start your blogs!

When conducting any sort of research, a good habit to get into is the keeping of a lab notebook. In this class, you'll be keeping a virtual lab notebook in the form of a blog. (Please see Blogger for templates.)

Please start your blog and make your first post describing the experiment you'll be doing for the class. (Hopefully you were able to hone in on one of your proposals during the in-class discussion last week.) Submit the URL of your page.

3. Read the paper to be discussed next week.

The paper we'll discuss next week is:

Dow et al.: Parallel Prototyping Leads to Better Design Results, More Divergence, and Increased Self-Efficacy.

It should be fairly self-explanatory. In your submission, please include a brief paragraph including your reactions to the paper. This might include: your thoughts on the paper's findings, its methodology, or what you might have done differently (or the same). Did you find anything confusing? Did you particularly like any parts of it? This is simply an opportunity to show that you read the paper and gave it some thought.

Submission guidelines:

Your submission should consist of an email to the staff mailing list with:

1. Five attachments. The first four attachments must be named: suid-1.pdf, suid-2.pdf, suid-5.5.pdf, and suid-11.pdf, where suid is your SUID and each PDF is the plot from problem 1a). The fifth attachment must be named suid-roofnet.r and contains the R code for part a).

2. The text of the email should contain the answer to 1b) (and the names of any collaborators).

3. The text should contain a link to your "lab notebook" blog.

4. A paragraph containing your comments on the parallel prototyping paper.