Assignment 4: Analyzing Input & Online Clickthrough Data

due: by classtime on May 5th. email to cs303@cs

Introduction:
In this assignment, you will use R to analyze the input data from the first day of class, and the ad clickthrough data from the Dow paper.

Part 1:
On the first day of class, we gathered a bunch of target acquisition data using various input devices. We learned about Fitts Law, which demonstrates how target acquisition time is proportional to the log(amplitude/width). What that proportion is depends on the device. We're going to create some graphs and find parameters using this csv file of all the data. Here's the one I created for the mouse:
> mouse = lm(Duration ~ log2(A/W + 1), subset=(X.Subject%/%100==1))
The slope of the linear model (in the case of the mouse, 146.19) has the units milliseconds per bit. Converting to seconds and taking the reciprocal, we can find how many bits per second you as participants were able to specify with each device. (The mouse performed at 6.849 bits/second.) First, submit code that will generate a graph plotting the overall data and regressions for each device, and then graphs that show each device individually. Use this colortable to color code each point with the device used.
> colortable<-c("red","orange","yellow","green","blue","purple","pink","gray")
Here's an example for the mouse. Next, compute the bits per second for each device, and submit a writeup answering the following: which device was the slowest -- and how many bits? the fastest (and its bits)? which two were most similar (and their values)? which was most surprising to you?

Part 2:
The Dow et al. paper measures clickthrough data. We'll start with some exploratory analysis of this dataset. Plot: a) days on the x, clicks on the y. Use blue for serial, red for parallel. b) days on the x, impressions on the y. same colors. c) days on the x, rate on the y. same colors. Eyeballing from the graphs: what changes after about 5 days, and why? (Think about the motives of ad purveyors.) For the first five days -- the "clean" days -- was the click-through rate of parallel ads significantly (at the .05 level) different than the click-through rate of serial ads? Use a Pearson's CHI-squared test. Submit code to generate the exploratory graphs and perform the CHI-squared test; and a writeup with a short answer to the change question.

Some functions I found helpful for this part were: sum, colSums, rowSums, subset, rbind, plot, and lines.

Collaboration policy: You must complete part 1 individually. You may collaborate on part 2.

Submission guidelines:
Your submission should consist of an email to the staff mailing list. Incude attachments named: suid-1.r, suid-2.r, and suid-writeup.pdf. Thanks!

Here is sample code for part 1 and part 2.