EECS-395/495 Internet-Scale Experimentation, Winter 2016

Announcements

Remember to check this (and Piazza) regularly!

Administrative Information

Professor

Fabián E. Bustamante
Technological Institute, L465
+1 847 491-2745
This email address is being protected from spambots. You need JavaScript enabled to view it.

Location and Time

Lectures: Tuesdays and Thursdays 12:30-1:50PM
Tech Institute LG66

Professor Office Hours: by appointment

Catalog Description

Internet-scale Experimentation is a graduate-level seminar exploring the challenges of large-scale networked system experimentation and measurements.

Course Prerequisites

EECS 340 "Intro to Computer Networking" or EECS 345 "Distributed Systems".

If you have taken similar courses somewhere else or have not taken any of these courses, but would like to register for this seminar, please contact me.

In compliance with Section 504 of the 1973 Rehabilitation Act and the Americans with Disabilities Act, Northwestern University is committed to providing equal access to all programming. Students with disabilities seeking accommodations are encouraged to contact the office of Services for Students with Disabilities (SSD) at +1 847 467-5530 or This email address is being protected from spambots. You need JavaScript enabled to view it. . SSD is located in the basement of Scott Hall. Additionally, I am available to discuss disability-related needs during office hours or by appointment.

How would you ...

  1. Evaluate the effectiveness of a feature you added to your startup’ new app?
  2. Understand the tense relationship between Netflix and ISP?
  3. Characterize the impact of population growth on urban spaces?
  4. Understand what determines the quality of experience of Internet users?
  5. Measure the consequences of network censorship on user experience?

The answers to this seemingly disparate set of questions share a common requirement – carrying experimentation at Internet-scale.

Internet-scale Experimentation is a graduate-level seminar exploring the challenges of large-scale networked system experimentation and measurements. Over the last few decades, networked systems have become an integrated part of everyday life and a critical piece of our economic, educational, health and defense systems. This fact is normally brought up as evidence of the success and broader impact of our field of work.

The other, typically avoided, side of the story is the complications that this translates into for experimentalists. Today it is virtually impossible to run a randomized controlled experiment at even fractions of the scale of many of our systems. Despite this, as we explore new ideas in these uncharted territories we are reasonably asked to provide better evidence of the effects of interventions. In this seminar we will discuss ongoing projects on networked systems experimentation and their applications, in wired and wireless settings, that address some of these challenges.

The class consists of two major components: reading and reviewing papers and doing a research project on your own. For the research part of the course, you will have the chance to work (and expand) some existing platforms and datasets as you formulate and try to answer these and other interesting questions of Internet scale.

Topics

  • Introductory notes: Internet architecture, practical issues and good practices for Internet-scale experimentation
  • Experimental platforms: Experimental design, context of experiments
  • Experimental design
  • End-to-end and up-the-stack
  • Network infrastructure
  • Traffic
  • Applications and distributed services: DNS, Web, P2P, VoD, OSN, ..
  • Botnets and other maladies
  • Security and ethical issues

Course Organization

The course is organized as a series of paper discussions and a single term-long project.

Most class meetings will be centered around two paper presentations and discussion. You should read each paper before coming to class and be prepared to discuss it.

I will post a question in Piazza about each paper 24hr before class. Your answer need only be long enough to demonstrate that you understand the paper; a paragraph or two should be enough. I will check your answers to make sure they make sense and they will count for part of the paper discussion grade. Please make sure to post your answers as private!

The class will run as mini-conference with you as the Program Committee member. We will use the papers included in the schedule as our set of submissions. Each of you will write reviews for 3-4 of them. We will discuss the papers in a two-part PC meeting (around midterm and the end of the quarter) to decide what paper "should be accepted" for publication.

While there is no textbook for the course, a great book in Internet measurement is:
M. Crovella and B. Krishnamurthy, Internet Measurement: Infrastructure, Traffic and Applications, Wiley 2006.

Communication Channels

There are a number of communication channels set up for this class:

  • We will use the course web site to post announcements related to the course. You should check this regularly for schedule changes, clarifications and corrections to assignments, and other course-related announcements.
  • We will use Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on our Piazza for ISE.
  • There is always email for questions that would be inappropriate to post on the newsgroup/discussion-board. When using email to contact the staff please start your subject line with "ISE: helpful-comment" to ensure a prompt response.

Grading

I use a criterion-referenced method to assign your grade; in other words, your grade will be based on how well you do relative to predetermined performance levels, instead of in comparison with the rest of the class. Thus, if a test has 100 possible points, anyone with a score of 90 or greater will get an A, those with scores of 80 or greater will get a B, those with scores of 70 or greater will get a C, and so on. Notice that this means that if everyone works hard and gets >90, everyone gets an A.

Total scores (between 0 and 100) will be determined, roughly, as follows:

  • Paper discussion participation (and questions) 10%
  • Paper review and PC meeting participation 20%
  • Paper presentation 20%
  • Project 50%

Work in progress!

Week Date Topics and Reading
1 Introductory notes: Class organization, Internet architecture, practical issues and good practices for Internet-scale experimentation
01/07
01/12
2 Measurement and Experimentation Platforms
01/14
01/19
3 Internet topology
01/21
01/25
4 Routing and reachability
01/28
02/02
5 Internet Traffic
02/09
02/11
6 Outages and Reliability
02/12
7 Broadband | Mobile
02/16
02/18
X 2/19First PC Meeting
8 Applications
02/23
02/25
9 Censorship
03/01
03/03
10 Security | Second PC Meeting
03/08
03/10 Second PC Meeting
FINALS 03/?? Finals Week - Projects final presentations

Reading, writing and presenting papers

Reading and Answering Questions

We will be reading two or more papers per week. The papers will be first presented to the group by one or more students and then discussed in a round-table manner.

To ensure lively discussions, you will be responsible for reading the assigned papers before each class. I will post a question about each paper 24hr before class. Your answer need only be long enough to demonstrate that you understand the paper; a paragraph or two should be enough. I will check your answers to make sure they make sense and they will count for part of the paper discussion grade.

You may find the following documents useful:

Writing reviews

At one time or another, every researcher is asked to review papers submitted for publication at a conference or journal; a process known as peer review. We will work on this skill by running a mini-conference - WINE (We do INternet Experimentation) 2016 (Access is restricted to Northwestern).

All class members will be part of the "Programm Committee" for our mini-conference and we will consider all papers listed in our schedule as our submissions.

Each paper will receive three reviews and each PC member will be responsible for writing 3-4 reviews (you are welcome to write additional reviews). We will discuss all papers in a two-part PC meeting (around midterm and during final weeks) to decide what paper "should be accepted" for publication. Each paper discussion will be led by one of the reviewers (assigned by the PC chair).

You may find the following documents useful:

To enter your reviews go to WINE 2015 (Access is restricted to Northwestern).

Presenting

Most class meetings will be centered around a paper presentation and discussion. Each student will be responsible for presenting one of the papers in the schedule (so, if you haven't yet, please email me three ranked options).

Giving a good presentation is hard work. Please make sure to allocate enough time to prepare for yours. There are some good pointers around that you may want to look at.

Here is an incomplete list of dos and don'ts:

  • Don't try to present the whole work; remember the talk is just a taster.
  • Think of your primary audience to decide what/what not to expand on.
  • Use examples to motivate the work and approach, and illustrate the key points.
  • Don't put too much on a slide - prune and then prune again.
  • Don't put too much on a slide - just one figure/graph per slide!
  • Don't put too much on a slide - don't waste the header/slide title!
  • Careful with use of animation - not for show, just for clarity
  • Please put numbers in your slides
  • Seriously consider dropping the typical "overview/roadmap" slide
  • Saying enough without saying too much - enough depth to convey your ideas, not so much as to overwhelm your audience

Projects

There will be one single project on which you will work throughout the quarter - this is a critical component of the course. Your goal is to propose and tackle a research problem that requires the use of Internet-scale experimentation.

Projects must be written up in a term paper (due during finals week) and teams will present their results at the end of the course in a systems class mini-conference. Projects ideas will be suggested by the instructor, but you are strongly encouraged to come up with your own ideas. Based on the topic of your project, you will be assigned a project leader to help you through the quarter (you will meet weekly with them).

This is the schedule of meetings and deliverables (this is mainly to ensure steady progress):

  • Form a group: First week.
  • Project meeting with instructor: Second week.
  • Project proposal posted in Piazza (you should read the CSP project startup or look at the Questions that any project proposal should answer -- the Heilmeier "Catechism"):
    Third week
  • Midterm presentation and report. The presentation should be 4 slides long, including (1) Project name and team members, (2) Revised statement of project goals and list of new/interesting concepts to be investigated, (3) List of issues addressed and pending, and (4) Updated project milestones, highlighting accomplishments to date, and schedule for the rest of the quarter:
    Fifth week
  • Final presentation: Last class
  • Final report: Monday of finals week.
  • Review of another group's report: Wednesday of finals week.

The final report has to conform to the format used by the Workshop on Hot Topics in Networks. Reports should be no longer than 6 pages (you can use appendices or a webpage to document details). The following structure is suggested:

  • Abstract: What did you do, why is important and what are your high-level results?
  • Problem statement: What is the problem you tried to solve?
  • Prior work: How others have addressed the problem before and why that was not enough?
  • Research approach: What was your approach to solving the problem? What did you design, build? What was your experimental methodology?
  • Results: What were your results? How did you evaluate your work? What were your figures of merit?
  • Lessons learned and future work: If you knew what you know now, what would you do differently? What questions are left for future work?
  • Summary and conclusions.

Resources

RIPE Atlas slides