Cs294-spring08

From RAD Lab

Jump to: navigation, search

Contents

Administrative Info

  • David Patterson (patterson@cs), Armando Fox (fox@cs), Will Sobel (wsobel@eecs)
  • Mailing list: saas@lists.eecs.berkeley.edu
  • Course number: CS294-23
  • CCN: 27399
  • Where/when: MW 2:30-4, 405 Soda
    • One-time special meeting: Friday 1/25, 2:30, 405 Soda - Rob O'Callahan visiting from Mozilla
  • Units: 2 units without project and scribing some lectures, 3 units with project

Projects (and Weds 3/12 appointment times, in 413 Soda)

  • 2:30 Michael Armbrust (marmbrus@cs), Gunho Lee (gunho@cs) - Characterizing the variance of EC2 machine performance, network topology, etc. and developing recommendations for deployment
  • 2:45 Matei Zaharia (matei@eecs) and Andy Konwinski (konwinski@berkeley) - try to identify a CS262 project that could be joint with this class. Possibility: using the modified/improved Hadoop and the Hadoop trace data as a case study of Michael's findings; how should Hadoop users deploy on EC2. Correlate results for Hadoop to Michael's findings specifically.
  • 3:00 Raluca Sauciuc (sauciuc@eecs) - Concolic testing for Javascript code - improved code coverage compared to current testing techniques. Suggestion: look at widely-used libraries such as Script.aculo.us, Prototype, Google Gears, Adobe Air.
  • 3:15 Ryan Waliani (rwaliani@gmail) - TBD contribute to one of the existing RAD Lab projects in machine learning analysis of data - to discuss on Weds.
  • 3:30 Sonesh Surana (sonesh@cs)
  • 3:45 Jingtao Wang (jingtaow@eecs)

Web 2.0 "service building blocks"

  • Building blocks of Web 2.0 server side
    • Overview of:
  1. Ad serving
  2. checkout/financial txns
  3. geocoding
  4. messaging/IM
    • storage: S3, Dynamo and/or Bigtable paper/speaker
    • data-parallel batch processing (Hadoop/MapReduce)
    • collab filtering: Jordan to suggest overview paper
    • social networking - Facebook
    • search: talk from Lucene
    • content distribution: Coral, Akamai
  • Networking in Web 2.0 (Thacker from MSR)
  • "Classic" vs "utility computing" SaaS
    • Classic: Salesforce, Google Docs; can subdivide into whether requires interactive client latency or not
    • Utility: EC2, Planetlab, Emulab; Virtualization (per-node VMs as well as large scale utility computing)
  • Client side issues
    • Client architecture: binary (Google Earth) vs "Web 1.0" client vs "Web 2.0" rich client (Google Maps/Docs, ?? Dave has reference from HPTS comparing these)
  • Web 2.0 Application case studies
    • Craig Harper - Apisphere - mobile IM with geocoding

Project ideas

In general, each project can have a Phase 1 where it's deployed/benchmarked/measured on our small-scale local cluster(s) and a Phase 2 using large scale EC2 or similar.

  • Is there really no 80/20 rule for how Web pages stress browsers?
  • Is it really true that "most" Javascript is not executed more than once (hence no gain to JIT)? What about using a separate core (on multicore machine) to JIT, making it "free"?
  • How "compliant" (CSS, XHTML, JS extensions, etc.) are different Web sites? Could we provide a browser plugin that figures this out in the background and reports back to a UCB-hosted database, and peer pressure on the publishers of those sites would result in site improvements?
  • Repeatability/consistency of experiments on EC2: what are distributions of actual allocated CPU, internal latency, etc (eg: EC2 vs. VM on RAD Lab-owned hardware vs bare metal)? How could these findigns influence the design of VM monitors?
  • Paper design of declarative datacenter markup
  • RoR scalability/how fast respond to load spikes
  • SCADS scaling
  • Performance modeling (Peter/Charles)
  • Using AjaxScope to measure client Javascript performance (eg, implement "random sampling of instrumenation" idea used in Liblit's work for Javascript apps)
  • Design considerations for new Web 2.0 service (with or without prototype)
  • Propose a way to model a domain and propose a standard for representing some new Web 2.0 service element (eg, analogous to OpenSocial) - eg, content distribution for streaming media
  • Social network service with complex ACL to allow only certain third parties to gain access to the data. Also capable of storing encrypted personal data.
  • application simulator (aka DummyApp) [Peter/George]
    • each server runs multiple worker threads, each of which can execute the following operations: disk, memory, cpu, network, sleep, call, fork/call
    • these servers can be grouped into web application tiers (web servers, app server, storage, ...) and the workload generator generates requests that specify which operations should be executed at each tier

Project ideas would be critiqued by one or two invited domain experts.

Papers not yet slotted on any particular date

Approximate syllabus by week

Links to guest speaker talk titles and abstracts

Links to discussion and scribe notes

  1. 1/23 Intro Slides coming soon
  2. 1/25 (special meeting) Rob O'Callahan
    1. Time/Place: 2:30PM in the Wozniak Lounge, 4th Floor of Soda Hall
    2. Title: Inside Firefox
    3. Scribe: Kuang
    4. Scribe Notes: Armando's, Kuang's
  3. 1/28 Read and discuss publications:AjaxScope: A Platform for Remotely Monitoring the Client-Side Behavior of Web 2.0 Applications (Emre Kiciman et al., Microsoft Research, SOSP 2007)
    1. Scribe: David Poll
    2. Scribe Notes: Link
  4. 1/30 Chuck Thacker, Microsoft Inc.
    1. Time/Place: 2:00PM in the Wozniak Lounge, 4th Floor of Soda Hall
    2. Title: Rethinking Data Centers
    3. Scribe: David Poll
    4. Scribe Notes: Link
    5. Scribe: Jeremy Schiff
    6. Scribe Notes: Link
  5. 2/4 Thorsten von Eicken, Right Scale Inc.
    1. Time/Place: 2:00PM in room 465 (RAD Lab), 4th Floor of Soda Hall
    2. Title: The Future of Software: In the Cloud
    3. Scribe: Bryce Lee
    4. Scribe Notes: Link
  6. 2/6 Class discussion of last 4 speakers (leading to possible projects)
    1. Scribe: Bryce Lee
    2. Scribe Notes: Link
  7. 2/11 Charles Gordon, Amazon Inc.
    1. Time/Place: 2:30pm in 465 (RAD Lab)
    2. Title: IMDb's Architecture and Plans to Improve it
    3. Scribe: Simon Tan
    4. Scribe Notes: Link
  8. 2/13 Chris Olston, Yahoo Inc.
    1. Time/Place: 2:00PM in the Wozniak Lounge, 4th Floor of Soda Hall
    2. Title: "Processing Web-Scale Data with Pig"
    3. Scribe: Simon Tan
    4. Scribe Notes: Link
  9. 2/18 NO CLASS - President's Day
  10. 2/20 Andrew Fikes - Google Inc.
    1. Time/Place: 2:00PM in the Wozniak Lounge, 4th Floor of Soda Hall
    2. Title: Google's Scalable Architecture: GFS, Bigtable, and MapReduce
    3. To read: BigTable: A Distributed Storage System for Structured Data (Fay Chang et al., Google, OSDI '06)
    4. Slides: Image:Andrew-Fikes Google Big-table slides.pdf
    5. Scriber: Andy Konwinski
    6. Scribe Notes: Link
  11. 2/25 Bob Felderman, Google Inc.
    1. Time/Place: 2:00PM, room 465 (RAD Lab)
    2. Title: Datacenter Networking: Feeds, Speeds and Needs
    3. Scriber: Andy Konwinski
    4. Scribe Notes: Link
  12. 2/27 Class discussion of last 4 speakers (leading to projects)
    1. Scribe: Jeff Tang
    2. Scribe Notes: Link
  13. 3/3 Andrew Gordon, Microsoft Research
    1. Time/Place: 2:00PM, room 465 (RAD Lab)
    2. Title: Service Combinators for Farming Virtual Machines
    3. Scribe: Jeff Tang
    4. Scribe Notes: Link
  14. 3/5 Ari Steinberg, Facebook
    1. Time/Place: 2:00PM, In the Wozniak Lounge (4th Floor of Soda Hall)
    2. Title: Facebook Architecture and Abstractions
    3. Scribe: Kurtis Heimerl
    4. Scribe Notes: Link
  15. 3/10 Parallel In Class Project Discussions
    1. Scribe: Sameer Iyengar
    2. Scribe Notes: Link
  16. 3/12 Individual Meeting
  17. 3/17 NO CLASS due to CS Visit Day
  18. 3/19 Project Checkpoint / Midcourse correction
    1. Scriber: Kurtis Heimerl
    2. Scribe Notes: Kurtis's
    3. Scriber: Jeremy Schiff
    4. Scribe Notes: Jeremy's
  19. 3/24 NO CLASS - Spring Break
  20. 3/26 NO CLASS - Spring Break
  21. 3/31 <TBD>
  22. 4/2 Parallel In Class Project Discussions
    1. Scribe: Sameer Iyengar
    2. Scribe Notes: Link
    3. Scribe: Jingtao Wang
    4. Scribe Notes: Link
  23. 4/7 <TBD>
  24. 4/9 Project Checkpoint / Midcourse correction (Cancelled)
  25. 4/14 Project Checkpoint / Midcourse correction
    1. Scriber: Jingtao Wang
    2. Scribe Notes: Link
  26. 4/16 <TBD>
  27. 4/21 <TBD>
    1. Scriber: Kuang Chen
    2. Scribe Notes: Link
  28. 4/23 Early deadline for potential OSDI submissions (5/8 is OSDI drop dead date)
  29. 4/28 No Class
  30. 4/30 Poster Session - Link
  31. 5/5 Term paper deadline Midnight (unless OSDI, then 5/8)
  32. 5/7 (Last day of classes) What did we learn? Reflections on SaaS
Personal tools