Thursday, January 28, 2010

The dumbing down of operating systems

Yesterday's announcement of the Apple iPad -- basically an iPhone with a larger screen and no camera -- ushered in a new era for OS design. That is, an OS that was originally designed for a phone (the "iPhone OS", which is itself a stripped down version of Mac OS X), is now being ported over to larger, much more capable devices. I have no doubt that the iPhone OS will be very successful on the iPad. It would not surprise me to see it running on laptops and desktops in the near future. This simplified OS eliminates most of the complexity that makes people hate computers: the quagmire of configuration options, keeping up with upgrades, and of course the constant battle against viruses and malware. When I first saw the iPad, I immediately recognized that this is going to be the perfect computer for "non-computer users", like my dear mother-in-law, who pretty much only uses her Windows PC to read email and surf the web. (I'm not sure she's ever saved a file to the hard drive.) But it's also going to be the perfect computer for those of us who just want something that works. Like the original Macintosh, the iPad raises the bar of user-centric computer system design. It is elegant in its minimalism and simplicity.

Still, this trend of dumbing down the OS raises some interesting questions. The iPad OS lacks many features that operating systems have had for decades: multitasking, an open application API and development tools, multiple protection domains -- heck, it doesn't even have a proper filesystem. Arguably, it is this lack of features that makes the iPhone and iPad platforms so attractive to application developers and users alike. This trend suggests very strongly that the feature-rich, complex OSs that we love so much are going to look too baroque, top-heavy, and expensive to survive in a field where the "OS" is little more than GUI gloss over a pretty basic system. (Something tells me that in a few years we're going to get iPad OS v2.0 which builds in some of these "advanced" features that date back to the 1960's.)

Basically, what I'm saying is that the iPad would appear to make most current OS research irrelevant. Discuss.

(Update: Check out this petition that claims that the iPad's DRM is "endangering freedom." I'm not sure I'd go that far, but an interesting perspective nonetheless.)

Tuesday, January 26, 2010

The coming e-book armageddon

I may be jumping the gun a bit here, but with this WSJ article on the potential pricing models for e-books on the Apple Tablet, I am very worried about the future of e-books. Books used to be these paper things that you could get just about anywhere, not require a specialized device to read, lend to your friends, borrow from a library, scribble in, or use to prop a door open. It seems clear that paper books are about to go the same route as the compact disc, but the e-book industry is getting it all wrong by tying themselves to proprietary, closed formats that only work on certain devices from certain vendors. It seems like I hear about a new e-book reader every day on sites like Engadget, but every vendor has the same problem: getting content and agreements with the publishers to support their device. Amazon has done a great job establishing those relationships for the Kindle, and now it seems Apple is going to have to do the same legwork for their tablet. It seems inevitable that we're going to end up with a VHS-versus-Betamax style format war between the two platforms. This does not bode well for the e-book industry.

Let me make it clear that I love e-books. I've been doing a ton of reading on the Kindle App for the iPhone, having recently read Dennis Johnson's Tree of Smoke and Cormac McCarthy's The Road (and a few other not-so-great books that don't need my endorsement here) exclusively on the iPhone. (Yes, the tiny screen works - I can actually read faster with a narrower horizontal view of the text.) I particularly like having a half-dozen or so full-length books in my pocket at all times: if I weren't so busy reviewing papers for program committees I'd probably do more "fun" reading this way. But I really don't know if the Kindle e-books have any staying power. I certainly won't be able to hand them down to my son when he's old enough to read them. And will I have to buy them again, in a different format, for a different device, ten (or five or two) years from now when the next hot e-reader device comes on the market? What a pain.

Publishers should have learned their lesson from the music industry. Anybody remember the early digital Walkman that required you to encode your music in a DRM format and "check out" music from your library on the PC to the portable device? It was a total failure. Although Apple's AAC format is closed, iPods are quite happy to play MP3s, and arguably MP3 won the format war. I know full well that MP3 is far from the "best" format from a technical perspective, but pretty much every device I own knows how to play it, and it seems likely that it will stick around for a while. Better yet there are plenty of open source implementations that can encode and decode the format, so my music library is not locked down.

My final worry is what closed e-book format mean for accessibility of books and (to risk hyperbole) the wealth of human knowledge. In a decade, maybe only the rich kids with fancy Kindles or iSlates will be able to read the good stuff like Harry Potter, while the poor kids with only access to crummy public libraries will have to make do with yellowing, dog-eared paper (!) copies of Encyclopedia Brown novels. If books are really going all electronic, I think we need to think now about how to avoid creating a technology gap that closes certain people off from getting access.

Thursday, January 21, 2010

Sensor networks, circa 1967

What was the first sensor network? Thinking back, I bet most people would guess the early demos by Berkeley and UCLA at places like the Intel Developer's Forum; the Twentynine Palms air-dropped mote demo; and Great Duck Island. These were all around 2002 or so. It turns out this is off by about 35 years -- the first bona fide wireless sensor network was actually deployed in Vietnam, along the Ho Chi Minh Trail, in 1967. It was called Igloo White.

I've been doing a lot of reading about Igloo White lately, and while most of the information on the program is still classified, there are a bunch of articles, books, and a few websites that provide some useful details. Igloo White was a system designed to use seismic and acoustic sensors to detect PAVN movements along the Ho Chi Minh Trail from North Vietnam, through Laos, and into South Vietnam (thereby skirting the DMZ). It consisted of wireless sensors, typically dropped from helicopters and aircraft. In all nearly 30,000 sensors were deployed during the war. (The image above shows an airman about to drop an Igloo White sensor from a helicopter.)

You can read much more about Igloo White elsewhere, but some juicy tidbits that caught my attention. The sensors themselves were shaped like large artillery shells and buried themselves in the ground, with a radio antenna designed to look like the surrounding jungle foliage. They used lithium batteries with an expected lifetime of 30 days. Each sensor performed simple local thresholding and triggered a movement alarm when a certain level of activity was detected. (All of this was done with analog circuity: at the time a digital computer was the kind of thing that filled a good portion of a room.)

The sensors would transmit a 2W signal on the UHF band which would be picked up by orbiting EC-121R aircraft that flew 24/7 on rotating 18-hour missions over various parts of the sensor field. The personnel on board would listen to the transmitted acoustic signals, and attempt to classify the targets. They could even identify specific trucks based on the acoustic signature of the engine. Detections were relayed to a top-secret SIGINT center at Nakhon Phanom, Thailand, where the data was stored by IBM 360 computers and processed largely by human analysts. The analysts would then call in air strikes against the targets. Note that in many cases the bombing runs occurred at night, using ground-based RADAR for bomb guidance, so the pilots never even saw what they were hitting. Presumably they could go from target to detection to, ahem, interdiction, in less than five minutes.

The most incredible thing about Igloo White was the sheer amount of resources that were poured into the program: the whole thing went from concept to reality in little more than a year, and cost more than a billion dollars. The system relied entirely on human analysts to interpret the sensor data, which was often noisy; later in the program the North Vietnamese became adept at spoofing the sensors. The number of people involved in deploying the sensors, monitoring the signals, and interpreting the data was almost inconceivable; the operations center in Thailand was the largest building in Southeast Asia at the time, complete with air conditioning, positive pressure, and airlocks to prevent contaminants from damaging the computers inside. This is not the kind of thing you're going to do with two grad students and a half-million in NSF funding!

The I-REMBASS system from L3 Communications represents the state-of-the-art in (deployed) military sensor networks. And, of course, much of the modern work on sensor nets came out of DARPA programs such as NEST. Still, it's fascinating to see the history behind this technology.

Saturday, January 16, 2010

The Paper Formatting Gestapo

I've recently had two conference submissions "rejected with prejudice" for violating the formatting requirements. In both cases, it was because the lead grad student on the paper didn't read the call for papers carefully enough. (I'll take my share of the blame for not checking it, but sometimes it's pretty hard to check without pulling out a ruler and counting lines of text by hand.) Now, I totally agree with those papers being rejected: it's essential that papers adhere to the formatting requirements. On the other hand, it certainly does not help that every conference uses slightly different guidelines. Here's a typical formatting convention:
Submitted papers must be no longer than 14 single-spaced pages, including figures, tables, and references. Papers should be formatted in 2 columns, using 10 point type on 12 point leading, in a text block of 6.5" by 9".
Another one:
Submissions must be full papers, at most 14 single-spaced 8.5" x 11" pages, including figures, tables, and references, two-column format, using 10-point type on 12-point (single-spaced) leading, with a maximum text block of 6.5" wide x 9" deep with 0.25" intercolumn space.
Yet another (from SIGCOMM 2009):
  • Submissions MUST be no more than fourteen (14) pages in 10 point Times Roman (or equivalent font). This length includes everything: figures, tables, references, appendices and so forth.
  • Submissions MUST follow ACM guidelines: double column, with each column 9.25" by 3.33", 0.33" space between columns.
  • Each column MUST contain no more than 55 lines of text.
  • NOTE: For the submission, you may use the following LaTeX template to ensure compliance.
  • The final copy will be 12 pages using the SIGCOMM standard 9 pt format; this is less than what you might be able to fit in 14 pages at 10pt, and so there is no value in pushing the envelope.
  • Provide an abstract of fewer than 200 words.
  • Number the pages.
  • Do not identify the papers' authors, per the Anonymity Guidelines.
  • On the front page, in place of the authors' names, the paper MUST indicate: the paper ID number assigned during the paper registration process and the total number of pages in the submission.
  • The paper MUST be submitted in PDF format. Other formats (including Postscript) will not be accepted. We must be able to display and print your submission exactly as we receive it, using only standard tools (Adobe Acrobat Reader), with no loading of special fonts.
  • Make sure that the paper prints well on black-and-white printers, not color printers. This is especially true for plots and graphs in the paper.
  • Make sure that the output has been formatted for printing on LETTER (8.5" by 11") size paper.
  • Make sure that symbols and labels used in the graphs are readable as printed, and not only with a 20x on-screen magnification.
  • Try to limit the file size to less than 15 MB.
This is getting silly. Even conferences sponsored by the same organization (say, USENIX or ACM) have different formatting guidelines. In one case, our paper was rejected because it was a resubmission from a previous conference that used an oh-so-slightly different format.

Now, I am all for having a consistent, and firm, formatting requirement for conference submissions. It's really unfair to authors that take pains to adhere to the guidelines when someone violates them by squeezing too much text into the paper. But isn't it time that we define a single standard for conference paper formatting that everyone uses?

Yes, I know there are various templates out there, but most of these are for the final proceedings, which can vary substantially from the submission format. Even worse, many of the "standard" templates floating around out there don't adhere to the submission guidelines anyway. What would help tremendously would be to have a canonical standard template (in various formats, e.g., LaTeX and Word) that everyone uses. Then it would be trivial to tell if someone had tweaked the formatting since their paper wouldn't look the same as the rest. This is the model used by EWSN 2010 (which required submissions to be in the Springer LNCS format) and it worked very well -- all of the submissions had consistent formatting.

A word on automatic format checkers. Both of the conferences we submitted to were supposed to be using an automated format checker that should have flagged the paper as violating the requirements when we submitted it. In both cases, this failed, and the program chairs (quite rightly!) rejected the paper once they discovered that the formatting was wrong. Unfortunately we were not careful enough and assumed that passing the automated check meant that we had done everything correctly. I like Geoff Voelker's Banal system, but it doesn't always work (mostly the fault of the underlying Ghostscript tool that it's based on). Even doing this kind of thing manually is a big pain -- Adobe Acrobat Pro lets you measure things like text blocks and font sizes, but it's a lot of manual effort.

Finally, as a TPC chair I have been on both sides of this, and I always hate rejecting papers due to formatting violations, especially when I know the authors have done good work. Dealing with formatting problems is a huge time sink when you're running a conference, and a standard format would save everyone -- authors, program chairs, reviewers, publication chairs -- a lot of trouble. I think it's time for the systems and networking community to simply define a single standard and get all conferences to use it.

Tuesday, January 12, 2010

The CS Grad Student Lab Manual

Should CS grad students be required to receive formal training in lab technique?

In most scientific disciplines, a great deal of attention is paid to proper experimental design, data collection, and analysis. A grad student in chemistry or biology learns to adhere to a fairly rigid set of procedures for running an experiment (and documenting the procedure). In CS, we largely assume that grad students (not to mention professors and undergrads) somehow magically know how to do these things properly.

When I was a grad student, I more or less figured out how to run benchmarks, collect and record data, document experimental setup, analyze the raw data, and produce meaningful figures on my own. Sure, I had some mentorship from the more senior grad students in my group (and no small amount of pushback from my advisor when a graph would not make sense to him). But in reality, there was very little oversight in terms of how I ran my experiments and collected results. I logged benchmark output to various homebrew ASCII file formats and cobbled together Perl scripts to churn the output. This evolved considerably over time, adding support for gzipped log files (when they got too big), automatic generation of gnuplot scripts to graph the results, and elaborate use of Makefiles to automate the benchmark runs. Needless to say, I am absolutely certain that all of my scripts were free of bugs and that the results published in my papers are 100% accurate.

In my experience, grad students tend to come up with their own procedures, and few of them are directly verifiable. Sometimes I find myself digging into scripts written by one of my students to understand how the statistics were generated. As an extreme example, at one point Sean Rhea (whom I went to grad school with) logged all of his benchmark results directly to a MySQL database and used a set of complex SQL queries to crunch the numbers. For our volcano sensor network deployments, we opted to log everything using XML and wrote some fairly hairy Python code to parse the logs and generate statistics. The advantage of XML is that the data is self-describing and can be manipulated programmatically (your code walks the document tree). It also decouples the logic of reading and writing the logs from the code that manipulates the data. More recently, students in my group have made heavy use of Python pickle files for data logging, which have the advantage of being absolutely trivial to use, but the disadvantage that changes to the Python data structures can make old log files unusable.

Of course, all of these data management approaches assume sound experimental technique. Obvious things include running benchmarks on an "unloaded" machine, doing multiple runs to eliminate measurement error, and using high-resolution timers (such as CPU cycle counters) when possible. However, some of these things are more subtle. I'll never forget my first benchmarking experience as an undergrad at Cornell -- measuring round-trip latency of the U-Net interface that I implemented on top of Fast Ethernet. My initial set of runs said that the RTT was around 6 microseconds -- below the fabled Culler Constant! -- beating the pants off of the previous implementation over ATM. I was ecstatic. Turns out my benchmark code had a small bug and was not doing round-trip ping-pongs but rather having both ends transmit simultaneously, thereby measuring the packet transmission overhead only. Duh. Fortunately, the results were too good to be true, and we caught the bug well before going to press, but what if we hadn't noticed?

Should the CS systems community come up with a set of established procedures for running benchmarks and analyzing results? Maybe we need a "lab manual" for new CS grad students, laying out the best practices. What do people think?

Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.