Break-out Topics and Talks
Wednesday, October 19, 2011

SESSION I 10:55am - 12:00noon	Security CSE 305	Computing for Social Good CSE 403	Network Systems Here, There, and Everywhere CSE 691
SESSION II 1:25 - 2:30pm	Personal Robotics CSE 305	Big Data Analytics CSE 403	Games for learning, science, and collaborative problem solving CSE 691
Keynote Talk 2:45 - 3:30pm	Cloud + Big Data = Massive Change, Massive Opportunity Atrium
SESSION III 3:45 - 4:50pm	The Future of Search CSE 305	Programming Better CSE 403	The Bug, the Graph, and the Green: Making computer Systems Better at UW CSE 691
POSTER SESSION/RECEPTION 5:00 - 8:00pm

Please check back for updates

Session I

Security (CSE 305)
- 10:55-11:00: Introduction and Overview: Yoshi Kohno
- 11:00-11:15: Comprehensive Experimental Analyses of Automotive Attack Surfaces, Karl Koscher
  Last year we demonstrated that with access to a car's internal network, an attacker can control many critical components, including the locks, engine, and brakes. This year, we'll talk about how one can get access to a car's network without direct physical access, through malicious media files, Bluetooth, cellular links, and more.
- 11:15-11:30: User-Driven Access Control: Rethinking Permission Granting in Modern Operating Systems, Franziska Roesner
  Modern client platforms, such as iOS, Android, and web browsers, run each application in an isolated environment with limited privileges. A pressing open problem in such systems is how to allow users to grant applications access to user-owned resources, e.g., to privacy- and cost-sensitive devices like the camera or to the user’s data that resides with various applications. A key challenge is to enable such access in a way that is non-disruptive to users while still maintaining least-privilege restrictions on applications. We propose user-driven access control, whereby permission granting is built into existing user actions, rather than added as an afterthought via manifests or prompts. To this end, we introduce access control gadgets (system-controlled UI elements that applications may embed) for controlling access to user-owned resources.
- 11:30-11:45: Televisions, Video Privacy, and Powerline Electromagnetic Interference, Sidhant Gupta
  Modern consumer electronics that make use of switched mode power supplies (SMPS) for increased efficiency produce Electromagnetic Interference (EMI). This EMI can be sensed using a single low-cost single sensor installed at any available electrical outlet in a home. Though such EMI is considered 'noise', our investigation shows that this 'noise' is a rich source of information! In this talk, we will show results from an extensive study that we performed investigating information leakage over the power line infrastructure. In particular we analyzed EMI from eight televisions (TVs) spanning multiple makes and models. In addition to being of scientific interest, our findings contribute to the overall debate of whether or not measurements of residential powerlines reveal significant information about the activities within a home. We found that the power supplies of modern TVs produce discernible EMI signatures that are indicative of the video content being displayed.
- 11:45-12:00: Security Threat Discovery Cards: A Toolkit for Computer Security Analyses, Tamara Denning
  We developed the Security Threat Discovery Cards to help security non-experts investigate the security threats to a system. The Security Threat Discovery Cards are a physical toolkit that scaffolds broad formative thinking about the resources, motivations, and methods of a potential adversary and the system assets that can have human impact.

Computing for Social Good (CSE 403)
- 10:55-11:00: Introduction and Overview: Alan Borning
- 11:00-11:20: Is This What You Meant? Improving Civic Discussion on the Web with Reflect and ConsiderIt, Travis Kriplean
  Frustrated by the state of our political discourse? By people speaking past each other and not listening? In this talk, I present two novel approaches for improving large public discussions on the web: ConsiderIt and Reflect. ConsiderIt guides people to reflect on the tradeoffs of an issue through the creation of a personal pro/con list, with the twist that they can adopt points others have already contributed. ConsiderIt then surfaces the most salient pros and cons overall, while also enabling users to drill down into the key points for those who support, oppose or are undecided. Reflect, on the other hand, modifies online comment boards by creating a space next to every comment where people can succinctly summarize and restate points they hear the commenter making. Commenters thus know whether they are being heard and understood, and readers can demonstrate understanding. I will share insights about what happened when Reflect was deployed on the technology news discussion site Slashdot and when ConsiderIt was deployed as the Living Voters Guide during the 2010 U.S. election.
- 11:20-11:40: A Point-of-Care Diagnostic System: Automated Analysis of Immunoassay Test Data on a Cell Phone, Nicola Dell (PDF slides)
  Many of the diagnostic tests administered in well-funded clinical laboratories are inappropriate for point-of-care testing in low-resource environments. As a result, inexpensive, portable immunoassay tests have been developed to facilitate the rapid diagnosis of many diseases common to developing countries. However, manually analyzing the test results at the point of care may be complex and error-prone for untrained users reading test results by eye, and providing methods for automatically processing these tests could significantly increase their utility. In this paper, we present a mobile application that automatically quantifies immunoassay test data on a smart phone. The speed and accuracy demonstrated by the application suggest that cell-phone based analysis could aid disease diagnosis at the point of care.
- 11:40-12:00: Design of a Phone-Based Clinical Decision Support System for Resource-Limited Settings, Yaw Anokwa
  While previous work has shown that clinical decision support systems (CDSS) improve patient care in resource-limited settings, access to such systems at the point of care is limited. Moreover, even when CDSS are available, compliance with care suggestions remain low. In this talk, I will describe a multi-method approach used to document the types of failures that can affect CDSS implementations. I will present ODK Clinic, a phone-based system created to address these common failures. Informed by early results of a deployment at one of the largest HIV treatment programs in sub-Saharan Africa, I will end with findings relevant to implementers of mobile systems for health care providers in resource-limited settings.

Network Systems Here, There, and Everywhere (CSE 691)
- 10:55-11:00: Introduction and Overview: David Wetherall
- 11:00-11:20: HomeOS, Colin Dixon
  Networked devices for the home including remotely controllable locks, lights, thermostats, cameras and motion sensors are now readily available and inexpensive. In theory, this makes scenarios like remotely monitoring cameras from a smartphone or customizing climate control based on occupancy patterns possible and affordable. However, in practice today, such smarthome scenarios are limited to expert hobbyists and the rich.
  We present HomeOS, a platform that bridges this gap by allowing developers to easily write applications that use diverse sets of commodity devices and enabling non-expert home users to configure and secure their technology. It is based on programming abstractions that are independent of device protocols, management primitives that match how home users want to manage their homes and a kernel that is agnostic of device and application details to enable easy extensibility. HomeOS already has tens of applications and supports a wide range of devices. It has been running 12 real homes for 4--8 months and 41 students have built new applications and drivers independent of our efforts.
- 11:20-11:40: Augmenting Data Center Networks with Multi-Gigabit Wireless Links, Daniel Halperin (PDF slides)
  Today's data center perform well in the average case, but network hotspots can prove to be a major bottleneck for some workloads because of the oversubscribed design of the network. We explore using the emerging 60 GHz wireless technology to provide dense, reconfigurable, and extremely fast data center links to relieve hotspots and improve performance. We conduct extensive measurements that show that 60 GHz technology is well-suited to use in the data center environment, and find that our dynamic "wireless flyways" system can provide significant performance improvement on workload traces from real production data centers.
- 11:40-12:00: Improving Internet Availability through Automatic Diagnosis and Repair, David Choffnes (PDF slides)
  Everyone has experienced Internet outages and service degradation that make the network unusable, and we are far from five nines of reliability that critical services (e.g., remote health monitoring) require. In fact, we found that outages are common and most notably long-term network problems have a large impact on overall network availability because repairs happen on a human time scale.
  To address this problem, we are developing a system, called LIFEGUARD, that automatically locates and repairs network outages, replacing the slow, manual troubleshooting process that operators have to rely on today. We build on our recent work on reverse traceroute to pinpoint the network causing an outage. Once we locate the failure, we use a technique known as BGP poisoning to instruct other networks to avoid the problem, enabling immediate re-routing. We designed our approach to work in today's Internet and to allow an edge network to repair problems that affect it without requiring the intervention of the network causing the problem.

Session II

Personal Robotics (CSE 305)
- 1:25-1:30: Introduction and Overview: Joshua Smith
- 1:30-1:50: Seashell effect pretouch sensing, Liang-Ting Jiang
  We have created a new type of proximity sensor for robotic manipulation using the same effect that allows you to "hear the sea" when you hold a seashell to your ear: as the seashell moves closer or farther from your head, the sound changes. We have built a small cavity and microphone into each finger of our PR2 robot, which allows the robot to detect the object just before contact. The method works with almost any material, unlike our previous electric field pretouch system. We also demonstrated two applications of this sensor for detecting compliant objects and compensating incomplete object information before robotic grasping.
- 1:50-2:10: Pile Manipulation, Lillian Chang (PDF slides)
  We are investigating strategies for robot interaction with piles of objects and materials in cluttered scenes. Our experiments demonstrate that by combining perception and manipulation, a robot is able to take a pile of unknown objects, sort through them, and put them in a tidy array. The robot pushes each potential object numerous times to determine whether it is a single object, or part of a pile. The accumulation of motion evidence from the pushes determines when items have been separated and reduces the grasp errors that occur with non-singulated piles.
- 2:10-2:30: Brain Computer Interface, Mike Chung
  Recent advances in neuroscience and robotics have allowed initial demonstrations of brain-computer interfaces (BCIs) for controlling humanoid robots. However, previous BCIs have relied on high-level control based on fixed pre-wired behaviors. On the other hand, low-level control can be strenuous and impose a high cognitive load on the BCI user. To address these problems, We proposed an adaptive hierarchical approach to brain-computer interfacing: users teach the BCI system new skills on-the-fly; these skills can later be invoked directly as high-level commands, relieving the user of tedious lower-level control. We explore the application of hierarchical BCIs to the task of controlling a PR2 humanoid robot and teaching it new skills.

Big Data Analytics (CSE 403)
- 1:25-1:30: Introduction and Overview: Magda Balazinska
- 1:30-1:50: PerfXPlain: Debugging the Performance of MapReduce computations, Magda Balazinska (PDF slides)
  From industry to science and government, everyone has a "big data" problem. While there exist increasingly many tools to analyse very large datasets, it is often difficult to get high-performance from these tools. The problem is especially challenging when a tool processes data in a large cluster of machines. In the PerfXPlain project, we developed a system that helps users better understand the performance they are getting from a parallel data processing engine such as MapReduce. In PerfXPlain, users ask questions about the relative performances (i.e., runtimes) of pairs of MapReduce jobs and PerfXPlain explains whether the performance difference comes from configuration parameters, from some properties of the jobs, from the input data, or from the cluster load conditions. PerfXPlain provides a new query language for articulating performance queries and a machine-learning-based algorithm for generating explanations from a log of past MapReduce job executions.
- 1:50-2:10: A Latency and Fault-Tolerance Optimizer for Online Parallel Query Plans, Prasang Upadhyaya
  We address the problem of making online, parallel query plans fault-tolerant: i.e., provide intra-query fault-tolerance without blocking. We develop an approach that not only achieves this goal but does so through the use of differ- ent fault-tolerance techniques at different operators within a query plan. Enabling each operator to use a different fault- tolerance strategy leads to a space of fault-tolerance plans amenable to cost-based optimization. We develop FTOpt, a cost-based fault-tolerance optimizer that automatically selects the best strategy for each operator in a query plan in a manner that minimizes the expected processing time with failures for the entire query. We implement our approach in a prototype parallel query-processing engine. Our experiments demonstrate that (1) there is no single best fault-tolerance strategy for all query plans, (2) often hybrid strategies that mix-and-match recovery techniques outperform any uniform strategy, and (3) our optimizer correctly identifies winning fault-tolerance configurations.
- 2:10-2:30: Reverse Data Management, Alexandra Meliou (PDF slides)
  Database research mainly focuses on forward-moving data flows: source data is subjected to transformations and evolves through queries, aggregations, and view definitions to form a new target instance, possibly with a different schema. This Forward Paradigm underpins most data management tasks today, such as querying, data integration, data mining, etc. We contrast this forward processing with Reverse Data Management (RDM), where the action needs to be performed on the input data, on behalf of desired outcomes in the output data. Some data management tasks already fall under this paradigm, for example updates through views, data generation, data cleaning and repair. RDM is, by necessity, conceptually more difficult to define, and computationally harder to achieve. Today, however, as increasingly more of the available data is derived from other data, there is an increased need to be able to modify the input in order to achieve a desired effect on the output, motivating a systematic study of RDM. We define the Reverse Data Management problem, and classify RDM problems into four categories. We illustrate known examples of RDM problems and classify them under these categories. Finally, we introduce a new type of RDM problem, How-To Queries.

Games for learning, science, and collaborative problem solving (CSE 691)
- 1:25-1:30: Introduction and Overview: Steve Tanimoto (PDF slides)
- 1:30-1:45: Revealing the Problem-Solving Process to Solvers, Tyler Robison (PDF slides)
  The process of collaboratively solving a problem can be made transparent by showing solvers not only their current state, but the entire problem-solving process of their team. However, this may increase the complexity of the problem-solving task. This work in progress examines how this complexity can be managed, and how the problem-solving and collaboration can be enhanced, by computing and displaying assessments of the team's progress, collaboration and exploration style. We are currently investigating these tools in the context of a team of solvers working together to achieve a high score in a city-building game.
- 1:45-2:00: Roles in Online Collaborative Problem-Solving, Sandra B. Fan
  Working in a team can be hard. Working in a team online can be even harder. In this talk, we discuss new features we are exploring to help make online collaborative problem solving more effective. Our system, CoSolve, is a web platform for modeling problems and allowing large teams of people to explore solutions to these problems. However, when working in a team, not all team members may feel free to offer their ideas or critique others' ideas; on the other hand, others naturally take over and steamroll others' opinions. In our new roles feature of CoSolve, we explore a way to ensure that everyone gets an equal chance to participate.
- 2:00-2:15: Training Visual Perception through Games, Yun-En Liu (PDF Slides and Demo)
  Visual perception skills cover your ability to process information you can see. Having better peripheral vision in football, keeping track of surrounding cars while driving, having faster reflexes in action games - these are all examples of visual perception skills. Being able to improve them would be useful for nearly every task we encounter which involves sight. Our goal is to train these skills using a free, online Flash game where players hunt for vampires hidden amongst a crowd of unsuspecting humans, with the vampires' characteristics constantly changing. Even more, we will use the game as a research platform to understand how it is that different people improve at the different visual perception skills, with the eventual goal of creating an adaptive game that can bring relative novices to the same performance level as experts with sufficient game play.
- 2:15-2:30: Game Optimization through Telemetry and Large-Scale Experimentation, Erik Andersen (PDF slides)
  Internet telemetry presents exciting new opportunities for exploring user behavior on a massive scale. This talk will describe how we used A/B testing and analytics to optimize player engagement and retention in three video games: Refraction, Hello Worlds, and Foldit. We will present highly counterintuitive design insights that we gained from analyzing the play data of more than 110,000 people. We will how how A/B testing can determine the optimal amount of resources to invest in aesthetic improvements such as music, sound effects, and animations. We will explain how optional game objectives can distract players, driving many of them to quit prematurely, and why secondary objectives should support primary objectives. Finally, we will explore the impact of tutorial design on games of varying complexity and learnability, and why investment in tutorials is not always justified.

Session III

The Future of Search (CSE 305)
- 3:45-3:50: Introduction and Overview: Oren Etzioni
- 3:50-4:20: The Future of Search, Oren Etzioni
  A few months ago I wrote that "we could soon view keyword search with the same nostalgia and amusement we reserve for vinyl records and electric type writers." These words were meant to be provocative, but perhaps they seem less outlandish with the recent release of Siri’s new incarnation in the Iphone 4S. This talk will provide an overview of information extraction and related natural-language processing technologies that are the basis for a new generation of search engines.
- 4:20-4:50: Extracting a Calendar from Twitter, Alan Ritter
  Recently there has been an explosion in the number of users posting short “status update” messages on social networking websites such as Facebook and Twitter. Although noisy and informal, this new style of text represents a valuable source of information not available elsewhere: it provides the most up-to-date information on events unfolding in the world. Discovering useful information in this stream of raw unedited text is not easy, however, as large numbers of irrelevant and redundant messages can easily lead to information overload. No person can read each of the hundreds of millions of Tweets written each day, motivating the need for automatic text processing techniques to extract and aggregate the most important information. Of course this dynamically changing source of real-time information is already being mined using keyword extraction techniques, for example the "trends" displayed on Twitter's website provide a list of phrases which are frequent in the current stream of messages. In order to move beyond a flat list of phrases, we have been investigating the feasibility of applying Natural Language Processing (NLP) and Information Extraction techniques to produce more structured representations of events, which enable various queries and visualizations of the data. A key challenge is the noisy and informal nature of this data; unlike edited texts such as news articles, status messages contain frequent misspellings and abbreviations, inconsistent capitalization, unique grammar, etc... To deal with these issues, we have been building a manually annotated corpus of Tweets, which is used to train Twitter-specific NLP tools. As a demonstration of their utility, the resulting tools are combined to produce a calendar of popular events occurring in the near future, which can be viewed at http://statuscalendar.com.

Programming Better (CSE 403)
- 3:45-3:50: Introduction and Overview: Michael Ernst
- 3:50-4:10: Building and Using Pluggable Type-Checkers, Werner Dietl
  This talk describes practical experience building and using pluggable type-checkers. A pluggable type-checker refines (strengthens) the built-in type system of a programming language. This permits programmers to detect and prevent, at compile time, defects that would otherwise have been manifested as run-time errors. We built a series of pluggable type checkers using the Checker Framework, and evaluated them on 2 million lines of code, finding hundreds of bugs in the process. We also observed 28 first-year computer science students use a checker to eliminate null pointer errors in their course projects. Overall, we found that the type checkers were easy to write, easy for novices to productively use, and effective in finding real bugs and verifying program properties, even for widely tested and used open source projects.
- 4:10-4:30: Inferring Invariant-Constrained Models from Existing Instrumentation, Ivan Beschastnikh (PDF slides)
  Computer systems are often difficult to debug and understand. A common way of gaining insight into system behavior is to inspect execution logs and documentation. Unfortunately, manual inspection of logs is an arduous process and documentation is often incomplete and out of sync with the implementation. Synoptic is a tool that helps developers by inferring a concise and accurate system model. Unlike most related work, Synoptic does not require developer-written scenarios, specifications, negative execution examples, or other complex user input. Synoptic processes the logs most systems already produce and requires developers only to specify a set of regular expressions for parsing the logs.
  Synoptic has two unique features. First, the model it produces satisfies three kinds of temporal invariants mined from the logs, improving accuracy over related approaches. Second, Synoptic uses refinement and coarsening to explore the space of models. This improves model efficiency and precision, compared to using just one approach. We empirically evaluated Synoptic through two user experience studies and found that Synoptic-generated models helped developers to verify known bugs, diagnose new bugs, and increase their confidence in the correctness of their systems. None of the developers in our evaluation had a background in formal methods but were able to easily use Synoptic and detect implementation bugs in as little as a few minutes.
- 4:30-4:50: Verification games: Crowd-sourced software verification, Michael Ernst (PDF slides)
  Software verification is typically an arduous process performed by highly-skilled, highly-paid software engineers. We would like to change that by making software verification as fun as playing a game that does not require any knowledge of computing. We are building a system that converts a program into a game. When the player solves a level, the game board state can be converted into a proof of correctness for the program. I will demo the game and explain how solving it is equivalent to software verification.

The Bug, the Graph and the Green: Making Computer Systems Better at UW (CSE 691)
- 3:45-3:50: Introduction and Overview: Luis Ceze
- 3:50-4:05: Good-Enough Computing: Saving Energy by Making Mistakes, Adrian Sampson (PDF slides)
  Computer systems traditionally spend energy to guarantee consistent, correct behavior. However, many modern applications do not require perfect correctness. An image renderer, for example, can tolerate occasional pixel errors -- human viewers are likely not to notice a few "mistakes" in a rendered image. This kind of application can benefit from relaxed correctness if it can be used safely. This talk gives an overview of several projects at UW that explore the benefits, especially in energy efficiency, of computers that can occasionally make mistakes. A programming language, EnerJ, makes this "approximate computing" safe to use. Approximation-aware CPU architectures and discrete accelerators can be extremely low-power. Disciplined approximate computing has potential in software stacks and network communication.
- 4:05-4:20: Aviso: Reliably Executing Broken Concurrent Software, Brandon Lucia
  Programming errors make computer systems unreliable. Concurrent software is especially difficult to write correctly. As concurrent programming becomes the norm, it is especially difficult to make systems reliable. In this talk I will discuss Aviso, a technique we developed that avoids failures in concurrent programs. Aviso monitors deployed software as it runs. When a crash occurs, Aviso looks back at a history of the program's execution and finds sequences of events that are likely to have caused the crash. Aviso encodes each sequence as a state machine that it can use to prevent the sequence from occurring in future program runs. If a sequence of events caused a failure, its corresponding state machine will prevent that failure. We call the state-machines "SHIELDs" -- State-machines for Hiding Interactions Ending in Likely Defects. Aviso automatically vets SHIELDs to find the ones that most increase reliability. Aviso can distribute effective SHIELDs to many instances of the same program to share failure avoidance capability. Aviso works entirely without the programmer's help, from when the program crashes, to when Aviso distributes the SHIELD that avoids the crash. Using our high-performance software runtime implementation, we have shown that Aviso can dramatically improve the reliability of real software with low performance overhead.
- 4:20-4:35: Crunching Large Graphs with Commodity Processors, Jacob Nelson
  Crunching large graphs is the basis of many emerging applications, such as social network analysis and bioinformatics. Graph analytics algorithms exhibit little locality and therefore present significant performance challenges. Hardware multithreading systems (e.g., Cray XMT) show that with enough concurrency, we can tolerate long memory latencies. Unfortunately, this solution has not been available with commodity parts. Our goal is to develop a latency-tolerant system built on top of a commodity cluster. This talk will describe our approach and highlight a number of the components of our system, including a runtime that supports a large number of lightweight contexts, lightweight synchronization, and a memory manager that provides a high-bandwidth distributed shared memory.
- 4:35-4:50: Injectable virtual machines: Hunting concurrency bugs for fun and profit, Mark Oskin
  In this talk I'll describe Jinx, a commercial product by the UW spinoff Corensic. Jinx is a software quality tool designed to ferret out elusive concurrency errors in software. The user experience is deceptively simple: enable Jinx, and suddenly rare concurrency bugs in applications are happening all the time. Underneath the hood, however, lays much technology and unusual systems concepts. Including:
  *Injectable Hypervisor* Traditional hypervisors load onto a bare metal machine, and then load operating systems into virtual machines they manage. Jinx reverses this process, and worms itself as a hypervisor underneath an already booted operating system. Not only does this create new implementation challenges, such as the lack of control of which devices the guest operating system believes are in the system, but new systems concepts, such as the co-operative nature of resource management between the hypervisor and the guest. Thus, unlike traditional hypervisors, injectable hypervisors are a new beast, much closer to a co-operating system for the existing already booted OS.
  *Deterministic Multiprocessing* Traditionally, multiprocessors (and multicores) execute non-deterministically. That is, given the same input, a multithreaded program can produce different output each time it is executed. Jinx can programmatically make the entire guest virtual machine deterministic using technology developed here at the University of Washington. It does this by layering an abstraction layer over the paging system, in effect, shunting a deterministic coherence protocol into the virtual address mapping process.
  *Concurrency bug amplification* The microarchitecture of multiprocessor/core machines has a natural bug masking effect. Caches give threads significant false atomicity -- once a processor gains exclusive access to a cache line it is able to execute far more load/store operations to it, quickly, than a remote processor, which must coax the remote cache to cough up exclusive control of it. Jinx executes the guest virtual machine by periodically looking for shared memory interactions in copy-on-write snap shots of it. It then uses the knowledge about these shared memory interactions to carefully orchestrate them in the real virtual machine the user is interacting with. This orchestration is currently designed to ferret out buggy thread interactions, but one can imagine other possibilities, such as making them never occur.
  Given the brevity of this talk, the focus will be on injectable hypervisors and concurrency bug amplification.

Break-out Topics and Talks

Break-out Topics and Talks Wednesday, October 19, 2011

Session I

Security (CSE 305)

Computing for Social Good (CSE 403)

Network Systems Here, There, and Everywhere (CSE 691)

Session II

Personal Robotics (CSE 305)

Big Data Analytics (CSE 403)

Games for learning, science, and collaborative problem solving (CSE 691)

Session III

The Future of Search (CSE 305)

Programming Better (CSE 403)

The Bug, the Graph and the Green: Making Computer Systems Better at UW (CSE 691)

Break-out Topics and Talks
Wednesday, October 19, 2011