Riptide: Fast End-to-End Binarized Neural Networks
Proceedings of the 3rd MLSys Conference, 2020.
, Automatic generation of high-performance quantized machine learning kernels
Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020.
, Probing the physical limits of reliable DNA data retrieval
Nature communications, Nature Publishing Group 11:1, 2020.
, Stabilizing synthetic DNA for long-term data storage with earth alkaline salts
Chemical Communications, Royal Society of Chemistry, 2020.
, PLink: Efficient Cloud-based Training with Topology-aware Dynamic Hierarchical Aggregation
Proceedings of the 3rd MLSys Conference, 2020.
, Genotype extraction and false relative attacks: Security risks to third-party genetic genealogy services beyond identity inference.
Network and Distributed System Security Symposium (NDSS) 10:29, 2020.
, Molecular digital data storage using DNA
Nature Reviews Genetics, Nature Publishing Group, 2019.
, Perceptual Compression for Video Storage and Processing Systems
Proceedings of the ACM Symposium on Cloud Computing, ACM, 2019.
, Scaling Microfluidics to Complex, Dynamic Protocols
2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), IEEE, 2019.
, High density DNA data storage library via dehydration with digital microfluidic retrieval
Nature communications, Nature Publishing Group 10:1, 2019.
, Puddle: A Dynamic, Error-Correcting, Full-Stack Microfluidics Platform
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019.
, DNA Data Storage and Hybrid Molecular–Electronic Computing
Proceedings of the IEEE, 2019.
, A Hardware–Software Blueprint for Flexible Deep Learning Specialization
IEEE Micro, IEEE 39:5, 2019.
, Combining Data Longevity with High Storage Capacity—Layer-by-Layer DNA Encapsulated in Magnetic Nanoparticles
Advanced Functional Materials, Wiley Online Library, 2019.
, Demonstration of end-to-end automation of DNA data storage
Scientific reports, Nature Publishing Group 9:1, 2019.
, DNA assembly for nanopore data storage readout
Nature communications, Nature Publishing Group 10:1, 2019.
, LightDB: A DBMS for Virtual Reality Video
Proceedings of VLDB, 2018.
, MATIC: Learning Around Erros for Efficient Low-Voltage Neural Network Accelerators
Design, Automation and Test in Europe, 2018.
Best paper award.
, Correlation Manipulating Circuits for Stochastic Computing
Design, Automation and Test in Europe, 2018.
, Troubleshooting Transiently-Recurring Errors in Production Systems with Blame-Proportional Logging
USENIX Annual Technical Conference (USENIX ATC), 2018.
, TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
Symposium on Operating Systems Design and Implementation (OSDI), 2018.
, Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training
Proceedings of the ACM Symposium on Cloud Computing, ACM, 2018.
, Random access in large-scale DNA data storage
Nature Biotechnology, 2018.
ISBN: 1364298980840
, A content-addressable DNA database with learned sequence encodings
International Conference on DNA Computing and Molecular Programming, Springer, 2018.
, DNA data storage and hybrid molecular–electronic computing
Proceedings of the IEEE, IEEE 107:1, 2018.
, Architecture Considerations for Stochastic Computing Accelerators
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37:11, 2018.
, Learning to optimize tensor programs
Advances in Neural Information Processing Systems (NeurIPS), 2018.
, Iterative Search for Reconfigurable Accelerator Blocks with a Compiler in the Loop
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018.
, Clustering Billions of Reads for DNA Data Storage
NIPS, 2017.
, A Taxonomy of General Purpose Approximate Computing Techniques
IEEE Embedded Systems Letters, 2017.
, The Evolution of Bitcoin Hardware
Computer, IEEE, 2017.
, Celerity: An Open Source RISC-V Tiered Accelerator Fabric
HOTCHIPS, 2017.
, Specializing a Planet's Computation: ASIC Clouds
IEEE Micro, 2017.
, IncBricks: Towards In-network Computation with an In-Network Cache
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
Selected as Honorable Mention for IEEE Micro Top Picks.
, Energy-Efficient Hybrid Stochastic-Binary Neural Networks for Near-Sensor Computing
Design, Automation and Test in Europe, 2017.
, Approximate Storage for Encoded and Encrypted Videos
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
, Similarity Search on Automata Processors
International Parallel and Distributed Processing Symposium (IPDPS), 2017.
, A Hardware-Friendly Bilateral Solver for Real-Time Virtual Reality Video
High Performance Graphics (HPG), 2017.
, Computer Security, Privacy, and DNA Sequencing: Compromising Computers with Synthesized DNA, Privacy Leaks, and More
USENIX Security, 2017.
, Customizing Progressive JPEG for Efficient Image Storage
USENIX HotStorage, 2017.
, Moonwalk: NRE Optimization in ASIC Clouds or, accelerators will use old silicon
Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
, VisualCloud Demonstration: A DBMS for Virtual Reality
SIGMOD, 2017.
, Disciplined Inconsistency with Consistency Types
SOCC, 2016.
, A DNA-Based Archival Storage System
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2016.
Selected for IEEE Micro Top Picks 2016.
, High-Density Image Storage Using Approximate Memory Cells
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2016.
, Disciplined Inconsistency with Consistency Types
ACM Symposium on Cloud Computing (SOCC), 2016.
, ASIC Clouds: Specializing the Datacenter
International Symposium on Computer Architecture (ISCA), 2016.
, Optimizing Synthesis with Metasketches
POPL, 2016.
, Approximate Semantics for Wirelessly Networked Applications
arxiv, 2015.
Earlier version appeared in Workshop on Approximate Computing Across the Stack (WACAS w/ ASPLOS) 2014
, Latency-Tolerant Software Distributed Shared Memory
USENIX Annual Technical Conference (ATC), 2015.
Best Paper Award
, REACT: A Framework for Rapid Exploration of Approximate Computing Techniques
Workshop on Approximate Computing Across the Stack (WAX w/ PLDI), 2015.
, Approximate Program Synthesis
Workshop on Approximate Computing Across the Stack (WAX w/ PLDI), 2015.
, Claret: Using Data Types for Highly Concurrent Distributed Transactions
Workshop on Principles and Practice of Consistency (PaPoC'15 w/ EuroSys), 2015.
, Approximate Computing: Making Mobile Systems More Efficient
Pervasive Computing, IEEE, 2015.
, Debugging Approximate Programs via Dynamic Analysis
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2015.
, SNNAP: Approximate Computing on Programmable SoCs via Neural Acceleration
International Symposium on High-Performance Computer Architecture (HPCA), 2015.
, SNNAP: Approximate Computing on Programmable SoCs via Neural Acceleration
International Symposium on High-Performance Computer Architecture (HPCA), 2015.
, Data Provenance Tracking for Concurrent Programs
International Symposium on Code Generation and Optimization (CGO), 2015.
, NCAM: Near-Data Processing for Nearest Neighbor Search
MEMSYS, 2015.
, Probability Type Inference for Flexible Approximate Programming
OOPSLA, 2015.
, "Hardware–Software Co-Design: Not Just a Cliche"
SNAPL, 2015.
, Robust chemical preservation of digital information on DNA in silica with error-correcting codes
Angewandte Chemie International Edition, Wiley Online Library 54:8, 2015.
, CortexSuite: A Synthetic Brain Benchmark Suite
International Symposium on Workload Characterization (IISWC), 2014.
, Symbolic Execution of Multithreaded Programs from Arbitrary Program Contexts
SPLASH-OOPSLA, 2014.
, Alembic: Automatic Locality Extraction via Migration
SPLASH-OOPSLA, 2014.
, Exploring Energy Scalability in Coprocessor-Dominated Architectures for Dark Silicon
Transactions on Embedded Computing Systems (TECS), 2014.
, Expressing and Verifying Probabilistic Assertions
Conference on Programming Language Design and Implementation (PLDI), 2014.
, How to Answer “Haven’t We Done This Already?”, and Challenges/Opportunities in Approximate Computing
APPROX Workshops w/ PLDI 2015, 2014.
, Mercury: An Integrated, 3D-Stacked Server Design for Increasing Physical Density of Key-Value Stores
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014.
, Low-Level Detection of High-Level Data Races with LARD
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014.
, QualityTime: A Simple Online Technique for Quantifying Multicore Execution Efficiency
International Symposium on Performance Analysis of Systems and Software (ISPASS), 2014.
, General-Purpose Code Acceleration with Limited-Precision Analog Computation
International Symposium on Computer Architecture (ISCA), 2014.
Selected for IEEE Micro Top Picks Honorable Mention 2015.
, Approximate Storage in Solid-State Memories
International Symposium on Microarchitecture (MICRO), 2013.
, EnerJ, the Language of Good-Enough Computing
IEEE Spectrum Feature Article, 2013.
, A Landscape of the New Dark Silicon Design Regime
Micro, IEEE, 2013.
, DNA-based Molecular Architecture with Spatially Localized Components
International Symposium on Computer Architecture (ISCA), 2013.
, Input-Covering Schedules for Multithreaded Programs
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS), 2013.
, Cooperative Empirical Failure Avoidance for Multithreaded Programs
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2013.
, DDOS: Taming Nondeterminism in Distributed Systems
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2013.
, Flat Combining Synchronized Global Data Structures
International Conference on PGAS Programming Models, 2013.
, DR-SNUCA: An Energy-Scalable Dynamically Partitioned Cache
International Conference on Computer Design (ICCD), 2013.
, Exploring Storage Class Memory with Key Value Stores
Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, 2013.
, Skadu: Efficient Vector Shadow Memories for Poly-scopic Program Analysis
Conference on Code Generation and Optimization (CGO), 2013.
, Bitcoin and the Age of Bespoke Silicon
International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), 2013.
, A Study of Virtual Memory Usage and Implications for Large Memory
Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, 2013.
, Time Cube: A Manycore Embedded Processor with Interference-Agnostic Progress Tracking
International Conference On Embedded Computer Systems: Architectures, Modeling And Simulation (SAMOS), 2013.
, Neural Acceleration for General-Purpose Approximate Programs
International Symposium on Microarchitecture (MICRO), 2012.
Selected for IEEE Micro Top Picks 2012.
, The Kremlin Oracle for Sequential Code Parallelization
Micro, IEEE 32:4, 2012.
, Do We Need a Crystal Ball for Task Migration?
USENIX Workshop on Hot Topics in Parallelism (HotPar), 2012.
, Towards Neural Acceleration for General-Purpose Approximate Computing
Workshop on Energy Efficient Design w/ International Symposium on Computer Architecture (WEED w/ ISCA), 2012.
, RADISH: Always-On Sound and Complete RAce Detection In Software and Hardware
International Symposium on Computer Architecture (ISCA), 2012.
, Addressing Dark Silicon Challenges with Disciplined Approximate Computing
Dark Silicon Workshop w/ International Symposium on Computer Architecture (DaSi w/ ASPLOS), 2012.
, Automatic Empirical Failure Avoidance for Concurrent Software
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS), 2012.
, Architecture Support for Disciplined Approximate Programming
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2012.
, The Case For Merging Execution- and Language-level Determinism with MELD
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS), 2012.
, GreenDroid: An Architecture for the Dark Silicon Age
Asia and South Pacific Design Automation Conference (ASPDAC), 2012.
, Is Dark Silicon Useful? Harnessing the Four Horsemen of the Coming Dark Silicon Apocalypse
Design Automation Conference (DAC), 2012.
, GreenDroid: Exploring the next evolution for smartphone application processors
IEEE Communications Magazine, 2011.
, The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future
Micro, IEEE, 2011.
, EnerJ: Approximate Data Types for Safe and General Low-Power Computation
Conference on Programming Language Design and Implementation (PLDI), 2011.
, Isolating and Understanding Concurrency Errors Using Reconstructed Execution Fragments
Conference on Programming Language Design and Implementation (PLDI), 2011.
, On the Impact of Memory Models on Software Reliability in Multiprocessors
Symposium on Principles of Distributed Computing (PODC)., 2011.
, Data-Race Exceptions Have Benefits Beyond the Memory Model
Workshop on Memory System Performance and Correctness w/ Conference on Programming Language Design and Implementation (MSPC w/ PLDI), 2011.
, Crunching Large Graphs with Commodity Processors
USENIX Hot Topics on Parallelism (HotPar), 2011.
, Operating System Implications of Fast, Cheap, Non-Volatile Memory
USENIX Hot Topics on Operating Systems (HotOS), 2011.
, Dense Approximate Storage in Phase-Change Memory
Wild and Crazy Ideas w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WACI w/ ASPLOS), 2011.
, RCDC: A Relaxed Consistency Deterministic Computer
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2011.
, The Deterministic Execution Hammer: How Well Does it Actually Pound Nails?
Workshop on Determinism and Correctness in Parallel Programming w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WoDet w/ ASPLOS), 2011.
, Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures
Workshop on Interaction between Compilers and Computer Architectures w/ International Symposium on High-Performance Computer Architecture (INTERACT w/ HPCA), 2011.
, Checked Load: Architectural Support for JavaScript Type-Checking on Mobile Processors
International Symposium on High-Performance Computer Architecture (HPCA), 2011.
, System Introspection with Hardware Watchmachines
Fun Ideas and Thoughts w/ Conference on Programming Language Design and Implementation (PLDI FIT), 2011.
, Efficient Complex Operators for Irregular Codes
High Performance Computing Architecture (HPCA), 2011.
, Reducing the Energy Cost of Irregular Code Bases in Soft Processor Systems
IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 2011.
, Kismet: Parallel Speedup Estimates for Serial Programs
Conference on Object-Oriented Programming, Systems, Language and Applications (OOPSLA), 2011.
, System Introspection with Hardware Watchmachines
Fun Ideas and Thoughts w/ Conference on Programming Language Design and Implementation (PLDI FIT), 2011.
, Parkour: Parallel Speedup Estimates from Serial Code
USENIX Workshop on Hot Topics in Parallelism (HOTPAR), 2011.
, An Evaluation of Selective Depipelining for FPGA-based Energy-Reducing Irregular Code Coprocessors
Conference on Field Programmable Logic and Applications (FPL), 2011.
, Kremlin: Rethinking and Rebooting gprof for the Multicore Age
Proceedings of the Conference on Programming Language Design and Implementation (PLDI), 2011.
, QsCores: Configurable Co-processors to Trade Dark Silicon for Energy Efficiency in a Scalable Manner
International Symposium on Microarchitecture (MICRO), 2011.
, Kremlin: Like gprof, but for Parallelization
Principles and Practice of Parallel Programming (PPoPP), 2011.
, A Limit Study of JavaScript Parallelism
International Symposium on Workload Characterization (IISWC), 2010.
, Deterministic Process Groups in dOS
Symposium on Operating Systems Design and Implementation (OSDI), 2010.
, Lock Prediction
USENIX Hot Topics on Parallelism (HotPar), 2010.
, CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.
, DMP: Deterministic Shared Memory Multiprocessing
IEEE Micro Top Picks in Computer Architecture, 2010.
, Encyclopedia of Parallel Computing, Editor: David Padua
Springer, 2010.
, Bridging the Parallelization Gap: Automating Parallelism Discovery and Planning
USENIX Workshop on Hot Topics in Parallelism (HOTPAR), 2010.
, GreenDroid: A Mobile Application Processor for a Future of Dark Silicon
HOTCHIPS, 2010.
, Conservation cores: reducing the energy of mature computations
Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.
, Finding Concurrency Bugs with Context-Aware Communication Graphs
International Symposium on Microarchitecture (MICRO), 2009.
, The Bulk Multicore Architecture for Improved Programmability
Communication of the ACM, 2009.
, The Case for System Support for Concurrency Exceptions
USENIX Hot Topics on Parallelism (HotPar), 2009.
, Two Hardware-based Approaches for Deterministic Multiprocessor Replay
Research Highlights, Communication of the ACM, 2009.
, Concurrency Discovery for Very Large Windows of Execution
Workshop on Parallel Execution of Sequential Programs on Multi-core Architectures w/ International Symposium on Computer Architecture (PESPMA w/ ISCA), 2009.
, DMP: Deterministic Shared Memory Multiprocessing
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2009.
Selected for IEEE Micro Top Picks 2009.
, Self-Powered Processors
Wild and Crazy Ideas w/ International Conference on Architectural Support for Programming Languages and Operating Systems (WACI w/ ASPLOS), 2009.
, SoftSig: Software-Exposed Hardware Signatures for Memory Disambiguation
IEEE Micro Top Picks in Computer Architecture, 2009.
, Atom-Aid: Detecting and Surviving Atomicity Violations
IEEE Micro Top Picks in Computer Architecture, 2009.
, Using Checkpoint-Assisted Value Prediction to Hide L2 Misses
ACM Transactions on Architecture and Code Optimization (TACO), 2009.
, Programming and Debugging Shared Memory Programs with Data Coloring
Workshop on Compilers for Parallel Computing (CPC), 2009.
, SD-VBS: The San Diego Vision Benchmark Suite
IEEE International Symposium on Workload Characterization (IISWC), 2009.
, Explicitly Parallel Programming with Shared-Memory is Insane: At Least Make it Deterministic!
Workshop on Software and Hardware Challenges of Manycore Platforms w/ International Symposium on Computer Architecture (SHCMP w/ ISCA), 2008.
, DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Efficiently
International Symposium on Computer Architecture (ISCA), 2008.
, Atom-Aid: Detecting and Surviving Atomicity Violations
International Symposium on Computer Architecture (ISCA), 2008.
Selected for IEEE Micro Top Picks 2008.
, Concurrency Control with Data Coloring
Workshop on Memory Systems Performance and Correctness w/ International Conference on Architectural Support for Programming Languages and Operating Systems (MSPC w/ ASPLOS), 2008.
, BulkSC: Bulk Enforcement of Sequential Consistency
International Symposium on Computer Architecture (ISCA), 2007.
, Implicit Parallelism with Ordered Transactions
Principles and Practice of Parallel Programming (PPoPP), 2007.
, Colorama: Architectural Support for Data-Centric Synchronization
International Symposium on High-Performance Computer Architecture (HPCA), 2007.
, Tiled Microprocessors
Massachusetts Institute of Technology, 2007.
, Runtime Checking for Program Verification
RV, 2007.
, Scalable Cache Miss Handling for High Memory Level Parallelism
International Symposium on Microarchitecture (MICRO), 2006.
, Bulk Disambiguation of Speculative Threads in Multiprocessors
International Symposium on Computer Architecture (ISCA), 2006.
, POSH: A TLS Compiler that Exploits Program Structure
Principles and Practice of Parallel Programming (PPoPP), 2006.
, Are We Ready for High Memory-Level Parallelism?
Workshop on Memory Performance Issues w/ International Symposium on High-Performance Computer Architecture (WMPI w/HPCA), 2006.
Also appears in SIGMICRO Newsletter selection from WMPI-2006.
, Energy-Efficient Thread-Level Speculation on a CMP
IEEE Micro Top Picks in Computer Architecture, 2006.
, Scalar Operand Networks
IEEE Transactions on Parallel and Distributed Systems, 2005.
, Scalar Operand Networks
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2005.
, Thread-Level Speculation on a CMP Can Be Energy Efficient
International Conference on Supercomputing (ICS), 2005.
Selected for IEEE Micro Top Picks 2005.
, Tasking with Out-of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation
International Conference on Supercomputing (ICS), 2005.
Selected for IEEE Micro Top Picks 2005.
, CAVA: Hiding L2 Misses with Checkpoint-Assisted Value Prediction
IEEE Computer Architecture Letters (CAL), 2004.
, Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
International Symposium on Computer Architecture (ISCA), 2004.
, An Overview Of The Blue Gene/L System Software Organization
Parallel Processing Letters, 2003.
, An Overview Of The Blue Gene/L System Software Organization
International Conference on Parallel and Distributed Computing (Euro-Par), 2003.
, Energy Characterization of a Tiled Architecture Processor with On-Chip Networks
International Symposium on Low Power Electronics and Design (ISLPED), 2003.
, Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures
International Symposium on High Performance Computer Architecture (HPCA), 2003.
, A 16-issue Multiple-Program-Counter Microprocessor with Point-to-Point Scalar Operand Network
IEEE International Solid-State Circuits Conference (ISSCC), 2003.
, Full Circle: Simulating Linux Clusters on Linux Clusters
LCI International Conference on Linux Clusters (CWCE), 2003.
Selected as one of the top 3 papers in the conference.
, Blue Gene/L, a system-on-a-chip
IEEE International Conference on Cluster Computing (CC), 2002.
, An Overview of the Blue Gene/L Supercomputer
IEEE Supercomputing (SC), 2002.
, The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs
IEEE Micro, 2002.
, Cellular Supercomputing with System-on-a-Chip
International Solid State Circuits Conference (ISSCC), 2002.
, An environment for easy cross synchronization of multimedia Web based material
Frontiers in Education, 2000.
, Baring it all to Software: Raw Machines
IEEE Computer, 1997.
, The Raw Benchmark Suite: Computation Structures for General Purpose Computing
IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 1997.
, The RAW compiler project
Proceedings of the Second SUIF Compiler Workshop, 1997.
,