2016 IARPA Advanced Processor Developments

Wednesday, Oct. 19, 2016 from 1:00-5:00 PM, Hilton San Diego
IARPA Special Workshop, not part of official ICRC events, but will be open to all attendees.

Recent IARPA research on advanced analog processors is reviewed for plausible potential to extend speed and power performance beyond conventional algorithms on conventional CMOS, and beyond the reach of classical technology entirely. Applications of interest include Machine Learning training and inferencing, streaming filtering & selection, streaming imagery analysis, and combinatorial optimization of fault diagnostics.

 

Time Speaker Title
1:00-1:30 A. Jamie Kerman, MIT LL Advanced quantum annealing hardware development for IARPA QEO
1:35-2:05 Bettina Heim, ETH Beyond the reach of classical
2:10-2:40 Helmut Katzgraber, TAMU Predicting quantum advantage and raising the bar for quantum architectures
2:45-3:00 BREAK
3:00-3:30 John Realpe-Gómez, NASA Ames Quantum annealing for sampling and machine learning applications
3:35-4:05 Zeb Barber, S2 Corporation S2 Optical Processors for Multi-Terabit Streaming Filtering and Selection and 2D Imagery Analysis
4:10-4:40 John Paul Strachan, HPE The dot-product engine: using memristor crossbars as computational accelerators
4:45 Adjourn

 

Speaker:      Andrew J. Kerman, Ph.D.
                     Quantum Information and Integrated Nanosystems Group
                     MIT Lincoln Laboratory, LI-276

Title:             Advanced quantum annealing hardware development for IARPA QEO

Abstract
Classical optimization methods are a vital resource in virtually every large-scale, complex human endeavor, from timely delivery of our mail to the planning of deep-space exploration missions. Any technology which can provide substantial improvement in optimization efficiency or effectiveness therefore has the potential for enormous practical impact. Because of this, quantum approaches to optimization, such as that pioneered by D-Wave systems, have generated strong interest from the academic community, private industry, government, and even the popular press. This is true in spite of the fact that unlike quantum computation, where a provably exponential speedup exists over known classical methods for a few problems (most notably the integer factoring of Shor’s algorithm), the computational power of quantum optimization is virtually unknown theoretically.

Extensive investigations of the behavior and performance of the D-Wave quantum annealing machines have been carried out since their introduction in 2010, across a wide variety of problems (though only at the relatively small problem sizes that the present day hardware can accommodate). Although there is still no clear indication from this research whether quantum annealing can have an important practical impact, much has been learned about the limitations of the existing machines and their technology. Furthermore, many avenues have been identified which may enable significantly broader investigation into the true potential of quantum annealing. Starting in 2014, the IARPA Quantum-enhanced optimization (QEO) activity has been laying the technological and theoretical foundations for such an investigation. In this presentation, I will describe some of the hardware and architectural concepts and technology being developed in QEO, and discuss their potential to go far beyond current capabilities.

 
 

Speaker:      Bettina Heim
                     ETH

Title:             Beyond the reach of classical

Abstract
TBD

 
 

Speaker:      Helmut G. Katzgraber
                     Department of Physics & Astronomy, Texas A&M University
                     Santa Fe Institute

Title:             Predicting quantum advantage and raising the bar for quantum architectures

Abstract
Because of recent evidence that native random spin-glass problems are not well suited for benchmarking purposes, efforts in the search for quantum speedup have shifted to carefully tailored problems, such as Google Inc.’s weak-strong clusters model. Here we present a framework to detect (sub)classes of random and application problems where quantum annealing might excel over classical heuristics.  We illustrate our approach with random spin-glass instances, as well as application instances ranging from circuit fault diagnosis to minimum vertex covers, constraint-satisfaction problems, as well as graph partitioning, to name a few. This also means that we now possess the capability to predict if a particular application could benefit from quantum optimization, or if classical hardware might be better suited to tackle a particular problem. Finally, we illustrate our optimization algorithm portfolio on Google’s weak-strong cluster instances.

 
 

Speaker:      John Realpe-Gómez
                     Quantum Artificial Intelligence Laboratory, NASA Ames Research Center

Title:             Quantum annealing for sampling and machine learning applications

Abstract
Increasing the efficiency of sampling from Boltzmann distributions would have a significant impact in deep learning, probabilistic programming, and other machine learning applications. Quantum annealers hold the potential to speed up this task, but several limitations still bar these state-of-the-art technologies from being used effectively.  One of the main limitations is that, while there is evidence that the device can sample from Boltzmann-like distributions, it does so with an unknown instance-dependent effective temperature, different from its physical temperature. Unless this unknown temperature can be unveiled, it is unlikely that a quantum annealer can be used effectively as a Boltzmann sampler. Here, we discuss our recent algorithmic advances to overcome this challenge with a simple effective-temperature estimation algorithm. We provide a systematic study assessing the impact of the effective temperatures in the learning of a restricted Boltzmann machine specialized to available quantum hardware, which can serve as a building block for deep learning architectures. We also provide a comparison to k-step contrastive divergence (CD-k) with k up to 100, a type of Monte Carlo algorithm based on a Markov chain of length k. Although assuming a suitable fixed effective temperature allows quantum annealing sampling to outperform one step contrastive divergence (CD-1), only when using an instance-dependent effective temperature do we find a performance close to that of CD-100. We will also discuss follow-up work on how to overcome other limitations of the device, such as such as limited connectivity of quantum hardware, high imprecision, and uncertainty in model parameters. We validate these methods by training a classical Boltzmann machine for image generation, through a quantum-classical hybrid that integrates a quantum annealing sampler into a classical stochastic gradient descent algorithm. Finally, we discuss the potential of this framework for validating speed-ups when sampling from quantum hardware.

 
 

Speaker:      Dr. Zeb Barber, Director
                     MSU Spectrum Lab

Title:             S2 Optical Processors for Multi-Terabit Streaming Filtering and Selection and 2D Imagery Analysis

Abstract
Spatial-spectral (S2) holographic technology enables high bandwidth (>25 GHz) real-time processing of analog and digital signals against arbitrary large time-bandwidth product (100 kbit length) programmable FIR filters (sliding dot-products). The spatial degrees of freedom allow efficient multiplexing of independent signals or processing across multiple signal dimensions for processing many Terabits data-streams in real-time. Simple text searches for long phrases at rates up to 200 Gbps in a single channel are demonstrated and application to 2D image analysis is discussed.

 
 

Speaker:      John Paul Strachan
                     Hewlett Packard Enterprise

Title:             The Dot-Product Engine: Using Memristor Crossbars as Computational Accelerators

Abstract
The future acceleration of many computational workloads is expected to depend on novel architectures, circuits, and devices. We describe an effort utilizing memristor crossbar arrays to accelerate vector-matrix multiplication, which underpins many applications in image and signal processing, neural networks, and scientific computations. Significant improvement over CPUs, GPUs, and custom ASICs is anticipated using such systems.  We describe our work spanning device level engineering of memristors for this application, integration with CMOS circuits, multi-level tuning of individual memristors to 64 levels (6 bits), a hardware platform for direct demonstration of single time-step dot-product operations in fabricated memristor arrays, and simulations forecasting ultimate performance, bottlenecks and comparisons to alternative CMOS approaches.  We further describe some specific applications including image convolutions, machine learning inference algorithms, and signal processing.