Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator (ICFP 2019 - Research Papers)

Who

Fei Wang, Dan Zheng, James Decker, Xilun Wu, Gregory Essertel, Tiark Rompf

Track

ICFP 2019 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 20 Aug 2019 13:30 - 13:52 at Aurora Borealis - The Real World Chair(s): Robert Atkey

Abstract

Deep learning has seen tremendous success over the past decade in computer vision, machine translation, and gameplay. This success rests crucially on gradient-descent optimization and the ability to “learn” parameters of a neural network by backpropagating observed errors. However, neural network architectures are growing increasingly sophisticated and diverse, which motivates an emerging quest for even more general forms of differentiable programming, where arbitrary parameterized computations can be trained by gradient descent. In this paper, we take a fresh look at automatic differentiation (AD) techniques, and especially aim to demystify the reverse-mode form of AD that generalizes backpropagation in neural networks. We uncover a tight connection between reverse-mode AD and delimited continuations, which permits implementing reverse-mode AD purely via operator overloading and without managing any auxiliary data structures. We further show how this formulation of AD can be fruitfully combined with multi-stage programming (staging), leading to an efficient implementation that combines the performance benefits of deep learning frameworks based on explicit reified computation graphs (e.g., TensorFlow) with the expressiveness of pure library approaches (e.g., PyTorch).

Link to Preprint

https://www.cs.purdue.edu/homes/rompf/papers/wang-icfp19.pdf

Fei Wang

Dan Zheng

Purdue University, Google Brain

James Decker

Xilun Wu

Purdue University

Gregory Essertel

Purdue University

Tiark Rompf

Purdue University

United States

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 20 Aug
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

13:30 - 15:00	The Real WorldResearch Papers at Aurora Borealis Chair(s): Robert Atkey University of Strathclyde

13:30 22m Talk		Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator Research Papers Fei Wang , Dan Zheng Purdue University, Google Brain, James Decker , Xilun Wu Purdue University, Gregory Essertel Purdue University, Tiark Rompf Purdue University Pre-print
13:52 22m Talk		Efficient Differentiable Programming in a Functional Array-Processing Language Research Papers Amir Shaikhha University of Oxford, Andrew Fitzgibbon Microsoft Research, Cambridge, Dimitrios Vytiniotis DeepMind, Simon Peyton Jones Microsoft, UK
14:15 22m Talk		From high-level inference algorithms to efficient code Research Papers Rajan Walia Indiana University, Praveen Narayanan Indiana University, USA, Jacques Carette McMaster University, Sam Tobin-Hochstadt Indiana University, Chung-chieh Shan Indiana University, USA Pre-print
14:37 22m Talk		Sound and robust solid modeling via exact real arithmetic and continuityDistinguished Paper Research Papers Benjamin Sherman Massachusetts Institute of Technology, USA, Jesse Michel Massachusetts Institute of Technology, Michael Carbin Massachusetts Institute of Technology DOI Pre-print Media Attached