TensorFlow: A system for large-scale machine learning
Venue
Google Brain (2016)
Publication Year
2016
Authors
Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng
BibTeX
Abstract
TensorFlow is a machine learning system that operates at large scale and in
heterogeneous environments. TensorFlow uses dataflow graphs to represent
computation, shared state, and the operations that mutate that state. It maps the
nodes of a dataflow graph across many machines in a cluster, and within a machine
across multiple computational devices, including multicore CPUs, general-purpose
GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This
architecture gives flexibility to the application developer: whereas in previous
"parameter server" designs the management of shared state is built into the system,
TensorFlow enables developers to experiment with novel optimizations and training
algorithms. TensorFlow supports a variety of applications, with particularly strong
support for training and inference on deep neural networks. Several Google services
use TensorFlow in production, we have released it as an open-source project, and it
has become widely used for machine learning research. In this paper, we describe
the TensorFlow dataflow model in contrast to existing systems, and demonstrate the
compelling performance that TensorFlow achieves for several real-world
applications.
