GReTA: Hardware Optimized Graph Processing for GNNs
Published in Proceedings of the Workshop on Resource-Constrained Machine Learning (ReCoML 2020), March 2020.
Abstract
Graph Neural Networks (GNNs) are a type of deep neural network that can learn directly from irregular graph-structured data. However, GNN inference relies on sparse operations that are difficult to implement on accelerator architectures optimized for dense, fixed-sized computation. This paper proposes GReTA (Gather, Reduce, Transform, Activate), a graph processing abstraction designed to be efficient for an accelerator to execute and flexible enough to implement GNN inference. We demonstrate GReTA’s advantages by designing and synthesizing a custom accelerator ASIC for GReTA and implementing several GNN models (GCN, GraphSage, G-GCN, and GIN.) Across several benchmark graphs, our implementation reduces 97th percentile latency by a geometric mean of 15× and 21× compared to a CPU and GPU baseline respectively.
BibTeX entry
@inproceedings{greta-recoml20,
author = "Kevin Kiningham and Philip Levis and Christopher Re",
title = "{GReTA: Hardware Optimized Graph Processing for GNNs}",
booktitle = "{Proceedings of the Workshop on Resource-Constrained Machine Learning (ReCoML 2020)}",
year = {2020},
month = {March}
}