AHEAD: Automated Framework for Hardware Accelerated Iterative Data Analysis

TitleAHEAD: Automated Framework for Hardware Accelerated Iterative Data Analysis
Publication TypeConference Paper
Year of Publication2015
AuthorsSonghori, E., A. Mirhoseini, and F. Koushanfar
Conference NameIEEE/ACM Design, Automation & Test in Europe (DATE)
Date PublishedMarch, 2015
KeywordsAPI, Dense Matrix, FISTA, FPGAs, Gram Matrix, HLS, Iterative Solver, Least Squares, Sparse Approximation
Abstract

This paper introduces AHEAD, a novel domainspecific framework for automated (hardware-based) acceleration of massive data analysis applications with a dense (nonsparse) correlation matrix. Due to non-scalability of matrix inversion, often iterative computation is used for converging to a solution. AHEAD addresses two sets of domain-specific matrix computation challenges. First, the I/O and memory bandwidth constraints which limit the performance of hardware accelerators. Second, the hardness of handling large data because of the complexity of the known matrix transformations and the inseparability of non-sparse correlations. The inseparability problem translates to an increased communication cost with the accelerators. To optimize the performance within these limits, AHEAD learns the dependency structure of the domain data and suggests a scalable matrix transformation. The transformation minimizes the memory access required for matrix computing within an error threshold and thus, optimizes the mapping of domain data to the available (bandwidth constrained) accelerator resources. To facilitate automation, AHEAD also provides an Application Programming Interface (API) so users can customize the framework to an arbitrary iterative analysis algorithm and hardware mapping. Proof-of-concept implementation of AHEAD is performed on the widely used compressive sensing and general `1 regularized least squares solvers. On a massive light field imaging data set with 4.6B non-zeros, AHEAD attains up to 320x iteration speed improvement using reconfigurable hardware accelerators compared with the conventional solver and about 4x improvement compared to our transformed matrix solver on a general purpose processor (without hardware acceleration).

URLhttp://dl.acm.org/citation.cfm?id=2757032
AttachmentSize
AHEAD.pdf352.43 KB

Navigation

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer