Overview
We intend to build a hardware accelerator for the singular value decomposition (SVD) and maybe the randomized SVD. Essentially, this will be a peripheral to compute basic matrix operations like matrix multiply, transpose, etc. Those basic steps can then be composed in hardware to compute subportions of a Jacobi SVD algorithm using the Kogbetlianzt method, which is very parallelizable. This approach was implemented by Ma, Kaye, et al in 2006 [3]. We will also be consulting other supplementary sources to determine the final implementation [1, 2]. We will optimize the algorithm specifically for the SoCKit by appropriately splitting the workload between the onboard ARM processor and the FPGA.

Evaluation
Our evaluation will involve comparing the performance of our joint hardware-software implementation against a pure software implementation. This implementation will be either obtained or written and serve as the origination of our porting.

Project Requirements
Our project requires only the Cyclone SoCKit board. We will be utilizing the ARM processor as well as the FPGA to optimize the performance of the SVD algorithm.

Milestones
1. Implement an SVD algorithm in C
2. Define interface between onboard processor and FPGA
3. Implement an SVD algorithm split between the FPGA and ARM processor

References