Generating Efficient FFT GPU Code with Lift
The Fast Fourier Transform is a well-known algorithm used in many high-performance applications, ranging from signal processing to convolutional neural networks.
In this paper, we encode FFTs by building high-level abstractions based on a set of functional parallel patterns in the Lift language. Abstractions are derived from and closely resemble mathematical definitions for FFTs. We leverage the Lift performance-portable code generator to generate high performing GPU code for FFTs. No FFT-specific patterns are required for this, showing the expressive power of the generic parallel patterns used in Lift.
Our experimental results show that our approach achieves performance close to or better than AMD’s OpenCL implementation clFFT on two different models of GPU, but that Nvidia’s highly optimized cuFFT implementation still performs better on their GPUs.
Presentation Slides (Generating_Efficient_FFT_GPU_Code_with_Lift-presentation.pdf) | 756KiB |
Sun 18 AugDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:50 - 12:10 | |||
10:50 26mTalk | Generating Efficient FFT GPU Code with Lift FHPNC Link to publication DOI Pre-print File Attached | ||
11:16 26mTalk | Lazy Evaluation in Infinite-Dimensional Function Spaces with Wavelet Basis FHPNC Link to publication Pre-print | ||
11:43 26mTalk | Functional Approach to Acceleration of Monte Carlo Simulation for American Option Pricing (extended abstract) FHPNC Wojciech Michal Pawlak University of Copenhagen, Denmark, Martin Elsman University of Copenhagen, Denmark, Cosmin Oancea University of Copenhagen, Denmark Link to publication |