Generating Efficient FFT GPU Code with Lift
The Fast Fourier Transform is a well-known algorithm used in many high-performance applications, ranging from signal processing to convolutional neural networks.
In this paper, we encode FFTs by building high-level abstractions based on a set of functional parallel patterns in the Lift language. Abstractions are derived from and closely resemble mathematical definitions for FFTs. We leverage the Lift performance-portable code generator to generate high performing GPU code for FFTs. No FFT-specific patterns are required for this, showing the expressive power of the generic parallel patterns used in Lift.
Our experimental results show that our approach achieves performance close to or better than AMD’s OpenCL implementation clFFT on two different models of GPU, but that Nvidia’s highly optimized cuFFT implementation still performs better on their GPUs.
|Presentation Slides (Generating_Efficient_FFT_GPU_Code_with_Lift-presentation.pdf)||756KiB|
Sun 18 Aug Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
|10:50 - 11:16|
|Generating Efficient FFT GPU Code with Lift|
FHPNCLink to publication DOI Pre-print File Attached
|11:16 - 11:43|
|Lazy Evaluation in Infinite-Dimensional Function Spaces with Wavelet Basis|
FHPNCLink to publication Pre-print
|11:43 - 12:10|
|Functional Approach to Acceleration of Monte Carlo Simulation for American Option Pricing (extended abstract)|
Wojciech Michal PawlakUniversity of Copenhagen, Denmark, Martin ElsmanUniversity of Copenhagen, Denmark, Cosmin OanceaUniversity of Copenhagen, DenmarkLink to publication