Simulate Diffraction Patterns Using CUDA FFT Libraries

This example shows how to use GPU Coder™ to leverage the CUDA® Fast Fourier Transform library (cuFFT) to compute two-dimensional FFT on a NVIDIA® GPU. The two-dimensional Fourier transform is used in optics to calculate far-field diffraction patterns. When a monochromatic light source passes through a small aperture, such as in Young's double-slit experiment, you can observe these diffraction patterns. This example also shows you how to use GPU pointers as inputs to an entry-point function when generating CUDA MEX, source code, static libraries, dynamic libraries, and executables. By using this functionality, the performance of the generated code is improved by minimizing the number of cudaMemcpy calls in the generated code.

Prerequisites

  • CUDA enabled NVIDIA GPU with compute capability 3.2 or higher.

  • NVIDIA CUDA toolkit and driver.

  • Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products. For setting up the environment variables, see Setting Up the Prerequisite Products.

Verify GPU Environment

To verify that the compilers and libraries necessary for running this example are set up correctly, use the coder.checkGpuInstall function.

envCfg = coder.gpuEnvConfig('host');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);

Define the Coordinate System

Before simulating the light that has passed through an aperture, you must define your coordinate system. To get the correct numeric behavior when you call fft2, you must carefully arrange $x$ and $y$ so that the zero value is in the correct place. N2 is half the size in each dimension.

N2 = 1024;
[gx, gy] = meshgrid(-1:1/N2:(N2-1)/N2);

Simulate the Diffraction Pattern for a Rectangular Aperture

Simulate the effect of passing a parallel beam of monochromatic light through a small rectangular aperture. The two-dimensional Fourier transform describes the light field at a large distance from the aperture. Form aperture as a logical mask based on the coordinate system. The light source is a double-precision version of the aperture. Find the far-field light signal by using the fft2 function.

aperture       = (abs(gx) < 4/N2) .* (abs(gy) < 2/N2);
lightsource    = double(aperture);
farfieldsignal = fft2(lightsource);

Display the Light Intensity for a Rectangular Aperture

The visualize.m function displays the light intensity for a rectangular aperture. Calculate the far-field light intensity from the magnitude squared of the light field. To aid visualization, use the fftshift function.

type visualize
function visualize(farfieldsignal, titleStr)

farfieldintensity = real( farfieldsignal .* conj( farfieldsignal ) );
imagesc( fftshift( farfieldintensity ) );
axis( 'equal' ); axis( 'off' );
title(titleStr);

end
str = sprintf('Rectangular Aperture Far-Field Diffraction Pattern in MATLAB');
visualize(farfieldsignal,str);

Generate CUDA MEX for the Function

You do not have to create an entry-point function. You can directly generate code for the MATLAB® fft2 function. To generate CUDA MEX for the MATLAB fft2 function, in the configuration object, set the EnablecuFFT property and use the codegen function. GPU Coder replaces fft, ifft, fft2, ifft2, fftn, and ifftn function calls in your MATLAB code to the appropriate cuFFT library calls. For two-dimensional transforms and higher, GPU Coder creates multiple 1-D batched transforms. These batched transforms have higher performance than single transforms. After generating the MEX function, you can verify that it has the same functionality as the original MATLAB entry-point function. Run the generated fft2_mex and plot the results.

cfg = coder.gpuConfig('mex');
cfg.GpuConfig.EnableCUFFT = 1;
codegen -config cfg -args {lightsource} fft2

farfieldsignalGPU = fft2_mex(lightsource);
str = sprintf('Rectangular Aperture Far-Field Diffraction Pattern on GPU');
visualize(farfieldsignalGPU,str);

Simulate The Young's Double-Slit Experiment

Young's double-slit experiment shows light interference when an aperture comprises two parallel slits. A series of bright points is visible where constructive interference takes place. In this case, form the aperture representing two slits. Restrict the aperture in the $y$ direction to ensure that the resulting pattern is not entirely concentrated along the horizontal axis.

slits          = (abs(gx) <= 10/N2) .* (abs(gx) >= 8/N2);
aperture       = slits .* (abs(gy) < 20/N2);
lightsource    = double(aperture);

Display the Light Intensity for Young's Double-Slit

Because the size, type and complexity of the inputs remains the same, reuse the fft2_mex functions generated display the intensity as before.

farfieldsignalGPU = fft2_mex(lightsource);
str = sprintf('Double Slit Far-Field Diffraction Pattern on GPU');
visualize(farfieldsignalGPU,str);

Generate CUDA MEX Using GPU Pointer as Input

In the CUDA MEX generated above, the input provided to MEX is copied from CPU to GPU memory, the computation is performed on the GPU and the result is copied back to the CPU. Alternatively, CUDA code can be generated such that it accepts GPU pointers directly. For MEX targets, GPU pointers can be passed from MATLAB® to CUDA MEX using gpuArray. For other targets, GPU memory must be allocated and inputs must be copied from CPU to GPU inside the handwritten main function, before they are passed to the entry-point function.

lightsource_gpu = gpuArray(lightsource);
cfg = coder.gpuConfig('mex');
cfg.GpuConfig.EnableCUFFT = 1;
codegen -config cfg -args {lightsource_gpu} fft2 -o fft2_gpu_mex

Only numeric and logical input matrix types can be passed as GPU pointers to the entry-point function. Other data types that are not supported can be passed as CPU inputs. During code generation, if at least one of the inputs provided to the entry-point function is a GPU pointer, the outputs returned from the function are also GPU pointers. However, if the data type of the output is not supported as a GPU pointer, such as a struct or a cell-array, the output will be returned as a CPU pointer. For more information on passing GPU pointers to entry-point function, see Support for GPU Arrays.

Notice the difference in the generated CUDA code when using lightsource_gpu GPU input. It avoids copying the input from CPU to GPU memory and avoids copying the result back from GPU to CPU memory. This results in fewer cudaMemcpys and improves the performance of the generated CUDA MEX.

Verify Results of CUDA MEX Using GPU Pointer as Input

To verify that the generated CUDA MEX using gpuArray has the same functionality, run the generated fft2_gpu_mex, gather the results on the host and plot the results.

farfieldsignal_gpu = fft2_gpu_mex(lightsource_gpu);
farfieldsignal_cpu = gather(farfieldsignal_gpu);
str = sprintf('Double Slit Far-Field Diffraction Pattern on GPU using gpuArray');
visualize(farfieldsignal_cpu,str);