Cheetah - SKA - PSS - Prototype Time Domain Search Pipeline
Public Types | Public Member Functions | List of all members
ska::cheetah::fft::cuda::Fft Class Reference

A cuda-specific implementation of the fft module. More...

Inheritance diagram for ska::cheetah::fft::cuda::Fft:
Inheritance graph
Collaboration diagram for ska::cheetah::fft::cuda::Fft:
Collaboration graph

Public Types

typedef cheetah::Cuda Architecture
 
typedef panda::nvidia::DeviceCapability< 2, 0, panda::nvidia::giga/2 > ArchitectureCapability
 
typedef panda::PoolResource< Architecture > ResourceType
 

Public Member Functions

 Fft (fft::Config const &algo_config)
 Construct and Fft instance. More...
 
 Fft (Fft const &)=delete
 
 Fft (Fft &&)=default
 
template<typename T , typename InputAlloc , typename OutputAlloc >
void process (ResourceType &gpu, data::TimeSeries< cheetah::Cuda, T, InputAlloc > const &input, data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &output)
 Perform a real-to-complex 1D FFT. More...
 
template<typename T , typename InputAlloc , typename OutputAlloc >
void process (ResourceType &gpu, data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, InputAlloc > const &input, data::TimeSeries< cheetah::Cuda, T, OutputAlloc > &output)
 Perform a complex-to-real 1D FFT. More...
 
template<typename T , typename InputAlloc , typename OutputAlloc >
void process (ResourceType &gpu, data::TimeSeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const &input, data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &output)
 Perform a complex-to-complex 1D forward FFT. More...
 
template<typename T , typename InputAlloc , typename OutputAlloc >
void process (ResourceType &gpu, data::FrequencySeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const &input, data::TimeSeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &output)
 Perform a complex-to-complex 1D inverse FFT. More...
 

Detailed Description

A cuda-specific implementation of the fft module.

This class provides and interface to Nvidia's cuFFT library wrapping plan generation and execution and providing a single overloaded method to simplify the process of performing FFTs.

It should be noted that for performance reasons it is best to perform repetitions of the same transform size and type as this will not result in reallocation of the FftPlan object and regeneration of a cuFFT plan.

It is therefore recommended that separate instantiations should be created for forward and back transforms and for repeated transforms of the same size the object should persist between transforms.

Note that cuFFT rescales such that iFFT(FFT(A)) = size(A)*A for complex transforms and iFFT(FFT(A)) = sqrt(size(A))*A for a real to complex followed by complex to real transform.

Definition at line 41 of file Fft.cuh.

Constructor & Destructor Documentation

◆ Fft()

ska::cheetah::fft::cuda::Fft::Fft ( fft::Config const &  algo_config)

Construct and Fft instance.

Parameters
[in]configA configuration object for the Fft instance

Definition at line 8 of file Fft.cu.

9  : utils::AlgorithmBase<Config, fft::Config>(algo_config.cuda_config(),algo_config)
10 {
11 }

Member Function Documentation

◆ process() [1/4]

template<typename T , typename InputAlloc , typename OutputAlloc >
void ska::cheetah::fft::cuda::Fft::process ( ResourceType &  gpu,
data::TimeSeries< cheetah::Cuda, T, InputAlloc > const &  input,
data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &  output 
)

Perform a real-to-complex 1D FFT.

The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.

Parameters
[in]gpuA PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context.
[in]inputA real TimeSeries instance to be transformed
[out]outputA complex FrequencySeries instance for the transform output
Template Parameters
TThe base value type for the transform (float or double)
InputAllocThe allocator type of the input
OutputAllocThe allocator type of the output

Definition at line 81 of file Fft.cu.

84 {
85  typedef detail::CufftHelper<T> Cufft;
86  typedef typename Cufft::RealType RealType;
87  typedef typename Cufft::ComplexType ComplexType;
88  PANDA_LOG_DEBUG << "GPU ID: "<<gpu.device_id();
89  //update the size of the output buffer to match output transform size
90  output.resize(input.size()/2 + 1);
91  //Calculate the new frequency step that the output will have
92  output.frequency_step((1.0f/(input.sampling_interval().value() * input.size())) * data::hz);
93  RealType const* in = thrust::raw_pointer_cast(input.data());
94  ComplexType* out = (ComplexType*) thrust::raw_pointer_cast(output.data());
95  CUFFT_ERROR_CHECK(Cufft::r2c(_plan.plan<T>(R2C,input.size(),1), in, out));
96 }
cufftHandle const & plan(FftType fft_type, std::size_t size, std::size_t batch)
Get (or create) a cufft plan.
Definition: FftPlan.cu:9
Here is the call graph for this function:

◆ process() [2/4]

template<typename T , typename InputAlloc , typename OutputAlloc >
void ska::cheetah::fft::cuda::Fft::process ( ResourceType &  gpu,
data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, InputAlloc > const &  input,
data::TimeSeries< cheetah::Cuda, T, OutputAlloc > &  output 
)

Perform a complex-to-real 1D FFT.

The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.

Parameters
[in]gpuA PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context.
[in]inputA complex FrequencySeries instance to be transformed
[out]outputA real TimeSeries instance for the transform output
Template Parameters
TThe base value type for the transform (float or double)
InputAllocThe allocator type of the input
OutputAllocThe allocator type of the output

Definition at line 99 of file Fft.cu.

102 {
103  typedef detail::CufftHelper<T> Cufft;
104  typedef typename Cufft::RealType RealType;
105  typedef typename Cufft::ComplexType ComplexType;
106  PANDA_LOG_DEBUG << "GPU ID: "<<gpu.device_id();
107  //update the size of the output buffer to match output transform size
108  output.resize(2*(input.size() - 1));
109  //Calculate the new sampling time that the output will have
110  output.sampling_interval((1.0f/(input.frequency_step().value()*output.size())) * data::seconds);
111  ComplexType const* in = (ComplexType*) thrust::raw_pointer_cast(input.data());
112  RealType* out = thrust::raw_pointer_cast(output.data());
113  CUFFT_ERROR_CHECK(Cufft::c2r(_plan.plan<T>(C2R,input.size(),1), in, out));
114 }
cufftHandle const & plan(FftType fft_type, std::size_t size, std::size_t batch)
Get (or create) a cufft plan.
Definition: FftPlan.cu:9
Here is the call graph for this function:

◆ process() [3/4]

template<typename T , typename InputAlloc , typename OutputAlloc >
void ska::cheetah::fft::cuda::Fft::process ( ResourceType &  gpu,
data::TimeSeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const &  input,
data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &  output 
)

Perform a complex-to-complex 1D forward FFT.

The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.

Parameters
[in]gpuA PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context.
[in]inputA complex TimeSeries instance to be transformed
[out]outputA complex FrequencySeries instance for the transform output
Template Parameters
TThe base value type for the transform (float or double)
InputAllocThe allocator type of the input
OutputAllocThe allocator type of the output

Definition at line 117 of file Fft.cu.

120 {
121  typedef detail::CufftHelper<T> Cufft;
122  typedef typename Cufft::ComplexType ComplexType;
123  PANDA_LOG_DEBUG << "GPU ID: "<<gpu.device_id();
124  //update the size of the output buffer to match output transform size
125  output.resize(input.size());
126  //Calculate the new frequency step that the output will have
127  output.frequency_step((1.0f/(input.sampling_interval().value() * input.size())) * data::hz);
128  ComplexType const* in = (ComplexType*) thrust::raw_pointer_cast(input.data());
129  ComplexType* out = (ComplexType*) thrust::raw_pointer_cast(output.data());
130  CUFFT_ERROR_CHECK(Cufft::c2c(_plan.plan<T>(C2C,input.size(),1), in, out, CUFFT_FORWARD));
131 }
cufftHandle const & plan(FftType fft_type, std::size_t size, std::size_t batch)
Get (or create) a cufft plan.
Definition: FftPlan.cu:9

◆ process() [4/4]

template<typename T , typename InputAlloc , typename OutputAlloc >
void ska::cheetah::fft::cuda::Fft::process ( ResourceType &  gpu,
data::FrequencySeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const &  input,
data::TimeSeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &  output 
)

Perform a complex-to-complex 1D inverse FFT.

The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.

Parameters
[in]gpuA PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context.
[in]inputA complex FrequencySeries instance to be transformed
[out]outputA complex TimeSeries instance for the transform output
Template Parameters
TThe base value type for the transform (float or double)
InputAllocThe allocator type of the input
OutputAllocThe allocator type of the output

Definition at line 134 of file Fft.cu.

137 {
138  typedef detail::CufftHelper<T> Cufft;
139  typedef typename Cufft::ComplexType ComplexType;
140  PANDA_LOG_DEBUG << "GPU ID: "<<gpu.device_id();
141  //update the size of the output buffer to match output transform size
142  output.resize(input.size());
143  //Calculate the new sampling time that the output will have
144  output.sampling_interval((1.0f/(input.frequency_step().value()*output.size())) * data::seconds);
145  ComplexType const* in = (ComplexType*) thrust::raw_pointer_cast(input.data());
146  ComplexType* out = (ComplexType*) thrust::raw_pointer_cast(output.data());
147  CUFFT_ERROR_CHECK(Cufft::c2c(_plan.plan<T>(C2C,input.size(),1), in, out, CUFFT_INVERSE));
148 }
cufftHandle const & plan(FftType fft_type, std::size_t size, std::size_t batch)
Get (or create) a cufft plan.
Definition: FftPlan.cu:9

The documentation for this class was generated from the following files: