Cheetah - SKA - PSS - Prototype Time Domain Search Pipeline
|
A cuda-specific implementation of the fft module. More...
Public Types | |
typedef cheetah::Cuda | Architecture |
typedef panda::nvidia::DeviceCapability< 2, 0, panda::nvidia::giga/2 > | ArchitectureCapability |
typedef panda::PoolResource< Architecture > | ResourceType |
Public Member Functions | |
Fft (fft::Config const &algo_config) | |
Construct and Fft instance. More... | |
Fft (Fft const &)=delete | |
Fft (Fft &&)=default | |
template<typename T , typename InputAlloc , typename OutputAlloc > | |
void | process (ResourceType &gpu, data::TimeSeries< cheetah::Cuda, T, InputAlloc > const &input, data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &output) |
Perform a real-to-complex 1D FFT. More... | |
template<typename T , typename InputAlloc , typename OutputAlloc > | |
void | process (ResourceType &gpu, data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, InputAlloc > const &input, data::TimeSeries< cheetah::Cuda, T, OutputAlloc > &output) |
Perform a complex-to-real 1D FFT. More... | |
template<typename T , typename InputAlloc , typename OutputAlloc > | |
void | process (ResourceType &gpu, data::TimeSeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const &input, data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &output) |
Perform a complex-to-complex 1D forward FFT. More... | |
template<typename T , typename InputAlloc , typename OutputAlloc > | |
void | process (ResourceType &gpu, data::FrequencySeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const &input, data::TimeSeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > &output) |
Perform a complex-to-complex 1D inverse FFT. More... | |
A cuda-specific implementation of the fft module.
This class provides and interface to Nvidia's cuFFT library wrapping plan generation and execution and providing a single overloaded method to simplify the process of performing FFTs.
It should be noted that for performance reasons it is best to perform repetitions of the same transform size and type as this will not result in reallocation of the FftPlan object and regeneration of a cuFFT plan.
It is therefore recommended that separate instantiations should be created for forward and back transforms and for repeated transforms of the same size the object should persist between transforms.
Note that cuFFT rescales such that iFFT(FFT(A)) = size(A)*A for complex transforms and iFFT(FFT(A)) = sqrt(size(A))*A for a real to complex followed by complex to real transform.
ska::cheetah::fft::cuda::Fft::Fft | ( | fft::Config const & | algo_config | ) |
void ska::cheetah::fft::cuda::Fft::process | ( | ResourceType & | gpu, |
data::TimeSeries< cheetah::Cuda, T, InputAlloc > const & | input, | ||
data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > & | output | ||
) |
Perform a real-to-complex 1D FFT.
The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.
[in] | gpu | A PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context. |
[in] | input | A real TimeSeries instance to be transformed |
[out] | output | A complex FrequencySeries instance for the transform output |
T | The base value type for the transform (float or double) |
InputAlloc | The allocator type of the input |
OutputAlloc | The allocator type of the output |
Definition at line 81 of file Fft.cu.
void ska::cheetah::fft::cuda::Fft::process | ( | ResourceType & | gpu, |
data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, InputAlloc > const & | input, | ||
data::TimeSeries< cheetah::Cuda, T, OutputAlloc > & | output | ||
) |
Perform a complex-to-real 1D FFT.
The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.
[in] | gpu | A PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context. |
[in] | input | A complex FrequencySeries instance to be transformed |
[out] | output | A real TimeSeries instance for the transform output |
T | The base value type for the transform (float or double) |
InputAlloc | The allocator type of the input |
OutputAlloc | The allocator type of the output |
Definition at line 99 of file Fft.cu.
void ska::cheetah::fft::cuda::Fft::process | ( | ResourceType & | gpu, |
data::TimeSeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const & | input, | ||
data::FrequencySeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > & | output | ||
) |
Perform a complex-to-complex 1D forward FFT.
The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.
[in] | gpu | A PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context. |
[in] | input | A complex TimeSeries instance to be transformed |
[out] | output | A complex FrequencySeries instance for the transform output |
T | The base value type for the transform (float or double) |
InputAlloc | The allocator type of the input |
OutputAlloc | The allocator type of the output |
Definition at line 117 of file Fft.cu.
void ska::cheetah::fft::cuda::Fft::process | ( | ResourceType & | gpu, |
data::FrequencySeries< cheetah::Cuda, thrust::complex< T >, InputAlloc > const & | input, | ||
data::TimeSeries< cheetah::Cuda, typename data::ComplexTypeTraits< cheetah::Cuda, T >::type, OutputAlloc > & | output | ||
) |
Perform a complex-to-complex 1D inverse FFT.
The output object will be resized appropriately to match the expected output size of the transform. For this reason it is most performant to reuse an output object of the correct size for each call to this method. The method also updates meta data associated with its output based on metadata associated with its input.
[in] | gpu | A PoolResource object of architecture type Cuda. The object contains information about the selected GPU and the current context. |
[in] | input | A complex FrequencySeries instance to be transformed |
[out] | output | A complex TimeSeries instance for the transform output |
T | The base value type for the transform (float or double) |
InputAlloc | The allocator type of the input |
OutputAlloc | The allocator type of the output |
Definition at line 134 of file Fft.cu.