Thursday, February 2, 2012

On CFT, DFT and their normalization

For me to do a FFT is a bit uncommon. In fact I rather prefer to do a DFT, to avoid having spurious lobes and to try to skip the process of windowing and all that. It is always with care that I approach it, as although one has a basic knowledge of its workings, it is kind of an old lore with dust when one does not use it very often. Well the thing is, although one knows a bit of Fourier analysis, to handle an FFT algorithm is sometimes handling a black box and the worse happens when we think we know what we're doing, but in truth we don't.

I'm writing this because I wanted to compare a the DFT and the CFT of a specific function, to which the first apparent difficulty was to normalize the DFT output. It is trivial to find how one should normalize the power spectral densities (PSD) after doing an FFT with MATLAB, OCTAVE or Python. But to normalize already on the DFT output, that is kind of difficult to find. But alas, I knew my function and my CFT, so this turned to be quite simple. The function is an Agnesi shape (or a version of the Cauchy distribution):
$$ f(x)=h\,\left[1+\left(\frac{x}{a}\right)^2\right]^{-1} $$
to which the continuous Fourier transform yields:
$$ F(\xi)=\int\limits_{-\infty}^{\infty} f(x)\,e^{-2\pi\,i\,x\,\xi}\,dt = \pi\,a\,h\,e^{-2\pi\,|\xi|\,a} $$
So it is easy to find that the amplitude at the zero frequency is equivalent to the area below the spatial (or time) domain curve:
$$ F(\xi=0)=\pi\,a\,h\;\equiv\;\int\limits_{-\infty}^{\infty} f(x)\,dx $$
Thus I want the output of my DFT to have a zero coefficient with the same properties, which gives the normalization I must apply.

Advancing to the DFT, one should first define what is exactly the discrete signal. Defining a time array with N equaly spaced bins of size $\Delta t$:
$$ \mathsf{t}=\{0,\,1,\,\ldots,\,N-1\} \times \Delta t \quad\Rightarrow\quad \mathsf{t}_n=n\,\Delta t\,, \;n=0,\,1,\,\ldots,\,N-1 $$
and the respective sampled signal:
$$ \mathsf{f}=\{ f_0,\,f_1,\,\ldots,\,f_{N-1}\} $$
The sampling frequency is the inverse of the sampling period, $f_s={\Delta t}^{-1}$ .

The DFT is thus defined as:
$$ \operatorname{DFT}(\mathsf{f}) = \mathsf{\widehat{F}}_k = \sum\limits_{n=0}^{N-1} \mathsf{f}_n \, e^{-2\pi\,i\,\frac{n}{N}\,k} \;,\quad k = 0,\,1,\,\ldots,\,Nk-1 $$
As I prefer to have the same size for the array returned by the DFT, $Nk$, as the number of samples I have available in my signal, $N$, I sometimes perform a DFT instead of the more faster FFT (when this number is not a power of two).

This is the same definition as the implementation in Numpy, as well as MATLAB and OCTAVE. As it is stated in the Numpy function reference list:
The values in the result follow so-called "standard" order: If A = fft(a, n), then A[0] contains the zero-frequency term (the mean of the signal), which is always purely real for real inputs. Then A[1:n/2] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency.

By making some simple tests, one finds that the DFT coefficients should be multiplied by the sampling period $\Delta t$ (or divided by the sampling frequency $fs$). This is because the zero frequency coefficient of the DFT is equal to the sum of the elements in the array $\mathsf{f}$. If one wants to be consistent with the results of the CFT (recall that $F(0)$ is equal to the area below function $f(t)$), multiplying the DFT by $\Delta t$ yields a zero coefficient that is also the area beneath the discrete array $\mathsf{f}$, using the most basic numeric integration method:
$$ \mathsf{\widehat{F}}_0 = \sum\limits_{n=0}^{N-1} \mathsf{f}_n \quad\Rightarrow\quad \mathsf{F}_0 = \mathsf{\widehat{F}}_0 \, \Delta t \quad\Rightarrow\quad \mathsf{F}_n = \mathsf{\widehat{F}}_n \, \Delta t $$
Now for the tricky part: when I did my DFT, I did it on an array where the x coordinate was defined as follows:
$$ \mathsf{x} = \{0,\,1,\,\ldots,\,N-1\} \times \Delta x + \mathrm{xmin} \\ \ldots = \{\mathrm{xmin}, \, \mathrm{xmin}-\Delta x,\,\ldots, \, -\Delta x,\, 0,\, \Delta x,\, \ldots,\, \mathrm{xmax}-\Delta x,\, \mathrm{xmax}\} $$
and my f array as:
$$ \mathsf{f}_n = f(\mathsf{x}_n) $$
This yielded the following results for the DFT (both real and imaginary parts):



where,the wavenumber is simply $k = 2 \pi \xi$.
At first, not realizing the correctness of the DFT output, I used python functions fftshift and ifftshift, such that:


F = fft(ifftshift(f), n=NDFT)


which is the same has having the following signal:



Although this would give a DFT that agreed with the CFT, conceptually it was not right.

After further analysis my mistake was evident: although the CFT is done on both directions, i.e., it is done using an improper integral spanning from minus infinity to infinity, the DFT is done on a finite signal, where the first index is 0 and the last is $N-1$. Thus when performing the DFT, is over a signal shifted by a distance $\mathrm{xmin}$. Analytically this is the same as:
$$ g(x) = f(x-\gamma) \quad\Rightarrow\quad G(\xi) = e^{-2\pi\,i\,\gamma\,\xi} \, F(\xi) $$
When viewing from the discrete perspective, the shifting is the same as if an index $l$ was applied, thus:
$$ \mathsf{g}_n = \mathsf{f}_{n-l} \quad\Rightarrow\quad \mathsf{G}_k = e^{-2 \pi\,i\,k\,\frac{l}{N}} \, \mathsf{F}_{k} $$
where:
$$ l = \begin{cases} \frac{N-1}{2} \,, & \text{if} \; \mod(N, 2) \ne 0 \\ \frac{N}{2}-0.5 \,, & \text{if} \; \mod(N, 2) = 0 \end{cases} $$
As I want to compare the DFT with the CFT performed on the Agnesi shape without any shifting, I will obtain $\mathsf{G}$ and from it compute $\mathsf{F}$, such that:
$$ \mathsf{F}_k = e^{+2\pi\,i\,k\,\frac{l}{N}} \, \mathsf{G}_{k} $$
which in python code, is equivalent to:

F = np.fft.fft(f, n=NDFT)

if 0==(NDFT % 2):
    l = N/2-0.5
else:
    l = (N-1)/2

F = F * np.exp(2*np.pi * 1j * np.arange(0, NDFT) * l/float(N))

With this correction the DFT yields:





Power Spectral Density (PSD)


Regarding the power spectral density, I obtained:


When searching for ways to normalize the PSD, a common way is (same as in the MATLAB User's Guide):


PSD = np.abs(F)**2/f.size/FS


or using the period (spatial resolution in this case):


PSD = np.abs(F)**2 * DX/f.size



Resorting to the definition of the energy spectral density:
$$ \Phi(\xi) = F(\xi) \cdot \text{conj}\left(F(\xi)\right) \equiv |F(\xi)|^2 $$
and the PSD for an analytic continuous signal:
$$ S(\xi) = \lim\limits_{T\rightarrow\infty} \frac{1}{T} \Phi(\xi) = \lim\limits_{T\rightarrow\infty} \frac{ F(\xi) \cdot \text{conj}\left(F(\xi)\right) }{T} $$
Thus, when looking back to the way the DFT was normalized, in order to resemble the CFT output, generically one can write:
$$ \text{DFT}(\mathsf{f}) = \widehat{\mathsf{F}} \quad \Rightarrow\quad |\widehat{\mathsf{F}}|^2 \, \frac{\Delta t}{N} \;\equiv\; \frac{ \; |\widehat{\mathsf{F}} \, \Delta t|^2 }{\Delta t \, N} \;=\; \frac{|\mathsf{F}|^2}{T} $$
In the above notation, the hat was used to represent the unprocessed output from the DFT, while $\mathsf{F}$ without the hat is the DFT normalized by $\Delta t$. If one obtains the DFT and use it directly to get the PSD, as it is done in the MATLAB reference page, it becomes difficult to distinguish why should the PSD be normalized in such a way, i.e., dividing by $f_s\,N$. Normalizing the DFT output it becomes clear that the PSD is obtained by computing the energy spectra and dividing it by the integral time span of the discrete finite signal. It becomes easier to understand how the PSD relates to the energy spectra and the DFT.



Amplitude or Spectral Density (ASD)


When plotting the frequency content of a signal it is common to resort to the amplitude instead of the PSD. This is useful when dealing with a signal containing several sinusoids at different frequencies, such as:
$$ f(t) = a_1 \sin(2 \pi \, \omega_1 \, t) + a_2 \sin(2 \pi \, \omega_2 \, t) \ldots + a_m \sin(2 \pi \, \omega_m \, t) \ldots + a_M \sin(2 \pi \, \omega_M \, t) $$
Then it would be advantageous if, when viewing the spectrum of $f(t)$, besides identifying the frequencies $\omega_m$, one would also see the correct amplitudes $a_m$ on the vertical axis. The amplitude spectra is given by the magnitude of the DFT, being also related with the energy spectra:
$$ \mathsf{A}_k = \frac{ \text{abs}(\, \widehat{\mathsf{F}}_k ) }{N} \equiv \frac{ \text{abs}(\, \mathsf{F}_k ) }{N \, \Delta t} $$
where the division by $N$ is present to normalize the amplitude,
such that the zero coefficient becomes equal to the mean of the array $\mathsf{f}_n$. Using the DFT output normalized by multiplying with $\Delta t$, again one observes that the normalizing quantity of the ASD becomes the integral time span of the signal. Analytically, the ASD is related to both the energy and power spectra by:
$$ A(\xi) = \frac{ |F(\xi)| }{N \, \Delta t} = \frac{\sqrt{\Phi(\xi)}}{T} \equiv \sqrt{ \frac{ \mathrm{S}_f(\xi) }{T} } $$
Now, the DFT is composed by the zero frequency, a set of floor(NDFT/2) coefficients for positive frequencies an equal set of floor(NDFT/2) coefficients for negative frequencies. The amplitude $a_m$ is thus returned in the ASD, but half at the positive frequency $\omega_m$, while the other half is in the negative frequency $-\omega_m$. AS one is only interested in plotting the positive frequencies, to have the correct value of $a_m$ when plotting the ASD:
$$ \widetilde{\mathsf{A}}_k = 2\,\mathsf{A}_k \;,\quad \text{for}\quad k>0 \\ \widetilde{\mathsf{A}}_0 = \mathsf{A}_0 $$
To finish this post, an example for a signal composed of two sinusoids:



This figure was generated with the following python code: