PyDAQ 4: Fourier transform `(home)`¶

In the previous exercise (PyDAQ 3), you whistled a filter; you measured the frequency response by looking at $V_\text{rms}$ for varying frequencies. Sampling the signal via the data acquisition board gives the representation of the signal in the time domain. However, as you have seen in the SCR exercises, we are often not so much interested in the amplitudes of the individual samplings, but in the frequencies that a signal is composed of.

In Analysis, you learned that the Fourier transform converts a function in the time domain $f(t)$ to a function in the frequency domain $\hat{f}(\omega)$, just the way a prisma splits a white light beam into all its constitutive colours, displaying the frequency spectrum of the white light. Remember that the equation for the Fourier transform is:

$$ \hat{f}(\omega) = \int_{-\infty}^\infty f(t) e^{-i\omega t}\text{d}t $$

In Electronic Instrumentation, the signals are not infinite and not continuous, because they are sampled using a data acquisition board. The above formula cannot be used, because we cannot measure $f(t)$ on every infinitesimally small $t$. Therefore, we will use the discrete Fourier transform (DFT), which looks like this:

$$ \hat{f}[\Omega] = \sum_{n=0}^{N-1} f[n] e^{-i \frac{2\pi}{N}\Omega n}, \Omega=0,...,N-1 $$

In Systems and Signals, you will learn more about the discrete-time Fourier transform (DTFT) and the discrete Fourier transform (DFT). In principle, they still decompose the signal into the frequencies that it is composed of.

Discrete Fourier transform in Python¶

We will mimic a voltage measurement with the data acquisition board. In the code below, time corresponds to the timepoints of a measurement and voltage corresponds to the measured voltage. We will measure Nsamples=350 samples equally spaced over a time inverval duration of dur = 2 seconds. You can calculate that this corresponds to a sampling rate of 175 Hz.

Exercise 1: Run the following cell and observe that the voltage signal is composed of two signals. (a) How do you calculate the sampling rate? (b) What are the frequencies of the two signals? (c) What are the amplitudes?

import numpy as np
import matplotlib.pyplot as plt

Nsamples = 350
duration = 2 # seconds

print('The sampling rate equals: '+str(Nsamples/duration)+' Hz.')

time = np.linspace(0,duration,Nsamples)
freq1 = 5.0 # Hz
freq2 = 18.0 # Hz
voltage = 1.4 * np.sin( freq1 * 2.0 * np.pi * time) + 0.5 * np.sin( freq2 * 2.0 * np.pi * time)

plt.plot(time,voltage,'x-')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.grid()

The sampling rate equals: 175.0 Hz.

> Write your answer here
a. ...
b. The two frequencies are ...
c. Amplitudes are ...

As you see, the time domain does not display so clearly that the signal is composed of two frequencies with two separate amplitudes. The goal of this exercise is to retrieve these frequencies and amplitudes using the DFT.

Exercise 2: Plot the DFT of voltage using np.fft.fft() (fft means fast Fourier transform). (a) How many peaks do you expect? (b) Why do we take the absolute value of the DFT, instead of plotting the DFT right away? (c) Is the frequency axis correct?

Fourier_voltage = np.fft.fft( voltage )
plt.plot(np.abs(Fourier_voltage))
plt.xlabel('omega in Hz?')
plt.ylabel('amplitude in V?')

Text(0, 0.5, 'amplitude in V?')

> Write your answer here
a. how many peaks:
b. why absolute value:
c. frequency axis correct?

There are several peculiarities in this plot.

Firstly, the frequency axis is not correct. We can easily calculate the frequency axis using np.fft.fftfreq(Nsamples, d=duration/Nsamples). It will give a list with Nsamples//2 frequencies in the range $-\frac{f_\text{sampling}}{2} < \omega \leq \frac{f_\text{sampling}}{2}$.

Secondly, we see four peaks instead of two. This is because Python calculates the DFT for positive and negative frequencies. The elements 1:Nsamples//2 of the list Fourier_voltage correspond to strictly positive $\omega$ and the elements (Nsamples+1)//2: correspond to strictly negative $\omega$, where the final element is for $\omega$ closest to zero and the element (Nsamples+1)//2 for the most negative $\omega$. The element 0 corresponds to $\omega=0$. You could also derive this information from the fftfreq function (above). For our purposes, we should take the absolute values of the negative and positive frequencies and add them on top of each other.

Thirdly, the height of the peaks does not correspond to the amplitudes in the original signal. To solve this, you should know that we have to divide the DFT by the number of samples. The reason for this is explained in the appendix at the end of this document.

Exercise 3: Solve the above peculiarities and plot the DFT in the frequency range $\omega \in [0,\frac{f_\text{sampling}}{2}]$. Your code should also work when Nsamples is odd.

Fourier_voltage = np.fft.fft( voltage )

DFT_frequencies = ................

DFT_plot = ..............
...........

plt.plot( DFT_frequencies, DFT_plot )
plt.xlabel('Frequency (Hz)')
plt.ylabel('Amplitude')
plt.grid()

If you did this correctly, you should see two peaks at 5 and 18 Hz with amplitudes 1.4 and 0.5. This is exactly the frequency spectrum that we created in exercise 1. If your peaks are twice as low, you did not place the positive and negative frequencies on top of each other correctly. They are probably off by one index. If you have even smaller peaks, you probably did not take the absolute value of the DFT first before placing the frequencies on top of each other.

Exercise 4: Use your code to calculate the DFT of the following signal. You should obtain the same plot with an additional additional peak at $\omega=0$ with a height of 0.8.

If you have an error about incompatible shapes, your code is probably not handling the odd Nsamples well.
If the additional peak at $\omega=0$ is not there, you are probably not plotting the 0-th element of np.fft.fft() or you are still using the old voltage variable above instead of the new one below.
If the additional peak has height 0.4 or 1.6 instead of 0.8, remember that you should not alter the 0-th element that you retrieve from np.fft.fft(). You should only add up the strictly negative $\omega$ to the strictly positive $\omega$. The reason for this is explained in the appendix.
If some other peaks are twice as high or twice as low, you did not place the positive and negative frequencies on top of each other correctly.
If the relative heights of the peaks are not corresponding to the relative heights in the original signal, you may have added the signals before taking the absolute value.

Nsamples = 351
time = np.linspace(0,duration,Nsamples)
voltage = 1.4 * np.sin( 5.0 * 2.0 * np.pi * time) + 0.5 * np.cos( 18.0 * 2.0 * np.pi * time) + 0.8

If everything worked out well, you can use your code to make a function calculate_fft(). The input parameters are:

signal: the signal in the time domain
dur: the duration of the signal in seconds
rate: the sampling rate of the signal in Hz

As you may see, we already wrote the backbone of this function. It allows you to call the function with only dur or rate. The code will calculate the mising one automatically. If you specify dur and rate, the code will check if they are coherent with the length of the signal.

Exercise 5: Complete this function that calculates the DFT. You will use this function later in the course.

def calculate_fft( signal, dur=None, rate=None ) -> tuple:
    
    import numpy as np
    
    Nsamples = len(signal)

    # CHECK IF ENOUGH VARIABLES ARE GIVEN AND CALCULATE MISSING VARIABLES
    assert not (dur is None and rate is None), "You have to define at least one of these two variables: 'dur' or 'rate'."
    dur = dur or Nsamples/rate
    rate = rate or Nsamples/dur
    assert dur == Nsamples/rate, "The sample duration 'dur' (" + str(dur)+" sec) and sampling frequency 'rate' ("+str(rate)+" Hz) do not correspond to the length of 'signal' ("+str(Nsamples)+")."
     
    
    # CALCULATE THE FREQUENCY AXIS
    frequency_axis = ..........
    
    # CALCULATE THE FOURIER TRANSFORM
    DFT = ...........
    ...........
    
    
    return frequency_axis, DFT

Exercise 6: Test your function professionally using this piece of test code. For the purpose of this course, it is okay if you do not pass the negativeOffset test, because we will not use this function to measure negative offsets.

import unittest

class TestCode(unittest.TestCase):
    
    def test_1_evenNumberOfSamples_FourierCoefficients(self):
        _,y = calculate_fft( signal=[1, 0, -1, 0, 1, 0, -1, 0], dur=1, rate=8)
        self.assertTrue( (y == [0., 0., 1., 0.]).all() )
        
        _,y = calculate_fft( signal=[0, 0], dur=1, rate=2 )
        self.assertEqual( y, [0.] )
        
        _,y = calculate_fft( signal=[0, 0, 0, 0], dur=4, rate=1 )
        self.assertTrue( (y == [0., 0.]).all() )
        
    def test_2_evenNumberOfSamples_FrequencyAxis(self):
        omega,_ = calculate_fft( signal=[1, 0, -1, 0, 1, 0, -1, 0], dur=1, rate=8)
        self.assertTrue( (omega == [0., 1., 2., 3.]).all() )
        
        omega,_ = calculate_fft( signal=[0, 0], dur=1, rate=2 )
        self.assertEqual( omega, [0.] )
        
        omega,_ = calculate_fft( signal=[0, 0, 0, 0], dur=4, rate=1 )
        self.assertTrue( (omega == [0., 0.25]).all() )
        
    def test_3_oddNumerOfSamples_FourierCoefficients(self):
        _,y = calculate_fft( signal=[2, -1, -1, 2, -1, -1, 2, -1, -1], dur=1 , rate=9)
        self.assertTrue( (y == [0., 0., 0., 2.]).all() )
        
        _,y = calculate_fft( signal=[2, -1, -1, 2, -1, -1, 2, -1, -1], dur=2 , rate=4.5)
        self.assertTrue( (y == [0., 0., 0., 2.]).all() )
        
        _,y = calculate_fft( signal=[1, 0, -1], dur=1, rate=3 )
        self.assertEqual( y, [0.] )
        
    def test_4_oddNumerOfSamples_FrequencyAxis(self):
        omega,_ = calculate_fft( signal=[2, -1, -1, 2, -1, -1, 2, -1, -1], dur=1 , rate=9)
        self.assertTrue( (omega == [0., 1., 2., 3.]).all() )
        
        omega,_ = calculate_fft( signal=[2, -1, -1, 2, -1, -1, 2, -1, -1], dur=2 , rate=4.5)
        self.assertTrue( (omega == [0., 0.5, 1., 1.5]).all() )
        
        omega,_ = calculate_fft( signal=[1, 0, -1], dur=1, rate=3 )
        self.assertEqual( omega, [0.] )

    def test_5_specifyingDurOnly(self):
        omega,_ = calculate_fft( signal=[0, 0, 0, 0, 0, 0, 0, 0], dur=1)
        self.assertTrue( (omega == [0., 1., 2., 3.]).all() )
        omega,_ = calculate_fft( signal=[0, 0, 0, 0, 0, 0, 0, 0, 0], dur=1)
        self.assertTrue( (omega == [0., 1., 2., 3.]).all() )
        
    def test_6_specifyingRateOnly(self):
        omega,_ = calculate_fft( signal=[0, 0, 0, 0, 0, 0, 0, 0], rate=8)
        self.assertTrue( (omega == [0., 1., 2., 3.]).all() )
        omega,_ = calculate_fft( signal=[0, 0, 0, 0, 0, 0, 0, 0, 0], rate=9)
        self.assertTrue( (omega == [0., 1., 2., 3.]).all() )

    def test_7_offset(self):
        _,y = calculate_fft( signal=[1, 1, 1], dur=1, rate=3)
        self.assertEqual( y[0], 1. )
        self.assertNotEqual( y[0], 2. )
        _,y = calculate_fft( signal=[2, 1, 2, 1], dur=1, rate=4)
        self.assertEqual( y[0], 1.5 )
        self.assertNotEqual( y[0], 3. )
        
    def test_8_negativeOffset___optional_test(self):
        _,y = calculate_fft( signal=[-1, -1, -1], dur=1, rate=3)
        self.assertEqual( y[0], -1. )    
        _,y = calculate_fft( signal=[-1, -1, -1, -1], dur=1, rate=4)
        self.assertEqual( y[0], -1. )    

if __name__ == '__main__':
    unittest.main(argv=['first-arg-is-ignored'], 
                  verbosity=2, exit=False)

test_1_evenNumberOfSamples_FourierCoefficients (__main__.TestCode) ... ok
test_2_evenNumberOfSamples_FrequencyAxis (__main__.TestCode) ... ok
test_3_oddNumerOfSamples_FourierCoefficients (__main__.TestCode) ... ok
test_4_oddNumerOfSamples_FrequencyAxis (__main__.TestCode) ... ok
test_5_specifyingDurOnly (__main__.TestCode) ... ok
test_6_specifyingRateOnly (__main__.TestCode) ... ok
test_7_offset (__main__.TestCode) ... ok
test_8_negativeOffset___optional_test (__main__.TestCode) ... ok

----------------------------------------------------------------------
Ran 8 tests in 0.021s

OK

If something is wrong in test 3 and test 4, your code does not handle inputs with an odd number of samples correctly. It is important that you correct this, because you will be using this function later on in the course.

If something is wrong in test 7, you should look at the zeroth Fourier coefficient in your output. It might be twice as large because of how you added the positive and negative frequencies. In that case, you should divide the zeroth coefficient by 2 or think of another way to solve this. If test 8 failed, you can ignore this. You will not have to detect negative offsets with this function.

If you pass the tests 1 to 7, your code will also pass the automatic grading. Test 8 is optional.

You can also test your function on the following data. If everything works correctly, you are done with this exercise.

Nsamples = 80
time = np.linspace(0,2,Nsamples)
voltage = 1.4 * np.sin( 5.0 * 2.0 * np.pi * time) + 0.5 * np.sin( np.pi * 2.0 * np.pi * time)

x,y = calculate_fft( signal=voltage,dur=2)
plt.plot(x,y)

[<matplotlib.lines.Line2D at 0x7fd8781ef400>]

Appendix on the Discrete Fourier transform [optional]¶

This appendix is meant for students who would like to get more information about Fourier series and the Fourier transform. It is not part of the exams for Electronic Instrumentation. However, the information may be useful for courses like Systems and Signals, although it might also be difficult to understand without having attended such a course first.

Brief note on Fourier series¶

In Analysis 2, you learned how to calculate the (continuous-time) Fourier series of a function. We started with the Fourier coefficients in the following form. We will assume that the signal is periodic with $T = \frac{2\pi}{\omega_0}$. We can rewrite these coefficients into a more condensed form (with $c_k$) using Euler's formula.

$$ f(t) \sim \frac{a_0}{2} + \sum_{k=1}^\infty a_k \cos{ \frac{2\pi k t}{T} } + \sum_{k=1}^\infty b_k \sin{ \frac{2\pi k t}{T} } = \sum_{k=-\infty}^\infty c_k e^{i \frac{2\pi k}{T} t} $$

If you work out this out, you will obtain that $a_{k} = c_{k} + c_{-k}$ and $b_k = ic_k -ic_{-k}$ (for $k>1$) and $a_0 = 2c_0$. You will also find that $c_k$ can be calculated using an integral over one period $P$:

$$ c_k = \frac{1}{T} \int_P f(t) e^{-i\frac{2\pi k}{T} t} \text{d}t $$

The Fourier series with $a_k$ and $b_k$ mean exactly the same as the Fourier series with $c_k$. It is a good exercise to convert the notations into each other yourself.

If we assume that we have measured a signal during one period with $N$ equially spaced sample points, we can say that $t \approx \frac{nT}{N}$, $\text{d}t \approx \frac{T}{N}$ and obtain the following approximation for $c_k$.

$$ c_k \approx \frac{1}{T} \sum_{n=0}^{N-1} \left( f(\frac{T}{N}n) e^{-i\frac{2\pi k}{N} n} \frac{T}{N} \right) = \frac{1}{N} \sum_{n=0}^{N-1} f[n] e^{-i\frac{2\pi k}{N} n} $$

For large enough $N$, this is a good approximation.

Why divide the DFT by Nsamples?¶

Since computers can only store a discrete list of values, it is not possible that np.fft.fft() returns a continuous function. Instead, numpy calculates the following thing, which we call the discrete Fourier transform (DFT):

$$ \hat{f}[\Omega] = \sum_{n=0}^{N-1} f[n] e^{-i \frac{2\pi}{N}\Omega n} $$

You may see that this is surprisingly similar to the approximation of the Fourier coefficients $c_k$ mentioned above. Therefore, we should say that numpy actually returns $N$ Fourier coefficients $c_k$ multiplied by $N$, calculated assuming that the given function values come from on an equally spaced grid in the interval $[0,T)$. If we do not specify the frequency axis, numpy will take $T=2\pi$. We used the notation $c_k = c(k)$ to make it more readible. The fractions in the indices should be rounded down.

$$ \texttt{np.fft.fft}( f ) = \left[N\cdot c(0),\ N\cdot c(1),\ N\cdot c(2)\ \cdots\ N\cdot c({\frac{N}{2}}),\ N\cdot c({-\frac{N+1}{2}}),\ N\cdot c({-\frac{N+1}{2}+1})\ \cdots\ N\cdot c({-2}),\ N\cdot c({-1}) \right] $$

As you see, the first half of the list corresponds to positive frequencies and the other half to strictly negative frequencies.

You can see that this is true in the following example. We can expand $\sin{t} =\frac{e^{it} - e^{-it}}{2i} $ and $\cos{t} = \frac{e^{it} + e^{-it}}{2}$ into complex exponentials, revealing the Fourier coefficients $c_k$. Note that numpy takes $T = 2\pi$ by default if we do not calculate the frequency axis.

\begin{gather*} f(t) = 0.6 \cdot \sin(t) +0.2 \cdot \cos(2t)+ 4.5 \\ = 0.6 \cdot \frac{e^{it} - e^{-it}}{2i} + 0.2 \cdot \frac{e^{i2t} + e^{-i2t}}{2} + 4.5 \\ = 0.3 i e^{-it} - 0.3 i e^{it} + 0.1 e^{-2it} + 0.1 e^{2it} + 4.5 e^{0} \\ = c_{-1} e^{-it} + c_{1} e^{it} + c_{-2} e^{-i2t} + c_{2} e^{i2t} + c_0 \\ = \sum_{k=-\infty}^\infty c_k e^{i \frac{2\pi k}{T} t} \end{gather*}

This shows that we should have $c_{1} = - c_{-1} =-0.3i$, $c_{2} = c_{-2} = 0.1$, $c_0 = 4.5$ and $c_k=0$ for all remaining values of $k$. If N=10, we would expect the following list from np.fft.fft: [45 + 0i, 0-3i, 0+1i, 0+0i, 0+0i, 0+0i, 0+0i, 0-1i, 0+3i]. You can check that this is approximately correct.

N = 10
time = np.linspace(0,2*np.pi,N)
y = 0.6*np.sin( time ) + 0.2*np.cos( 2*time ) + 4.5
print( np.fft.fft(y) )

[45.2       +0.00000000e+00j  1.09230095-2.56119820e+00j
  0.50753201+9.81852325e-01j -0.35310189+7.76663109e-03j
 -0.23754   +2.25345350e-02j -0.21838214+1.11022302e-16j
 -0.23754   -2.25345350e-02j -0.35310189-7.76663109e-03j
  0.50753201-9.81852325e-01j  1.09230095+2.56119820e+00j]

If you pick higher N, you will see that the pattern becomes more precise. This has to do with the Nyquist-Shannon sampling theorem and nonideal sampling. Here, we used N=1e6. Note how all the elements of the list increase with $N$.

N = 1000000
time = np.linspace(0., 2.*np.pi,N)
y = 0.6*np.sin( time ) + 0.2*np.cos( 2.*time ) + 4.5
print( np.round(np.fft.fft(y)) )

[ 4.5e+06+0.e+00j  1.0e+00-3.e+05j  1.0e+05+1.e+00j ... -0.0e+00-0.e+00j
  1.0e+05-1.e+00j  1.0e+00+3.e+05j]

That is the reason that you have to divide the values of np.fft.fft() by $N$. To repeat, np.fft.fft() does not precisely calculate the DTFT, but yields an approximation for Fourier coefficients $c_k$ multiplied by $N$ instead.

What happens if we put the coefficients on top of each other?¶

In essense, the Fourier series of a signal decomposes the signal into a weighted sum of constitutive complex exponentials. The angle $\angle c_k$ (remember that $c_k$ is a complex number) corresponds to the phase shift of the constitutive complex exponential $e^{i \frac{2\pi k}{T}t}$ in the signal. The magnitude $\mid c_k \mid$ corresponds to the amplitude of this complex exponential.

Let us take the previous example again:

$$f(t) = 0.6 \cdot \sin(t) +0.2 \cdot \cos(2t)+ 4.5 = 0.3 i e^{-it} - 0.3 i e^{it} + 0.1 e^{-2it} + 0.1 e^{2it} + 4.5 e^{0}$$

The Fourier coefficients are given by $c_{1} = - c_{-1} =-0.3i$, $c_{2} = c_{-2} = 0.1$, $c_0 = 4.5$ and $c_k=0$ for all remaining values of $k$. We see that $\angle c_2 = \angle c_{-2} = 0$, because they correspond to the cosine part. Also, $\angle c_{-1} = -\angle c_1 = \angle 0.3 i = \frac{\pi}{2}$ correspond to the sine part. Indeed, we know that the sine and the cosine differ $90^{\circ}$ in phase.

If we take the magnitude of a Fourier coefficient, we essentially get rid of this phase shift of the complex exponential. Adding the coefficients for the positive and the negative frequencies means that we add the amplitudes of all parts of a signal irrespective of the phase.

What makes the fast Fourier transform fast?¶

In this section, we will discuss the Cooley-Turkey algorithm. We will start with the definition of the discrete Fourier transform (DFT), where $k=0,...,N-1$.

$$ \hat{f}[k] = \sum_{n=0}^{N-1} f[n] e^{-i \frac{2\pi}{N}k n} $$

Each summation takes approximately $N$ operations and we have approximately $N$ values of $k$. Therefore, we need $\mathcal{O}( N^2 )$ operations for the entire DFT. (It is common to use Big O notation for algorithm complexity)

Let us assume that $N$ is a power of two. Therefore, we can easily divide $N$ by 2 without worrying about border cases. We will divide the summation into two parts, namely for even and off values of $n$.

$$ \hat{f}[k] = \sum_{m=0}^{N/2-1} f[2m] e^{-i\frac{2\pi}{N} k(2m)} + \sum_{m=0}^{N/2-1} f[2m+1] e^{-i\frac{2\pi}{N} k(2m+1)} $$

We will factor out $e^{-i\frac{2\pi}{N}k}$ and rewrite the sums as two DFTs again. We will call those new DFTs $E_k$ (for even terms) and $O_k$ (for odd terms).

$$ \hat{f}[k] = \sum_{m=0}^{N/2-1} f[2m] e^{-i\frac{2\pi}{N/2} km} + e^{-i\frac{2\pi}{N}k}\sum_{m=0}^{N/2-1} f[2m+1] e^{-i\frac{2\pi}{N/2} km} = E_k + e^{-i\frac{2\pi}{N}k} O_k $$

Now, we will use the above formula to calculate $\hat{f}[k+\frac{N}{2}]$. We will see that some terms will vanish because of the complex exponentials.

\begin{gather*} \hat{f}[k+\frac{N}{2}] = \sum_{m=0}^{N/2-1} f[2m] e^{-i\frac{2\pi}{N/2} (k+\frac{N}{2})m} + e^{-i\frac{2\pi}{N}(k+\frac{N}{2})}\sum_{m=0}^{N/2-1} f[2m+1] e^{-i\frac{2\pi}{N/2} (k+\frac{N}{2})m} \\ = \sum_{m=0}^{N/2-1} f[2m] e^{-i\frac{2\pi}{N/2} km}e^{-i2\pi m} + e^{-i\frac{2\pi}{N}k} e^{-i\pi}\sum_{m=0}^{N/2-1} f[2m+1] e^{-i\frac{2\pi}{N/2} km}e^{-i2\pi m} \\ = \sum_{m=0}^{N/2-1} f[2m] e^{-i\frac{2\pi}{N/2} km} - e^{-i\frac{2\pi}{N}k}\sum_{m=0}^{N/2-1} f[2m+1] e^{-i\frac{2\pi}{N/2} km} = E_k - e^{-i\frac{2\pi}{N}k}O_k \end{gather*}

As you can see, we can use the intermediate results $E_k$ and $O_k$ twice. Furthermore, $E_k$ and $O_k$ are themselves DFTs with half the amount of sample points. Therefore, we can apply the same trick recursively and reduce the number of operations. In an Algorithms and Data Structures course, you will learn that this recursion reduces the total time complexity to $\mathcal{O}(N\log N)$.