next up previous contents
Next: 3. Time-Scale Analysis Up: Wavelets and Filter Banks Previous: 1. Analysis and Synthesis   Contents

Subsections

2. Time-Frequency Analysis

In many applications such as speech processing, we are interested in the frequency content of a signal locally in time. That is, the signal parameters (frequency content etc.) evolve over time. Such signals are called non-stationary. For a non-stationary signal, $x(t)$, the standard Fourier Transform is not useful for analyzing the signal. Information which is localized in time such as spikes and high frequency bursts cannot be easily detected from the Fourier Transform.

Time-localization can be achieved by first windowing the signal so as to cut off only a well-localized slice of $x(t)$ and then taking its Fourier Transform. This gives rise to the Short Time Fourier Transform, (STFT) or Windowed Fourier Transform. The magnitude of the STFT is called the spectrogram. By restricting to a discrete range of frequencies and times we can obtain an orthogonal basis of functions.

2.1 The Short Time Fourier Transform

The Short Time Fourier Transform of a signal $x(t)$ using a window function $g(t)$ is defined as follows.


\begin{displaymath}STFT(f, s) = \int_{-\infty}^{\infty} x(t) g(t-s) e^{-j 2\pi f t} dt \end{displaymath}

Think of the window $g(t)$ as sliding along the signal $x(t)$ and for each shift $g(t-s)$ we compute the usual Fourier Transform of the product function $x(t) g(t-s)$. For example, if $g(t)$ is the box of width 1/2 then we have (see the Matlab m-file fig1.m):

\epsfig {file=fig1.eps, angle=270, width=5in}

In the frequency domain we can use the convolution theorem to recognize $STFT(f, s)$ as the convolution of $X(f)$ with the Fourier transform of $g(t-s)$ (which is $e^{-j2\pi f s} G(f)$).

Recall that we have the Fourier Transform pair:


\begin{displaymath}\mbox{box}(t) = \left\{
\begin{array}{ll}
1 & \mbox{ for }...
...n}
{\mathcal F}\left\{ \mbox{box}(t)\right\} = \mbox{sinc}(f)
\end{displaymath}

In the case where $g(t)$ is a box of width $T$, that is, $g(t) =
\mbox{box}(t/T)$ then $G(f) = T \mbox{sinc}(fT)$. That is, the nulls of $G(f)$ are at multiple of $1/T$. See the figure below where the box has width $T = 1/250$.

In the case where the signal is a pure sinusoid of frequency $f_{1}$ the windowed transform will be the sinc function shifted by $f_{1}$. In the figure below the box has width $T = 1/250$ and the first sinusoid has frequency $f_{1} = 800$ Hz.

In the case where the signal consists of two sinusoids of frequencies $f_{1}$ and $f_{2}$ the windowed transform will be the superposition of two shifted sinc functions. The individual frequencies cannot be resolved unless $\vert f_{1} - f_{2}\vert > 1 /T$. In fact, for adequate separation we should have $\vert f_{1} - f_{2}\vert > 2 /T$.

That is, the ``frequency resolution'' of this analysis is $1/T$.

In the following figure a signal is the sum of two sinusoids with frequencies $f_{1} = 800$ Hz and $f_{2} = 1200$ Hz. The window size is $T = 1/250$. We get two distinct peaks in the frequency response (see fig2.m).

\epsfig {file=fig2.eps, angle=270, width=5in}

In the case where the signal consists of two spikes close together in time we can resolve the spikes if the window size $T$ is smaller that the time difference between the spikes.

This analysis shows the ``trade-off'' between time resolution and frequency resolution: if we use a window of length $T$ then we have a ``time-resolution'' of $T$ but our frequency resolution is $1/T$.

2.2 The spectrogram

The magnitude of the Short Time Fourier Transform is called the spectrogram. We can make 2 dimensional plots of the spectrogram with time on the horizontal axis, frequency on the vertical axis and amplitude given by a gray-scale colour. Alternately we can make 3 dimensional plots where we plot amplitude on the third axis. The Matlab command specgram can be used to generate these plots.
In the following example, (see fig3.m) a signal $x(t)$ is the sum of two sinusoids of frequencies $f_{1} = 500Hz$ and $f_{2} = 1500Hz$ and two impulses at times $t_{1} = 125$ms and $t_{2} = 130$ms. We use a window width of $T = 2.5$ms ($1/T = 400$ Hz).

\epsfig {file=fig3a.eps, angle=270, width=4in}

\epsfig {file=fig3b.eps, angle=270, width=4in}

\epsfig {file=fig3c.eps, angle=270, width=4in}

The resolution in frequency is $1/T = 400$Hz. The time resolution is $T = 2.5$ms. As the plots show, we can can resolve both the sinusoids and the impulses.

Now suppose that we move the two frequencies closer together. Let's use a signal $x(t)$ which is the sum of two sinusoids of frequencies $f_{1} = 500Hz$ and $f_{2} = 1000Hz$ and two impulses at times $t_{1} = 125$ms and $t_{2} = 130$ms with a window width of $T = 2.5$ms (see fig4.m).

As the spectrograms now show we cannot resolve the frequencies but we can still resolve the spikes.

\epsfig {file=fig4a.eps, angle=270, width=4in}

\epsfig {file=fig4b.eps, angle=270, width=4in}

Now suppose that we change the window size to $T = 8$ ms. As the spectrograms below show, we can resolve the frequencies but not the spikes (see fig4cd.m).

\epsfig {file=fig4c.eps, angle=270, width=4in}

\epsfig {file=fig4d.eps, angle=270, width=4in}

2.3 An Orthgonal Basis of Functions

We can obtain an orthogonal basis of functions related to the Short Time Fourier Transform when using the window function $g(t)$ = the box of width $T$ as follows. Instead of computing $STFT(f, s)$ for all frequencies $f$ and all time shifts $s$ we restrict the calculation to $f_{n} = n /T$ and $s_{m} = mT$. To see that this corresponds to orthonormal functions define:


\begin{displaymath}v_{n,m}(t) = e^{j 2 \pi n t /T} g(t - mT) \end{displaymath}

Then we have:


\begin{displaymath}STFT(n /T, mT) = <x(t), v_{n,m}(t) > \end{displaymath}

Since $v_{n,m}(t)$ is non-zero only for $mT \le t \le (m+1)T$ it is clear that these are orthogonal functions.

Because we have analysis and synthesis on each interval $mT$ to $(m+1)T$ it follows that we have analysis and synthesis in general. That is:


\begin{displaymath}\begin{array}{ll}
\mbox{Analysis:} & \displaystyle c_{n,m} =...
...infty}^{\infty}
c_{n,m} e^{j 2 \pi n t/T}g(t-mT)
\end{array}\end{displaymath}

In summary, if we restrict the STFT calculation to a discrete set of frequencies and times we can regard the STFT values as the coordinates of our signal $x(t)$ with respect to an orthogonal basis. Hence we can recover our signal $x(t)$ from these STFT values.


next up previous contents
Next: 3. Time-Scale Analysis Up: Wavelets and Filter Banks Previous: 1. Analysis and Synthesis   Contents
Dr. W. J. Phillips
2003-04-03