Schroeder sweep

A sound with maximal power and minimal peak-to-peak amplitude

The algorithm presented here, describes how to calculate a signal from a given power spectrum (or amplitude spectrum) such that the signal has a maximal root mean square (RMS) value and a minimal peak-to-peak amplitude at the same time. It is shown first why these properties are of importance in acoustical (probe tube) measurements (to determine the directional transfer function of the ear).

Importance of Low Peak Factors in Acoustics

In ear canal probe recordings the peak-to-peak amplitudes are to be minimized for physiological reasons. When sound waveforms contain peak-to-peak amplitudes which exceed a certain value, the middle ear exhibits a reflex by contracting. This effect, known as the 'acoustic reflex', is in particular undesirable because of the inconvenience it causes for a subject. This effect occurs for sound waveforms with peak amplitude corresponding to 90 dB SPL at the eardrum. For acoustical waveforms, a high RMS value (or sound level) with respect to the background noise level is desired to achieve a maximal signal-to-noise ratio (SNR). Typical background level for an artificial free field situations like an anechoic room is 35 dB. In such a situation, one desires to minimize the peak-to-peak amplitude and maximize the RMS at the same time. In other words, how can we minimize the peak factor, which is defined by:

\( peak factor = \frac{A_{max}-A_{min}}{2\sqrt{2}A_{RMS}}\)

with \(A_{min}\), and, \(A_{max}\) the minimum amplitude and maximum amplitude of the signal, respectively, \(A_{max}-A_{min}\) the peak-to-peak amplitude and \(A_{RMS}\) the RMS value of the signal.

Figure 1 shows the peak factors of several waveforms, including the sine, square, sawtooth, and the pulse and noise.


This figure and the waveforms are constructed in Matlab according to the following. First, clean up Matlab, and initialize some variables, such as waveform frequency and time.

%% Initialization
% Clear
close all
clear all
% Variables
x  = 0:0.01:10; % time samples
f  = .5; % Frequency 1 Hz

Then make a sine waveform.

%% Cosine
y  = sin(2*pi*f*x); % wave form

And here is a function to determine the peakfactor.

function pf = peakfactor(y)
% Obtain peakfactor PF from signal Y
Amax = max(y); % minimum amplitude
Amin = min(y); % maximum amplitude
Arms = sqrt(mean(y.^2)); % or Arms = pa_rms(y), root-mean-square
pf  = (Amax-Amin)./(2*sqrt(2)*Arms);

Let's determine the peak-factor and do the plotting in a function, calling it graphics, as all waveforms have to go through the same procedure.

function graphics(x,y)
% Make some nice visual graphics
pf  = peakfactor(y); % determine the peak-factor with a sub-function
% Graphics
% labels
% axis
ylim([-1.4 1.4]);
axis square;
box off;
% text
str = ['Peak Factor = ' num2str(pf,3)];
h  = pa_text(0.5,0.9,str); set(h,'HorizontalAlignment','center'); % Set the text and center it.
% Ticks

Finally we can plot the sine waveform.

graphics(x,y); % Some graphics in an easy sub-function

And do the same for a square,

%% Square
y  = ones(size(x)); % high wave form
sel = rem(x,1/f)*f>=0.5; % select the latter half of the period (1/f)
y(sel) = -1; % low
graphics(x,y); % Some graphics in an easy sub-function

a sawtooth,

%% Sawetooth
y = cumsum(y); % use the square wave to produce a sawtooth
y = y-1; y = y./max(y); y = 2*(y-0.5); % and normalize:
graphics(x,y); % Some graphics in an easy sub-function

a pulse,

%% Pulse
y  = -ones(size(x)); % high wave form
y(100) = 1000;
graphics(x,y); % Some graphics in an easy sub-function

and a noise.

%% Noise
y = 1/3*randn(size(x)); % See also PA_GENGWN
graphics(x,y); % Some graphics in an easy sub-function
title('Gaussian noise');

For those who want the save the plot, here's some code.

%% Print
cd('C:\DATA'); % See also PA_DATADIR
print('-depsc','-painter',mfilename); % I try to avoid bitmaps, 
% so I can easily modify the figures later on in for example Illustrator
% For web-and Office-purposes, I later save images as PNG-files

The Mathematical Problem

Starting point is a given power (or amplitude) spectrum, which is specified at only discrete frequencies (harmonically related). For a given frequency spectrum the periodic signal can easily be determined by the inverse discrete fourier transform. However, here, only the amplitude of the frequency spectrum is known and the phase spectrum is still to be determined.

Setting all phases to zero for a broadbanded spectrum results in a sharply peaked pulse at \( t_0 \) (see Fig. 1 Pulse). However, pulses yield a low SNR. Furthermore, pulses are awkward from a practical point of view: speakers can expose a nonlinear behavior (e.g. explode). Another possibility is to choose phases uniformly random, like in white noise, which indeed results in a noisy signal (Fig. 1, Gaussian noise). But this doesn't really resolve the problem: although not as extreme as a pulse, this still yields a 'high' peak-to-peak amplitude with respect to the RMS value. So, random phases don't seem very useful as well.

The solution

The solution of minimizing the peak factor is based on a simple intuitive concept concerning an asymptotic relationship between the power spectra of frequency-modulated signals and their instantaneous frequencies. Consider a frequency-modulated (FM) signal whose instantaneous frequency switches every half second from 1000 to 2000 Hz and back again. It is clear that the power spectrum of this signal must have two pronounced peaks, one around 1000 Hz and the other around 2000 Hz, representing approximately equal powers. More generally consider a periodic phase-modulated signal whose instantaneous frequency during the fundamental period \(T\) attains \(N\) different harmonically related frequencies \(\frac{k}{T}\) (\(k1,2,...,N\)), the signal power at frequency\(\frac{k}{T}\) will be approximately proportional to the time interval during which the instantaneous frequency is at \(\frac{k}{T}\).

This relationship between the distribution of instantaneous frequencies of FM signals and their power spectra has also been known in a somewhat different form as 'Woodwards theorem'. The idea underlying Woodward's theorem will be applied here as follows. FM signals have low peak factors. Therefore, if we succeed in constructing an FM signal with the desired harmonic power spectrum, the phase angles of its harmonic components will be a good choice for a low-peak-factor signal. This will not be exactly possible for arbitrary power spectra. However, since Woodwards theorem applies to \(N\to\infty\), the larger \(N\),the better the result. Although Woodwards theorem applies strictly to stochastic processes given by random phase constants of FM signals, this approach appears to yield very good results. This will be discussed somewhat further.

The algorithm

How it works

The problem can be restated as follows. Consider periodic signal \(x(t)\) with period \(T\), finite bandwidth and Fourier series:

\(x(t) \sum_{k1}^N{\sqrt{\frac{p_k}{2}\cos{(\frac{2\pi kt}{T}+\theta_k)}}} \)

where \(p_k\) is the relative power of the \(k^{th}\) harmonic (\(\sum_{k1}^N{p_k}1\)) and \(\theta_k\) its phase angle. How then, should we choose \(\theta_k\), such that the peak factor is minimized? Schroeder's procedure is as follows:

Let a cosine function attain each harmonic frequency \(\frac{2\pi k}{T}\) successively for a time interval, which is proportional to the relative power \(p_k\). Then, \(\theta_k\) is the phase of the \(k^{th}\) frequency in that cosine function.

Implementation is as follows. Consider the following periodic phase-modulated signal with [[wp>Piecewise_linear_function|piece-wise linear]] phase:
\(s(t) \cos{(\psi (t))}, \psi (t)\int_0^{\tau} \dot \psi (\tau)d\tau \)


\(\dot \psi (t) \frac{2\pi k}{T}, t_{k-1}\leq t < t_k\)

Note: the integral form of the argument of \(s(t)\) guarantees continuity. Instantaneous phase of \(s(t)\) in the \(n_{th}\) time interval is given by:

\(\psi (t) \phi_n+\frac{2\pi n}{T}t, t_{n-1}<t<t_n\)

where \(\phi_n\) is a phase constant that in the limiting case \(N\to \infty\) corresponds to the phase angle of the \(n^{th}\) Fourier component of \(s(t)\).
Demanding continuity for \(\psi (t)\) at \(t t_{n-1}\) yields:

\(\phi_n+\frac{2\pi n}{T}t_{n-1} \phi_{n-1}+\frac{2\pi (n-1)}{T}t_{n-1} \)


\(\phi_n \phi_{n-1} - \frac{2\pi}{T}t_{n-1} \phi_{n-1} - 2 \pi \sum_{k-1}^n p_k \)

So we finally have:

\(\phi_{n+1} \phi_{n} - 2 \pi P_n, P_n \sum_{k-1}^n p_k \)



For a flat spectrum (Fig. 2a, c.f. Fig. 1, Gaussian noise and Pulse) this algorithm scores a peak factor of 1.34. Furthermore, simulations show that choosing phases randomly yields a peak factor of 2.30 \(\pm\) 0.15. Looking at the Schroder signals in a qualitative way: the algorithm appears to produce sweeplike signals. This is to be expected as the algorithm tries to produce a cosine of which only the frequency varies in a spectrum-dependent way (Fig. 2b, d).

Matlab implementation

A Schroeder-sweep wav-file can be made by


Here follows the code to create a single Schroeder sweep in Matlab, and Figure 2.


function pa_gensweep_example
% Generate Schroeder Sweep
%% Initialization
Nsample  = 2^10; %1024 samples
Fstart  = 0; % Start-frequency Hz
Fs     = 50000; % sample-frequency
Fstop   = Fs; % Hz
%% Obtain single sweep
% Lowpeak generates a sweep of 2N samples for
% a magitude spectrum of N samples
N           = round(0.5*Nsample); % samples
% OK, now let's fill that magnitude spectrum
Magnitude   = zeros(1,N); % magnitude
nstart      = max(1,round(N*Fstart/Fs));
nstop       = min(N,round(N*Fstop/Fs));
Magnitude(nstart:nstop) = ones(1,(nstop-nstart+1));
x           = lowpeak(Magnitude); % Determine phase response with sub-function
%% Peak factor
pf = pa_peakfactor(x);
%% Graphics
close all
t = (1:length(x))/Fs*1000;
ylabel('Amplitude (au)');
xlabel('Time (ms)');
xlim([min(t) max(t)]);
str = ['Peak Factor = ' num2str(pf,3)];
h  = pa_text(0.5,0.9,str); set(h,'HorizontalAlignment','center'); % Set the text and center it.
%% Spectrogram Time-Frequency representation
nfft    = 2^11; % number of fft points
window    = 2^4; % resolution
noverlap  = 2^2; % smoothing
caxis([-100 -60])
ylim([0 20000])
ylabel('Frequency (kHz)');
%% Determine spectrum, see also PA_GETPOWER
% Time vector of 1 second
% Use next highest power of 2 greater than or equal to length(x) to calculate FFT.
nfft      = 2^(nextpow2(length(x)));
% Take fft, padding with zeros so that length(fftx) is equal to nfft
fftx      = fft(x,nfft);
% Calculate the numberof unique points
NumUniquePts  = ceil((nfft+1)/2);
% FFT is symmetric, throw away second half
fftx      = fftx(1:NumUniquePts);
% Take the magnitude of fft of x and scale the fft so that it is not a function of
% the length of x
mx        = abs(fftx)/length(x);
ph        = angle(fftx);
% Take the square of the magnitude of fft of x -> magnitude 2 power
mx        = mx.^2;
% Since we dropped half the FFT, we multiply mx by 2 to keep the same energy.
% The DC component and Nyquist component, if it exists, are unique and should not
% be mulitplied by 2.
if rem(nfft, 2) % odd nfft excludes Nyquist point
  mx(2:end)  = mx(2:end)*2;
  mx(2:end -1) = mx(2:end -1)*2;
% This is an evenly spaced frequency vector with NumUniquePts points.
f    = (0:NumUniquePts-1)*Fs/nfft;
mx    = 20*log10(mx); % Power (dB)
sel    = isinf(mx);
mx(sel) = min(mx(~sel));
%% Graphics of spectrum
subplot(221) % same as subplot(2,2,1)
plot(f,mx,'k-','LineWidth',2); % Power
title('Power Spectrum');
xlabel('Frequency (kHz)');
ylabel('Power (dB)');
xlim([0 25]*1000);
plot(f,unwrap(ph),'k-','LineWidth',2); % Phase needs to be unwrapped to observe systematic changes
hold on
title('Phase Spectrum');
xlabel('Frequency (kHz)');
xlim([0 25]*1000);
for ii = 1:4
  axis square;
  box off
  pa_text(0.1,0.9, char(96+ii));
%% Print
cd('C:\DATA'); % See also PA_DATADIR
print('-depsc','-painter',mfilename); % I try to avoid bitmaps, 
% so I can easily modify the figures later on in for example Illustrator
% For web-and Office-purposes, I later save images as PNG-files
function Signal = lowpeak(Magnitude)
%    Returns the time signal X with minimized peakfactor
%    given a frequency magnitude response M. The algorithm is
%    extracted from Schroeder et al (1970).
%    M  - frequency magnitude response containing the frequency
%         response in equidistant bins which correspond to frequency
%         0 to the Nyquist frequency (best to make the length of M
%         a power of 2). The algorithm calculates a frequency phase
%         response that together with M, yields a low-peak time signal.
%    X  - Time signal with minimized peak factor
%    >> sweep = lowpeak (ones(1,512));
Nbin       = length(Magnitude);
TotalPower = sum(Magnitude.^2);
NormFactor = 1.0/TotalPower;
TwoPi      = 2*pi;
Phi        = 0.0;
Power      = 0.0;
Spectrum   = zeros (1, Nbin);
for j=1:Nbin
  Spectrum(j) = Magnitude(j) * exp (1i*Phi);
  Power    = Power + NormFactor*Magnitude(j).^2;
  Phi      = Phi - TwoPi*Power;
Spectrum    = [Spectrum -conj([Spectrum(1) Spectrum((Nbin):-1:2)])];
Signal      = imag(ifft(Spectrum));
function pf = pa_peakfactor(y)
% Obtain peakfactor PF from signal Y
Amax = max(y); % minimum amplitude
Amin = min(y); % maximum amplitude
Arms = sqrt(mean(y.^2)); % or Arms = pa_rms(y), root-mean-square
pf  = (Amax-Amin)./(2*sqrt(2)*Arms);