Audio Signal Delay Project

Hao Huang
Columbia University
Ft. Washington Ave., AP-1310
New York, NY 10032
USA
huangha@flux.cpmc.columbia.edu

Abstract

The main aim of this project is to build an analyzer for measuring the delay between the two channels of the audio input, using the Fast Fourier Transform (FFT) correlation function.  Given a signal from the phone line and a signal transmitted over the Internet, this analyzer also provides the ability to measure one-way delay.

Requirements

The analyzer, including the FFT library, were written in C and compiled with gcc. The FFT library is based on Numeric Recipe in C by W. H. Press et al.

The analyzer utilizes the standard math library and the netutil library of Prof. Henning Schulzrinne's research group in the ~hgs/src/nevot/sun5 directory on Columbia CS cluster, which in turn, requires the audio library provided by Sun in the /usr/demo/SOUND/lib/ directory on Columbia CS cluster.   Change the Makefile if the directories on your machine are different.

The package utilizes the netutil library and its system-dependency is limited by that library.  Currently, the package only supports U-law, A-law, 8-bit linear, 16-bit linear and 32-bit linear encoding (all linear data are treated as signed).  Users can write decoding block in the decodeAudioData function in aufileutil.c to support other encoding if desired.

Installation

Download the archive here.

Uncompress the archive with gzip and tar. Edit Makefile if  necessary.  Type "make" and make the program delay.  To make the test driver for FFT library, use "make fft.t".

Operation

delay                   User Commands                   delay
 

NAME
    delay - measure the delay between two audio channels

SYNOPSIS
    delay [-hd] [-e ##] [-f ##] [-p ##] [-s ##] [-t ##] [-i in] [-o out]

DESCRIPTION
    The delay utility is an audio signal delay analyzer.  It reads
    the data from the input device or a file, calculates correlation
    between the two channels using FFT, and report the delay between
    them.

    In correlation, we compare two sets of data directly superposed,
    and with one of them shifted left or right.   The relation that
    holds when two functions, g(t) and h(t), are interchanged is:
        Corr(g,h)(t) = Corr(h,g)(-t)
    The discrete correlation of two sampled functions gk and hk, each
    periodic with period N, is defined by
        Corr(g,h)[j] = Sum[k=0 to N-1](g[j+k] + h[k])
    The discrete correlation theorem says that this discrete correlation
    of two real functions g and h is one member of the discrete Fourier
    transform pair
        Corr(g,h)[j] <==> GkHk*
    We can compute correlation using FFT as follows: FFT the two data
    sets, multiply one resulting transform by the complex conjugate
    of the other, and inverse transform the product.  The result, say
    r, will formally be a complex vector of length N with all its
    imaginary parts zero since the original data sets were both real.
    The components of r are the values of the correlation at different
    lags.

    When the delay is more then half of the length of the collected
    sample, the results tend to vary.  Normally, more than half of
    them are correct.  Most incorrect results fluctuate near 0 (no
    delay) as the two set of data are considered unrelated.  Specify
    larger sampling time if results jump between long time and 0.

OPTIONS
    The following options are supported:

    -h             Print a short description of the usage.

    -d             Print out decoded data in addition to the corre-
                   lation when -o is specified.

    -e ##          Specify the encoding of the audio input. Currently,
                   delay supports PCMU(U-law), PCMA(A-law) and L16
                   (Linear 16-bit signed).  The default encoding is
                   L16.  This option has no effect when reading from
                   file because the file header provides encoding
                   information.

    -f ##          Sampling frequency in Hz. The default is 8000 Hz.
                   This option has no effect when reading from file
                   because file header provides frequency information.

    -p ##          Period for checking delay. When reads data from
                   audio input device, the program can check the
                   delay periodically.  The default is 60 sec.  Use
                   0 or less to suppress looping (run only once).
                   This option has no effect when reading from file
                   because the program only run once in that case.

    -s ##          Skip first seconds of data when reading from file.
                   The default is 0.  This option has no effect when
                   reading from audio input device.

    -t ##          Sampling time in seconds.  The default is 2 seconds.
                   The FFT library requires the size of the data set
                   to be power of 2.  Therefore, the delay program
                   automatically reads in data of length that is power
                   of 2 and just above the specified length.  If the
                   delay is larger than the half of the sampling length,
                   the results tend to vary.  Normally, more than half
                   of them are correct.  Most incorrect results fluctuate
                   near 0 (no delay) as the two set of data are considered
                   unrelated.  Specify larger sampling time if results
                   jump between long time and 0.

    -i in          Read from the given audio file.  By default, this
                   program reads from audio input device.

    -o out         Output the delay correlation pairs to file out.  Use
                   "-" for stdout.  By default, this program does not
                   output these data.  This option has no effect if
                   this program loops for checking delay from audio
                   input device.

OUTPUT
    This program simply print out the delay between the two channels.

    When -o option is specified, the output is in following format:
         Lag (msec)      Correlation
         -1024.000000    -1.408000e+03
         -1023.875000    -2.560000e+02
         -1023.750000    1.220000e+02

    If, in addition to -o option, -d option is also specified, the
    output is in following format:
         Lag (msec)      Correlation     Left Channel    Right Channel
         -1024.000000    -1.408000e+03   0.000000e+00    0.000000e+00
         -1023.875000    -2.560000e+02   0.000000e+00    0.000000e+00
         -1023.750000    1.220000e+02    0.000000e+00    0.000000e+00

EXAMPLES
    Reads from a recorded audio file "data.au" and analyze delay:
        example% delay -i data.au
 
    Reads from a recorded audio file "data.au", analyzes delay, and
    print out the correlation to file "result":
        example% delay -i data.au -o result

    Reads from a recorded audio file "data.au", analyzes delay, print
    out the correlation along with decoded data to standard output,
    and observe them with pager:
        example% delay -d -i data.au -o - | more

    Reads from the audio input device with all default values:
        example% delay

    Reads from the audio input device, calculates only once:
        example% delay -p 0

    Reads from the audio input device with U-law encoding:
        example% delay -e PCMU

EXIT STATUS
    The following exit values are returned:

    0         The program terminates successfully.

    > 0       An error occurred.

NOTES
    This program can be used to measure one-way internet delay.
    Split one audio signal into two channels, one transmitted by
    telephone network and the other by internet.  The receiver
    connect these channels to the audio input of the computer,
    the first channel to the left and the second to the right.
    The result from the program is then the one-way internet delay
    compared to the telephone switch network transmission.

Setup Diagram

Acknowledgment

I would like to thank Jonathan Lennox for his help and advice.  I would also like to thank Prof. Henning Schulzrinne for giving me this opportunity, for his guidance, and for his kind consideration for my medical condition during this project.

References

1
William H. Press, et. al., Numerical Recipe In C -- the Art of Scientific Computing, Cambridge University Press, 1988.
2
Audio Format FAQ, PART 1
3
Audio Format FAQ, PART 2

Last updated: 1998-05-13 by Hao Huang