MDCT is short for Modified Discrete Cosine Transform. This, like the Discrete Cosine Transform or DCT, is a variation of a Discrete Fourier Transform: the DCT uses real-valued data rather than complex numbers as inputs, and produces real not complex outputs. The Discrete Cosine Transform (DCT) is used to transform a block of N values in the time domain to give N values in the frequency domain.

In audio coding the MDCT is used in preference to the DCT. This is because data is encoded by dividing it into blocks, and encoding each block separately. In mathematical use, the DCT and its equivalent inverse function (the IDCT) both encode the same data in different ways, meaning that applying a DCT and then an IDCT will give you exactly what you started out with.

However in computer systems the fact that you are working with integers, and also commonly quantize the output of DCTs (that is, scale it down by dividing and taking the integer part), means that precision will be lost between the input of a DCT and the output of the inverse function. This means that continuous waveforms inputted to successive DCTs will come out discontinuous, with jagged breaks between blocks. In video and JPEG images, where the DCT is commonly used, the discontinuities are considered less important, although you can see artifacts in highly-compressed images as a consequence: the square edges of the blocks into which the image is divided are visible.

The MDCT is very similar to the DCT, but it allows input data values to be overlapped, which means that after an MDCT even if the output is being quantized or subject to loss of precision, you can apply the corresponding inverse function (the IMDCT) and the output will also be overlapped and hence smoothed out. The MDCT takes twice as many input values as outputs; this is because the first half of the sequence of input values is the same as the second half of the previous input block.

The formula for a MDCT is:

X(m) = Σ    f(k) x(k) cos ( pi (2k + 1 + n/2) (2m+1) / 2n) ,      m = 0 ... (n/2 - 1)

Where x(k) is the input sequence (the time domain values), X(m) is the output (frequency domain), and f(x) is a windowing function with the right properties, e.g. f(x) = sin ( pi * x / n).

Note that slight variations of the above MDCT function are also used, and a number of different windowing functions are in common use.

(HTML note: Σ should display as a capital sigma, meaning sum of)

An MDCT is used for encoding data in MP3 audio coding. MDCTs can be calculated efficiently using Fast Fourier Transforms, although for small transforms there are often more efficient ways available, specific to the transform size.

Log in or register to write something here or to contact authors.