Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 19329, Pages 1-16 DOI 10.1155/ASP/2006/19329

Cosine-Modulated Multitone for Very-High-Speed Digital Subscriber Lines

Lekun Lin and Behrouz Farhang-Boroujeny

Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT 84112-9206, USA Received 17 November 2004; Revised 24 June 2005; Accepted 22 July 2005

In this paper, the use of cosine-modulated filter banks (CMFBs) for multicarrier modulation in the application of very-high-speed digital subscriber lines (VDSLs) is studied. We refer to this modulation technique as cosine-modulated multitone (CMT). CMT has the same transmitter structure as discrete wavelet multitone (DWMT). However, the receiver structure in CMT is different from its DWMT counterpart. DWMT uses linear combiner equalizers, which typically have more than 20 taps per subcarrier. CMT, on the other hand, adopts a receiver structure that uses only two taps per subcarrier for equalization. This paper has the following contributions. (i) A modification that reduces the computational complexity of the receiver structure of CMT is proposed. (ii) Although traditionally CMFBs are designed to satisfy perfect-reconstruction (PR) property, in transmultiplexing applications, the presence of channel destroys the PR property of the filter bank, and thus other criteria of filter design should be adopted. We propose one such method. (iii) Through extensive computer simulations, we compare CMT with zipper discrete multitone (z-DMT) and filtered multitone (FMT), the two modulation techniques that have been included in the VDSL draft standard. Comparisons are made in terms of computational complexity, transmission latency, achievable bit rate, and resistance to radio ingress noise.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION

In recent years, multicarrier modulation (MCM) has attracted considerable attention as a practical and viable technology for high-speed data transmission over spectrally shaped noisy channels [1-6]. The most popular MCM technique uses the properties of the discrete Fourier transform (DFT) in an elegant way so as to achieve a computationally efficient realization. Cyclic prefix (CP) samples are added to each block of data to resolve and compensate for channel distortion. This modulation technique has been accepted by standardization bodies in both wired (digital subscriber lines—DSL) [7-10] and wireless [11,12] channels. While the terminology discrete multitone (DMT) is used in the DSL literature to refer to this MCM technique, in wireless applications, the terminology orthogonal frequency-division multiplexing (OFDM) has been adopted. The difference is that in DSL applications, MCM signals are transmitted at baseband, while in wireless applications, MCM signals are upconverted to a radio frequency (RF) band for transmission.

Zipper DMT (z-DMT) is the latest version of DMT that has been proposed as an effective frequency-division duplexing (FDD) method for very-high-speed DSL (VDSL) applications. Two variations of z-DMT have been proposed:

(i) synchronous zipper [13,14] and (ii) asynchronous zipper [15]. The synchronous zipper requires synchronization of all modems sharing the same cable (a bundle of twisted pairs). As this is found too restrictive (manymodems have to be synchronized), it has been identified as an infeasible solution. The asynchronous zipper, on the other hand, at the cost of some loss in performance, requires only synchronization of the pairs of modems that communicate with each other. The unsynchronized modems on the same cable then introduce some undesirable crosstalk noise. Since the asynchronous z-DMT is the one that has been adopted in the VDSL draft standard [16], in the rest of this paper all references to z-DMT are with respect to its asynchronous version.

To synchronize a pair of modems in z-DMT, cyclic suffix (CS) samples are used. Moreover, to suppress the sidelobes of DFT filters and thus allow more effective FDD, extensions are made to the CP and CS samples and pulse-shaping filters are applied [15]. All these add to the system overhead, and thus reduce the bandwidth efficiency of z-DMT.

Radio frequency interference (RFI) is a major challenge that any VDSL modem has to deal with. RF signals generated by amateur radios (HAM signals) coincide with the VDSL band [3, 4]. Thus, there is a potential of interference between VDSL and HAM signals. The first solution to separate

HAM and VDSL signals is to prohibit VDSL transmission over the HAM bands. This solution along with the pulse-shaping method adopted in z-DMT will solve the problem of VDSL signals egress interference with HAM signals. However, the poor sidelobe behavior of DFT filters and also the very high level of RFI still result in interference which degrades the performance of z-DMT significantly. RFI cancellers are thus needed to improve the performance of z-DMT. There are a number of methods in the literature that cancel RFI by treating the ingress as a tone with no or very small variation in amplitude over each data block of DMT [17-19]. Such methods have been found to be limited in performance. Another method is to pick up a reference RFI signal from the common-mode component of the twisted-pair signals and use it as input to an adaptive filter for synthesizing and subtracting the RFI from the received signal [20]. This method which maybe implemented in analog or digital form can suppress RFI by as much as 20 to 25 dB [19]. Our understanding from the limited literature available on RFI cancellation is that a combination of these two methods will result in the best performance in any DMT-based transceiver. Thus, the comparisons given in the later sections of this paper consider such an RFI canceller setup for z-DMT.

Since RFI cancellation is rather difficult to implement, there is a current trend in the industry to adopt filter-bank-based MCM techniques. These can deal with RFI more efficiently, thanks to much superior stopband suppression behavior of filter banks compared to DFT filters. We note that z-DMT has made an attempt to improve on stopband suppression. However, as we show in Section 6, z-DMT is still much inferior to filter bank solutions.

Filtered multitone (FMT) is a filter bank solution that has been proposed by IBM [21-23] and has been widely studied recently. In order to avoid interference among various subcarriers, FMT adopts a filter bank with very sharp transition bands and allocates sufficient excess bandwidth, typically in the range from 0.05 to 0.125. This introduces significant intersymbol interference (ISI) that is dealt with by using a separate decision feedback equalizer (DFE) for each subcarrier [23]. Such DFEs are computationally very costly as they require relatively large number of feedforward and feedback taps. Nevertheless, the advantages offered by this solution, especially with respect to suppression of ingress RFI, has justified its application, and thus FMT has been included as an annex to the VDSL draft standard [16].

Cosine-modulated filter banks (CMFBs) working at maximally decimated rate, on the other hand, are well understood and widely used for signal compression [24]. Moreover, the use of filter banks for realization of transmultiplexer systems [24] as well as their application to MCM [25] have been recognized by many researchers. In particular, the use of CMFB to multicarrier data transmission in DSL channels has been widely addressed in the literature, under the common terminology of discrete wavelet multitone (DWMT), for example, see [25-32]. In DWMT, it is proposed that channel equalization in each subcarrier be performed by combining the signals from the desired band and its adjacent bands. These equalizers that have been referred to as

postcombiner equalizers impose significant load on the computational complexity of the receiver. This complexity and the lack of an in-depth theoretical understanding of DWMT have kept industry lukewarm about it in the past.

A revisit to CMFB-MCM/DWMT has been made recently [33-36]. In the first work, [33], an in-depth study of DWMT has been performed, assuming that the channel could be approximated by a complex constant gain over each subcarrier band. This study, which is also intuitively sound, revealed that the coefficients of each postcombiner equalizer are closely related to the underlying prototype filter of the filter bank. Furthermore, there are only two parameters per subcarrier that need to be adapted; namely, the real and imaginary parts of the inverse of channel gain. In a further study [34, 35], it was noted that by properly restructuring the receiver, each postcombiner equalizer could be replaced by a two-tap filter. It was also shown that there is no need for cross-filters (as used in the postcombiner equalizers in DWMT), thanks to the (near-) perfect reconstruction property of CMFB. Moreover, a constant modulo blind equalization algorithm (CMA) was developed [34, 35]. In [36], also a receiver structure that combines signals from a CMFB and a sine-modulated filter bank (SMFB) is proposed to avoid cross-filters. This structure which is fundamentally similar to the one in [34, 35] approaches the receiver design from a slightly different angle. The complexity of CMFB/SMFB receiver is discussed in [37] where an efficient structure is proposed. In a further development [38], it is noted that CMFB/SMFB can be configured for transmission of com-plexmodulated (such as QAM—quadrature amplitude modulated) signals. This is useful for data transmission over RF channels, but is not relevant to xDSL channels which are fundamentally baseband.

In this paper, we extend the application of CMFB-MCM to VDSL channels. The following contributions are made. The receiver structure proposed in [34, 35] is modified in order to minimize its computational complexity. Moreover, we discuss the problem of prototype filter design in trans-multiplexer systems. We note that the traditional perfect-reconstruction (PR) designs are not appropriate in this application, and thus develop a near-PR (NPR) design strategy. There are some similarities between our design strategy and that of [39] where prototype filters are designed for FMT. We contrast the CMFB-MCM against z-DMT and FMT and make an attempt to highlight the relative advantages that each of these three methods offer. In order to distinguish between the proposed method and DWMT, we refer to it as cosine-modulated multitone (CMT). We believe the term "cosine-modulated filter bank" (and thus CMT) is more reflective of the nature of this modulation technique than the term "wavelet." The term wavelet is commonly used in conjunction with filter banks in which the bandwidth of each subband varies proportional to its center frequency. In CMFB, all subbands have the same bandwidth. Moreover, the modulator and demodulator blocks that we use are directly developed from a pair of synthesis and analysis CMFB, respectively. We should also acknowledge that there have been some attempts to develop communication systems that use

Transmitter

Receiver

¡Sm-1(n) >

"f M F0 (z)

Tm —> FM -1(z)

Synthesis filter bank

Analysis filter bank

Figure 1: Block schematic of a CMFB-based transmultiplexer.

wavelets with variable bandwidths, for example, see [40] and the references therein.

An important class of filter-bank-based transmultiplexer systems that avoid ISI and ICI completely has been studied recently, for example, [41, 42]. Similar to DMT, where cyclic prefix samples are used to avoid ISI and ICI, here also redundant samples are added (e.g., through precoding) for the same purpose. Such systems, thus, similar to DMT and FMT, suffer from bandwidth loss/inefficiency. Moreover, since the designed filter banks, in general, are not based on a prototype filter, they cannot be realized in any simple manner, for example, in a polyphase DFT structure. Hence, they do not seem attractive for applications such as DSL where filter banks with a large number of subbands have to be adopted.

The rest of this paper is organized as follows. We present an overview of CMFB-MCM/CMT in Section 2. In Section 3, we propose a novel structure of CMT receiver which reduces its complexity significantly compared to the previous reports [34, 35]. In Section 4, we develop an NPRprototype filter design scheme. Computational complexities and latency issues are discussed and comparisons with z-DMT and FMT are made in Section 5. This will be followed by a presentation of a wide range of computer simulations, in Section 6, where we compare z-DMT, FMT, and CMT under different practical conditions. The concluding remarks are made in Section 7.

2. COSINE-MODULATED MULTITONE

Figure 1 presents block diagram of a CMFB-based transmul-

tiplexer system. At the transmitter, the data symbol streams Sk (n) are first expanded to a higher rate by inserting M -1 zeros after each sample. Modulation and multiplexing of data streams are then done using a synthesis filter bank. At the receiver, an analysis filter bank followed by a set of decima-tors are used to demodulate and extract the transmitted symbols. The delay S at the receiver input is required to adjust the total delay introduced by the system to an integral multiple of M. When S is selected correctly, channel noise v(n) is zero and the channel is perfect, that is, H(z) = 1, a well-

designed transmultiplexer delivers a delayed replica of data symbols Sk(n) at its outputs, that is, Sk(n) = Sk(n - A), where

A is an integer. However, due to the channel distortion, the recovered symbols suffer from intersymbol interference (ISI) and intercarrier interference (ICI). Equalizers are thus used to combat the channel distortion. As noted above, postcombiner equalizers that span across the adjacent subbands and along the time axis were originally proposed for this purpose [25]. Such equalizers are rather complex—typically, 20 or more taps per subcarrier are used. A recent development [34, 35] has shown that with a modified analysis filter bank, each subcarrier can be equalized by using only two taps. In the rest of this section, we present a review of this modified CMFB-based transmultiplexer and explain how such simple equalization can be established. As noted above, we call this new scheme CMT.

In CMT, the transmitter follows the conventional implementation of synthesis CMFB [24]. For the receiver, we resort to a nonsimplified structure of the analysis CMFB. Figure 2 presents a block diagram of this nonsimplified structure for an M-band analysis CMFB; see [24] for development of this structure. Gk(z), 0 < k < 2M - 1, are the polyphase components of the filter bank prototype filter P(z), namely,

P(z) = X z-kGk(z2M). (1)

The coefficients d0, d1,..., d2M-\ are chosen in order to equalize the group delay of the filter bank subchannels. This gives dk = e}ekWM+°.5)N/2 for k = 0,1,..., M - 1, and dk = d**M-1-k for k = M, M + 1,... ,2M - 1, where 6k = (-1)k(n/4), W2M = e-j2n/2M, * denotes conjugate, and N is the order ofP(z).

Let Q0(z), Q1(z),..., QM-i(z) denote the transfer functions between the input x(n) and the analyzed outputs uO(n), u1(n),..., u2M-1(n), respectively. We recall from the theory of CMFB that Qk(z) = dkP0(zW2kiM°5) for k = 0,1, ... ,2M - 1, see [24]. The CMFB analysis filters are generated by adding the pairs of Qk(z) and Q|M-1-k(z), for k = 0,1,..., M - 1. This leads to MI analysis filters [24]

Fa(z) = Qk(z) + Qmm-1-i(z), k = 0,1,...,M - 1. (2)

Figure 2: The analysis CMFBstructure that is proposed for CMT.

The synthesis filters Fk(z) are given as [24]

= Qk(z) + Q2M-i-k(z), k = 0,1,...,M - 1, (3)

where Qk (z) = z-NQk *(z-1) and the subscript * means conjugating the coefficients.

In a CMT transceiver, the synthesis filters Fk(z) are used at the transmitter. However, at the receiver, we resort to using the complex coefficient analysis filters Qk(z). In the absence of channel, and assuming that a pair of synthesis and analysis CMFB with PR are used, we get [24]

Uk(n) = 1 [sk(n - A) + jrk(n)],

where rk(n) arises because of ISI from the kth subchannel and ICI from other subchannels. The PR property of CMFB allows us to remove the ISI-plus-ICI term rk(n) and extract the desired symbol Sk (n-A) simply by taking twice of the real part of Uk(n). This, of course, is in the absence of channel. The presence of channel affects Uk(n), and Sk(n - A) can no longer be extracted by the above procedure.

In order to include the effect of the channel, we make the simplifying, but reasonable, assumption that the number of subbands is sufficiently large such that the channel frequency response H(z) over the kth subchannel can be approximated by a complex constant gain hk. Moreover, we assume that variation of the channel group delay over the band of transmission is negligible. Then, in the presence of channel, we obtain

Uk (n)

1 [Sk (n

A) + jrk (n)] X hk + Vk (n), (5)

where Vk(n) is the channel additive noise after filtering. The numerical results presented in Section 6 show that for a rea-sonly large value of M, the assumption of flat channel gain over each subcarrier is very reasonable. However, for channels with bridged taps, the group delay variation may not be insignificant. Nevertheless, the incurred performance loss, found through simulation, is tolerable. Clearly, the latter loss could be compensated by adjusting the delay in each subcarrier channel separately. But, this would be at the cost of significant increase in the receiver complexity which may not be justifiable for such a minor improvement.

Considering (5), an estimate of Sk(n) can be obtained as follows:

5k (n) = W^Uk (n)}

= Wk,RUk,R(n) + Wk,lUk,l(n),

where the subscripts R and I denote the real and imaginary parts of the respective variables. Equation (6) shows that the distorted received signal Uk(n) can be equalized by using a complex tap weight wf or, equivalently, by using two real tap weights Wk,R and wy. If we define the optimum value of wf, wf t, as the one that maximizes the signal-to-noise-(4) plus-interference ratio at the equalizer output, we find that

Wk,opt = —

At this point, we will make some comments about DWMT and clarify the difference between the proposed receiver and that of the DWMT [25]. In DWMT, the analyzed subcarrier signals that are passed to the postcombiner equalizers are the outputs of Fk(z) filters, that is, 2%{uk(n)}. Since these outputs are real-valued, they lack the channel phase information and, hence, a transversal equalizer with input 2%{uk(n)} will fall short in removing ISI and ICI. To compensate for the loss of phase information, in DWMT, it was proposed that samples ofsignals from kth subcarrier channel and its adjacent subcarrier channels be combined together for equalization. Theoretical explanation of why this method works can be found in [33]. Hence, the main difference between DWMT and CMT is their respective receiver structures. DWMT uses Fk(z) as analysis filters. CMT, on the other hand, uses the analysis filters Qa(z). This (minor) change in the receiver allows CMT to adopt simple equalizers with only two real-valued tap weights per subcarrier band while DWMT needs equalizers that are of an order of magnitude higher in complexity.

3. EFFICIENT REALIZATION OF ANALYSIS CMFB

Efficient implementation of synthesis CMFB using discrete cosine transform (DCT) can be found in [24]. This will be used at the transmitter side of a CMT transceiver. At the

Go(-z2) + jz-1GM (-z2)

Gi(-z2) + jz-1GM+i(-z2)

0/2 2M

i /2 2M

(M-1)/2 2M

—> | M —» GM-i (-z2) + jz-iG2M-i (-z2)

M-point IDFT

->«o(n)

->«i(n)

—->UM-i(n)

Figure 3: Efficient implementation of the analysis CMFB.

receiver, as discussed above, we use a modified structure of analysis CMFB. Thus, efficient implementations that are available for the conventional analysis CMFB, for example, [24], are of no use here. We develop a computationally efficient realization of the analysis CMFB by modifying the structure of Figure 2.

At the receiver, we need to implement filters Qa(z), Qa (z),..., QM-i(z). Recalling that Q.Vi-i (e-j<°) = [Ql(eja)]* andx(n) is real-valued, we argue that these filters can equivalently be implemented by realizing Qk (z) for k = 0,2,4,...,2M - 2, that is, for even values of k only; Qa (z), for instance, is realized by taking the conjugate of the output of QaM-2(z). We thus note from Figure 2 that

Qak(z) = ¿2k X (z-iW2-M/2)-z2M)W

2M -2kl 2M

1=0 M- i

= d2kX [z-l(Gi( - z2M)

+ jz-MG,+M( - zMW-»2] WMkl.

Using (8) to modify Figure 2 and using the noble identities, [24], to move the decimators to the position before the polyphase component filters, we obtain the efficient implementation of Figure 3. This implementation has a computational complexity that is approximately one half of that of the original structure in Figure 2, assuming that the decimators in the latter are also moved the position before the polyphase component filters—here, the 2M-point IDFT in Figure 2 is replaced by an M-point IDFT. The block C is to reorder and conjugate the output samples, wherever needed.

The realization of Figure 3 involves implementation of M polyphase component filters Gi(-z2) + jz-iGi+M(-z2), M complex scaling factors W-M2, an M-point IDFT, and the group delay compensatory coefficients di. The latter coefficients may be deleted as they can be lumped together with the equalizer coefficients wf.

The structure of Figure 3 should be compared with the analysis CMFB/SMFB structure of [37]. On the basis of the

operation count (the number of multiplications and additions per unit of time), the two structures are similar. However, they are different in their structural details. While Figure 3 uses an M-point IDFT with complex-valued inputs, the CMFB/SMFB structure uses two separate transforms (a DCT and a DST) with real-valued inputs. Therefore, a preference of one against the other depends on the available hardware or software platform on which the system is to be implemented.

4. PROTOTYPE FILTER DESIGN

Prototype filter design is an important issue in CMT modulation. In CMFB, conventionally, the prototype filter is designed to satisfy the PR property. However, in the application of interest to this paper, the presence of channel results in a loss of the PR property. In this section, we take note of this fact and propose a prototype filter design scheme which instead of designing for PR aims at minimizing the ISI plus ICI and maximizing the stopband attenuation. We thus adopt an NPR design. For this purpose, we develop a cost function in which a balance between the ISI plus ICI and the stopband attenuation is struck through a design parameter. A similar approach was adopted in [39] for designing prototype filter in FMT.

4.1. ISI and ICI

Referring to Figures i and 2, and assuming that only adjacent subchannels overlap, in the absence of channel noise, we obtain

U£(z) = z-S(Sk(zM)Fk(z) + Sk-i (zM)Fk-i(z) + Sk+i(zM )Fk+i(z))H (z)Qk(z),

where Sk (z) is the z-transform of Sk (n) and z-transforms of other sequences are defined similarly. Substituting (3) in (9) and noting that for k = 0 and M - i, Qk(z) has no (significant) overlap with Q2M-k(z), Q2M-i-k(z), and

Q2M-2-k(z), we obtain, for1 k = 0 and M - 1,

U0(z) = z-s (Sk (zM ) Qk (z) + Sk-1 (zM ) Qk-1 (z) + Sk+1 (zM )Qk+1(z))H (z)Qk(z).

We use the notation [ ■ ]iM to denote the M-fold decimation. Recalling that [UjO(z)]iM = Uk(z) and for arbitrary functions X(z) and 7(z), [X(zM)Y(z)]m = X(z)[Y(z)]m, from (10), we obtain

Uk(z) = Sk (z)[z-*Qi(z)H (z)Ql(z)] m

+ Sk-1(z)[z-ÄQk-1(z)H (z)Ql(z)]i

+ Sk+1(z)[ z-SQk+1(z)H (z)Qg(z^ m Using (7), we get the estimate of Si(z) (the equalized signal)

Sk (z) = — Uk(z)

= Sk (z)Ak (z) + Sk-1(z)Bk (z) + Sk+1(z)Ck (z),

Ak(z) = 2k [z-SQk(z)H(z)Qk(z)],^, Bk (z) = [z-ÄQk-1(z)H(z)Qk (z)]^}, Ck (z) = [z-ÄQk+1(z)H(z)Qk (z)]^},

and ■ } when applied to a transfer function means forming a transfer function by taking the real parts of the coefficients of the argument. When applied to a complex number of vector, ■ } denotes "the real part of."

If the prototype filter was designed to satisfy the PR condition, in the absence of the channel, we would have Ai (z) = z-A, Bk (z) = 0, and Ck (z) = 0. In the presence of the channel, these properties are lost and accordingly the ISI and ICI powers at kth subchannel are expressed, respectively, as

Ck,ISI = (ak - u)T(ak - u),

Zk,ici = bTbk + CTCk,

where ak, bk, and Ck are the column vectors of the coefficients of Ak (z), Bk (z), and Ck (z), respectively, and u is a column vector with Ath element of 1 and 0 elsewhere.

The above results were given for the case when only the adjacent bands overlap. When each subcarrier band overlaps with more than two of its neighbor subcarrier bands, the above results may be easily extended by defining more polynomials like Bk(z) and Ck(z), and accordingly adding more terms to (15).

1 In DSL applications, the sub-channels near origin (I = 0) and n (I M - 1) do not carry any data [25].

4.2. The cost function

The cost function that we minimize for designing the prototype filter is defined as

Z = Zs + y«isi + Zici),

where Zs is the stopband energy of the prototype filter, defined below, and y is a positive parameter which should be selected to strike a balance between the stopband energy and ISI plus ICI. A larger y leads to a smaller ISI plus ICI. Here and in the remaining discussions, for convenience, we drop the subcarrier band index k of Zi,isi and Zk,ICI.

Selecting the frequency grid {w0, w1,..., wL-1} in the interval [ws, n ], where ws is the stopband edge of the prototype filter, we define

Zs = \P(j)

We also assume that the prototype filter P(z) has a length of 2mM. This choice of the length follows that of the PR CMFB [24], and is believed to be appropriate since here we design a filter bank with NPR property. Moreover, we follow the PR CMFB convention and design a linear-phase prototype filter. This implies that

P(ejwi) = e-jwi(mM-a5) 2p(mM + n)cos (w(n + 0.5)),

where p(n) is the nth coefficient of P(z). Rearranging (18), we obtain

■ gja0(mM-0.5)p(gj^n ) eju1(mM-0.5)p^gjtt! ^

gj^L-i(mM-0.5) p(gj^L-1 )

where C is an L X mM matrix with the z'jth element of Cj,j = 2cos(wi-1(j - 0.5)) and p = [p(mM)p(mM + 1) ■ ■ ■ p(2mM - 1)]T. Using (19), (17) may be rearranged as

Zs = L pTcTCp.

To calculate ZISI and ZICI, we note that since Qk(z)Qk(z), Qk-1(z)Qa(z) and Qk+1(z)Qk (z) are narrowband filters centered around the kth subcarrier band and over this band H(z) may be approximated by the constant gain hi, from (13), we obtain

ak = [qk * qk]m},

bk = [qk-1 * qk] m}, Ck = [qk+1 * qk]m},

(21) (22) (23)

where * stands for convolution and qk and qk are the column vectors of coefficients of z-SQl(z) and Qa(z), respectively.

Equation (2i) maybe expressed in a matrix form as

ak = 2^{Qqk }, (24)

where the matrix Q is obtained by the arranging of qk and its shifted copies in a matrix Qo and the decimation of Qo by M in each of the columns. Noting that q!(n) =

p(n)ej((n/M)(k+a5)(n-N/2)+(-i)k(n/4)), p(n) = p(2mM - n - i), and defining D as a diagonal matrix with the nth diagonal element d„,„ = ej((n/M)(k+a5)("--N/2)+(-i)'(n/4)), (24) maybe written as

ak = 2%{QD}

where pr is obtained by reversing the order of elements of p. In matrix/vector notations, pr = Jp where J is the antidiagonal matrix with the antidiagonal elements of i. Using this in (25), we obtain

ak = Ep,

where E = 2^{QD}[ J ] and I is the identity matrix. Substituting (26) in (i4), we obtain

Cisi = (Ep - u)T(Ep - u). Following similar steps, we obtain

Cici = pTFTFp,

where the matrix F is constructed in the same way as E, by replacing qk with [ q»-' ].

Now substituting (20), (27), and (28) in (i6), we obtain

C = (Gp - v)T(Gp - v), (29)

where G =[, f 1, v = [ ¡J ], and 0 is a zero column vector

L (i/ywc J

with proper length.

4.3. Minimization of the cost function

We note that qk, and thus G, depends on p. Hence, the cost function (29) is fourth order in the filter coefficients p(n), and thus its minimization is nontrivial. Rossi et al. [43] proposed an iterative least-squares (ILS) minimization for a similar problem. They formulated the same filter design problem for the case of a PR CMFB. Adopting the method of Rossi et al. [43], we minimize C by using the following procedure.

Step 1. Let p = p0; an initial choice.

Step 2. Construct the matrix G using the current value of p.

Step 3. Form the normal equation Yp = 6, where Y = GTG and 6 = GTv.

Step 4. Compute p' = Y-i6.

Step 5. (p0 + pO/2 p0 and go back to Step 2.

Steps 2 to 5 are run for sufficient iterations until the design converges.

Numerical examples show that this algorithm can converge to a good design if the initial choice p = p0 and the parameter y are selected properly. Compared to other CMFB prototype filter designs, this method is attractive because of its relatively low computational complexity. Other methods such as those based on paraunitary property of PR filter banks [24] are too complicated and hard to apply to filter banks with large number of subbands; the case of interest in this paper. Besides, such design methods are not useful here because we are not interested in designing filter banks with PR property. Because of these reasons, we found the approach of [43] the most appropriate in this paper, and thus elaborate on it further.

In CMT, we are interested in very long prototype filters whose length exceeds a few thousands. This means in the normal equation Yp = 6, Y is a very large matrix. Hence, Step 4 in the above procedure may be computationally expensive and sensitive to numerical errors. In our experiments where we designed filters with length of up to 3072, using the Matlab routine of [43], we did not encounter any numerical inaccuracy problem. However, the design times were excessively long. Since we wished to design many prototype filters, we had to find other alternative methods that could run faster. Fortunately, we found the Gauss-Seidel method as a good alternative.

Gauss-Seidel method is a general mathematical optimization method that is applicable to variety of optimization problems [44, 45]. It finds the optimum parameters of interest by adopting an iterative approach. A cost function is chosen and it is optimized by successively optimizing one of the cost function parameters at a time, while other parameters are fixed. A particular version of Gauss-Seidel reported in [46] can be used to minimize the difference Gp - v in the least-squares sense without resorting to the normal equation Yp = 6. Moreover, an accelerated step that improves the convergence rate of the Gauss-Seidel method has been proposed in [46]. Through numerical examples, we found that the accelerated Gauss-Seidel method could be used to replace for Step 4 in the above procedure, with the advantage of speeding up the design time by an order of magnitude or more.

Here, we request the interested readers to refer to [46] for details of the accelerated Gauss-Seidel method. In an appendix at the end of this paper, we have given the script of a Matlab m-file that we have used for the design of the prototype filters. The prototype filter that we have used to generate the simulation results of Section 6 is based on the following parameters: M = 5i2, m = 3, f = i.2/2M, y = i00, and K = 2.

5. COMPUTATIONAL COMPLEXITY AND LATENCY

Computational complexity and latency are two issues of concern in any system implementation. In this section, we present a detailed evaluation of computational complexity

Table 1: Summary of computational complexity of z-DMT transceiver.

Function Additions Multiplications

Modulator (IFFT) M(3log2 M - 2) M(log2 M - 2)

Demodulator (FFT) M(3log2 M - 2) M(log2 M - 2)

FEQ 3M 3M

Table 2: Summary of computational c omplexity of CMT trans-

ceiver.

Function Additions Multiplications

Modulator M(1.5log2 M + 2m) M(0.5log2 M + 2m + 1)

Demodulator M(3log2 M + 2m - 2) M(log2 M + 2m)

Equalizer M 2M

and latency of CMT and compare that against z-DMT and FMT.

5.1. Computational complexity

The computational blocks involved in z-DMT and their associated operation counts are summarized in Table 1. The number of operations given for each block is based on some of the best available algorithms. In particular, we have considered using the split-radix FFT algorithm [47] for implementation of the modulator and demodulator blocks. We have counted each complex multiplication as three real multiplications and three real additions [47]. The variable M, here, indicates the number of subcarriers in z-DMT. The FEQs are single-tap complex equalizers used to equalize the demodulated data symbols. We have not accounted for possible adaptation of the equalizers. The RFI cancellation also is not accounted for, as it varies with the number of in-terferers. For instance, when there is no RFI, the computational load introduced by the canceller is limited to channel sounding for detection of RFI and this can be negligible. On the other hand, when an RFI is detected, the system may momentarily have to take a relatively large computational load to set up the canceller parameters. Thus, the issue here might be that of a peak computational power load. Since accounting for this can complicate our analysis, we simply ignore the complexity imposed by the RFI canceller and only comment that this can be a burden to a practical z-DMT system.

Table 2 lists the computational blocks of a CMT transceiver and the number of operations for each block. Here, the modulator and demodulator are the CMFB synthesis and analysis filter banks, respectively. The operation counts of modulation are based on the efficient implementation of synthesis CMFB with DCT in [24], and the operation counts of demodulation are based on Figure 3. Two-tap equalizers, discussed in Section 2, are used to mitigate ISI and ICI at the demodulator outputs. Here also, we have not accounted for possible adaptation of the equalizers. The coefficients at the output of the analysis CMFB of Figure 3 are not accounted for as they can be combined with the

Table 3: Summary of computational complexity of FMT transceiver.

Function Additions Multiplications

Modulator M(3log2 M + 2m - 4) M(log2 M + 2m - 2)

Demodulator M(3log2 M + 2m - 4) M(log2 M + 2m - 2)

Equalizer M(5Nf + 5Nb - 2) 3M(Nf + Nb)

equalizers. The parameters which appeared in Table 2 are the number of subcarriers M and the overlapping factor m; the length of prototype filter P(z) is 2mM.

Table 3 lists the computational blocks of an FMT transceiver and the number of operations for each block. The operation counts are based on the efficient realization in [23]. Similar to z-DMT and CMT, here also, the adaptation of the equalizer coefficients is not counted. M is the number of subcarrier channels. The prototype filter length is 2mM. Nf and Nb denote the number of taps in the feedforward and feedback sections of DFE, respectively.

Adding up the number of operations given in each of Tables 1, 2, and 3, and normalizing the results by the block length (2M for z-DMT and FMT, and M for CMT), the persample complexities of z-DMT, CMT, and FMT are obtained as

Cdmt = 4log2 M - 1,

Ccmt = 6log2 M + 8m + 2, (30)

CFMT = 4log2 M + 4m + 4(Nf + Nb) - 7.

For all comparisons in this paper, the following parameters are used. For z-DMT, we choose M = 2048. This is consistent with the VDSL draft standard [16] and the latest reports on z-DMT [15]. For FMT, we follow [23] and choose M = 128, m = 10, Nf = 26, and Nb = 9. For CMT, we experimentally found that M = 512 and m = 3 are sufficient to get very close to the best results that it can achieve. With these choices, we obtain CDMT = 43, CCMT = 80, and CFMT = 201 operations per sample. It is noted that FMT is significantly more complex than z-DMT and CMT, and the computational complexity of CMT is about 2 times that of the z-DMT. However, we should note that the complexity of z-DMT given here does not include the RFI canceller which, as noted above, can momentarily exhibit a significant computational peak load, whenever a new RFI is detected.

5.2. Latency

In the context of our discussion in this paper, the latency is defined as the time delay that each coded information symbol will undergo in passing through a transceiver. In z-DMT, the following operations have to be counted for. A block of data symbols has to be collected in an input buffer before being passed to the modulator. This, which we refer to as buffering delay, introduces a delay equivalent to one block of DMT. While the next block of data symbols is being buffered, the modulator processes the previous block of data. This introduces another block of DMT delay. We refer to this as

Symbol generator -» Modulator

Symbol generator -» Modulator

Symbol generator -» Modulator

->| NEXT coupling

->| FEXT coupling

Channel

Background

Calculate —> Bit

SNR allocation

Figure 4: Simulation setup.

processing delay. The buffering and processing delay together count for a delay of the equivalent of two blocks of DMT at the transmitter. Following the same discussion, we find that the receiver also introduces two blocks of DMT delay. Thus, the total latency introduced by the transmitter and receiver in z-DMT (or DMT, in general) is given by

Admt - 4tdmt>

where TDMT is the time duration of each z-DMT block. This includes a block of data and the associated cyclic extensions. We also note that the channel introduces some delay. Since this delay is small and common to the three schemes, we ignore it in all the latency calculations. We thus use the following approximation for the purpose of comparisons:

Admt - 4(2M + ^cp + fcs)Ts,

where ^cp and ^cs are the length of cyclic prefix and cyclic suffix, respectively, and Ts is the sampling interval which in the case of VDSL is 0.0453 microseconds, corresponding to the sampling frequency of 22.08 MHz.

The latency calculation of CMT is straightforward. The delay introduced by the synthesis and analysis filter banks is determined by the total group delay introduced by them. It is equal to the length of the prototype filter times the sampling interval Ts. This results in a delay of 2mMTs. We should add to this the buffering and processing delays. Since each processing of CMT is performed after collecting a block of M samples, the total buffering plus processing delay in a CMT transceiver is equal to 4MTs. The latency of CMT is thus obtained as

Acmt - (2m + 4)MTS.

The latency calculation of FMT is similar to that of CMT. Delays are introduced by the synthesis filter bank, the analysis filter bank, and the DFEs. The delay introduced by synthesis and analysis filter banks is 2mMTs. A total buffering and processing delay 4MTs should be added to this. The delay introduced by the feedforward section of DFE is Nf/2 samples.

Since fractionally spaced DFEs work at the rate decimated by M, the introduced delay is MNf Ts/2. The latency of FMT is thus

Afmt - (2m + 8 + Nf )MTS.

As noted in Section 5.1, we choose M = 2048 and ^cp + = 320 for z-DMT, M = 512 and m = 3 for CMT, and choose M = 128, m = 10, Nf = 26, and Nb = 9 for FMT. These result in the latency values ADMT = 800 microseconds, ACMT = 232 microseconds, and AFMT = 238 microseconds. We note that the latencies of CMT and FMT are significantly lower than that of z-DMT. This, clearly, is because of the use of a much smaller block size M in CMT and FMT.

6. SIMULATION RESULTS AND DISCUSSION

The system model used for simulations is presented in Figure 4. This setup accommodates NEXT (near-end crosstalk) and FEXT (far-end crosstalk) coupling, background noise, and RFI ingress. The setup assumes that the system is in training mode, and thus transmitted symbols are available at the receiver. Hence, we can measure SNRs at various subcarrier bands, and accordingly find the corresponding bit allocations. The symbol generator output is 4-QAM in the cases of z-DMT and FMT, and antipodal binary for CMT.

To make comparisons with the previous works possible, we follow simulation parameters of [15], as close as possible. We use a transmission bandwidth of 300 kHz to 11 MHz. The noise sources include a mix of ETSI'A, [48], 25 NEXT, and 25 FEXT disturbers. Transmit band allocation is also performed according to [15].

6.1. System parameters

The number of subcarriers M and the length of the prototype filter 2mM are the two most important parameters in CMT. Obviously, the system performance improves as one

20 15 10

0 200 400 600 800 1000 1200 1400 Length of TP1 (m)

- Upper bound — z-DMT

—•- CMT proposed design " FMT ---CMT PR design

Figure 5: Comparison of bit rates of z-DMT, CMT, and FMT on TP1 lines of different lengths.

or both of these parameters increase. However, as we may recall from the results of Section 5, both system complexity and latency increase with M and m. It is thus desirable to choose M and m to strike a balance between the system performance and complexity. Moreover, for a given pair of M and m, the system performance is affected by the choice of the CMFB prototype filter. An important parameter that affects the performance of CMT is the stopband edge of the prototype filter ws. The optimum value of ws is hard to find. On one hand, the choice of a small ws is desirable as it limits the bandwidth of each subcarrier and makes the assumption of constant channel gain over each subband more accurate. On the other hand, a larger choice of ws improves the stop-band attenuation of the prototype filter, and this in turn reduces the ICI and noise interference from the nonadjacent subbands. Moreover, a large value of ws increases RF ingress noise and the NEXT near the frequency band edges. Unfortunately, because of the complexity of the problem and the variety of the parameters that affect the system performance, a good compromised choice of Mm and ws could only be obtained through extensive numerical tests over a wide variety of channel setups. The details of such results will be reported in [49]. Here, we mention the summary of observations that we have had. The choice of M = 512 was generally found sufficient to satisfy the approximation "constant channel gain over each subband." With M = 512, the choices m = 3 (thus, a prototype filter length of 3072) and ws = 1.2n/M result in a system which behaves very close to the optimum performance, where the optimum performance is that of an ideal system with nonoverlapping subcarrier bands; see Figure 5.

In our study, we also explored the choices of m = 2 and m = 1. The results, obviously, were not as good as those of m = 3, however, for most cases, they were still superior to z-DMT and FMT. Here, because of space limitation, we only

present results and compare CMT with z-DMT and FMT when in CMT, M = 512, m = 3 and ws = 1.2n/M. Details of other cases will be reported in [49].

For z-DMT, the number of subcarriers is set equal to 2048, following the VDSL draft standard [16]. As in [15], we have selected the length of CP equal to 100, determined the length of CS according to the channel group delay, and the length of the pulse-shaping and windowing samples are set equal to 140 and 70, respectively.

Following the parameters of [23], we use an FMT system with M = 128 subchannels, and a prototype filter of length 2mM, with m = 10. The excess bandwidth a is set equal to 0.125. Per-subcarrier equalization is performed by employing a Tomlinson-Harashima precoder with Nb = 9 taps and a T/2-spaced linear equalizer with Nf = 26 taps.

6.2. Crosstalk dominated channels

The DSL environment is crosstalk dominated due to bundling of wire pairs in binder cables. Here, we consider the performance of z-DMT, CMT, and FMT when both NEXT and FEXT are present. Since the three modulation schemes are frequency-division duplexed (FDD) systems, NEXT is significant only near the frequency band edges where there is a change in transmit direction. FEXT, on the other hand, affects all the transmit band.

In our simulations, NEXT and FEXT are generated according to the coupling equations provided in [16] for a 50-pair binder cable as

PSDnext = ^nextsd( f)

PSDfext = KfextSd(f)\H(f) |2^^)06f2,

where KNEXT and KFEXT are constants with values of 8.818 X 10~14 and 7.999 X 10~20, respectively, Sa( f) is the PSD of a disturber, Nd is the number of disturbers, H(f) is the channel frequency response, and d is the channel length in meters.

Figure 6 presents SNR curves demonstrating the impact of NEXT in degrading the performance of z-DMT, CMT, and FMT. The results correspond to a 810 m TP1 line. The arrows I and i indicate downstream and upstream bands, respectively. The SNR in each subcarrier channel is measured in the time domain by looking at the power of the residual error after subtracting the transmitted symbols. As one would expect, there is a significant performance loss in z-DMT at the points where the transmission direction changes. The CMT and FMT, on the other hand, do not show any visible degradation due to NEXT. It is worth noting that the SNR results of z-DMT match closely those reported in [15].

Another observation in Figure 6 that requires some comments is that although CMT has a lower SNR compared to z-DMT and FMT, it may achieve a higher transmission rate because of higher bandwidth efficiency—no cyclic extensions or excess bandwidth.

Frequency (MHz)

- z-DMT

---CMT

...... FMT

Figure 6: SNR curves showing the impact of NEXT on z-DMT, CMT, and FMT. Arrows indicate the direction of data transmission.

Figure 5 presents plots that compare the bit rates of z-DMT, CMT, and FMT on TP1 lines of different lengths. Also shown in this figure are the results of an ideal system where a bank of ideal filters with zero transition bands and a channel with flat gain over each subband are assumed. Moreover, for CMT, we have presented the results when a prototype filter with PR property (designed using the code given in [43]) is used and when the design procedure of Section 4 is adopted. As seen, CMT, even with PR design, outperforms z-DMT and FMT for all the line lengths with a gain of 5 to 10% higher bit rate. Moreover, CMT approaches very close to the upper bound of the bit rate determined by the idealized system. A design based on PR property is already within 5% of the upper bound. The filter design proposed in Section 4 reduces this gap to around 2 ~ 3%. An observation in Figure 5 that requires some comments is that the performance of FMT is worse than that of FMT obtained in [23], especially when the length of the line is larger than 1000 m. This is because we use a different noise model than [23]. We follow [15] and use ETSI'A' as the background noise, while -140 dBm/Hz white Gaussian noise is used in [23].

Bit allocation for each subcarrier is done based on the following formula [4, 50]:

bi = log2 1 +

SNRi ■ ycode

^margin

where SNR, is signal-to-noise ratio at the ith subcarrier, ycode = 3 dB is the coding gain, r = 9.8 dB is the SNR gap between the Shannon capacity and QAM-modulation to achieve a BER of approximately 10-7, and ymargin = 6dB is the system margin. Since in CMT data symbols are PAM, we treat each pair of adjacent PAM symbols as one QAM symbol and apply (36).

6.3. Channels with bridged taps

So far, the simulated subscriber loops were homogeneous lengths of TP1 cables. Previous reports, [30], as well as our simulation studies have shown that the group delay distortion of such lines is very minimal and mostly limited to very low and very high frequencies in the VDSL band. Nonho-mogeneous subscriber lines with bridged taps, on the other hand, exhibit significant group delay distortion. Hence, a study of CMT behavior in VDSL loops with bridged taps is essential to complete our study. We present simulation results for the five test loops that are shown in Figure 7. These are chosen from the test loops provided in [16]. Figure 8 presents the group delays of two of these loops and also that of a 300 m TP1 line with no bridged tap. We note that the line without bridged tap exhibits almost no group delay distortion over most of the channel band, while as the number of bridged taps increases, the group delay distortion also increases. We also note that the fast variations of the group delay at certain frequencies coincide with the points where the magnitude gain ofthe channel is reduced due to signal reflection from the open-ended bridged-tap extensions. This phenomenon is clearly seen by referring to Figure 9 where the subcarrier SNRs of z-DMT, CMT, and FMT are shown for the loop 4 "short." The following observations are also made by referring to Figure 9. Even though the group delay distortion may bring some degradation to the CMT performance since it affects the flatness of each subchannel, this degradation is not significant. It is worth noting that the sharp variations of the group delay at frequencies (about) 0.6 and 1.3 MHz, in Figure 8, coincide with the sharp drops in SNRs of all the three systems in Figure 9. The fact that both CMT and z-DMT behave similarly, at these points, and also recalling that DMT has no sensitivity to group delay distortion clearly indicate that the variation of group delay, in VDSL channels, has little effect in degrading the performance of CMT. On the other hand, bit-rate evaluations presented in Table 4 reveal that even for such extreme lines, CMT is superior to z-DMT and FMT.

6.4. Effect of RFI ingress noise

The RFI noise can badly affect the performance of the VDSL systems as it may appear at a level much higher than the VDSL signal. The RFI has to be suppressed at two stages. The first stage uses an analog RFI suppressor at the receiver input [20]. It has been reported that this technique can result in an RFI suppression of 20 to 25 dB [19]. However, unfortunately, this suppression is not sufficient for an acceptable performance of z-DMT system. It is thus proposed that further suppression of RFI has to be made at the demodulator output [17, 18]. Here, we consider the RFI cancellation method proposed in [ 17]. In this method, the center frequency of the RFI is estimated by locating the peak of the signal within the set of tones in the HAM bands. It then uses two listener tones, one on each side of the RFI, to estimate this ingress and interpolate the RFI through the transfer function of the receiver window (see [17] for details). In our simulations, we follow

VDSL 3 'short'

1S007TP2

2S07TP3

VDSL 4 'short'

10007TP1

Aerial cable

1S0'/TP2 1S0'/TP2

VDSL 5

SS07TP2 1007TP2 2S07TP2 g -•-•-( S0'/TP3 1-

VDSL 6

Underground cable, 20 pair

Underground, Overhead aerial 5 pair

16S0'/TP1

6S07TP2

SS07TP2

l00'/TP2

2S0'/TP2

S0'/TP3

Underground cable, Underground, Underground, Underground, Overhead aerial 100 pair 100 pair 20 pair 5 pair

VDSL 7

16S0'/TP1

23007TP2

SS07TP2

1007TP2

2S07TP2

S0'/TP3

Underground cable, Underground, Underground, Underground, Overhead aerial 100 pair 100 pair 20 pair 5 pair

Figure 7: Examples of test loops with bridged taps.

3 30 d

i Л Л.;

VWY vyv V-

Frequency (MHz)

300 m TP1 VDSL test loop 5 VDSL test loop 4 "short"

3S 30 2S

- z-DMT

---CMT

...... FMT

Frequency (MHz)

Figure 8: Group delays of the test loops shown in Figure 7 and a TP1 line of length 300 m.

Figure 9: SNR plots of z-DMT, CMT, and FMT for the VDSL test loop 4 "short." The plots confirm that group delay distortion in this loop has no significant impact on degrading CMT performance when compared with z-DMT. Arrows indicate the direction of data transmission.

[17] and set the listener tones to be at 8-tone spacing from the center frequency of the RFI.

In CMT and FMT, the sharp roll-off and the high stopband attenuation of the analysis filters allow cancellation of the RFI without resorting to any additional post-demodulator RFI canceller (i.e., the second stage of the RFI

canceller). However, we note that to get an acceptable performance, the first stage of RFI suppression is needed for CMT and FMT systems, as well.

Figures 10(a) and 10(b) present a set of results that compare the performance of z-DMT, CMT, and FMT in the presence of RFI. In both cases, the RFI power has been set equal

Table 4: Comparison of bit rates (Mbps) of z-DMT, CMT, and FMT over bridged loops.

Bridged loop z-DMT FMT CMT

VDSL 3 "short" 20.39 20.08 21.99

VDSL 4 "short" 19.21 19.13 20.05

VDSL 5 24.12 23.67 25.52

VDSL 6 9.84 10.58 11.96

VDSL 7 2.92 3.24 3.60

1.5 2 Frequency (MHz)

1.5 2 2.5 Frequency (MHz)

z-DMT w/o RFC z-DMT with RFC

---CMT

----FMT

z-DMT w/o RFC z-DMT with RFC

---CMT

----FMT

Figure 10: RFI performance of z-DMT, CMT, and FMT when an RFI with bandwidth of 4 kHz at the level of -35 dBm presents at the center frequency (a) 1.9 MHz and (b) 1.82 MHz. Arrows indicate the direction of data transmission.

to -35 dBm at the demodulator input. This is assumed to be the residual from a -10 dBm RFI (stipulated in [16]), after the first stage ofsuppression. The RFI is chosen to be a 4 kHz narrowband signal. In Figure 10(a), the center frequency of the RFI is at 1.9 MHz. This is near the center of the first HAM band. We observe that in this case, the RFI canceller clears RFI almost perfectly. There is only slight degradation in SNRs near the band edges. However, the RFI canceller fails when the RFI center frequency moves to a point near one of the VDSL signal band edges. This is shown in Figure 10(b) where the center frequency of the RFI is shifted to 1.82 MHz. The reason for the failure of the RFI canceller in this case is that one of the listener tones used to measure RFI coincides with the VDSL signal. According to [17], as well as our simulations, any attempt to shift the listener tone nearer to the center frequency of the RFI will result in a significant degradation of the tone estimates, and thus equally results in failure of the RFI canceller.

7. CONCLUSIONS

A thorough study of a new multicarrier modulation in VDSL channels was presented. This modulation which uses cosine-modulated filter banks was called CMT—an acronym for cosine-modulated multitone. Compared to the earlier

publications on the subject [34, 35], the receiver structure of CMT was modified to reduce its computational complexity. A criterion that balances between ISI plus ICI and the stopband attenuation was proposed for designing NPR prototype filters for CMT. Numerical results showed that this criterion leads to designs that are superior to those that are designed based on the PR criterion. Moreover, CMT was compared with z-DMT and FMT, the two candidate modulation schemes for VDSL [16]. Comparisons were made with respect to computational complexity, latency, achievable bit rates, and resistance to crosstalks and RFI. Except for computational complexity, where CMT was found to be more complex than z-DMT, CMT showed superior performance with all other respects. Compared to FMT, CMT was found to be superior with respect to computational complexity and achievable bit rate. CMT and FMT showed similar resistance to crosstalks and RFI, and had similar latency.

We note that the CMT scheme that was proposed in this paper is nothing but an amended version of DWMT, a modulated scheme which has been known for a decade [25]. However, because of its relatively high computational complexity, which was a consequence of inappropriate selection of the receiver structure, DWMT was never accepted by the industry. We hope that this revisit of the scheme and in particular the simplification of the receiver structure that is proposed

Algorithm 1: Near perfect reconstruction prototype filter design.

function h=PFDesign(M, Lh, fs, gammaf, K)

/ h: prototype filter, M: number of sub-channels, Lh: prototype filter length / fs: stopband edge frequency, 1/(2M)<fs<3.8/M, gammaf: final gamma, /K: step size of gamma /

/ Initialize h, and generate C.

epsilon=5E-6; gamma=1; L=2*Lh; n=[0:Lh-1]; k=M/2; f=linspace(fs,1,L)'; C=2*cos(pi*(f*([1:Lh/2]-0.5))); S=C'*C/L;

p=[1;-C(1:end,2:end)\C(1:end,1)];p=p/(2*sqrt(p'*p));h=[flipud(p);p]; %

/generate vector v L_u=ceil(2*Lh/M-1); delay=Lh/M-1;

s=ceil(2*fs*M); /s is the number of adjacent sub-channels needed to calculate ICI u=[zeros(delay,1);1;zeros(s*L_u-delay-1,1)]; v=[zeros(L,1);u]; for i=1:100

gamma=min(gamma*K, gammaf); pold=p; %

/Generate the matrix G.

h_k=2*h'.*exp(j*(pi*(k+0.5)*(n-(Lh-1)/2)/M+(-1)~k*pi/4)); h_k=[zeros(1,Lh-1),fliplr(h_k),zeros(1,Lh-1)]; H=zeros(L_u,Lh); for m=1:L_u;

H(m,:)=h_k(end-Lh-m*M+2:end-m*M+1); end; Hi=zeros(0,0); for x=0:s-1,

temp=H.*repmat(2*cos(pi*(k+0.5+x)*(n-(Lh-1)/2)/M-(-1)~(k+x)*pi/4),L_u,1); Hi=[Hi;real(temp)]; end Hi=Hi(:,Lh/2+1:Lh)+fliplr(Hi(:,1:Lh/2)); G=[C/sqrt(gamma/2);Hi]; /

/Apply Accelerated Gauss-Seidel method ec=G*p-v; m=0; for mm=1:2

for m=m+1:m+Lh/2-1

m=mod(m-1,Lh/2)+1; sigma=-(G(:,m))'*ec/((G(:,m))'*G(:,m)); p(m)=p(m)+sigma; ec=ec+sigma*G(:,m); end; pp=p; ecc=ec; for r=1:Lh/2,

sigma=-(G(:,r))'*ec/((G(:,r))'*G(:,r)); p(r)=p(r)+sigma; ec=ec+sigma*G(:,r);

sigma=((ecc-ec)'*ec+ec'*(ecc-ec))/(2*(ecc-ec),*(ecc-ec)); p=p+sigma*(p-pp); ec=ec+sigma*(ec-ecc); end; p=(p+pold)/2; h=[flipud(p);p];

disp([num2str(p'*S*p),' ', num2str(sum(abs(Hi*p-u)."2))]); if max(abs(p-pold))<epsilon&gamma==gammaf break; end;

in this paper can initiate new thoughts on reconsideration of this powerful signal processing tool in xDSL applications.

APPENDIX

A. PROTOTYPE FILTER DESIGN

The Matlab function below can be used to design a prototype filter based on the design criterion discussed in Section 4. Note that to guarantee the stability of the design, the stop-band edge frequency fs = ws/2n should be limited to the range 1/(2M) to 3.8/(2M). Also, the parameter y is initialized to 1 and progressively increase of a specified maximum as

the design proceeds. We have experimentally found that this procedure always leads to good design within small number of iterations (see Algorithm 1).

REFERENCES

[1] E. A. Lee and D. G. Messerschmitt, Digital Communication, Kluwer Academic, Boston, Mass, USA, 2nd edition, 1994.

[2] J. G. Proakis, Digital Communications, McGraw-Hill, New York, NY, USA, 3rd edition, 1995.

[3] W. Y. Chen, DSL: Simulation Techniques and Standards Development for Digital Subscriber Lines Systems, Macmillan, Indianapolis, Ind, USA, 1998.

[4] T. Starr, J. M. Cioffi, and P. J. Silverman, Understanding Digital Subscriber Line Technology, Prentice-Hall, Upper Saddle River, NJ, USA, 1999.

[5] R. Steele, Mobile Radio Communications, IEEE Press, New York, NY, USA, 1992.

[6] J. A. C. Bingham, "Multicarrier modulation for data transmission: an idea whose time has come," IEEE Communications Magazine, vol. 28, no. 5, pp. 5-14, 1990.

[7] xDSL Forum, http://www.dslforum.org.

[8] ETSI, http://www.etsi.org.

[9] ANSI T1E1.4 Working Group, http://www.t1.org.

[10] "Network and Customer Installation Interfaces—Asymmetric Digital Subscriber Line (ADSL) Metallic Interface," T1.413-1998, American National Standards Institute, New York, NY, USA, 1998.

[11] "Radio broadcasting systems; Digital Audio Broadcasting (DAB) to mobile, portable and fixed receivers," ETS 300 401, European Telecommunications Standards Institute, 2nd ed., May 1997.

[12] "Digital Video Broadcasting (DVB); Framing structure, channel coding and modulation for digital terrestrial television," ETS 300 744, European Telecommunications Standards Institute, March 1997.

[13] F. Sjoberg, M. Isaksson, R. Nilsson, P. Odling, S. K. Wilson, and P. O. Borjesson, "Zipper: a duplex method for VDSL based on DMT," IEEE Transactions on Communications, vol. 47, no. 8, pp. 1245-1252, 1999.

[14] D. G. Mestdagh, M. Isaksson, and P. Odling, "Zipper VDSL: a solution for robust duplex communication over telephone lines," IEEE Communications Magazine, vol. 38, no. 5, pp. 9096, 2000.

[15] F. Sjoberg, R. Nilsson, M. Isaksson, P. Odling, and P. O. Borjesson, "Asynchronous zipper [subscriber line duplex method]," in Proc. IEEE International Conference on Communications (ICC '99), vol. 1, pp. 231-235, Vancouver, British Columbia, Canada, June 1999.

[16] Committee T1Working Group T1E1.4, "VDSL Metallic Interface: Part1—Functional rquirements and common specifications," Draft Standard, T1E1.4/2000-009R3, February 2001; and "VDSL Metallic Interface: Part 3—Technical specification of multi-carrier modulation transceiver," Draft Standard, T1E1.4/2000-013R1, May 2000.

[17] R. Nilsson, Digital communication in wireline and wireless environments, Ph.D. thesis, Lulea University of Technology, Lulea, Sweden, March 1999, available at: http://www.sm.luth.se/csee/sp/publications.html#Theses.

[18] B.-J. Jeong and K.-H. Yoo, "Digital RFI canceller for DMT based VDSL," Electronics Letters, vol. 34, no. 17, pp. 16401641, 1998.

[19] J. Cioffi, M. Mallory, and J. Bingham, "Digital RF cancellation with SDMT," ANSI Contribution T1E1.4/96-083, April 1996.

[20] J. Cioffi, M. Mallory, and J. Bingham, "Analog RF cancellation with SDMT," ANSI Contribution T1E1.4/96-084, April 1996.

[21] G. Cherubini, E. Eleftheriou, and S. Olcer, "Filtered multitone modulation for VDSL," in Proc. IEEE Global Telecommunications Conference (GLOBECOM '99), vol. 2, pp. 1139-1144, Rio de Janeireo, Brazil, December 1999.

[22] G. Cherubini, E. Eleftheriou, S. Olcer, and J. M. Cioffi, "Filter bank modulation techniques for very high speed digital subscriber lines," IEEE Communications Magazine, vol. 38, no. 5, pp. 98-104, 2000.

[23] G. Cherubini, E. Eleftheriou, and S. Olcer, "Filtered multitone modulation for very high-speed digital subscriber lines," IEEE

Journal on Selected Areas in Communications, vol. 20, no. 5, pp. 1016-1028, 2002.

[24] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Prentice-Hall, Englewood Cliffs, NJ, USA, 1993.

[25] S. D. Sandberg and M. A. Tzannes, "Overlapped discrete multitone modulation for high speed copper wire communications," IEEE Journal on Selected Areas in Communications, vol. 13, no. 9, pp. 1571-1585, 1995.

[26] M. A. Tzannes, M. C. Tzannes, and H. Resnikoff, "The DWMT: A Multicarrier Transceiver for ADSL using M-band Wavelet Transforms," ANSI Contribution T1E1.4/93-067, March 1993.

[27] M. A. Tzannes, M. C. Tzannes, J. Proakis, and P. N. Heller, "DMT systems, DWMT systems and digital filter banks," in Proc. IEEE International Conference on Communications (SU-PERCOMM/ICC '94), vol. 1, pp. 311-315, New Orleans, La, USA, May 1994.

[28] M. Hawryluck, A. Yongacoglu, and M. Kavehrad, "Efficient equalization of discrete wavelet multi-tone over twisted pair," in Proc. International Zurich Seminar on Broadband Communications, pp. 185-191, Zurich, Switzerland, February 1998.

[29] N. Neurohr and M. Schilpp, "Comparison of transmultiplex-ers for multicarrier modulation," in Proc. 4th IEEE International Conference on Signal Processing (ICSP '98), vol. 1, pp. 35-38, Beijing, China, October 1998.

[30] A. Viholainen, J. Alhava, J. Helenius, J. Rinne, and M. Ren-fors, "Equalization in filter bank based multicarrier systems," in Proc. 6th IEEE International Conference on Electronics, Circuits and Systems (ICECS '99), vol. 3, pp. 1467-1470, Pafos, Cyprus, September 1999.

[31] S. Govardhanagiri, T. Karp, P. Heller, and T. Nguyen, "Performance analysis of multicarrier modulation systems using cosine modulated filter banks," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99), vol. 3, pp. 1405-1408, Phoenix, Ariz, USA, March 1999.

[32] B. Farhang-Boroujeny and W. H. Chin, "Time domain equaliser design for DWMT multicarrier transceivers," Electronics Letters, vol. 36, no. 18, pp. 1590-1592, 2000.

[33] B. Farhang-Boroujeny and L. Lin, "Analysis of post-combiner equalizers in cosine-modulated filter bank-based transmulti-plexer systems," IEEE Transactions on Signal Processing, vol. 51, no. 12, pp. 3249-3262, 2003.

[34] B. Farhang-Boroujeny, "Multicarrier modulation with blind detection capability using cosine modulated filter banks," IEEE Transactions on Communications, vol. 51, no. 12, pp. 20572070, 2003.

[35] B. Farhang-Boroujeny, "Discrete multitone modulation with blind detection capability," in Proc. 56th IEEE Vehicular Technology Conference (VTC '02), vol. 1, pp. 376-380, Vancouver, British Columbia, Canada, September 2002.

[36] J. Alhava and M. Renfors, "Adaptive sine-modulated/cosine-modulated filter bank equalizer for transmultiplexers," in Proc. European Conference on Circuit Theory and Design (ECCTD '01), Espoo, Finland, August 2001.

[37] A. Viholainen, J. Alhava, and M. Renfors, "Implementation of parallel cosine and sine modulated filter banks for equalized transmultiplexer systems," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), vol. 6, pp. 3625-3628, Salt Lake City, Utah, USA, May 2001.

[38] J. Alhava and M. Renfors, "Exponentially-modulated filter bank-based transmultiplexer," in Proc. IEEE International Symposium on Circuits and Systems (ISCAS '03), vol. 4, pp. IV-233-IV-236, Bangkok, Thailand, May 2003.

[39] B. Borna and T. N. Davidson, "Efficient filter bank design for filtered multitone modulation," in Proc. IEEE International Conference on Communications (ICC '04), vol. 1, pp. 38-42, Paris, France, June 2004.

[40] G. W. Wornell, "Emerging applications of multirate signal processing and wavelets in digital communications," Proceedings of the IEEE, vol. 84, no. 4, pp. 586-603, 1996.

[41] A. Scaglione, G. B. Giannakis, and S. Barbarossa, "Redundant filterbank precoders and equalizers. I. Unification and optimal designs," IEEE Transactions on Signal Processing, vol. 47, no. 7, pp. 1988-2006, 1999.

[42] Y.-P. Lin and S.-M. Phoong, "ISI-free FIR filterbank transceivers for frequency-selective channels," IEEE Transactions on Signal Processing, vol. 49, no. 11, pp. 2648-2658, 2001.

[43] M. Rossi, J.-Y. Zhang, and W. Steenaart, "Iterative least squares design ofperfect reconstruction QMF banks," in Proc. Canadian Conference on Electrical and Computer Engineering (CCECE '96), vol. 2, pp. 762-765, Calgary, Alberta, Canada, May 1996.

[44] A. Bjorck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia, Pa, USA, 1996.

[45] C. Brezinski and L. Wuytack, Projection Methods for Systems of Equations, Elsevier, Amsterdam, The Netherlands, 1997.

[46] T. M. Ng, B. Farhang-Boroujeny, and H. K. Garg, "An accelerated Gauss-Seidel method for inverse modeling," Signal Processing, vol. 83, no. 3, pp. 517-529, 2003.

[47] H. S. Malvar, Signal Processing with Lapped Transforms, Artech House, Norwood, Mass, USA, 1992.

[48] ETSI, "Transmission and Multiplexing (TM); Access transmission systems on metallic cables: Very high speed Digital Subscriber Line (VDSL); Part1: Functional requirements," Technical Specification TS 101 270-1 V1.1.1 (1998-04), 1998.

[49] L. Lin, Multicarrier communications based on cosine modulated filter banks, Ph.D. thesis, University of Utah, Salt Lake City, Utah, USA, submitted.

[50] J. M. Cioffi, "A Multicarrier Primer," ANSI Contribution T1E1.4/91-157, November 1991.

general area of signal processing. His current scientific interests are adaptive filters, multicarrier communications, detection techniques for space-time coded systems, and signal processing applications to optical devices. In the past, he has worked and has made significant contribution to areas of adaptive filters theory, acoustic echo cancellation, magnetic/optical recoding, and digital subscriber line technologies. He is the author of the book Adaptive Filters: Theory and Applications, John Wiley & Sons, 1998. He received the UNESCO Regional Office of Science and Technology for South and Central Asia Young Scientists Award in 1987. He served as an Associate Editor of IEEE Transactions on Signal Processing from July 2002 to July 2005. He has also been involved in various IEEE activities. He is currently the Chairman of the Signal Processing/Communications Chapter of IEEE in Utah.

Lekun Lin received the B.S. and M.S. degrees in electrical and telecommunications engineering from Nanjing University of Posts and Telecommunications, Nanjing, China, in 1995 and 1998, respectively. He received the Ph.D. degree from the Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, in 2005. His current research interests are multicarrier communications, digital signal processing, and MIMO systems.

Behrouz Farhang-Boroujeny received the B.S. degree in electrical engineering from Teheran University, Iran, in 1976, the M.Eng. degree from University of Wales, Institute of Science and Technology, UK, in 1977, and the Ph.D. degree from Imperial College, University of London, UK, in 1981. From 1981 to 1989, he was with the Isfahan University of Technology, Isfahan, Iran. From 1989 to 2000, he was with the National University of Singapore. Since August 2000, he has been with the University of Utah where he is now a Professor and Associate Chair of the Department of Electrical and Computer Engineering. He is an expert in the