Timing Estimation and Resynchronization for Amplify-and-Forward Communication Systems

Xiao Li, Chengwen Xing, Yik-Chung Wu, and S. C. Chan, Member, IEEE

Abstract—This paper proposes a general framework to effectively estimate the unknown timing and channel parameters, as well as design efficient timing resynchronization algorithms for asynchronous amplify-and-forward (AF) cooperative communication systems. In order to obtain reliable timing and channel parameters, a least squares (LS) estimator is proposed for initial estimation and an iterative maximum-likelihood (ML) estimator is derived to refine the LS estimates. Furthermore, a timing and channel uncertainty analysis based on the Cramér-Rao bounds (CRB) is presented to provide insights into the system uncertainties resulted from estimation. Using the parameter estimates and uncertainty information in our analysis, timing resynchronization algorithms that are robust to estimation errors are designed jointly at the relays and the destination. The proposed framework is developed for different AF systems with varying degrees of timing misalignment and channel uncertainties and is numerically shown to provide excellent performances that approach the synchronized case with perfect channel information.

Index Terms—Amplify-and-forward (AF), asynchronous, channel estimation, Cramér-Rao bound (CRB), relay, resynchroniza-tion.

I. Introduction

COOPERATIVE distributed MIMO systems, which suggest the sharing of antennas among several single-antenna terminals to cooperatively transmit data [1], have been advocated by many researchers because of their great potential in achieving comparable link reliability and system capacity to traditional multiple antenna systems [2]-[5].

Nevertheless, this type of system also presents many practical difficulties in system design and implementations due to its distributed nature, the most important of which is timing synchronization. It has been analytically and numerically shown in [6]-[8] that timing asynchronism brings considerable performance degradation to such systems. Furthermore, diversity gain [9] and system capacity [10] may also be lost as a result of the intersymbol interference (ISI) caused by imperfect synchronization. Therefore, appropriate mechanisms that deal with the

Manuscript received July 20, 2009; accepted December 10, 2009. First published December 31,2009; current version published March 10, 2010. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Roberto Lopez-Valcarce. This work was supported by Grant GRF HKU 7181/07E.

X. Li is with the Department of Electrical and Computer Engineering, University of California, Davis CA 95616-5294 USA (e-mail: eceli@ucdavis.edu).

C. Xing, Y.-C. Wu, and S.C. Chan are with the Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong (e-mail : (cwxing@eee.hku.hk; ycwu@eee.hku.hk; scchan@eee.hku.hk.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSP.2009.2039837

asynchronous reception in distributed systems become especially essential.

As countermeasures, some delay-robust transmission and coding schemes at the transmitter side have been proposed [11]-[16] to bypass the issue. However, these schemes impose restrictions on transmission and reception protocols and hence reduce the flexibility of cooperation strategies. On the other hand, in order for many existing cooperation schemes to work efficiently, it is necessary to achieve perfect timing synchronization, including (but not limited to) cooperative relays [7], [8], distributed space-time coding [17], [18], cooperative eigen-coding scheme and distributed unitary space-time modulation (USTM) [21]. Thus, instead of employing a special scheme at the transmitter side to combat asynchronism, a general algorithm that can effectively resynchronize the timing misalignment is of tremendous value to mitigate the ISI at the destination receiver side.

As a first investigation on designing effective algorithms to resynchronize the received signal at the destination, [22] proposed a general framework to design estimation and timing resynchronization algorithms in decode-and-forward (DF) relay systems. However, for amplify-and-forward (AF) systems, most of the existing works assume perfect timing synchronization [17]-[19], [23]-[28] and the timing resynchronization problem remains widely untouched because unlike DF systems, the timing resynchronization algorithm design for AF systems is closely related to theprocessing at relays, which varies with transmission schemes.

This paper develops a general framework for the estimation of timing and channels, as well as the design of joint timing resynchronization algorithms at the relays and the destination for dual-hop cooperative AF relay systems. The contributions of this paper are summarized as follows. First, as a computationally efficient method, a least squares (LS) solution is derived to perform initial timing and channel estimation. Then, an iterative maximum-likelihood (ML) estimator is proposed to refine the LS estimates. Secondly, we present a Cramer-Rao bound (CRB) analysis, which provides insights into the system uncertainties obtained from estimation. Third, based on the uncertainty analysis, robust timing resynchronization algorithms are designed jointly at relays and the destination to minimize the recovered data mean-squared error (MSE) averaged over timing and channel uncertainties. Simulation results show that the symbol error rate (SER) performance of systems using the proposed algorithms approaches that of the ideal case when the uncertainties and timing misalignment are relatively mild.

The rest of this paper is organized as follows. In Section II, the system model for the considered relay system is presented. The joint timing and channel estimation problem is investigated in Section III, followed by a thorough CRB analysis. The design of

1053-587X/$26.00 © 2010 IEEE

joint relay and destination timing resynchronization algorithms is discussed in Section IV. Section V provides numerical results to validate the proposed estimation and resynchronization algorithms. Finally, the paper is concluded in Section VI.

Notation: •) and •) take the real and imaginary parts of a complex quantity. The operation tr{A} takes the trace of matrix A and the notation A1/2 is the square root of matrix A by Cholesky decomposition. The operator diag(x) denotes a diagonal matrix with the elements of x located along the main diagonal, while diag[A, B,...] represents a block diagonal matrix with [A, B,...] being the diagonal matrix elements. Superscripts (-)*,{-)H and (• )T denote the conjugate, the conjugate transpose and the transpose operators respectively while Ia" indicates a K x K identity matrix. Notation ||x|| (||x||w) represent the Lo norm (the weighted L2 norm with W being the weighting matrix) of vector x and assumes the expectation with respect to variable 0. Finally, © and © stand for the Hadamard and Kronecker product, respectively.

II. System Model

In this paper, an AF cooperative communication system is considered, as shown in Fig. 1. The system consists of a source S, a destination D and K relays Rk scattered in the middle, which are all equipped with single antennas. The propagation channels are assumed to be quasi-static and flat fading [2]. Practically speaking, the receiver at the destination D has no timing and channel information before transmission. Thus, a training is used to obtain these parameter estimates [19] prior to the transmission of data, and the transmission contains the following two periods.

• Training period: The source S transmits a training sequence to the K relays R^. At each relay, a superimposed training is added onto the original training after some relay processing, such as distributed space-time coding [17] or a linear precoding [19]. At the destination D, after sampling, a joint estimation of the multiple timing offsets and channels is performed. With the timing and channel estimates, the design of joint relay and destination timing resynchronization algorithms is carried out at the destination D.

• Data transmission period: After the training period, the source S transmits a data sequence to the K relays Rj.. After encoding the incoming message using a specific coding scheme, each relay R^ preprocesses the signal using the designed relay resynchronization algorithm and forwards the incoming message to the destination D. At D, the incoming signal is further resychronized with the proposed algorithm to mitigate the distortions brought by the asynchronism. Finally, the signal is demodulated and decoded.

A. Received Signals and Processing at Relays

In the first hop, each relay Rj. receives signals from S, which is a point-to-point communication system and the synchronization is straightforward. All the conventional synchronization techniques can be used [31]-[33]. Thus, without loss of generality, it is assumed that the signal from S is received without timing offsets after matched-filtering. The received signal vector Vk at the ¿th relay R& can then be written as

Fig. 1. A typical amplify-and-forward (AF) relay system.

where s is a length-i sequence transmitted from the source S. The scalar fk is the complex channel coefficient from S to Rk and is assumed to be a zero mean, circular complex Gaussian random variable with unknown variance. The term n/, is a vector containing circular complex Gaussian noise elements with zero mean and covariance Rnfc.

Upon reception, Rk processes the incoming signal vu according to a specific transmission scheme. This process can be generally written as

xfc = Wfc(

= fkS + nk

where the notations p^., ilk and Wt are explained in the following.

• Superimposed Training p/,: The sequence p/, is a separate training with covariance Rpt. imposed by Rk, which is only used during the training period for estimation purpose. During the data transmission period, p^ is set to zero and all the power is used for transmitting data s from S.

• General Encoding Matrix : The operator ilk is a predefined Rx L matrix that generates distributed space-time codes [17], [18]. Meanwhile, if distributed space-time coding is not employed during transmission, the encoding matrix is simply chosen as ii^ = I

• General Precoding Matrix Wa : The matrix Wk is a general R x R precoding matrix after encoding the message using ilk ■ Traditionally, for single antenna relay systems, W^ is a scaling operation with Wj = w/Jl and it is referred to as distributed beamforming or power allocation. The scalar Wk is designed by normalizing the transmitted power at each relay [17]-[19] or optimizing certain system performance criteria, including (but not limited to) recovered data MSE and received

signal-to-noise ratio (SNR) [23]-[26]. In order to make our discussions more general, the notation Wt is considered as nondiagonal in Section IV.

Note that different from traditional multiple antenna systems which perform precoding and encoding across different antennas, the received vector r^ here contains a block of symbol received in time, and all the encoding ftk and precoding W^ are applied across time at different relays.

Then substituting (1) into (2), we can expand the signal model

Xfc = fk ■ WfcfifcS -

■W*efe,

where ek = flknk is the equivalent noise vector with its covari-ance matrix calculated as

Ret — tokHnktok •

eral model in (6) can be rewritten compactly as

y = AeHWFi7s + AtHWp + AtHWe + v (7)

where p = [pf,..., pg-]H and e = [ef,..., ef-]H.

In the next section, estimation algorithms are presented to estimate the channels (f = [/i,.... /к]т, h = [hi,..., Ьк]т) and timing (e = [ei,..., ек]т) parameters. Then in Section IV, the resynchronization algorithms at the relays and the destination are discussed.

B. Received Signal at Destination

Due to hardware imperfections and diverse relay locations, the signals arriving at D from each relay R^ are not synchronized to each other. Hence, before matched filtering, the received signal (within 0 < t < L0T) at the destination D can be expressed as

K L0+Lg+1

where x^ (i) is the ¿th element in vector x^ in (3) and g(t) stands for the pulse shaping filter. Also, hk is the complex channel coefficient between and D and has the same statistics as fk. The symbol ek G [0.1) is the normalized timing offset while T is the symbol duration. The last term v(t) is the zero mean, circular complex Gaussian noise at D. Here, the output code length R is taken to be R = La + 2Lg, with La representing the observation interval while Lg being the effective duration of the tail of g(t) on one side.

Upon reception, the waveform y(t) is sampled at D by an oversampling ratio Q > 2 and thus the sample interval is Ts — T/Q. After putting the received samples into a vector y =

Aek = [a-La

L0+L„

g(-iT-

III. Joint Timing and Channel Estimation in the Training Period

During the training period, there is no timing and channel information. Hence, without loss of generality, the precoder is chosen as W = ~i-K(L„+2Lg) since in general W depends on timing and channel information obtained from estimation in this period. Thus, the system model (7) in the training period is simplified as

yt = AeHFf2st + A£

Af He + v

where yt is the received signal at D during the training period.

In order to derive the joint timing and channel estimator, we start with an equivalent model to (8) using the diagonal structure of H and F

, S'ils t,

■Л(к & к

;k Wktoks + . wfcpfc

■^2hkAekWkek + v (6)

where z = AeHe + v is the equivalent compound noise at D with covariance

Rz = E{zzH} = A€HReH A^ + R-

In (10), Rv is the covariance of v and Re = diag[Rei,..., Rejf] with Refc defined in (4). Now that the estimation of separate channels f and h in each hop becomes the estimation of the composite channel £ = f © h and the second hop channel h.

Remark: For estimation purpose, st. tok and p^ should be chosen to minimize the estimation errors (e.g., estimation MSE

[19], Cramer-Rao bound [35]). The design of ilk and training st has been discussed in [17], [19]. Since the focus of this paper is to design the timing resynchronization algorithms, here without loss of generality, p^ is chosen as white sequences whereas st and ftk are chosen according to [19] and [17] respectively.

A. Least Squares Estimation

As a straightforward approach, we can estimate the parameters c, £ and h by minimizing the norm of error z

Then the timing offsets c are estimated as cls = arg min ALS(e,|.]

lR~1(e,h)

l£ml'sml)%l

estimates (c^ ,£

O+i) ¿(¿+1) £(<+i)

in the (i +

iteration

are updated in turn, starting from the composite channel £ as follows.

• Updating ^ :

composite channel is updated as argmax^ AML(eML'^; ^ml)- Since Rz(c, h) is not a function of the above problem is actually a weighted LS problem [35]

¿ml^ = arg min

ML ' ML'

When e are fixed, it can be readily shown that the Least Squares (LS) estimates of £ and h are

and the solution can be computed analytically as

c^Q^Q^y,. (12)

After substituting (12) into (11) and ignoring some scaling constants and irrelevant terms, a cost function that only depends on is obtained as

(*«.) M (ê« ) + B.-1

(i) û(i)

ml; ml

• Updating

• (13)

The second arg maxh Am

hop channel l(cml'£ml '

is updated as b) and evaluated as

^ml^ = arg min h

The above minimization (14) is a multidimensional problem, which imposes high computational complexity at D. To cope with this issue, alternating projection [34] can be used to reduce the if-dimensional minimization into a series of one-dimensional searches. Finally, the channel estimates are obtained by putting cls back to (12).

B. Iterative Maximum-Likelihood Estimation

The LS solution derived above is simple and straightforward but it does not consider the effect of the channel-dependent noise z. For higher accuracy, we here derive a ML estimator. From (9), the likelihood function of timing e and channel h is obtained as

The above expression is highly nonlinear but can be solved

via the gradient descent and related algorithms (i.e., conjugate

gradient and quasi-Newton methods [20]), which requires the

(0 ¿(¿+i) u\

partial derivative of ôAml(cml'£ml

9h*. The partial derivatives are calculated elementwise as shown in (41) in Appendix A. • Updating 4il1}:

Finally, the timing offsets c are updated as arg max£ AMl (c, ^ml^ ' ^ml1' ) and is equivalent to

arg mm •

Rz €,h

where Rz was defined in (10) and is related to c and h. Therefore, the exact ML estimator is highly complex. In the sequel, we propose an iterative ML solution to refine the estimates from the LS solution.

Maximizing (15) is equivalent to maximizing the log-likelihood function

where an exhaustive search is performed using alternating projection.

With the results above, the joint timing and channel estimation algorithm is now summarized in Algorithm 1.

Algorithm 1: Joint Timing and Channel Estimation Algorithm

(1) Perform coarse estimation using the Least Squares solution

• Obtain €ls = arg min ALs(e, I, h) in (13);

Obtain

where C is an irrelevant constant and the dependency of Rz on timing e and channel h is explicitly stated as Rz(e, h). Then the iterative algorithm is initialized as ,(o) ¿(0) wok _

(2) If higher accuracy is desired, perform fine estimation using the iterative Maximum Likelihood solution

and the parameter

Initialize (cml^ml^ml) = Repeat i = 0.1,...

3,hLS);

Îml^ = argmaxAml(ê-w

£ h(i)

) in (17); in (18);

л (Лг) ¿(*+1)

argmaxAML(£VL>ÇML > h

CML1} = arS max Aml(e, ÎML > ) in (19);

until IIIml4

1ml ^ '

: and I

and С

are smaller than corresponding thresholds Q, 0i

then the CRB matrix of the original set of complex-valued parameters (e, h) can be evaluated as [35]

C. Timing and Channel Estimation Uncertainty Analysis

Notice that all the parameters obtained during the training period suffer from random estimation errors. That is e = с + £ = £ + ^ and h = h + where 6è = [<5êl,..., 6iK}T, % = [%, • • •, }T and 6й = ,..., 6iiK]T are the estimation errors of the corresponding parameters. If the statistical information of timing and channel uncertainties is

known to the receiver, then it is possible to design timing resyn-chronization algorithms that are robust to the estimation errors. Here we propose to use the CRB as a measure to provide insights into the uncertainties of parameters because the ML estimator asymptotically approaches the error performance predicted by the CRB [35].

Since the parameters of interest h) contain

both real and complex-valued elements, we define

function in (15), the (¿, j)th entry of the Fisher information matrix (FIM) J is calculated as [35]

CRB(e,£,h) = П J-1 П and can be written in a block matrix form as

Ce,e C^ Ch,€

CRB(e. h) =

ii and based on the probability density

where C^f, and Ch.h are the K x K CRB matrices for e, £ and h respectively. According to [35] and [36], given the estimates (c, h), the true values of timing and channel parameters asymptotically follow a multivariate Gaussian distribution as

e ~ Af(e, Ce.e) t~CAf(jt,Cu)

IV. Joint Relay and Destination Timing Resynchronization Algorithms

In this section, the joint design of timing resynchronization algorithms at both R^ and D is addressed under timing and channel uncertainties. Recall the general model in (7). During the data transmission period, the superimposed training p in (7) is set to zero and all the powers at relays are used to transmit data from S. Thus, the model becomes

= AeHWFÎÎsd + AeHWe + v

where y^ is the received signal at D during the data transmission period. Using the diagonal structure of W, H, and F, we could rearrange the model in the data transmission period as

where 8i is the ¿th element in vector 6.

For the calculation of FIM J, the corresponding components are computed as ..H

К (D£ipi)

Yd = Aewsnsd + AeWHe + v

where j is the imaginary unit j — and Dft = dAf Jdei. Substituting the above results back to (20), we can obtain the CRB by inverting the FIM J (i.e., CRB(0) = J-1). Note that

where S = HF is the composite channel matrix.

For asynchronous AF systems, the signals from Rj-'s are no longer perfectly aligned with each other at the destination D and hence there is ISI from adjacent symbols within the data block. Also, the inaccurate channel obtained from estimation will result in further performance degradation. Ultimately, the diversity gain achieved by distributed space-time coding [17]-[19] or optimal beamforming [24]-[27] will be lost due to the asyn-chronism and channel uncertainties. In the following, we jointly design robust timing resynchronization strategies for both relays and the destination in distributed AF systems.

Traditionally, for beamforming-based AF systems, the design of relay processing is focused on choosing the optimal precoding matrix W^ to optimize certain performance criteria, such as minimum transmit power [23], maximum system capacity [24], [25], minimum recovered data MSE [29], [30] and maximum received SNR [26]. On the other hand, for distributed space-time coding-based AF systems, the design of relay processing is to normalize the transmitted power at each relay [17]-[19]. Unfortunately, the above mentioned works are based on the assumption of perfect synchronization, and W^ is a scaled diagonal matrix W* = Wki-L0+2Lg- However, a diagonal scaling operation at relays may not be optimal under

asynchronism due to the ISI. Therefore, in the following, W^ is considered as a general matrix and designed jointly with the timing equalizer G at the destination D.

A. Problem Formulation

Here, we propose to jointly design timing resynchronization strategies at relay W and destination G to minimize the recovered data MSE under a power constraint

B. Joint Design of Timing Equalizer G and Precoder W

In order to derive the optimal pair (G, W), we first differentiate (31) with respect to G-^ and set the derivative to zero. It follows that the optimal G in terms of W is

G = T(»?)Rf SlH±HWHAf (AjWRW^Af + R„

mm MSE(G,W) E

where the expectation in (27) is taken with respect to the statistics of h, Sd,e and v. The matrix T(i/) is a Lax (La + 2Lg) circulant matrix with the first row rj being

•-as

(-Lg),...,R

1X(L„-1)]

A-Lg);

eW3ftsd + AeWHe||2}

= tr{AêWRWffAf}

After substituting (33) into (31), it is proved in Appendix-C that the optimization problem (27) can be reformulated as

$ (R^W^Af R^WR1/2 + I s.t. tr {AêWRWffAf} < PD

min tr w

j)] stands for the ideal zero-ISI sampled waveform after matched filtering, with Rgg(e) being the autocorrelation function of g(t) at i = eT. The matrix T(ij) can be interpreted as a windowing operation selecting the length-£0 block of data of interest for detection.

Besides, £(W) is the average signal power received at the destination D from the K relays

where $ à R^^SflRsT11 (V)T(ri)R^ flHSHR~H^.

Since Rv and 3? are both Hermitian, according to eigendecomposition we can write Rv = UvAvUf and <& = where Uv and are the unitary ma-

trices containing eigenvectors that correspond to the eigenvalues in Av and . After transforming the variable W by W = UfAgWR1/2!.^, the optimization problem is further simplified as

min tr w

s.t. tr-

{a* (A

W A~ W + I

<Pd■

where the expectation is taken with respect to e, h, sd and e. On the other hand, Pd is used to specify a threshold at the destination D to limit the power from the K relays [28], [37], [38]. This constraint is especially important for cooperative communication systems (e.g., "ad hoc" wireless networks, sensor networks, etc.) where relay nodes cooperate together to transmit a single data stream, which may result in a much higher transmit power than previously allowed.

Using the fact that the timing and channel uncertainties follow an asymptotic Gaussian distribution (24), the average MSE of the recovered data and the power constraint are evaluated in Appendix B to be

= tr {GAgWRWff Af + tr{T(i/)RsTH(i/)} -trjTMRfil^S^W^AfG*} (31)

It is known that the optimal W for the above problem has the following structure [39]:

diag(/3i,...,/3jv) 0 0 0

where N = min{rank(Av), K(L0 + 2Lg)}. Denote A* and as the zth diagonal element in matrices Av and , then the above problem becomes

mm ^ J ^ s.t. ¿/3?<PD.

k(l0+'2l9) £

Applying the method of Lagrange multiplier onto the above optimization problem, the optimal solution is readily obtained as

- — /A,.

where R = EilRsClHEH + R6. © (ft R«. © Re and furthermore, R« = Ci % © I

HReH^ '

Lo+2L„,Rfc :

lL0+2Lg 7 and Rs is the covariance of the data sequence

where (a;)+ = x if x > 0 and (a;)+ = 0 if x < 0. The constant v is chosen to satisfy the constraint Pi < Pd-

With the elements pi calculated in (37), the optimal W° can be obtained as in (36). Then according to the transformation

W = U^AgWR1/2!^, the optimal choice W° should satisfy the following equation:

AÉW° = UvW°Uf R-1/2 .

a /y' - j- o

and W° = diag[Wf

Since A-e = [A?1,..., 1 the above equation can also be written as

iA- W? A- W?-l — T l-^-ei »v 1) • • • i -"-ck vv a j — ± o

and the corresponding block of W^ can be calculated as

WS= A£A

with R = La + 2Lg and T0[(A; - 1)71 + 1 : kR] denotes the submatrix containing the [(fc—l)i?.+l]thtothe (kR)th columns of T0.

V. Numerical Results and Discussions

In this section, the performances of the proposed algorithms for estimation and timing resynchronization in different AF systems is demonstrated by Monte Carlo simulations, where each point is obtained by averaging over 104 runs. In all simulations, QPSK modulation is used and the pulse shaping filter g(t) is assumed to be root-raised cosine waveform with effective tail length Lg — 4, roll-off factor 0.22 and normalized energy

The length of the sequence is La = 64 and the oversampling ratio is set to Q — 2. The channel coefficients fk s and hk s are modeled as independent identically distributed (i.i.d.) complex Gaussian random variables with zero mean and unit variance. Finally, the noise covariance is taken as Rnt = u2tIl at Rk and Rv = ct%Il0q at D. For simplicity, the noise powers at relays and destination are set to be the same a2

Fig. 2. Estimation performance of coarse LS and iterative ML estimators for timing offsets e for both K = 2 and K = 4.

t2 A 2

"'til ' ' '

<j;lK = <t2, and the SNR is defined as SNR = Es/a2, where Es is the average transmitted signal power E{|s(i)|2} = Es.

A. Estimation Performance

In the training period, the training sequence st from the source and the encoding matrix ilk are chosen according to [19] as st = v

5s1(l0+2l£,)xi and ilk specified in [17]. As introduced later, the same ilk is also used for distributed space-time coding-based AF systems during the data transmission period as well. On the other hand, the superimposed training sequences pa-'s at IR^'s are all generated as [exp(-i<j)-Lg),..., exp(-;)<j)Lo+L J], where fa is uniformly distributed between [—7r, it],

Here, the performances of the proposed LS estimator and iterative ML estimator are presented. For the iterative ML estimator, the corresponding thresholds for convergence are set as Q = (t, = (t = K ■ 10"° (with K being the number of relays), and approximately our simulations after three iterations can arrive at stable convergent results. In Fig. 2, the timing estimation MSE at D and the corresponding CRB are plotted as a function of SNR (K = 2 and K = 4). It can be seen that for both cases, the CRB and the MSE for the LS and iterative ML estimators coincide in high SNR region, indicating that theLS solution pro-

Fig. 3. Estimation performance of coarse LS and iterative ML estimators for composite channel for both and .

vides very close accuracy to the ML estimator and the CRB. At low SNR, the MSE of the LS estimate is slightly higher than those obtained from the ML estimator because the LS estimate does not consider the effects of the timing-channel-dependent noise, which are significant in low SNR region. It is also observed that for both cases at low SNR, the LS and ML MSEs fall below the CRB due to the fading environment. This phenomenon has been discussed in great details in [22] and hence it is not repeated here due to the space limitation.

In Fig. 3, the MSE of the composite channel estimate and the corresponding CRB are plotted. It is obvious from Fig. 3 that the performances of both the LS and ML channel estimators touch the CRB when the SNR is high. When the SNR is low, the

Asynchronous Beamforming-based AF Systems with Perfect Information

o 3 6 9 12 15

SNR (dB)

Fig. 4. SER performance for asynchronous beamforming-based AF systems with Perfect information using QPSK modulation. | | < 0.1 and |Äü| < 0.3 for and .

Asynchronous Beamforming-based AF Systems under Uncertainties

0 3 6 9 12 15

SNR (dB)

Fig. 5. SER performance for asynchronous beamforming-based AF systems with uncertainties using QPSK modulation. < O.lfor K = 2 and A" = 4.

MSE performance of the LS solution starts to deviate from the CRB while the ML solution remains close to the CRB. Similar performance can be observed for the estimation of the second hop channel h and thus it is not presented here.

B. Timing Resynchronization Performance

The performance of the proposed joint relay and destination timing resynchronization algorithms is thoroughly illustrated in this section. In order to better illustrate the effects of timing offsets, we represent the offsets as — e0 + Ak, where e0 is the common offset with respect to a certain time frame at D and Ak is the residual offset. Furthermore, we separately consider the results for distributed beamforming-based and distributed space-time coding-based AF systems.

1) Distributed Beamforming-Based AF Systems: In the beamforming-based AF system, ilk = 1r. The proposed design is first examined in Fig. 4 for K —2 and 4 relay systems when perfect timing and channel information are available under |Aft | < 0.1 and |Afc| < 0.3. When the timing asyn-chronism is relatively mild |A/,| < 0.1, the performance of the proposed joint design (G, W) overlaps with the performance of the perfect case. Even under severe timing misalignment , the performance of the proposed design still remains at a satisfactory level and the degradation brought by the increase of delay difference is acceptable.

Meanwhile in Fig. 5, the performance of the joint relay and destination timing resynchronization algorithms (G, W) is further examined in K = 2 and 4 relay systems under |Aft| < and with timing and channel uncertainties. Especially, the proposed design is compared with a nonrobust design where the parameter estimates are directly used without considering their estimation errors. It is obvious that the proposed design is effective in dealing with system uncertainties and outperforms the nonrobust design by almost an order of magnitude. As can be seen, under mild asynchronism \Ak\ < 0.1, the proposed joint timing resynchronization algorithm provides very close

0 3 6 9 12 15

SNR (dB)

Fig. 6. SER performance for asynchronous coding-based AF systems with Perfect Information using QPSK modulation. \Ak | <0.1 and |A/, | < 0.3 for

K = 2 and K = 4.

performance to the perfect case, in which there is no asynchro-nism and system uncertainty. This shows that the proposed algorithm can effectively mitigate the ISI in beamforming-based AF systems in terms of SER performance, especially in salvaging the diversity gain that may be lost due to asynchronism [9].

2) Distributed Coding-Based AF Systems: For the distributed coding-based AF system, during the data transmission period, the coding matrices ilk are generated randomly based on isotropic distribution on the space of (L0+2Lg) x (L0+2Lg) unitary matrices as in [17] and [18]. The performance of the proposed robust joint relay and destination timing resynchro-nization algorithm is examined for coding-based AF systems in Figs. 6 and 7, respectively. It can be seen that similar phenomena to that of beamforming-based systems can be observed for asynchronous coding-based AF systems as well and it can

Fig. 7. SER performance for asynchronous coding-based AF systems with uncertainties using QPSK modulation, | Aj, | < 0.1 for K = 2 and K = 4.

be easily noted that asynchronous coding-based AF systems generally preserve its diversity order using the proposed joint relay and destination resynchronization algorithms.

VI. Conclusion

In this paper, the problem of joint timing offset and channel estimation, and furthermore the joint design of relay and destination timing resynchronization algorithms was considered for amplify-and-forward cooperative relay systems. The LS estimator, iterative ML estimator and CRB were derived for the timing and channel parameters. Then, aiming at minimizing the recovered data MSE at the destination and taking into account the uncertainties in the estimated timing and channel parameters, the robust joint timing resynchronization algorithm at relay and destination was derived. The proposed framework is a general methodology that can be applied to asynchronous AF systems with beamforming or distributed space-time coding. Furthermore, this framework integrates the estimation process into the design problem that deals with asynchronism and system

uncertainties. Simulation results have well supported our presented analysis and also verified the efficiency of the estimation and timing compensation algorithms.

Appendix A Calculation ^of Partial Derivative

The partial derivative with respect to h* can be obtained ele-mentwise as

With the expression of Aml^ml^ml^' derivative with respect to h* is written as

in (16), the partial

Si) Ai+1) h

;ml'?ml 11

d log det Rz(4iL'

y, - M - N (eg

with each individual component in the above equation is computed as follows. • Calculation of the first term:

With complex variable differentiation [35], we have

Ô logdet R2

;ml 'j

• Calculation of the second term: Similarly, the second term can be obtained using the chain rule as [see the equation shown at the bottom of the page]. By complex variable differentiation,

d yt-M

(4 il)

£ml ^ _ n

d yt-M

,(0 ;ml

■Ai) ;ml

'(cx:,,!.) h)

(y* - M (4}l) - N (<N':I.) h)" R- (ê« ,h)

d yt-M

(yt ~ M (cx':i.)

we can further compute the corresponding components in the above expression as

= R^fê«, h)-^^R-1 (ê«L, h)

3 fo - M - N J h

d(yt-M (ß) ^ - N (êW ) h

(A- P<)

IN (32)

- tr <GE,

■ tr

where - diagfâ © ILo+2Lg, Ss dia* ÎW © and

riag^j <x) ±lo+2ls, os = aiag|0£j <x> ±l0+2l

. Using the asymptotic Gaussian distribution in (24), the expectations over S and H can be evaluated as

Es{SnRsnHSH} = àflRsnHàH + Ri, © (iîRsiî

Appendix B Proof of MSE(G, W) in (31) and .

A. MSE(G,'W) Expression

Substituting (26) into (27), we have

MSE(G, W) = E^h.sj.e.v {|| G(AeWHi7sd

Using the fact that s^, e and v are uncorrelated, we have

EH{HReHÄ| = HReHff

R© Re

R«! = | © Ilo+2ls R«i, ch.h •••''•:

with | and C^ ^ being the CRB matrices for £ and h defined in (23). Note that the estimates are used when evaluating the CRB matrices because the CRB depends on the true values of the parameters.

• Timing Uncertainty : On the other hand, due to the nonlin-earity presented in the system model, we expand the matrix Aek using Taylor series around the estimates ek as follows:

Aei. = +D ik6ik+Ot

where De; = dAeJdti and the symbol 0(pkS?k) represents those matrix terms with order higher than 6? . With the above Taylor expansion and the distribution in (24), then for an arbitrary square matrix Z containing square submatrices Z^j (i, j — 1,..., K), we have

E€{AeZAf}

<\ D. Z, ,A

high order terms, neglected

+ A. Z; .A," ^ A.ZA?

Since timing offsets e are entangled through Ae in a highly nonlinear manner, the expectations over channel uncertainties are tackled first and then the timing uncertainties are tackled later using Taylor series expansion.

• Channel Uncertainties : With the channel uncertainties £ = £ + ëç and h = h + we can write

H = H + S-

and similarly Ee{Ae} ~ A^. Substituting the above results back to (43), we arrive at

= tr {GAèWRW5 Af GH} + tr {T + tr {GRvGfl} - tr {gAîW2ÎÎRsT - tr {t(iï)R?nHÛHWHAf GH}

where R = EimsnHEH + R6. © i .

■ HReH

B. £(W) Expression

Similar to the procedures shown above, we can readily write the expression of £(W) as

â Et,ijh,Sd,e{||A€WHilsd + A€WHe||2}

= tr A.,WEs

+ tr E£ { AfWEn { H Eejee^}

— tr {AjWRWffA?} .

Appendix C Proof of (34)

Substituting (33) into (31), the MSE expression becomes

MSE(W)

= tr{T(ï/)RsTff(

MSE(W)

i?nHàHn-H/2

x [I-(WffW + I)-1]R-1/2HÎ2RsTff(

?HW + I)-1R-1/2SÎÎRsTiî(

MSE0 = tr{T(7/)RsTff(7/)}

-tr |T(i/)R^r2iiSiiR~1SiiRsTii(7/)

is a constant that is irrelevant to W.

Denote $ = R~1/2HiiRsTii(7/)T(»7)R^fi7iiH-fiR~ii/2 and substitute W = R^1/2AiWR1/2 back to (49), the cost function (31) is now equivalent to

MSE(W)

= tr (rh/2WhAf R^AèWR1/2 +1)

x (A^WRW^Af + Rv) AtWS{lRsTH(r])j .

Using the transformation of variable W = R^^AjWR1/2, we have

MSE(W)

= tr{T(r/)RsTii(»7)} - tr {T(r/)Rf nHSH-R-H/'2 xWH(WWH + I)-1WR-1/22i2RsTii(7/)} .

Furthermore, using the matrix inversion lemma AH(AAH +

, the above expression can be

simplified as

where the constant MSE0 is dropped.

References

[1] A. Sendonaris, E. Erkip, and B. Aazhang, "User cooperation diversity, Part I, II," IEEE Trans. Commun., vol. 51, pp. 1927-1948, Nov. 2003.

[2] J. N. Laneman and G. W. Wornell, "Distributed space-time-coded protocols for exploiting cooperative diversity in wireless networks," IEEE Trans. Inf. Theory, vol. 49, no. 10, pp. 2415-2425, Oct. 2003.

[3] S. Cui, A. J. Goldsmith, and A. Bahai, "Energy-efficiency of MIMO and cooperative MIMO techniques in sensor networks," IEEE J. Sel. Areas Commun., vol. 22, pp. 1089-1098, Aug. 2004.

[4] P. A. Anghel and M. Kaveh, "Exact symbol error probability of a cooperative network in a Rayleigh-fading environment," IEEE Trans. Wireless Commun., vol. 3, pp. 1416-1421, Sep. 2004.

[5] A. Ribeiro, X. Cai, and G. B. Giannakis, "Symbol error probabilities for general cooperative links," IEEE Trans. Wireless Commun., vol. 4, pp. 1264-1273, May 2005.

[6] J. Mietzner and P. A. Hoeher, "Distributed space-time codes for cooperative wireless networks in the presence of different propagation delays and path losses," in Proc. 3rd IEEE Sensor Array Multichannel Signal Processing Workshop, Barcelona, Spain, Jul. 2004, pp. 264-268.

[7] R. C. Palat, A. Annamalai, and J. H. Reed, "Upper bound on bit error rate for time synchronization errors in bandlimited distributed MIMO networks," in Proc. IEEE WCNC, Apr. 2006, vol. 4, pp. 2058-2063.

[8] Y. Mei, Y. Hua, A. Swami, and B. Daneshrad, "Combating synchronization errors in cooperative relays," in Proc. IEEE ICASSP, Mar. 2005, vol. 3, pp. 369-372.

[9] S. Jagannathan, H. Aghajan, and A. Goldsmith, "The effect of time synchronization errors on the performance of cooperative MISO systems," in Proc. IEEE GLOBECOM, Nov. 2004, pp. 102-107.

[10] R. A. Iltis and R. Cagley, "Channel estimation and carrier offset control for cooperative MIMO sensor networks," in Proc. IEEE 39th Asilomar Conf. Signals, Systems, Computing, Pacific Grove, CA, Nov. 2005, vol.

I, pp. 210-214.

[11] B. Azimi-Sadjadi and A. Mercado, "Diversity gain for cooperating nodes in multi-hop wireless networks," in Proc. IEEE VTC—Fall, Sep.

2004, vol. 2, pp. 1483-1487.

[12] R. Djapic, A.-J. Van der Veen, and L. Tong, "Synchronization and packet separation in wireless ad hoc networks by known modulus algorithm," IEEE J. Sel. Areas Commun., vol. 23, no. 1, pp. 51-64, Jan.

[13] P. Stoica and E. Lindskog, "Space-time block coding for channels with intersymbol interference," in Proc. IEEE 35th Asilomar Conf. Signals, Systems, Computing, Pacific Grove, CA, Nov. 2001, vol. 1, pp. 252-256.

[14] X. Li, "Space-time coded multi-transmission among distributed transmitters without perfect synchronization," IEEE Signal Process. Lett., vol. 11, no. 12, pp. 948-951, Jan. 2005.

[15] Y. Shang and X.-G. Xia, "Space-time trellis codes with asynchronous full diversity up to fractional symbol delays," IEEE Trans. Wireless Commun., vol. 7, no. 7, pp. 2473-2479, Jul. 2008.

[16] Z. Li and X.-G. Xia, "A simple Alamouti space-time transmission scheme for asynchronous cooperative systems," IEEE Signal Process. Lett., vol. 14, no. 11, pp. 804-807, 2008.

[17] Y. Jing and B. Hassibi, "Distributed space-time coding in wireless relay networks," IEEE Trans. Wireless Commun., vol. 5, no. 12, pp. 3524-3536, Dec. 2006.

[18] Y. Jing and H. Jafarkhani, "Using orthogonal and quasi-orthogonal designs in wireless relay networks," IEEE Trans. Inf. Theory, vol. 53, no.

II, pp. 4106-4118, Nov. 2007.

[19] F. Gao, T. Cui, and A. Nallanathan, "On channel estimation and optimal training design for amplify and forward relay networks," IEEE Wireless Commun., vol. 7, no. 5, pp. 1907-1916, May 2008.

[20] T. Cui and C. Tellambura, "Semiblind channel estimation and data detection for OFDM systems with optimal pilot design," IEEE Trans. Commun., vol. 55, no. 5, pp. 1053-1062, May 2007.

[21] T. Wang, Y. Yao, and G. B. Giannakis, "Non-coherent distributed space-time processing for multiuser cooperative transmission," in Proc. IEEE GLOBECOM, Nov. 2005, pp. 3738-3742.

[22] X. Li, Y.-C. Wu, and E. Serpedin, "Timing synchronization in decode-and-forward cooperative communication systems," IEEE Trans. Signal Process., vol. 57, no. 4, pp. 1444-1455, Apr. 2009.

[23] V. Havary-Nassab, S. Shahbazpanahi, A. Grami, and Z.-Q. Luo, "Distributed beamforming for relay networks based on second-order statistics of the channel state information," IEEE Trans. Signal Process., vol. 56, no. 9, pp. 4306-4316, Sep. 2008.

[24] N. Khajehnouri and A. H. Sayed, "Distributed MMSE relay strategies for wireless sensor networks," IEEE Trans. Signal Process., vol. 55, no. 7, pp. 3336-3348, Jul. 2007.

[25] X. Tang and Y. Hua, "Optimal design of non-regenerative MIMO wireless relays," IEEE Trans. Wireless Commun., vol. 6, no. 4, pp. 1398-1407, Apr. 2007.

[26] Y. Jing and H. Jafarkhani, "Network beamforming using relays with perfect channel information," IEEE Trans. Inf. Theory, vol. 55, no. 6, pp. 2499-2517, Jun. 2009.

[27] W. Guan and H. W. Luo, "Joint MMSE transceiver design in non-regenerative MIMO relay systems," IEEE Commun. Lett., vol. 12, no. 7, pp. 517-519, Jul. 2008.

[28] A. S. Behbahani, R. Merched, and A. M. Eltawil, "Optimizations of a MIMO relay network," IEEE Trans. Signal Process., vol. 56, no. 10, pp. 5062-5073, Oct. 2008.

[29] H. Sampath, P. Stoica, and A. Paulraj, "Generalized linear precoder and decoder design for MIMO channels using the weighted MMSE criterion," IEEE Trans. Commun., vol. 49, no. 12, pp. 2198-2206, Dec. 2001.

[30] M. Joham, W. Utschick, and J. A. Nossek, "Linear transmit processing in MIMO communication systems," IEEE Trans. Signal Process., vol. 53, no. 8, pp. 2700-2712, Aug. 2005.

[31] M. Oerder and H. Meyr, "Digital filter and square timing recovery," IEEE Trans. Commun., vol. 36, pp. 605-612, May 1988.

[32] J. Riba, J. Sala, and G. Vazquez, "Conditional maximum likelihood timing recovery: Estimators and bounds," IEEE Trans. Signal Process., vol. 49, no. 4, pp. 835-850, Apr. 2001.

[33] Y.-C. Wu and E. Serpedin, "Design and analysis of feedforward symbol timing estimators based on the conditional maximum likelihood principle," IEEE Trans. Signal Process., vol. 53, no. 5, pp. 1908-1918, May 2005.

[34] M.-O. Pun, M. Morelli, and C.-C. J. Kuo, "Maximum-likelihood synchronization and channel estimation for OFDMA uplink transmissions," IEEE Trans. Commun., vol. 54, no. 4, pp. 726-736, Apr. 2006.

[35] S. M. Kay, Fundamentals of Statistical Signal Processing.. Upper Saddle River, NJ: Prentice-Hall, 1998.

[36] E. G. Larsson, Y. Selen, and P. Stoica, "Adaptive equalization for frequency-selective channels of unknown length," IEEE Trans. Veh. Technol., vol. 54, pp. 568-579, Mar. 2005.

[37] M. Gastpar, "On capacity under receive and spatial spectrum-sharing constraints," IEEE Trans. Inf. Theory, vol. 53, no. 2, pp. 471-487, Feb. 2007.

[38] M. Gastpar, "On capacity under received-signal constraints," in Proc. Allerton Conf. Communications, Control, Computing, Monticello, IL, Oct. 2004, pp. 1322-1331.

[39] D. P. Palomar, J. M. Cioffi, and M. A. Lagunas, "Joint Tx-Rx beam-forming design for multicarrier MIMO channels: A unified framework for convex optimization," IEEE Trans. Signal Process., vol. 51, no. 9, pp. 2381-2399, Sep. 2003.

Xiao Li received the B.Eng. degree (summa cum laude) from Sun Yat-sen (Zhongshan) University in 2007 and the M.Phil. degree from the Department of Electrical and Electronic Engineering at the University of Hong Kong in 2009. He is currently working towards the Ph.D. degree at the University of California, Davis.

His current research interests include statistical signal processing, optimization, communications and information theory, with applications in wireless networks and communication systems.

Chengwen Xing received the B.Eng. degree fromXi-dian University, Xian, China, in 2005. He is currently working towards the Ph.D. degree in the Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong.

His current research interests include statistical signal processing, convex optimization, multivariate statistics, combinatorial optimization, and cooperative communication systems.

Yik-Chung Wu received the B.Eng. (E.E.E.) degree and the M.Phil. degree from the University of Hong Kong (HKU) in 1998 and 2001, respectively, and the Ph.D. degree from the Texas A&M University in 2005. During his study at Texas A&M University, he was fully supported by the prestigious Croucher Foundation scholarship.

After receiving the Master's degree, he was a Research Assistant in the University of Hong Kong. From August 2005 to August 2006, he was a Member of Technical Staff with the Thomson Corporate Research, Princeton, NJ. Since September 2006, he has been an Assistant Professor with the University of Hong Kong. His research interests are in the general area of signal processing and communication systems, and in particular receiver algorithm design, synchronization techniques, channel estimation, and equalization.

Dr. Wu was a TPC member for the IEEE Vehicular Technology Conference Fall 2005, the IEEE GLOBECOM 2006 and 2008, and the International Conference on Communications (ICC) 2007 and 2008. He is currently an Associate Editor for the IEEE Communications Letters.

S. C. Chan (S'87-M'92) received the B.Sc. (Eng.) and Ph.D. degrees from the University of Hong Kong in 1986 and 1992, respectively.

He joined the City Polytechnic of Hong Kong in 1990 as an Assistant Lecturer and later as a University Lecturer. Since 1994, he has been with the Department of Electrical and Electronic Engineering, University of Hong Kong, where he is currently a Professor. He held visiting positions in Microsoft Corporation, Redmond, CA, Microsoft Research Asia, the University of Texas at Arlington, and the Nanyang Technological University. His main research area is signal processing and applications. His previous research covers fast transform algorithms, filter design and realization, multirate and array signal processing, adaptive and communications signal processing, data compression, and image-based rendering.

Dr. Chan has served in a number of professional and administrative committees. He is currently a member of the Digital Signal Processing Technical Committee of the IEEE Circuits and Systems Society, an Associate Editors for the IEEE Transactions on Circuits and Systems I and the Journal of Signal Processing Systems, and was the Chairman of the IEEE HK Chapter of Signal Processing. He was the Special Session Chairman of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2003 and is an organizing committee member of the IEEE 2010 ICIP.