# **RESEARCH PAPER**

# Computationally efficient real-time digital predistortion architectures for envelope tracking power amplifiers

PERE L. GILABERT AND GABRIEL MONTORO

This paper presents and discusses two possible real-time digital predistortion (DPD) architectures suitable for envelope tracking (ET) power amplifiers (PAs) oriented at a final computationally efficient implementation in a field programmable gate array (FPGA) device. In ET systems, by using a shaping function is possible to modulate the supply voltage according to different criteria. One possibility is to use slower versions of the original RF signal's envelope in order to relax the slew-rate (SR) and bandwidth (BW) requirements of the envelope amplifier (EA) or drain modulator. The nonlinear distortion that arises when performing ET with a supply voltage signal that follows both the original and the slow envelope will be presented, as well as the DPD function capable of compensating for these unwanted effects. Finally, two different approaches for efficiently implementing the DPD functions, a polynomial-based and a look-up table-based, will be discussed.

Keywords: Power amplifiers and linearizers, Modeling, Simulation and characterizations of devices and circuits

Received 1 October 2012; Revised 18 January 2013; first published online 5 March 2013

#### I. INTRODUCTION

Alternatives to the classical Cartesian transmitter that uses linear power amplifiers (PAs) with constant supply are being investigated to overcome the poor power efficiency with high peak-to-average power ratio (PAPR) signals. The Doherty architecture, for example, has been adopted for base stations, where several manufacturers (e.g. Freescale, NXP), are offering PAs with an average efficiency up to 50% and even more [1]. However, other promising structures such as the envelope elimination and restoration (EE&R) [2, 3], the envelope tracking (ET), or polar transmitters with delta-sigma modulation [4] are still being considered as candidates to overcome the Doherty PA efficiency. From the implementation point of view, ET is a very attractive technique because it can be applied in conventional transmitters based on linear RF amplification topologies by simply substituting the classical static supply for a dynamic one.

One of the main constraints in the maximum efficiency that can be achieved by ET transmitters regards the envelope modulator of the envelope amplifier (EA), since the overall efficiency of an ET architecture is the product between both the PA and the EA power efficiency. The envelope bandwidth (BW) is several times (theoretically is infinite) the BW of the baseband complex modulated signal, which is critical when

Department of Signal Theory and Communications, Universitat Politècnica de Catalunya-BarcelonaTech, c/ Esteve Terradas 7, 08860 Castelldefels, Barcelona, Spain **Corresponding author**: Pere L. Gilabert Email: plgilabert@tsc.upc.edu considering current wideband signals with high PAPR. There are already some companies, such as Nujira (www. nujira.com), MaXentric (www.maxentric.com) or Quantance (www.quantance.com) that are offering ET solutions with average efficiencies above 60% for WCDMA and LTE signals.

One of the main challenges of the EA consists of supplying the power required by the transistor at the same speed of the signal's envelope. In dual-band applications, for example, this becomes even more challenging since the combined envelope can present BWs more than  $5 \times$  the carrier separation. Therefore, in order to relax the EA requirements, some solutions have been proposed to reduce the BW and slew-rate (SR) of the original signal's envelope [5-8]. Unfortunately, the use of a slower version of the envelope to supply the PA drain not only degrades the overall efficiency but also results in nonlinear distortion amplification. Despite the efficiency and linearity degradation, the solution of supplying the PA with a slower envelope can still be of interest in applications where it is necessary to trade-off the BW and efficiency due to the EA limitations. To compensate the nonlinear distortion that arises when using the SR's limited version of the original envelope, it will be necessary to use a slow envelopedependent digital predistorter (SED-DPD) [5, 9, 10].

Therefore, this paper is organized as follows. The BW versus efficiency trade-off in EAs will be discussed in Section II. The design of the DPD that is required to compensate for the nonlinear distortion that arises when supplying with a slower version of the signal's envelope, will be presented in Section III. Some field programmable gate array (FPGA)-oriented implementation architectures for real-time DPD will be discussed in Section IV. Finally, in Section V conclusions will be given.

# II. DYNAMIC SUPPLY OF THE PA WITH SLOW VERSIONS OF THE SIGNAL'S ENVELOPE

In an ET system (see Fig. 1), the supply voltage is dynamically adjusted to track the RF envelope at high instantaneous power. The supply voltage can be shaped according to different criteria. By means of a so called shaping function it is possible to accommodate the shape of the supply voltage (that somehow must follow the instantaneous RF envelope) to achieve the following objectives: optimum efficiency, isogain [11-13] or SR and BW reduced shaping [14].

Focusing on this later objective, two different approaches based on SR and BW reduction of the RF signal's envelope showed that these strategies are suitable to adapt the envelope characteristics to the EA requirements or limitations at the expenses of having efficiency degradation. On the one hand, the method proposed in [5, 6] limits the BW of the envelope iteratively, which may represent an issue in real time applications. On the other hand, the method proposed in [8] consists of a real-time algorithm where the resulting signal is limited in SR but not in BW, making challenging its amplification if only a switched mode EA is considered or requiring a wide band if only a linear EA is considered. Therefore, in [14], the SR reduction algorithm proposed in [8] was modified in order to also restrict the BW of the resulting slow envelope. Moreover, due to its simplicity this algorithm is suitable to be implemented in a digital signal processor. Fig. 2 shows the original RF signal's envelope, an SR reduced version of the original envelope (SR reduced envelope - SRRE) and a BW reduced version of the original envelope (BW reduced envelope - BWRE) in both time and frequency domains, respectively. The parameter N (defined in [8]) is related to the maximum allowed increment in the signal's slope. For example, N = 100 corresponds to an SR reduction of 96% and BW reduction of 64% with respect to



Fig. 1. General block diagram of an ET PA with DPD.



Fig. 2. Waveforms and spectra of the envelope and its SR and BW limited versions [14].



Fig. 3. AM–AM characteristics of the PA when considering only three margins of  $E_s$  (left) and taking into account all possible values of  $E_s$  (right).

the original signal's envelope. The results shown in Fig. 2 were extracted from the implementation of this algorithm on a FPGA Virtex-4 whose clock speed was set to 60 MHz.

As reported in [15], the efficiency decays more or less linearly with the BW reduction, while it presents a logarithmic behavior with the SR reduction. As a consequence, when considering applications with high BW signals (e.g. dual-band transmissions) it is possible to find a trade-off solution to meet both SR and BW requirements of the EA while still keeping a reasonably good drain efficiency figure.

Unfortunately, using the SR and BW limited envelope (or simply *slow envelope* –  $E_s$ ) to supply the power transistor's drain generates a particular nonlinear distortion. Fig. 3 shows the AM–AM characteristics considering different margins of  $E_s$  values. As observed in Fig. 3, the ET PA shows a nonlinear variant gain because the slow envelopes used to supply the PA and the RF input signal are not univocally related. Therefore, for a given input it is possible to have a range of different outputs because it depends on the specific value of the dynamic power supply. Therefore, the ET PA presents a SED nonlinear behavior.

### III. DESIGN OF A REAL-TIME DPD FOR ET

The type of low-pass equivalent black-box behavioral model required to characterize the nonlinear distortion that arises when applying ET is dependent on the strategy (or shaping function) followed to supply the PA. Therefore, on the one hand, if the PA drain voltage follows the same shape (despite being bounded at low-voltage levels) than the RF signal's envelope, typical behavioral models such as the memory polynomial (MP) [7] can be used for DPD purposes. On the other hand, if the slow envelope is used to supply the PA, then the DPD has to include the information of the slow envelope in order to be capable of compensating for this type of nonlinear distortion.

For the case of using the original envelope, we can consider the implementation of a DPD based on the simple MP model. Following the notation in Fig. 1, the input–output relationship of the MP DPD is defined as

$$x[n] = \sum_{i=0}^{N} u[n - \tau_i] f_i(|u[n - \tau_i]|), \qquad (1)$$

where nonlinear functions  $f_i(\cdot)$  can be described by polynomials of order *P* 

$$f_{i}(|u[n-\tau_{i}]|) = \sum_{p=0}^{p} \gamma_{pi}|u[n-\tau_{i}]|^{p} = \gamma_{0i} + \gamma_{1i}|u[n-\tau_{i}]| + \dots + \gamma_{pi}|u[n-\tau_{i}]|^{p}.$$
(2)

As previously explained, when considering the slow envelope to supply the PA, the nonlinear distortion that appears cannot be compensated by simply using dynamic behavioral models such as the MP [10]. Therefore, in [9] a dynamic SED behavioral model is proposed to compensate for this type of nonlinear distortion. The input–output relationship of the SED-DPD is defined as

$$x[n] = \sum_{j=0}^{M} \sum_{q=0}^{Q} \sum_{i=0}^{N} \sum_{p=0}^{P} \gamma_{piqj} (E_s[n-\tau_j])^q u[n-\tau_i] |u[n-\tau_i]|^p,$$
(3)

where  $E_s[n]$  is the SR-limited version of the original envelope, u[n] is the input signal,  $\tau_j$  and  $\tau_i$  (with  $\tau_o = o$ ) are the most significant tap delays of the slow envelope and input signal, respectively, contributing to the characterization of memory effects.

Figure 4 shows linearized and unlinearized AM-AM characteristics of an ET PA when supplying the PA with the original envelope (MP DPD used) and with a slower version of the original envelope (SED-DPD used). The linearity performance in terms of out-of-band distortion compensation of the SED-DPD can be observed in Fig. 5. These particular results were measured on a test-bed based on instrumentation, schematically depicted in Fig. 1 and described in [10]. The Device under test (DUT) is a Cree Inc. Evaluation Board CGH40006P-TB (GaN transistor) at 2 GHz operating at a mean output power of 28 dBm. For the sake of simplicity, a linear IC LT1210 was considered as the envelope driver. The PAPR of the signals at baseband range from around 8 up to 11 dB, depending on the type of signal used (single-carrier M-QAM or OFDM). In the case of the SED-DPD, we used the following configuration: P = 9, Q = 2, M = 3 and N = 1(alternatively, N = 0).



Fig. 4. Linearized and unlinearized AM-AM characteristics of an ET PA considering: (a) the original envelope (left), (b) a slow envelope (right).



Fig. 5. Unlinearized and linearized (dynamic SED-DPD) output power spectra of a single-carrier 16-QAM (left) and OFDM 16-QAM (right) signals, respectively.



Fig. 6. Block diagram of the MP DPD (left) and the SED-DPD (right).

### IV. FPGA IMPLEMENTATION ARCHITECTURES

The FPGA implementation of an MP DPD will follow the structure presented in Fig. 6. Each branch represents one nonlinear function expressed by means of a polynomial development. To allow an accurate and efficient FPGA implementation of the MP DPD it is important to minimize the number of arithmetic operations (counting both additions and multiplications) and minimize the accumulative error inside the FPGA. Both issues can be addressed using the Horner's rule and this way limiting the number of consecutive complex multiplications to a maximum of two. Moreover, as presented in [16], in order to avoid a large variation in magnitude of the polynomial coefficients (which requires a large number of bits to preserve the precision of the computation) it is possible to take the ratios of adjacent coefficients. As a consequence, with a reformulation of (2) according to Horner's rule, nonlinear functions  $f_i(\cdot)$  can be described as

$$f_{i}(|u[n-\tau_{i}]|) = \gamma_{0i} \left(1 + \frac{\gamma_{1i}}{\gamma_{0i}}|u[n-\tau_{i}]| \left(1 + \cdots + \frac{\gamma_{(P-1)i}}{\gamma_{(P-2)i}}|u[n-\tau_{i}]| \left(1 + \frac{\gamma_{Pi}}{\gamma_{(P-1)i}}|u[n-\tau_{i}]|\right)\right) \cdots\right)$$
(4)

Therefore, taking into account the polynomial expression in (2), where  $\gamma_{pi} \in \mathbb{C}$ , it takes p + 1 real multiplications for each monomial  $\gamma_{pi} |u[n - \tau_i]|^p$  and 2P additions (*P* complex

additions), resulting in 
$$P(P+7)/2$$
 arithmetic operations for a polynomial of degree *P*. While using the formulation in (4), computation starts with the innermost parentheses using the coefficients of the highest degree monomials and works outward, each time multiplying the previous result by  $|u[n - \tau_i]|$  and adding the coefficient of the monomial of the next lower degree. Now it takes 4*P* arithmetic operations for a polynomial degree of *P*, which for high polynomial orders, Horner's algorithm results much more computationally efficient. Figure 7 shows the structure of the nonlinear branches of the MP DPD in Fig. 6. Alternatively, instead of using polynomials to describe nonlinear functions  $f_i(\cdot)$  it would have been possible to use basic predistortion cells (BPCs) [17]. A BPC is composed of a RAM block acting as a look-up table (LUT), an address calculator and complex multipliers.

In order to implement the dynamic SED-DPD in an FPGA device, the polynomial model in (3) is expressed as a combination of several BPCs [9]:

$$x[n] = u[n] \times \underbrace{\sum_{p=0}^{p} \gamma_{pooo} \times |u[n]|^{p}}_{G_{LUT}^{coo}(\cdot)} + \cdots + (E_{s}[n])^{Q}$$
$$\times u[n] \times \underbrace{\sum_{p=0}^{p} \gamma_{poQo} \times |u[n]|^{p}}_{G_{LUT}^{cQo}(\cdot)} + \cdots + u[n]$$



Fig. 7. Structure of one of the branches of the MP DPD (see Fig. 6) using Horner's rule.

$$\times \sum_{p=0}^{p} \gamma_{pooM} \times |u[n]|^{p} + \dots + (E_{s}[n - \tau_{M}])^{Q}$$

$$\times u[n] \times \sum_{p=0}^{p} \gamma_{poQM} \times |u[n]|^{p} + \dots$$

$$+ u[n - \tau_{N}] \times \sum_{p=0}^{p} \gamma_{pNoo} \times |u[n - \tau_{N}]|^{p}$$

$$+ \dots + (E_{s}[n])^{Q} \times u[n - \tau_{N}]]$$

$$\times \sum_{p=0}^{p} \gamma_{pNQo} \times |u[n - \tau_{N}]|^{p} + \dots$$

$$= \frac{G_{LUT}^{NOO}(\cdot)}{G_{LUT}^{NOO}(\cdot)}$$

$$+ u[n - \tau_{N}] \times \sum_{p=0}^{p} \gamma_{pNoM} \times |u[n - \tau_{N}]|^{p}$$

$$+ \dots + (E_{s}[n - \tau_{M}])^{Q} \times u[n - \tau_{N}]$$

$$\times \sum_{p=0}^{p} \gamma_{pNQM} \times |u[n - \tau_{N}]|^{p} + ,$$

$$= \frac{G_{LUT}^{NOO}(\cdot)}{G_{LUT}^{NOO}(\cdot)}$$

$$+ \dots + (E_{s}[n - \tau_{M}])^{Q} \times u[n - \tau_{N}]$$

$$\times \sum_{p=0}^{p} \gamma_{pNQM} \times |u[n - \tau_{N}]|^{p} + ,$$

$$= \frac{G_{LUT}^{NOO}(\cdot)}{G_{LUT}^{NOO}(\cdot)}$$

which yields to the following expression of the SED-DPD:

$$x[n] = \sum_{j=0}^{M} \sum_{q=0}^{Q} \sum_{i=0}^{N} \left( E_s[n-\tau_j] \right)^q u[n-\tau_i] \times G_{LUT}^{iqj}(|u[n-\tau_i]|)$$
(6)

with  $G_{LUT}^{iqj}$  being complex LUT gains.

Figure 6 shows the general block diagram of the SED-DPD architecture, where nonlinear functions  $f_{iqj}$  (·) can be expressed as a combination of BPCs. The number of BPCs forming this SED-DPD is # BPCs = (Q + 1)(N + 1)(M + 1). This structure requires less arithmetic operations than using polynomials; however, it consumes more memory resources.

Figure 8 shows the basic structure of a BPC where a dualport RAM, with two independent sets of ports for simultaneous reading and writing, is used to allow the complex LUT gains to be updated continuously without interrupting the normal data transmission. Therefore, because of this LUT-based architecture, it is possible to perform continuous adaptation of the DPD function by means of the least-mean squares (LMS) algorithm [17].

# V. CONCLUSION

In this paper, we have presented and discussed two computationally efficient design strategies for implementing real-time DPD in a FPGA device when considering ET PAs. As discussed along the paper, when considering slow versions of the original envelope to perform ET, the nonlinear distortion that appears has to be compensated using DPD architectures that depend not only on the input data and its memory, but also on the drain voltage signal (slow envelope) and its memory. Two efficient architectures to allow real-time FPGA implementation of the DPD function have been presented. One solution is based on polynomials and the other one is based on LUTs. The trade-off between those two configurations is the number of arithmetic operations versus the memory resources requirements. In any case, the linearization performance of both architectures has been validated in several papers [9, 16]. Finally, another key issue toward the computationally efficient FPGA implementation is the design of identification/adaptation process. One possibility is the use of LMS-based solutions as in [17], where the



Fig. 8. Basic architecture of a BPC forming the SED-DPD (see Fig. 6).

coefficients (or complex LUT gains) are being continuously updated. Alternatively, if more complex least-squares-type algorithms are considered, the coefficient update procedure can be relocated to embedded software running on a microblaze soft processor core as in [18].

#### ACKNOWLEDGEMENT

This work was supported by the Spanish Government (MINECO) under project TEC2011-29126-C03-02.

#### REFERENCES

- Kim, B.; Kim, I.; Moon, J.: Advance Doherty architecture. IEEE Microw. Mag., 11 (2010), 72–86.
- [2] Raab, F.; Sigmon, B.; Myers, R.; Jackson, R.: L-band transmitter using Kahn EER technique. IEEE Trans. Microw. Theory Tech., 46 (1998), 2220–2225.
- [3] Wang, F. et al.: An improved power-added efficiency 19 dBm hybrid envelope elimination and restoration power amplifier for 802.11 g WLAN applications. IEEE Trans. Microw. Theory Tech., 54 (2006), 4086–4099.
- [4] Taromaru, M.; Ando, N.; Kodera, T.; Yano, K.: An EER transmitter architecture with burst-width envelope modulation based on triangle wave comparison PWM, in Proc. IEEE Int. Symp. Personal, Indoor and Mobile Radio Communications (PIMRC'07), Athens, Greece, September 2007, 1–5.
- [5] Jeong, J.; Kimball, D.F.; Kwak, M.; Hsia, C.; Draxler, P.; Asbeck, P.M.: Wideband envelope tracking power amplifiers with reduced bandwidth power supply waveform and adaptive digital predistortion techniques. IEEE Trans. Microw. Theory Tech., 57 (2009), 3307–3314.
- [6] Mustafa, A.K.; Bassoo, V.; Faulkner, M.: Reducing drive signal bandwidths of EER microwave power amplifiers, in IEEE MTT Int. Microwave Symp. (IMS 2009), Boston, USA.
- [7] Kim, J.; Konstantinou, K.: Digital predistortion of wideband signals based on power amplifier model with memory. Electron. Lett., 37 (23) (2001), 1417–1418.
- [8] Montoro, G.; Gilabert, P.L.; Bertran, E.; Berenguer, J.: A method for real-time generation of slew-rate limited envelopes in envelope tracking transmitters, in IEEE Int. Microwave Series on RF Front-ends for Software Defined and Cognitive Radio Solutions, Aveiro, Portugal, February 2010, 1–4.
- [9] Gilabert, P.L.; Montoro, G.: Look-up table implementation of a slow envelope dependent digital predistorter for envelope tracking power amplifiers. IEEE Microw. Wirel. Compon. Lett., 22 (2) (2012), 97–99.
- [10] Montoro, G.; Gilabert, P.L.; Berenguer, J.; Bertran, E.: Digital predistortion of envelope tracking amplifiers driven by slew-rate limited envelopes, in IEEE Int. Microwave Symp. (IMS'2011), Baltimore, USA, June 2011.

- [11] Wimpenny, G.: Envelope Tracking PA Characterisation. White Paper. Open ET Alliance (http://www.open-et.org). November 2011.
- [12] Hanington, G.; Chen, P.-F.; Asbeck, P.M.; Larson, L.E.: Highefficiency power amplifier using dynamic power-supply voltage for CDMA applications. IEEE Trans. Microw. Theory Tech., 47 (1999), 1471–1476.
- [13] Hoversten, J.; Schafer, S.; Roberg, M.; Norris, M.; Maksimovic, D.; Popovic, Z.: Codesign of PA, supply, and signal processing for linear supply-modulated RF transmitters. IEEE Trans. Microw. Theory Tech., **60** (2012), 2010–2020.
- [14] Vizarreta, P.; Montoro, G.; Gilabert, P.A.: Hybrid envelope amplifier for envelope tracking power amplifier transmitters, in European Microwave Conf. (EuMC'12), Amsterdam, Holland, November 2012, 1–4.
- [15] Gilabert, P.L.; Montoro, G.; Vizarreta, P.: Slew-rate and efficiency trade-off in slow envelope tracking power amplifiers, in German Microwave Conf. (GeMiC<sup>1</sup>2), Ilmenau, Germany, March 2012, 1–4.
- [16] Mrabet, N.; Mohammad, I.; Mkadem, F.; Rebai, C.; Boumaiza, S.: Optimized hardware for polynomial digital predistortion system implementation, in IEEE Topical Conf. on Power Amplifiers for Wireless and Radio Applications (PAWR), Santa Clara, USA, January 2012, 81–84.
- [17] Gilabert, P.L.; Montoro, G.; Bertran, E.: FPGA implementation of a real-time NARMA-based digital adaptive predistorter. IEEE Trans. Circuits Syst. II, 57 (2011), 402–406.
- [18] Julius, S.; Dinh, A.: Evaluation of a digital predistortion on FPGA for power amplifier linearization, in IEEE Canadian Conf. on Electrical and Computer Eng. (CCECE), Montreal, Canada, May 2011, 660–664.



**Pere L. Gilabert** received the degree in Telecommunication Engineering from UPC in 2002, and he developed his Master Thesis at the University of Rome "La Sapienza" with an exchange grant. He joined the department of TSC in 2003 and received his Ph.D., awarded with the Extraordinary Doctoral Prize, from the UPC in 2008. He is an

associate professor at UPC where his research activity is in the field of linearization techniques and highly efficient transmitter architectures.



**Gabriel Montoro** received the M.S. degree in Telecommunication Engineering in 1990 and his Ph.D. degree in 1996, both from UPC. He joined the department of TSC in 1991, where he is currently an associate professor. His first research works were done on the area of adaptive control, and now his main research interest is in the use of signal

processing strategies for efficiency improvement in communications systems.