**4. Traditional precoding**

In this section, we will introduce traditional precoding to discuss its working mechanism and design principle. Preliminaries will be first introduced, as the basis of further discussion. Based on that, we mainly introduce the linear block-level precoding schemes with closed-form solutions, including MRT, ZF, and RZF. After that, the traditional non-linear symbol-level precoding will be discussed, including THP and VP.

#### **4.1 Preliminaries on precoding**

First, we will introduce the preliminaries of the precoding process in the downlink MIMO system, as the basis of further discussion.

Without loss of generality, we mainly consider a downlink MU-MISO system, where *K* single-antenna users are served by a common base station with *Nt* transmit antennas at the same time. Considering that users are generally separated spatially, based on CSI, the BS needs to employ signal processing techniques before transmission such that the destructive effect of channel fading and inter-user interference can be eliminated as much as possible. This is the initial motivation for precoding. Mathematically, the precoding process can be expressed as

$$\mathbf{x} = \sum\_{k=1}^{K} \mathbf{w}\_k \\ \mathbf{s}\_k = \mathbf{W} \mathbf{s}, \tag{8}$$

where **w***<sup>k</sup>* ∈ *Nt*�<sup>1</sup> denotes the *k*-th user's precoding vector and *sk* is the *k*-th user's data symbol, which is drawn from a specific modulation constellation. Based on that, with the general precoding matrix **W** ¼ ½ � **w**1, **w**2, … , **w***<sup>K</sup>* ∈ *Nt*�*<sup>K</sup>* and date symbol vector **s** ¼ ½ � *s*1, *s*2, … , *sK* <sup>T</sup> <sup>∈</sup> *<sup>K</sup>*�1, the received signal for the *<sup>k</sup>*-th user can be expressed as

$$\mathbf{y}\_k = \mathbf{h}\_k^T \mathbf{x} + n\_k = \mathbf{h}\_k^T \mathbf{W} \mathbf{s} + n\_k,\tag{9}$$

where *yk* is the received signal for the *k*-th user, **h***<sup>k</sup>* ∈ *Nt*�<sup>1</sup> is the complex channel vector between the BS and the *<sup>k</sup>*-th user, and *nk* � ℂℕ 0, *<sup>σ</sup>*<sup>2</sup> ð Þ is the additive Gaussian noise with zero mean and *σ*<sup>2</sup> noise power. Based on that, the transmission process can be given as

$$\mathbf{y} = \mathbf{H}\mathbf{W}\mathbf{s} + \mathbf{n},\tag{10}$$

where **y**∈ *<sup>K</sup>*�<sup>1</sup> denotes the received signal vector, **H** ∈ *<sup>K</sup>*�*Nt* denotes the channel matrix, and **n**∈ *<sup>K</sup>*�<sup>1</sup> denotes the additive noise vector.

In traditional communication systems, the presence of interference can significantly degrade the quality of the received signal. This is particularly true in multi-user systems, where signals for different users are superimposed over the spatial channel. In such scenarios, the transmitted signals from different users can interfere with each other, leading to reduced signal quality at the receiver.

The insight of precoding is to design the precoding matrix **W** such that the received signal **y** can approach the data symbol vector **s** as much as possible. In the following subsections, we will introduce linear closed-form block-level precoding, which is a classical type of precoding.

#### **4.2 Linear closed-form precoding**

The classical linear block-level precoding schemes have been widely used in practical engineering systems since they can ensure satisfactory communication performance with low computational complexity. In this subsection, we will mainly discuss the specific linear closed-form precoding, including MRT, ZF, and RZF, to show the principle of precoding design and the physical mechanism of the precoding effect.

Specifically, the precoding matrix of **MRT** can be given as [4].

$$\mathbf{W\_{MRT}} = \frac{\mathbf{1}}{f\_{\text{MRT}}} \cdot \mathbf{H^{H}} = \sqrt{\frac{P\_0}{\text{tr}\{\mathbf{H}\mathbf{H}^{H}\}}} \mathbf{H^{H}},\tag{11}$$

where *f* MRT ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tr **HH**<sup>H</sup> f g *P*0 r denotes the normalization factor to ensure the satisfaction

of the transmit power constraint, and *P*<sup>0</sup> denotes the total transmit power. Considering that MRT can maximize the signal gain at the intended user, its performance is promising in noise-limited scenarios (low SNR regimes or large-scale MIMO scenarios), while its performance is limited in interference-limited scenarios.

**Zero-Forcing (ZF)** precoding is another classical precoding method that has been extensively used in practical applications [8]. By employing a Moore-Penrose inverse of the channel matrix **H** as the precoding matrix, ZF precoding can create an ideal environment where each user's effective channel is orthogonal with each other. Based on that, inter-user interference can be eliminated as much as possible. The ZF precoding matrix can be expressed as

$$\mathbf{W}\_{\text{ZF}} = \frac{\mathbf{1}}{f\_{\text{ZF}}} \cdot \mathbf{H}^{\text{H}} \left(\mathbf{H} \mathbf{H}^{\text{H}}\right)^{-1} = \sqrt{\frac{P\_{0}}{\text{tr}\left\{\left(\mathbf{H} \mathbf{H}^{\text{H}}\right)^{-1}\right\}}} \mathbf{H}^{\text{H}} \left(\mathbf{H} \mathbf{H}^{\text{H}}\right)^{-1}, N\_{t} \ge K,\tag{12}$$

where *f* ZF ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tr **HH**<sup>H</sup> ð Þ�<sup>1</sup> � � *P*0 r denotes the normalization factor for ZF precoding. ZF precoding is shown to achieve improved performance over MRT in the high SNR regime. The main idea of ZF precoding is to create orthogonal effective channels among all the users to fully eliminate inter-user interference. For its low computational complexity, ZF precoding has been widely used in practical engineering systems. However, the noise amplification effect limits its performance, especially in low SNR regions, which has been improved by RZF precoding.

By introducing a regularization factor to handle the noise amplification effect, the **RZF** precoding can further improve the performance of ZF precoding [9]. The RZF precoding matrix can be given by

$$\begin{aligned} \mathbf{W}\_{\text{RZF}} &= \frac{\mathbf{1}}{\mathcal{f}\_{\text{RZF}}} \cdot \mathbf{H}^H \left(\mathbf{H}\mathbf{H}^H + a \cdot \mathbf{I}\right)^{-1} \\ &= \sqrt{\frac{P\_0}{\text{tr}\left\{\left(\mathbf{H}\mathbf{H}^H + a \cdot \mathbf{I}\right)^{-1}\mathbf{H}\mathbf{H}^H \left(\mathbf{H}\mathbf{H}^H + a \cdot \mathbf{I}\right)^{-1}\right\}}} \mathbf{H}^H \left(\mathbf{H}\mathbf{H}^H + a \cdot \mathbf{I}\right)^{-1}, \end{aligned} \tag{13}$$

where *f* RZF ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tr **HH***<sup>H</sup>* ð Þ <sup>þ</sup>*α*�**<sup>I</sup>** �<sup>1</sup> **HH***<sup>H</sup>* **HH***<sup>H</sup>* ð Þ <sup>þ</sup>*α*�**<sup>I</sup>** �<sup>1</sup> � � *P*0 r denotes the normalization factor for RZF precoding, and *α* denotes the regularization factor whose optimal value is *<sup>α</sup>*<sup>∗</sup> <sup>¼</sup> *<sup>K</sup>σ*2.

#### **4.3 Non-linear symbol-level precoding**

Compared with linear precoding, non-linear precoding can achieve better performance by employing more sophisticated precoding techniques, at the cost of relatively high computational complexity. Generally speaking, based on CSI and the data symbol, non-linear precoding manipulates signal at the symbol level, which leads to a better communication performance but higher processing complexity. The transmitted signal of non-linear precoding is no longer a linearly weighted combination of symbol vectors. In this subsection, we will introduce classical non-linear precoding schemes to show their working mechanism.

**Dirty Paper Coding (DPC)** is able to reduce the destructive effect of inter-user interference and further achieve channel capacity in MIMO systems [10]. However, assuming perfect CSI and that interference information can be obtained at the transmitter, the capacity-achieving DPC requires an infinite-length coding and a highcomplexity searching algorithm, which limits its application in practical systems.

Considering the high complexity of DPC, **Tomlinson-Harashima Precoding (THP)** has been proposed as an alternating near-capacity scheme whose computational complexity is relatively acceptable in practice. The basic idea of THP is to predistort the symbols before they are transmitted over the communication channel [11]. This pre-distortion is achieved by adding a feedback loop to the transmitting system, which modifies the symbols based on the previous symbols that have been transmitted. The feedback loop effectively cancels out the distortion introduced by the communication channel, leading to a higher quality and more reliable signal at the receiver. **Figure 2** shows the architecture of the THP precoding system.

Specifically, THP first decomposes the channel matrix into

$$\mathbf{H} = \mathbf{L} \mathbf{F}^{\mathbb{H}},\tag{14}$$

with a lower-triangle matrix **L** and a unitary matrix **F**. Based on that, the transmitted signal vector **x** for THP can be further expressed as

$$\mathbf{x}\_{\rm THP} = \mathbf{F}\tilde{\mathbf{x}}\_{\rm THP},\tag{15}$$

#### **Figure 2.**

*The geometrical representation of THP.*

where **x**~ can be obtained by

$$[\tilde{\mathbf{x}}\_{\text{THP}}]\_k = \text{mod}\_{\mathbf{r}} \left\{ s\_k - \sum\_{l=1}^{k-1} [\mathbf{B}]\_{k,l} \, [\tilde{\mathbf{x}}\_{\text{THP}}]\_l \right\}, \forall k \in \{1, 2, \dots, K\}. \tag{16}$$

mod*<sup>τ</sup>* f g*x* denotes a complex modulo function, given by

$$\mathbf{mod}\_{\mathbf{r}}\{\mathbf{x}\} = \left(\Re(\mathbf{x}) - \boldsymbol{\tau} \cdot \lfloor \frac{\Re(\mathbf{x}) + \boldsymbol{\tau}/2}{\boldsymbol{\tau}} \rfloor\right) + j \left(\Re(\mathbf{x}) - \boldsymbol{\tau} \cdot \lfloor \frac{\Im(\mathbf{x}) + \boldsymbol{\tau}/2}{\boldsymbol{\tau}} \rfloor\right),\tag{17}$$

where *τ* denotes the modulo basis and ⌊�⌋ denotes the floor approximating function. Based on the analysis above, the effective THP channel can be expressed as

$$\mathbf{B} = \mathbf{G} \mathbf{H} \mathbf{F},\tag{18}$$

where **G** is a diagonal matrix that contains the complex scaling gain corresponding to each user, which is actually the inverse of the corresponding diagonal entry in **L**, i.e.,

$$\mathbf{g}\_k = [\mathbf{G}]\_{k,k} = \frac{\mathbf{1}}{[\mathbf{L}]\_{k,k}}.\tag{19}$$

At the receiver side, the scaling compensation operation and the modulo operation are also required prior to the demodulation.

Considering that the performance of ZF precoding is mainly limited by its noise amplification effect, the **Vector- Perturbation (VP)** precoding [12] has been proposed as an improvement [12]. Based on the ZF precoding, VP precoding introduces a perturbation vector to the symbol vector, resulting in a transmitted signal that aligns better with the main eigenvector direction of the channel inverse matrix. This reduces the noise amplification factor and further lowers the noise amplification effect of ZF. Therefore, compared to ZF, VP can achieve significant performance gains. To be more specific, the VP precoding process can be expressed as

$$\mathbf{x}\_{\rm VP} = \frac{\mathbf{1}}{f\_{\rm VP}} \cdot \mathbf{H}^{\rm H} \left(\mathbf{H} \mathbf{H}^{\rm H}\right)^{-1} (\mathbf{s} + \boldsymbol{\tau} \cdot \mathbf{l}),\tag{20}$$

where *τ* ¼ 2j j*c* max þ Δ denotes the modulo basis corresponding to the modulation level, j j*c* max denotes the modulus value of the maximum amplitude modulation

constellation point, and Δ is the minimum distance among the constellation points. **l**∈ℂℤ*K*�<sup>1</sup> denotes the complex integer perturbation vector, given as

$$\mathbf{I} = \underset{\mathbf{I} \in \mathbb{C}\mathbb{Z}^{\mathbb{K} \times 1}}{\arg\min} \left\| \mathbf{H}^{\mathrm{H}} \left( \mathbf{H} \mathbf{H}^{\mathrm{H}} \right)^{-1} (\mathbf{s} + \boldsymbol{\tau} \cdot \mathbf{I}) \right\|\_{2}^{2},\tag{21}$$

which can be obtained by the sphere decoder. Based on that, the normalization factor of VP precoding can be obtained by

$$y\_k = \frac{1}{f\_{\rm VP}} \cdot \mathbf{h}\_k \mathbf{x}\_{\rm VP} + n\_k = \frac{1}{f\_{\rm VP}} (s\_k + \pi l\_k) + n\_k,\tag{22}$$

where *lk* denotes the *k*-th element of the perturbation vector **l**. In order to eliminate the perturbation component *τlk* at the receiver side, the receiver needs to accomplish the module operation after the power compensation, as shown below:

$$\begin{split} r\_k &= \text{mod}\_{\tau} \{ f\_{\text{VP}} y\_k \} \\ &= \text{mod} \tau \{ s\_k + \tau l\_k + f\_{\text{VP} \mathbb{n}\_k} \} \\ &= s\_k + f\_{\text{VP}} \hat{n}\_k, \end{split} \tag{23}$$

where *n*^*<sup>k</sup>* denotes the effective noise of the *k*-th user.
