**Abstract**

Under high-mobility scenarios, the traditional coherent demodulation schemes (CDS) have limited performance, because reference signals cannot effectively track the channel variations with an affordable overhead. As an alternative solution, noncoherent demodulation schemes (NCDS) based on differential modulation have been proposed. Even in the absence of reference signals, they are capable of outperforming the CDS with a reduced complexity. The literature on NCDS laid the theoretical foundations for simplified channel and signal models, often single-carrier and spatially uncorrelated flat-fading channels. This chapter explains the most recent results assuming orthogonal frequency division multiplexing (OFDM) signaling and realistic channel models.

**Keywords:** channel estimation, differential modulation, non-coherent, high-mobility, OFDM

### **1. Introduction**

Massive multiple-input multiple-output (MIMO) [1] is a key technology for the advancement of wireless communications, especially in the evolution from the current fifth generation (5G) [2] to the forthcoming sixth generation (6G) [3–6] of mobile communication systems. Typically, the base station (BS) is equipped with a very large number of radiating elements, while the user equipment (UE) is only equipped with one single antenna or very few. Under this scenario, the BS can either simultaneously spatially multiplex several data streams to many UEs or enhance the quality of some links by exploiting spatial diversity. In order to fully exploit the benefits of MIMO technology, accurate channel state information (CSI) between the BS and the UEs is a must; otherwise, the performance is significantly degraded [7, 8].

Coherent demodulation scheme (CDS) is the typically chosen technique for exploiting massive MIMO systems. The acquisition of CSI is obtained by transmitting some reference signals or pilot symbols per antenna, which is known as pilot symbolassisted modulation (PSAM) [9]. At the receiver, the CSI is estimated by typically

using the Least-Squares criterion (LS) [10]. Finally, the pre/post-equalization matrices are computed in order to compensate for the effects of the channel by typically using the zero-forcing (ZF) or minimum mean squared error (MMSE) criteria [11]. However, the transmission of reference signals produces an excessive overhead in the system since these pilot symbols are mapped in the physical resources in the data frame. In order to alleviate this issue, time division duplexing (TDD) is the preferable choice since the channel reciprocity can be assumed, and hence, the CSI is only estimated in the uplink (UL) and reused in the downlink (DL) [12].

Nevertheless, acquiring accurate CSI without sacrificing the performance of the system is significantly limited and cannot be adopted in the new challenging scenarios considered in 6G, such as high-mobility communications and low-powered networks [8, 13]. On the one hand, CDS requires that the coherence time of the channel impulse response remains for long symbol periods, otherwise, a huge amount of reference signals must be transmitted to constantly track the fast channel variations, which is the typical case in autonomous vehicles, drone communications and satellite links. On the other hand, CDS requires links with a medium/high signal-to-noise ratio (SNR) in order to provide accurate enough CSI, otherwise the computed equalization matrices are not correct and degrade the performance of the system. To improve the quality of the CSI, the channel estimates must be obtained in several independent physical resources for the same UE and averaged out to reduce the noise and interference effects. Last but not least, in scenarios with many spatially multiplexing UEs, to avoid the pilot contamination produced among the UEs [14]. This results in a even larger training overhead, which will also be detrimental for the data efficiency.

Non-coherent demodulation scheme (NCDS) is an appealing alternative to be combined with massive MIMO since it can demodulate the transmitted information without the knowledge of CSI, with the same asymptotic performance as coherent schemes [8]. Thus, the huge amount of required reference signals in CDS is entirely avoided and the complexity of transceivers is also reduced. Many works in the literature showed that the NCDS detection can provide an acceptable performance in very fast time-varying scenarios [8, 13, 15–21], while the coherent scheme fails. Additionally, NCDS is flexible and can be integrated in an orthogonal frequency division multiplexing (OFDM) [22]. Compared to the CDS, its performance superiority in scenarios with stringent condition makes it a good candidate for future communication systems in high-speed scenarios.

Some works have targeted the UL scenario [17, 20], in which one single-antenna UE transmits the differential symbols, while the BS exploits the spatial diversity produced by large number of antennas. An NCDS scheme based on differential *M*-ary phase shift keying (DMPSK) constellations was exploited [17], allowing differential detection while leveraging the advantages of an increased number of receive antennas. Later, [20] combined the NCDS with the OFDM multi-carrier waveform, in order to combat the frequency-selective channel. The differential symbols are mapped in the two-dimensional (time and frequency) resource grid. In [19], the NCDS is combined with precoding based on beamforming, where assuming that a beam-management procedure is executed beforehand. Recently, a combination of CDS and NCDS is also explored [13] in order to take advantage of both techniques. To achieve this, a blind channel estimation is proposed utilizing reconstructed non-coherent data, which can be later used to perform UL filtering of coherent data resulting in a hybrid demodulation scheme (HDS). Additionally, Lopez-Morales and Garcia-Armada [15] also proposed using a multi-user precoding for the DL combined with DMPSK to avoid the use of pilot symbols.

*Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

An overview of NCDS combined with massive MIMO-OFDM under different scenarios is provided in this chapter. Section 2 explains the UL of the non-coherent massive MIMO based on DMPSK and blind channel estimation. Section 3 provides the two possibilities to perform the DL in the non-coherent massive MIMO based on DMSPK. Section 4 details the multi-user approach for the UL of the NC massive MIMO based on constellation multiplexing. Section 5 compares the CDS, NCDS and HDS schemes in different scenarios. Finally, Section 6 concludes the chapter and gives insights into future research lines.

### **2. Non-coherent massive MIMO in UL**

Two wireless transceivers are considered in this scenario. One is a BS equipped with *V* antennas, while the other is a UE equipped with a single antenna. The chosen waveform is the well-known OFDM, composed of *K* subcarriers with a subcarrier spacing of Δ*f* Hz and a cyclic prefix (CP), whose length is measured in samples (*LCP*), to mitigate the multi-path effects of the channel. A set of *N* contiguous OFDM symbols is assumed to be transmitted in a burst. Note that multiple UEs can be multiplexed in either time or frequency dimensions thanks to the two-dimensional resource grid provided by the OFDM. Additionally, the UEs can be also mapped in the constellation domain, whose details are given in Section 4.

#### **2.1 Fundamentals of differential encoding and decoding in OFDM**

Typically, NCDS based on differential modulation is performed using the time domain scheme. This scheme is represented in **Figure 1a**, where the red arrows indicate the direction in which differential modulation and demodulation are performed. In this case, it occurs between resources that belong to the same frequency and contiguous symbols in the time domain. The differential encoding can be described as

$$\tilde{\mathcal{X}}\_{k,n} = \begin{cases} \tilde{r}\_{k,n}, & n = 1 \\ \tilde{\mathcal{X}}\_{k,n-1} \tilde{s}\_{k,n-1}, & 2 \le n \le N \end{cases}, \quad 1 \le k \le K,\tag{1}$$

where ~*rk*,1 is the reference symbol transmitted by the UE at the *k*th subcarrier of the first OFDM symbol, while ~*sk*,*<sup>n</sup>* and *x*~*<sup>k</sup>*,*<sup>n</sup>* are complex data and differential symbols, respectively, at the *k*th subcarrier and *n*th OFDM symbol transmitted by the UE. The data symbol ~*sk*,*<sup>n</sup>* needs to meet the condition

$$\tilde{\mathfrak{s}}\_{k,n} \in \mathfrak{M}, \quad \mathbb{E}\left\{ \left| \bar{\mathfrak{s}}\_{k,n} \right|^2 \right\} = \mathbf{1} \quad \mathbf{1} \le k \le K, \quad \mathbf{1} \le n \le N - 1,\tag{2}$$

where M denotes the set of symbols of a PSK constellation due to the fact that the differential encoding can only transmit information in the phase component and its average energy is normalized to one. One drawback of implementing the mapping scheme in the time domain is the increased latency and memory consumption. This is because the scheme requires waiting for two complete OFDM symbols to be received in order to obtain ~*sk*,*<sup>n</sup>*. In the time domain implementation, a differential decoding of two contiguous symbols is performed (as shown in **Figure 1a**). Furthermore, this

#### **Figure 1.**

*Differential modulation mapping schemes in an OFDM resource grid when K* ¼ 12*, N* ¼ 14 *and* ℐ*<sup>N</sup>* ¼ f g 1, 8 *. The yellow and blue boxes denote the reference symbols required by the differential modulation and phase difference estimation, respectively.*

implementation cannot be used when there is a high Doppler spread because two consecutive OFDM symbols may not experience similar channel responses.

Alternatively, the frequency domain scheme can be also used to implement the differential modulation technique, by exploiting the frequency dimension (as shown in **Figure 1b**). In this scheme, the differential symbols are mapped into contiguous frequency resources of the same OFDM symbol, according to [20] as

$$\check{\boldsymbol{x}}\_{k,n} = \begin{cases} \check{r}\_{k,n}, & k=1, \\ \check{\boldsymbol{x}}\_{k-1,n} \check{p}\_{k,n}, & k=2, n \in \mathcal{F}\_N, \\ \check{\boldsymbol{x}}\_{k-1,n} \check{s}\_{k-1,n}, & \text{otherwise} \end{cases}, \quad 1 \le n \le N \tag{3}$$

where ~*r*1,*<sup>n</sup>* and *p*~2,*<sup>n</sup>* are two reference symbols for different purposes. The set ℐ*<sup>N</sup>* contains the indexes that correspond to the OFDM symbols carrying *p*2,*<sup>n</sup>*. As explained before, The first reference symbol is necessary for differential demodulation, as previously explained. The second type is required to estimate the phase difference between two subcarriers resulting from frequency-domain mapping, as detailed in [20]. This scheme has the advantage of reduced latency and robustness against high

*Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

Doppler spreads. It is reasonable to assume that contiguous subcarriers have similar channel responses due to the much larger number of subcarriers compared to the number of channel taps. However, the benefits come at the expense of an additional phase estimation and compensation procedure. Although this additional phase component is negligible for non-frequency-selective channels, it must be compensated for strongly frequency-selective channels. When diversity is employed, only one additional reference pilot is needed for all OFDM symbols within the coherence time (*p*2,*n*), resulting in minimal overhead impact.

In [20], both time and frequency domain schemes are presented. However, if the number of allocated resources is reduced (*K*↓ and/or *N*↓), both schemes may result in significant overhead. For instance, in massive machine type communication (mMTC) scenarios, mechanical devices send short packets of only a few bytes. Adopting any of the two presented schemes implies sending a significant number of reference symbols. To address this issue, we propose a new mapping scheme called the mixed domain scheme (see **Figure 1c**). In this scheme, we first differentially encode the data symbols as

$$\tilde{\boldsymbol{x}}\_{j} = \begin{cases} \tilde{r}\_{j}, & j = \mathbf{1} \\ \tilde{\boldsymbol{x}}\_{j-1} \tilde{p}\_{j}, & j = \mathbf{2} \\ \tilde{\boldsymbol{x}}\_{j-1} \tilde{\boldsymbol{s}}\_{j-1}, & \mathbf{3} \le j \le \text{KN} \end{cases},\tag{4}$$

where *j* denotes the resource index. Then, the differential symbols *x*~*<sup>j</sup>* are allocated to the two-dimensional resource grid as

$$
\tilde{\mathbf{x}}\_{k,n} = \tilde{\mathbf{x}}\_j | (k, n) = f(j), \quad \mathbf{1} \le j \le \text{KN}, \tag{5}
$$

where *f*ð Þ• is the resource mapping policy function. **Figure 1c** shows a recommended example of a mapping policy function, where the dramatic reduction of reference signals can be observed. This policy mainly follows the frequency domain scheme, except for the edge subcarriers of the block, which follow a time domain scheme. This proposal cannot only significantly reduce the number of reference symbols, but it is also capable of taking all advantages of a frequency domain scheme. Moreover, in the case of time-varying channels, only those complex symbols placed at both edge subcarriers may suffer from an additional degradation.

To maintain conciseness and simplify notation, we adopt the frequency domain scheme for the remainder of this chapter. However, note that the techniques presented in the following sections can be applied to both time and mixed domain schemes without any modification.

Once, the differential symbols are obtained by using (3), the OFDM symbol can be obtained by performing an inverse discrete Fourier transform (IDFT) as

$$\chi\_{m,n} = \frac{1}{\sqrt{K}} \sum\_{k=1}^{K} \exp\left(j\frac{2\pi}{K}(k-1)(m-1)\right)\ddot{\chi}\_{k,n}, \quad 1 \le m \le K, \quad 1 \le n \le N. \tag{6}$$

Then, a CP, whose length is given by *LCP* is appended to each OFDM symbol **s** in order to absorb the multi-path effect.

At the receiver, the CP is discarded from the received signal, and hence, the linear convolution between the multi-tap channel and transmitted data symbols is converted to a circular one. Hence, the received signal at the *v*-th antenna at the BS is given by

$$\mathcal{Y}\_{m,n,v} = \sum\_{\tau=1}^{L\_{\rm eff}} h\_{\tau,n,\nu} x\_{\rm mod(m-\tau,K),n} + w\_{m,n,v}, 1 \le m \le K, 1 \le n \le N, 1 \le v \le V,\tag{7}$$

where *wm*,*<sup>n</sup>* is the additive white Gaussian noise (AWGN) at *m*-th sample in the *n*-OFDM symbol, and it is distributed as CN 0, *σ*<sup>2</sup> *w* � �. Following [7], the channel coefficients suffer from time variability and an autoregressive model approximates the temporally correlated fading channel coefficients of subcarrier *k* at time instant *n* as

$$h\_{\mathfrak{r},\mathfrak{r}',\mathfrak{v}} = a\_d h\_{\mathfrak{r},\mathfrak{v},\mathfrak{v}} + w'\_{\mathfrak{r},\mathfrak{v}',\mathfrak{v}}, \quad a\_d = f\_0 \left( 2\pi d f\_D \left( \frac{K + L\_{CP}}{K \Delta f} \right) \right) < 1,\tag{8}$$

where *n*<sup>0</sup> refers to a time instant in the future with respect to *n* (*d* ¼ ∣*n*<sup>0</sup> � *n*∣ time difference in OFDM symbols), *α<sup>d</sup>* is the temporal correlation parameter, *J*0ð Þ� denotes the zero-th order Bessel function of the first kind and *f <sup>D</sup>* represents the maximum Doppler spread experienced by the transmitted signal, also in Hertz. Similar to CDS, NCDS requires that the channel impulse response should be quasi-static during, at least, one OFDM symbol, otherwise inter-symbol and inter-carrier interferences (ISI and ICI, respectively) will appear. Consequently, the length of the OFDM symbols (*K*↓) should be reduced as the Doppler effect is higher (*f <sup>D</sup>*↑).

Then a discrete Fourier transform is performed to obtain the received symbols in the frequency domain as

$$\tilde{y}\_{k,n,\nu} = \frac{1}{\sqrt{K}} \sum\_{m=1}^{K} \exp\left(-j\frac{2\pi}{K}(n-1)(m-1)\right) y\_{m,n,\nu},\tag{9}$$

where 1≤ *m* ≤ *K*, 1≤*n*≤ *N*, 1≤*v*≤*V* and the received signal in the frequency domain can be modeled as

$$
\tilde{y}\_{k,n,\boldsymbol{\nu}} = \tilde{h}\_{k,n,\boldsymbol{\nu}} \tilde{\mathbf{x}}\_{k,n} + \tilde{w}\_{k,n,\boldsymbol{\nu}} \quad \mathbf{1} \le k \le K, \quad \mathbf{1} \le \boldsymbol{\nu} \le N, \quad \mathbf{1} \le \boldsymbol{\nu} \le \boldsymbol{V}, \tag{10}
$$

where ~ *hk*,*n*,*<sup>v</sup>* and *w*~ *<sup>k</sup>*,*n*,*<sup>v</sup>* is the channel frequency response and noise in the frequency domain, respectively, in the *k*th subcarrier and *n*th OFDM symbol at *v*th antenna.

Later, a differential demodulation is performed in the frequency domain to undo (3) as

$$\tilde{\omega}\_{k,n} = \frac{1}{V} \sum\_{v=1}^{V} \tilde{\mathcal{Y}}\_{k-1,n,v}^{\*} \tilde{\mathcal{Y}}\_{k,n,v} = \sum\_{i=1}^{4} T\_{k,n,v,i}, \quad 2 \le k \le k-1, \quad 1 \le n \le N,\tag{11}$$

$$T\_{k,n,v,1} = \frac{1}{V} \sum\_{v=1}^{V} \tilde{w}\_{k-1,n,v} \tilde{w}\_{k,n,v}, \quad T\_{k,n,v,2} = \frac{1}{V} \sum\_{v=1}^{V} \tilde{h}\_{k-1,n,v} \tilde{x}\_{k-1,n} \tilde{w}\_{k,n,v}, \tag{12}$$

$$T\_{k,n,\nu,3} = \frac{1}{V} \sum\_{v=1}^{V} \tilde{w}\_{k-1,n,\nu} \tilde{h}\_{k,n,\nu} \tilde{\mathbf{x}}\_{k,n}, \quad T\_{k,n,\nu,4} = \frac{1}{V} \sum\_{v=1}^{V} \tilde{h}\_{k-1,n,\nu} \tilde{h}\_{k,n,\nu} \tilde{\mathbf{x}}\_{k-1,n} \tilde{\mathbf{x}}\_{k,n}, \tag{13}$$

where *zk*,*<sup>n</sup>* is the decision variable and *Tk*,*n*,*v*,*<sup>i</sup>*, 1≤*i* ≤4 denotes each term out of four produced by differential demodulation. Note that the first three terms correspond to noise and interference terms, while the last one is the desired data term.

*Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

Making use of the Law of Large Numbers, when the number of the antennas tends to infinity (*V* ! ∞), the fourth terms can be simplified as

$$\mathbb{E}\left\{\left|T\_{k,\boldsymbol{\eta},\boldsymbol{\eta},\boldsymbol{1}}\right|^2\right\}=\mathbb{E}\left\{\left|T\_{k,\boldsymbol{\eta},\boldsymbol{\eta},\boldsymbol{2}}\right|^2\right\}=\mathbb{E}\left\{\left|T\_{k,\boldsymbol{\eta},\boldsymbol{\eta},\boldsymbol{3}}\right|^2\right\}=\mathbf{0},\tag{14}$$

$$\mathbb{E}\left\{\left|T\_{k,n,\nu,4}\right|^2\right\}=\rho\_f\exp\left(j\theta\_f\right)\times\left\{\begin{array}{ll}\bar{p}\_{k,n}, & k=2, n\in\mathcal{F}\_N\\\bar{s}\_{k-1,n}, & 3\le k\le K\end{array}\right\},\tag{15}$$

where 2 ≤*k*≤*K*, 1≤*n* ≤ *N*, the first three terms, which correspond to the interference and noise terms, vanished since the channel frequency response, noise and data symbols are independent random variables to each other, while the fourth term remains. Note that the pilot and data symbols in the fourth term are scaled by the correlation between two contiguous channel frequency responses at subcarriers *k* � 1 and *k*, whose modulus and phase are given by *ρ<sup>f</sup>* and *θ<sup>f</sup>* , respectively. This scaling factor is producing a common phase rotation to the received symbols *zk*,*<sup>n</sup>*, which can be easily estimated and equalized by transmitting a pilot symbol (*p*~*<sup>k</sup>*,*<sup>n</sup>*) before performing the symbol decision.

If the number of antennas (*V*) is not large enough, the three terms given in (14) are not zero. Hence, the received signal is polluted by noise and self-interference. The performance measured in signal-to-noise and interference ratio (SINR) for the multiuser case is given in Section 4, which corresponds to the generalization of the singleuser case.

The performance given by (11)–(13) assumed an ideal case, where hardware impairments are not considered. However, it is well-known that OFDM combined with the traditional CDS is very sensitive to phase noise (PN) [23, 24]. The effect of this PN is due to the instabilities of the local oscillators, which are typically modeled according to a classical Wiener random walk process. Its negative effect not only will degrade the received symbols, but it will also add a common phase error. According to 5G [6], the phase-tracking reference signal (PT-RS) is proposed to be added in order to estimate and equalize this phase error, and hence, the overhead of the system is further increased. On the other hand, according to [25], when NCDS is combined with OFDM it does not require any additional PN estimation and equalization since it is inherently robust to these effects thanks to the use of the differential modulation, and no additional reference signal is required.

#### **2.2 Blind channel estimation based on differential detection**

As it has been explained in the previous subsection, the non-coherent massive MIMO is capable of obtaining the transmitted data in the UL without the CSI and post-equalization. However an interesting question arises, could we estimate an accurate enough CSI given the non-coherently detected symbols? In the end, these noncoherently detected data symbols can be seen as a new type of reference signals, which can be utilized in CDS for channel estimation and equalization, without rising the overhead since the non-coherent data symbols convey data information.

Assuming that accurate CSI can be successfully obtained by using the NCDS, these estimates can be exploited in two ways. On the one hand, the estimates can be used to compute the precoding matrices and used in the DL in TDD mode [21], and hence, the overhead generated by transmitted reference signals in the UL is avoided, as will be shown in Section 3.2. On the other hand, CDS and NCDS can be merged in the UL,

namely to produce a HDS, where the traditional pilot symbols transmitted in CDS are replaced by non-coherent data symbols. The latter can be jointly used for data transmission, channel estimation and the computation of post-coding matrices. Consequently, the efficiency of the UL transmission is increased [13] (**Figure 2**).

The steps for the blind channel estimation based on NCDS can be summarized as follows:

1.Firstly, the symbol decision is performed over ~*zk*,*<sup>n</sup>* as

$$\hat{\mathfrak{s}}\_{k,n} = \ddot{\mathfrak{s}}\_{\dot{j}}|\dot{\mathfrak{j}} = \arg\min\_{\dot{j}} \left\{ \left| \ddot{z}\_{k,n} - \ddot{s}\_{\dot{j}} \right| \right\}, \quad \ddot{s}\_{\dot{j}} \in \mathfrak{M}, \quad \mathbf{1} \le \dot{j} \le |\mathfrak{M}|, \tag{16}$$

where 2≤ *k*≤*K*, 1≤*n* ≤ *N*, ^~*sk*,*<sup>n</sup>* are the decided symbols at *k*th subcarrier in *n*th OFDM symbol, €*sj* corresponds to the *j*th symbol of the constellation M whose number of elements is given by j j M .


$$
\hat{\tilde{h}}\_{k,n,v} = \hat{\tilde{\mathbf{x}}}\_{k,n}^{-1} \tilde{\mathbf{y}}\_{k,n,v}. \tag{17}
$$

**Figure 2.** *Example of a unit block for a proposed HDS scheme.*

*Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

Note that an additional error term in the channel estimation, with respect to the classical PSAM [9], is produced by a possible mismatch between transmitted data symbols *x*~*k*,*<sup>n</sup>* and reconstructed differential symbols *x*^~*<sup>k</sup>*,*n*, whose error was characterized in [13, 21]. The estimated channel at *k*th subcarrier will be used in another subcarrier index *k*<sup>0</sup> , such that *k* 6¼ *k*<sup>0</sup> . Hence, the channel estimation error is composed of two independent components ([13], Eq. (24)) as shown below

$$\sigma\_d^2 = \mathbb{E}\left\{ \left| \hat{\tilde{h}}\_{k,n,v} - \tilde{h}\_{k,n,v} \right|^2 \right\} = \sigma\_{\ge,d}^2 + \sigma\_b^2 = \mathbb{1} \left( 1 - a\_d \delta\_u^{n,k} \right) + \sigma\_w^2,\tag{18}$$

where *σ*<sup>2</sup> *<sup>x</sup>*,*<sup>d</sup>* is the channel estimation error that comes from compensation and estimation in different time instants with a possible mismatch between transmitted and reconstructed differential symbol. The term *δ<sup>n</sup>*,*<sup>k</sup> <sup>u</sup>* is computed as

$$\delta\_{k,n} = \mathbb{E}\left\{ \cos\left(\mathcal{L}(\check{\mathbf{x}}\_{k,n}) - \mathcal{L}\left(\hat{\check{\mathbf{x}}}\_{k,n}\right)\right) \right\} \approx \frac{\mathbf{1} - P\_{k,n} - (\mathbf{1} - P\_{k,n,u})^N}{(N-1)P\_{k,n}},\tag{19}$$

where *Pk*,*<sup>n</sup>* is the error probability for the UL of each user. To find the details of the derivations the interested reader is referred to [15].

The MSE of channel estimation, as given in (18), shows that when either *α<sup>d</sup>* or *δ<sup>k</sup>*,*<sup>n</sup>* is zero, the channel error estimation is highest, while both need to be 1 to prevent any increase in the channel estimation error compared to the PSAM. Various MSE curves are displayed for different values of *α<sup>d</sup>* and SNR (**Figure 3**).

To ensure that the channel is properly estimated in a certain time-frequency resource, some error-detecting code (such as a cyclic redundancy code) can be added to a data stream of non-coherent data. With this, and performing the channel estimation with reconstructed data that we are sure is correct, the channel estimation error will be the same as that of the PSAM.

#### **Figure 3.**

*MSE of channel estimation for MUL* ¼ 16 *and R* ¼ 100*. The continuous line shows the result obtained from the Monte Carlo simulation, while the dashed line represents the theoretical upper bound. The blue line corresponds to the PSAM method without considering channel time variability, which represents the best-case scenario.*

### **3. Non-coherent massive MIMO in DL**

In this scenario, one multi-antenna BS simultaneously serves *U* UEs in the DL. It is assumed that the parameters of the OFDM system are the same as described for the UL.

#### **3.1 Non-coherent massive MIMO in FDD mode**

In FDD, the multi-path channel coefficients between the UL and DL are fully uncorrelated, and the channel reciprocity property cannot be assumed as in TDD. Consequently, the massive number of antennas at the BS used for transmission can only be exploited in spatial diversity mode since the channel estimates of the *V* antennas per user in the DL are not available. However, the exploitation of the diversity from the transmitter without knowledge of the channel is still a challenge, due to the fact that techniques based on block codes [26] failed to exploit a large number of antennas at the transmitter, since their complexity is proportional to the number of antennas. Even though the mapping schemes proposed for the UL are still valid, a few more ingredients are needed to make NCDS suitable for the DL, detailed in the following subsections.

#### *3.1.1 Precoding based on beamforming or codebook selection*

The NCDS can be combined with the precoding technique based on beamforming or codebook selection at the expense of using some (reduced) channel knowledge. At the BS, it is assumed that either the angular position or the best codebook index of the UE of interest is available, which is obtained through a beam-management procedure. Given this additional information, the data is sent over a non-coherently processed link. For the sake of conciseness, beamforming is the chosen technique for the rest of the document. Note that the detailed procedures for the beamforming in the following sections can be easily adapted for the codebook selection scheme.

The combination of NCDS with a practical beamforming technique based on knowing the angular position of the UE is proposed in [19]. The beam-management procedure defined in 5G [6] is suggested to be performed as a first step. This procedure is responsible for accurately determining the angle of the spatial clusters of the propagation channel contributing to the signal of each UE, by transmitting some reference signals. These reference signals are the synchronization signals (SS) and channel state information-reference signals (CSI-RS). The former is used when a UE would like to enter the system for the first time, while the latter is exploited for updating the angular position of an existing UE in the system. Note that, this beam-management procedure must be executed, at least, once per channel coherence time in order to constantly update the estimated angular positions of the current and new UEs.

At the transmitter, the BS transmits the data stream to all the UEs by using beamforming as

$$\check{\mathbf{x}}\_{k,n,v} = \sum\_{u=1}^{U} \check{b}\_{k,n,v,u} \check{\mathbf{x}}\_{k,u,u}, \quad \mathbf{1} \le v \le V, \quad \mathbf{1} \le k \le K, \quad \mathbf{1} \le n \le N \tag{20}$$

where *x*~*<sup>k</sup>*,*n*,*<sup>v</sup>* and ~ *bn*,*<sup>v</sup>* are the precoded data symbol and the precoding coefficient, respectively, for the *v*th antenna and *u*th UE of the BS placed at the *k*th subcarrier and *Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

*n*th OFDM symbol. This precoding coefficient is obtained according to either the estimated angular position or the best codebook index for each UE, and thus, it is in charge of focusing the energy in the obtained specific direction. In this way, the energy received by the UE is enhanced since its path loss is compensated. Similarly, precoding can be used in the UL for the BS to receive the signal from this spatial direction.

#### *3.1.2 Diversity in the frequency domain*

In order to enhance SINR gain for a good performance of NCDS [17], averaging in dimensions other than space, such as time or frequency, is proposed in [19]. Since the number of antennas at the UE is usually limited, this additional source of diversity may be particularly necessary to multiplex several UEs or enable critical services. The use of the frequency dimension is explained in [20], where each OFDM symbol can be processed independently, providing the advantage of easy extension to averaging in time (processing multiple consecutive OFDM symbols) or space (increasing the number of receive antennas of the UE when feasible).

To exploit frequency diversity, the same differential complex symbol is transmitted in multiple frequency resources. After performing the differential encoding for the *u*th UE, the *Q* differential symbols are replicated at the transmitter as

$$\ddot{\boldsymbol{x}}\_{k,n,u} = \ddot{\boldsymbol{x}}\_{q,u,u}|q = \text{mod}(k-1, Q) + 1, \quad K = Q \times F, \quad 1 \le k \le K, \quad 1 \le n \le N,\tag{21}$$

where *F* is the frequency repetition/averaging factor.

The non-coherent detection at the receiver exploits the frequency diversity, where the received data in the subcarriers that carry the same transmitted data are averaged as

$$\tilde{\mathbf{z}}\_{q,\boldsymbol{u},\boldsymbol{u}} = \frac{1}{F} \sum\_{k=0}^{F-1} \mathcal{Y}\_{q+kQ-1,\boldsymbol{u},\boldsymbol{u}}^{\*} \mathcal{Y}\_{q+kQ,\boldsymbol{u},\boldsymbol{u}}, \quad \mathbf{1} \le q \le Q, \quad \mathbf{1} \le \boldsymbol{u} \le N, \quad \mathbf{1} \le \boldsymbol{u} \le U. \tag{22}$$

With this scheme there is a trade-off between overhead and robustness. According to [19], even though the frequency diversity add an additional overhead, it still outperforms the CDS in terms of throughput for some particular scenarios with high mobility.

#### **3.2 Non-coherent massive MIMO in TDD mode**

As was explained in Section 2.2, the channel could be blindly estimated utilizing the reconstructed data in the UL of a non-coherent massive MIMO scheme. Therefore, once the channel is available, it can be used for precoding in the DL transmission to spatially separate the users. To avoid the use of demodulation pilots and thus avoid any pilot signal in the TDD time slot, it is preferred to use a DMPSK also in the DL signals. The use of demodulation pilots is needed in the DL of any coherent scheme to compensate for inefficiencies in the precoder, which can be caused by an erroneous channel estimation, by the use of a simple and not so powerful precoder (such as the MRT) and by the fact that the power in transmission is limited by the RF circuitry, which may cause that some precoders are not realizable. By using a DMPSK in the DL, the transmitted signals will be much more robust against errors in amplitude and phase, compared to the QAM constellations.

To improve clarity and conciseness, we will be using matrix notation throughout this document. Boldface uppercase letters will represent matrices, boldface lowercase letters will represent vectors and normal letters will represent scalar quantities. Specifically, ½ � **A** *<sup>m</sup>*,*<sup>n</sup>* refers to the element in the *m*th row and *n*th column of matrix **A**, and ½ � **a** *<sup>n</sup>* represents the *n*th element of vector **a**.

In the DL, the symbols of all the users are stacked in **x**~*k*,*<sup>n</sup>* of size (*U* � 1) for each time instant *n* and subcarrier *k* and are precoded before transmission using the precoding matrix **<sup>B</sup>**~*k*,*<sup>n</sup>* <sup>¼</sup> **<sup>H</sup>**<sup>~</sup> *<sup>k</sup>*,*<sup>n</sup>* � �*<sup>H</sup>* <sup>¼</sup> **<sup>b</sup>**~*k*,*<sup>n</sup>*,1, <sup>⋯</sup>, **<sup>b</sup>**~*k*,*n*,*<sup>U</sup>* h i for maximum ratio transmission (MRT). The channel for each user is defined as **<sup>h</sup>**~*k*,*n*,*<sup>u</sup>* <sup>¼</sup> <sup>~</sup> *hk*,*n*,1,*u*, ⋯, ~ *hk*,*n*,*V*,*<sup>u</sup>* h i*<sup>T</sup>* . The DL channel is composed as **<sup>H</sup>**<sup>~</sup> *<sup>k</sup>*,*<sup>n</sup>* <sup>¼</sup> **<sup>h</sup>**~*<sup>k</sup>*,*<sup>n</sup>*,1, <sup>⋯</sup>, **<sup>h</sup>**~*<sup>k</sup>*,*n*,*<sup>U</sup>* h i*<sup>T</sup>* , where the DL channels of all users are stacked. Thus, in the DL the received signal is

$$
\tilde{\mathbf{y}}\_{k,n} = \tilde{\mathbf{H}}\_{k,n} \tilde{\mathbf{B}}\_{k,n} \tilde{\mathbf{x}}\_{k,n} + \tilde{\nu}\_{k,n}, \tag{23}
$$

where the noise vector ~*ν<sup>k</sup> <sup>n</sup>* is a *U* � 1 vector where each element represents the noise at the receiver of user *<sup>u</sup>* and is distributed as <sup>~</sup>*ν<sup>k</sup>*,*n*,*<sup>u</sup>* � CN 0, *<sup>σ</sup>*<sup>2</sup> *u* � �. In the case of applying MRT in the DL of the BS, the matrix in (23) can be separated into the desired user and the rest of the users. Therefore, we can rewrite (23) as follows

$$
\tilde{\boldsymbol{\jmath}}\_{k,\boldsymbol{\imath},\boldsymbol{u}} = \tilde{\mathbf{h}}\_{k,\boldsymbol{\imath},\boldsymbol{u}} \tilde{\mathbf{b}}\_{k,\boldsymbol{\imath},\boldsymbol{u}} \tilde{\mathbf{x}}\_{k,\boldsymbol{\imath}} + \sum\_{\boldsymbol{u}' \neq \boldsymbol{u}} \tilde{\mathbf{h}}\_{k,\boldsymbol{\imath},\boldsymbol{u}'} \tilde{\mathbf{b}}\_{k,\boldsymbol{\imath},\boldsymbol{u}'} \tilde{\mathbf{x}}\_{k,\boldsymbol{\imath}} + \tilde{\boldsymbol{\omega}}\_{k,\boldsymbol{\imath}}.\tag{24}
$$

To analyze the effect of imperfect channel estimation for the proposed scheme in the next Section of the DL transmission, we assume the following definition [27], **<sup>H</sup>**^ *<sup>k</sup>*,*<sup>n</sup>* <sup>¼</sup> ffiffiffiffiffiffiffiffiffiffiffiffi 1 � *e*<sup>2</sup> *d* q **<sup>H</sup>**<sup>~</sup> *<sup>k</sup>*,*<sup>n</sup>* <sup>þ</sup> **<sup>H</sup>**<sup>~</sup> *<sup>k</sup>*,*<sup>e</sup>*, where **<sup>H</sup>**<sup>~</sup> *<sup>k</sup>*,*<sup>e</sup>* � CN **<sup>0</sup>**,*e*<sup>2</sup> *<sup>d</sup>***<sup>I</sup>** � � is an error component which is uncorrelated with **H***<sup>k</sup>*,*<sup>n</sup>*. By performing some straightforward manipulations which can be found in [15], the distribution of <sup>~</sup>*yk*,*n*,*<sup>u</sup>* (for *xk*,*n*,*<sup>u</sup>* <sup>¼</sup> 1, without loss of generality<sup>1</sup> ) is

$$\Re\left\{\bar{\mathcal{Y}}\_{k,\boldsymbol{\mu},\boldsymbol{\mu}}\right\} \sim \mathcal{R}\sqrt{\mathbf{1} - \boldsymbol{e}\_d^2} + \mathcal{N}\left(\mathbf{0}, \frac{\mathcal{R}\left(\mathbf{U} - \mathbf{e}\_d^2 + \mathbf{1}\right) + \sigma\_\mathbf{u}^2}{2}\right) = \boldsymbol{\mu}\_\Re + \mathcal{N}\left(\mathbf{0}, \sigma\_\Re^2\right) \tag{25}$$

$$\mathfrak{N}\left\{\bar{\mathcal{y}}\_{k,n,u}\right\} \sim \mathcal{N}\left(0, \frac{R\left(U + \varepsilon\_d^2 - 1\right) + \sigma\_u^2}{2}\right) = \mathcal{N}\left(0, \sigma\_{\mathfrak{N}}^2\right). \tag{26}$$

The differential decoding performed in reception for the received signal at each user as <sup>~</sup>*zk*,*n*,*<sup>u</sup>* <sup>¼</sup> <sup>~</sup>*<sup>y</sup>* <sup>∗</sup> *<sup>k</sup>*,*n*�1,*<sup>u</sup>*~*yk*,*n*,*<sup>u</sup>* results in the product of complex normally distributed variables, where in order to find the distribution of the received symbol, we have to consider the product of two complex variables. Applying again some straightforward manipulations which can be found in [15], we have

$$\mathfrak{R}\{\bar{z}\_{k,n,\mu}\} \sim \mathcal{N}\{\mu\_{\mathfrak{R}}^2, 2\mu\_{\mathfrak{R}}^2 \sigma\_{\mathfrak{R}}^2 + \sigma\_{\mathfrak{R}}^4 + \sigma\_{\mathfrak{S}}^4\}, \quad \mathfrak{T}\{\bar{z}\_{k,n,\mu}\} \sim \mathcal{N}\left(0, 2\sigma\_{\mathfrak{S}}^2 \left(\mu\_{\mathfrak{R}}^2 + \sigma\_{\mathfrak{R}}^2\right)\right), \tag{27}$$

so the SER for the DL of user *u* is computed using ([13], Appendix A).

<sup>1</sup> The error is computed for **x**~*<sup>k</sup> n* � � *<sup>u</sup>* ¼ 1 for simplicity but is the same for the rest of the symbols.

*Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

### **4. Multi-user non-coherent massive MIMO based on DMPSK**

In the previous sections, only a single UE is mapped in each time/frequency resource of the OFDM for the non-coherent massive MIMO system based on DMPSK. Hence, the case of multiple UEs is presented in this Section, where its access strategy is based on a mapping the different UEs in the constellation domain. Each UE transmits its individual constellation and they superimpose in the receiver, resulting in a joint-constellation. Since there is no CSI available, a joint decision must be made. Therefore, ensuring a bijective relation between the individual constellations and the joint-constellation is important, resulting in a crucial constellation design problem to increase the multi-user performance. For this, in this Section, the system model of the multiple UE is briefly introduced first, which shows that joint-constellation distribution depends on the individual one, hindering classical design strategies to be utilized. Then, two design approaches that are based on utilizing artificial intelligence are described, followed by a proposal of some multi-user constellations.

#### **4.1 System model**

The constellation design for the simultaneous transmission of multiple UEs can be applied in UL or DL. For the sake of simplicity and without loss of generalization, UL is considered. The UEs transmit to the BS concurrently using the non-coherent scheme described in Section 2.1. During the *n*th OFDM symbol, the transmitted bits by the *u*th UE are arranged in a vector **b***<sup>n</sup>*,*<sup>u</sup>* having a dimension of ð Þ *Nb*, *u* � 1 . Here, *Nb*,*<sup>u</sup>* denotes the number of bits for user *u*. The vector **b***n*, *u* is then transformed into a complex symbol ~*sk*,*n*,*<sup>u</sup>*, given by

$$\tilde{\mathfrak{s}}\_{k,\mathfrak{u},\mathfrak{u}} = \mathfrak{g}\_{\mathcal{B}}(\varpi\_{\mathfrak{u}}, \mathbf{b}\_{\mathfrak{u},\mathfrak{u}}) \in \mathfrak{M}\_{\mathfrak{u}}, \quad 1 \le k \le K - 1, \quad 1 \le \mathfrak{u} \le \mathcal{N}, \quad 1 \le \mathfrak{u} \le \mathcal{U}, \tag{28}$$

$$\mathfrak{M}\_{\mathfrak{u}} = \{c\_{\mathfrak{u},1}, \dots, c\_{\mathfrak{u},M\_{\mathfrak{u}}}\}, \quad M\_{\mathfrak{u}} = |\mathfrak{M}\_{\mathfrak{u}}| = 2^{N\_{\mathfrak{k}\mathfrak{u}}}, \quad c\_{\mathfrak{i}}^{\mathfrak{u}} \in \mathbb{C}, |c\_{\mathfrak{i}}^{\mathfrak{u}}| = \mathbf{1}, c\_{\mathfrak{i}}^{\mathfrak{u}} \neq c\_{\mathfrak{i}}^{\mathfrak{u}} \forall \mathfrak{i} \neq \mathfrak{i}', \quad \mathfrak{a} \mathfrak{B} \}$$

where the *gB*ð Þ� is the bit mapping function, M*<sup>u</sup>* denotes the individual constellation set for the *u*th UE (constrained to constant modulus to facilitate the use of the differential modulation) and *ϖ<sup>u</sup>* of size ð Þ *Mu* � 1 denotes the bit mapping policy for the *u*th UE which satisfies that ½ � *ϖ<sup>u</sup> <sup>i</sup>* ∈f g 1, … , *Mu* , 1≤*i* ≤ *Mu*, ½ � *ϖ <sup>i</sup>* 6¼ ½ � *ϖ <sup>i</sup>* <sup>0</sup>, ∀*i* 6¼ *i* 0 . We define <sup>Π</sup> <sup>¼</sup> *<sup>ϖ</sup><sup>T</sup>* <sup>1</sup> ⋯ *ϖ<sup>T</sup> U* � �*<sup>T</sup>* a vector of size P*<sup>U</sup> <sup>u</sup>*¼<sup>1</sup>*Mu* � <sup>1</sup> � � that contains the bit mapping policies of all UEs. The complex symbols of each UE are differentially encoded and mapped in the OFDM symbol as described in (3) and transmitted to the wireless channel using an OFDM system.

At the BS, the received signal at *k*th subcarrier in the *n*th OFDM symbol can be described as

$$\tilde{\mathbf{y}}\_{k,n} = \mathbf{H}\_{k,n} \beta \ddot{\mathbf{x}}\_{k,n} + \tilde{\mathbf{w}}\_{k,n}, \quad \beta = \text{diag}\left( \left[ \sqrt{\beta\_1}, \dots, \sqrt{\beta\_U} \right] \right), \tag{30}$$

where **x**~*<sup>k</sup>*,*<sup>n</sup>* ¼ *x*~*<sup>k</sup>*,*n*,1 ½ � , ⋯, *x*~*<sup>k</sup>*,*n*,*<sup>U</sup> <sup>T</sup>*, **<sup>w</sup>**<sup>~</sup> *<sup>k</sup>*,*<sup>n</sup>* <sup>¼</sup> *<sup>w</sup>*<sup>~</sup> *<sup>k</sup>*,*n*,1 ½ � , <sup>⋯</sup>, *<sup>w</sup>*<sup>~</sup> *<sup>k</sup>*,*n*,*<sup>U</sup> <sup>T</sup>*, and *β<sup>u</sup>* represents the ratio of the received average power of the *u*th UE, with 1≤*β<sup>u</sup>* ≤ *β*max. This ratio is directly proportional to the combination of the large-scale channel effects and the power control employed by each user. The design of constellations takes into account the impact of varying *β<sup>u</sup>* values on the performance of each user. To prevent significant performance differences between users, a maximum value of *β*max is considered.

Again, the phase difference of two consecutive symbols received at each antenna is non-coherently detected as

$$\begin{split} \ddot{\mathbf{z}}\_{k,n} &= \frac{\left(\ddot{\mathbf{y}}\_{k,n-1}\right)^{H} \ddot{\mathbf{y}}\_{k,n}}{R} = \frac{1}{R} (\ddot{\mathbf{x}}\_{k,n-1})^{H} \beta \left(\mathbf{H}^{n-1}\right)^{H} \mathbf{H} \beta \ddot{\mathbf{x}}\_{k,n} \\ &+ \frac{1}{R} (\ddot{\mathbf{x}}\_{k,n-1})^{H} \beta \left(\mathbf{H}^{n-1}\right)^{H} \ddot{\mathbf{w}}\_{k,n} + \frac{1}{R} (\ddot{\mathbf{w}}\_{k,n-1})^{H} \mathbf{H}\_{k,n} \beta \ddot{\mathbf{x}}\_{k,n} + \frac{1}{R} (\ddot{\mathbf{w}}\_{k,n-1})^{H} \ddot{\mathbf{w}}\_{k,n}, \end{split} \tag{31}$$

which is a generalization of (11) to multiple UEs mapped in the constellation domain. For a very large number of antennas, using the asymptotic property of massive SIMO, by making use of the Law of Large Numbers, assuming that **<sup>H</sup>***<sup>k</sup>*,*n*�<sup>1</sup>≈**H***<sup>k</sup>*,*<sup>n</sup>*, we know that <sup>1</sup> *<sup>R</sup>* ð Þ **H***<sup>k</sup>*,*n*�<sup>1</sup> *<sup>H</sup>***H***<sup>k</sup>*,*<sup>n</sup>* ! *<sup>R</sup>*!<sup>∞</sup> **<sup>I</sup>***U*, and thus

$$z\_{k,n} \stackrel{R \to \infty}{\to} \mathfrak{c}\_{k,n} = \sum\_{u=1}^{U} \beta\_u s\_{k,u,u} \in \mathfrak{M}, \quad M = |\mathfrak{M}| = \prod\_u M\_u,\tag{32}$$

where the joint-symbol *ς<sup>k</sup>*,*<sup>n</sup>* is the result of superimposing the symbols sent by the users, where M represents the joint-constellation set. **Figure 4** illustrates the jointconstellation set formed by two specific individual constellations, which are designed using the proposed methods. We define **b***<sup>i</sup>*,*<sup>u</sup>* as a *N<sup>u</sup> <sup>b</sup>* � <sup>1</sup> � � vector containing the bits for the *u*th UE and the *i*th joint-symbol according to the mapping Π. Furthermore, we define **<sup>b</sup>***<sup>i</sup>* <sup>¼</sup> **<sup>b</sup>***<sup>T</sup> <sup>i</sup>*,1; <sup>⋯</sup>; **<sup>b</sup>***<sup>T</sup> i*,*U* � �*<sup>T</sup>* as a P*<sup>U</sup> <sup>u</sup>*¼<sup>1</sup>*N<sup>u</sup> <sup>b</sup>* � 1 � � vector containing all the **<sup>b</sup>***<sup>i</sup>*,*<sup>u</sup>* vectors for the *i*th joint-symbol of all UEs. The terms of (31) are independent, and their distribution is shown in [18]. Therefore, the conditional PDF of *zk*,*<sup>n</sup>* given the transmitted symbols of each UE can be analytically obtained as a convolution of the PDF of each of the terms. Assuming equiprobable joint-constellation elements, the decision of *ς<sup>k</sup>*,*<sup>n</sup>* while receiving *zk*,*<sup>n</sup>* can be done using (32) and maximum likelihood detection as

$$\hat{\varsigma}\_{k,n,ML} = \arg\max\_{\varsigma\_{k,n}} \left\{ f\left(z\_{k,n}|\varsigma\_{k,n}\right) \right\} \in \mathfrak{M}.\tag{33}$$

#### **Figure 4.**

*Block diagram that illustrates the NC scheme in the UL for the specific scenario of U* ¼ 2*, where β*<sup>1</sup> ¼ *β*<sup>2</sup> ¼ 1*. The diagram also shows two distinct cases of individual constellations, namely* M1 *and* M2*. These individual constellations are designed using the proposed methods to generate a QAM joint-constellation denoted as* M*.*

*Massive MIMO without CSI: When Non-Coherent Communication Meets Many Antennas DOI: http://dx.doi.org/10.5772/intechopen.112053*

Based on the previous analysis, in order to minimize interference among the different elements of the joint-constellation and reduce the symbol error rate (SER) or bit error rate (BER), it is necessary to place them strategically. However, this results in a significant increase in the complexity of the constellation design, as the probability density function (PDF) varies for each joint-symbol depending on the individual constellations. Additionally, even if an optimal joint-constellation is identified, the individual constant modulus constellations must be capable of generating that jointconstellation while also fulfilling individual requirements, which may not be feasible.

One of the most relevant parameters to produce high performance in terms of SER/BER is enlarging the minimum distance between the elements in the joint-

constellation. For comparison purposes, it is normalized as ^ *d*min ¼ *d*min*=* ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P*<sup>U</sup> <sup>u</sup>*¼1*β*<sup>2</sup> *u* q . The value of this distance for the typically used constellations [16, 17] is 0.39 for Type A, 0.6325 for Type B, 0.4142 for equally error protection (EEP) and 0.6325 for the Monte Carlo Optimization (MCO). Type A exhibits an exponential reduction in distance as the number of users and/or constellation sizes increase. Type B, on the other hand, is limited to DQPSK and requires specific average receive powers. The normalized minimum distance (NMD) is crucial to performance, as demonstrated in [17], and a larger NMD results in better performance. However, as the number of users *U* and/or constellation sizes *Mu* increase, the NMD of the joint-constellation decreases, leading to a decrease in performance. Regular M-QAM joint-constellations maximize the NMD, which can be calculated as ð Þ ð Þ *<sup>M</sup>* � <sup>1</sup> *<sup>=</sup>*<sup>6</sup> �1*=*<sup>2</sup> . Therefore, the minimum distance of any joint-constellation must satisfy 0< ^ *<sup>d</sup>*min <sup>≤</sup> ð Þ ð Þ *<sup>M</sup>* � <sup>1</sup> *<sup>=</sup>*<sup>6</sup> �1*=*<sup>2</sup> , with *M* calculated using (32). Moreover, the distribution of the received symbols around the theoretical values in the joint-constellation depends on the individual constellations chosen by each UE. If the phases of the individual constellation elements that make up the joint-constellation element are similar, the interference power projected on its direction is larger, and vice versa. The interference shapes of the joint-constellation elements are dependent on the individual constellations, and minimizing the effect of interference by altering the joint-constellation shape requires the use of different individual constellations, resulting in a recursive problem in the design process. Additionally, EEP suffers from distance reduction in the inner circle, which is inherent to the constellation definition structure and can even result in a distance of 0 in certain configurations. Consequently, the constellation design problem is mathematically intractable and cannot be solved using classical constellation design techniques.

#### **4.2 Multi-user constellation design approaches for NC massive MIMO**

Since the constellation design for the multi-user NC massive MIMO scenario implies solving a non-tractable optimization problem, two main approaches have been exploited in the literature, such as the "guess and try" approach and the artificial intelligence techniques specially designed for solving non-convex optimization problems. In the case of multi-user constellations, [16, 17] proposed a small set of suboptimal constellations for the NC based on DMPSK, namely Type A, Type B and EEP. Type A was designed to separate users over sub-quadrants, Type B involved separating elements through power control of the users and EEP placed the constellation elements of each user with a certain phase shift relative to the others. In this sense, these constellations are suboptimal since they do not maximize the probabilistic minimum distance in the joint-constellation and do not focus on any bit mapping policy,

which is also critical to minimize the BER. Recently, [18] defined an optimization problem to find the individual constellations and the bit mapping policies that give a proper joint-constellation in terms of BER performance. This is the first constellation design proposal for NC massive MIMO multi-user constellations that is based on evolutionary computation algorithms (a subfield of artificial intelligence techniques) to solve a mathematically intractable problem.

The optimization problem of finding the best individual constellations that result in an optimal joint-constellation and bit mapping policy is mathematically intractable and thus we utilize evolutionary computation algorithms [28] to solve them. We propose using the MCO, where no assumptions on the joint-constellation shape are considered and the bit-mapping policy is co-designed together with the jointconstellation shape. MCO defines a single optimization problem capable of providing the individual constellations and the bit mapping policy of all UEs at once. It is based on the Monte Carlo method to numerically evaluate the performance in terms of BER of the candidates at each iteration. The MCO optimization problem is expressed as

$$\begin{aligned} \min\_{\tilde{\mathbf{c}}\_{u}, \boldsymbol{\beta}} \quad & \alpha\_{1} \sum\_{u=1}^{U} [\boldsymbol{e}]\_{u} + \alpha\_{2} \sum\_{u=1}^{U} \beta\_{u}, \quad \text{where} \quad \boldsymbol{e} = \mathbf{g}\_{M} (\sigma\_{w}^{2}, R, \Pi, \boldsymbol{\beta}, \mathbf{\hat{c}}, N\_{s}, N\_{r})\\ \text{s.t.} \quad & \left| [\bar{\mathbf{c}}\_{u}]\_{i\_{u}} \right|^{2} = \mathbf{1}, \quad \mathbf{0} \le \mathcal{L} \Big( [\bar{\mathbf{c}}\_{u}]\_{i\_{u}} \Big) < 2\pi, \quad u = \mathbf{1}, \cdots, U; \ \boldsymbol{i}\_{u} = \mathbf{1}, \cdots, M\_{u};\\ \quad & \mathbf{1} \le \boldsymbol{\beta}\_{u} \le \boldsymbol{\beta}\_{\max}, \quad [\bar{\mathbf{c}}] = [\bar{\mathbf{c}}\_{1}, \cdots, \bar{\mathbf{c}}\_{U}]^{T}, \quad a\_{1} + a\_{2} = \mathbf{1}, \quad \sigma\_{u} \in \mathfrak{B}\_{u}, \end{aligned} \tag{34}$$

where *ε* is a vector of size ð Þ *U* � 1 that contains the BER of each UE and *gM*ð Þ� denotes a function to obtain this BER for a particular set of system parameters. These system parameters are Π which is a bit mapping policy for the individual constellations, *Nr* and *Ns* are the number of iterations and the number of symbols of the Monte Carlo simulation. This optimization problem is non-convex and NP-hard, so we propose solving it again by using numerical methods based on EC [28]. **Figure 5** provides a block diagram of the implementation of MCO, where *NG* is the number of generations and *NP* is the population size of the EC algorithm. The interested reader is referred to [18] for more explanations of the MCO.

**Figure 5.** *Block diagram of the MCO.*

### **4.3 Proposed multi-user constellations**

We provide a set of optimized constellations in ([18], Table II). While each constellation has been determined for a certain *R* and *ρ*, it can be used for any values in a realistic range. To read the table, for each scenario, there are *U* vectors of the form <sup>Φ</sup> <sup>¼</sup> <sup>Φ</sup>*<sup>u</sup>* 1Φ*<sup>u</sup>* 2⋯Φ*<sup>u</sup> Mu* h i, where <sup>Φ</sup>*<sup>u</sup> mu* is the phase in radians for the constellation element *mu* of user *u* (1≤ *mu* ≤ *Mu*, 1≤*u* ≤ *U*, where *Mu* is the constellation size of user *u*). A constellation element *mu* of user *u* can be found as *s u mu* <sup>¼</sup> exp *<sup>j</sup>*Φ*<sup>u</sup> mu* . The mapping of element *mu* is obtained with a decimal to the binary conversion of *mu* � 1.
