**4.1 End-to-end multilingual text-to-speech synthesis system PLATTOS**

Text-to-speech (TTS) PLATTOS in **Figure 4** is the first microservice in the PERSIST system. It is used for generating speech from text for the ECA agents that communicate with the patients. The PLATTOS system follows ideas presented in [55, 56] and enables real-time generation of speech in several languages, with practically human-like quality. It is basically the combination of two complex network models: a feature prediction NN model and a flow-based neural-network-vocoder WaveGlow.

**Figure 4.** *TTS system PLATTOS.*
