**5. Individualized movement choreography**

Different users have different health statuses and clinical requirements. VIGOR employs generative deep neural network architecture to create initiative and individualized Tai-Chi movements [26] to benefit users in the most effective way [100–102]. The most *challenging issue* in deep learning enabled choreography is how to balance the training reliability and the creativity of neural network. In this work, we propose the following techniques: (1) visible neural network, which incorporates biomechanics into the neural network, is employed to formulate the generative movement; (2) only mechanical property such as joint/muscle force and moment is used to measure the generative movement; (3) second-order optimizer is used to speed up the training the neural network.

#### **5.1 Tai-Chi choreography based on LSTM-RNN**

In this work, Long Short-Term Memory type of RNN (denoted as LSTM) [103, 104] is employed to design individualized Tai-Chi choreography [26]. Human3.6M dataset (high quality 3D joint positions and rotations at 50FPS) and our in-house dataset (acquired by Microsoft Kinect V2, including joints' XYZ and Quaternions, 24-30FPS) are used as the training data. The Tai-Chi movement is created clip by clip (or subsequence by subsequence) according to users' health conditions and their clinical rehabilitation requirements [20].

**Figure 13** shows the frame-work of LSTM-based Tai-Chi choreography design. A Tai-Chi movement (or sequence) is partitioned into multiple subsequences (aka a clip or clips). A *seed subsequence*, which can be generated randomly, is fed into the trained model. The output token is regarded as the succeeding subsequence that is fed back into the model for the following subsequence, as a result a creative Tai-Chi sequence can be created clip-by-clip. Four thread visible and hierarchical AutoEncoders [106] are used to reduce problem dimensionality. The resulting individualized Tai-Chi choreography [100–102] is integrated into the VR or AR

**Figure 13.**

*The pipeline of LSTM-based motion chorepgraphy (online video of LSTM-based Tai-Chi Choreography [88, 105]).*

environment [88] from which users can learn. Online video [105] shows a sample Tai-Chi choreography. Compared to other deep learning-enabled choreography projects [107], the proposed method may have faster training speed and be more problem-oriented because (1) the geometric configuration of human anatomy is kept by employing Joint-coordinate systems such as Euler angles. [36, 41], and (2) human biomechanics are preserved by introducing kinetic features [41, 108].

#### **5.2 Movement choreography based on visible GAN**

LSTM-based choreography suffers from relatively large accumulated error and lacks a global picture of Tai-Chi choreography. As an effective deep generative model, Generative Adversarial Networks (GANs) learn to model distribution either with or without supervision for high dimensional data (images, texts, audios, etc.), and have been gaining considerable attention in many fields [109–111]. In VIGOR, GANs may be considered to generate novel Tai-Chi movements by simulating a given distribution.

As illustrated in this work, conventional GAN such as DCAN [46], suffers from frequent modal collapse during the training state, particularly on generator side. The discriminator often improves too quickly for the generator to catch up, which is why we need to regulate the learning rates or perform multiple epochs on one of the two networks. To balance the training of generator and discriminator for decent output, this work investigates the following strategies: (1) Application of *Wasserstein distance* to formulate the loss function [46, 112]. (2) Application of *visible neural network* by incorporating the biomechanics theory (inverse dynamics and the transient dynamics simulation of human body [60, 68]) in the formulation of generator and discriminator. The neural network is personalized using boundary and initial conditions of human dynamics.

**Figure 14** shows the pipeline of GAN-enabled human movement choreography system. A generator *G* generates kinematic data out of latent vector, and a discriminator *D* estimates the probability that a sample came from the training data rather than *G*. Fed with latent vector, which is randomly generated in the beginning and derived from the transient dynamics simulation of human body thereafter, the generator generates a series of personalized and creative Tai-Chi kinetic subsequence to fool the discriminator. The discriminator is trained to discriminate between "real" Tai-Chi kinetic sub-sequences (from the training set) and "fake" Tai-Chi sub-sequence generated by the generator. Because the generator is fed with deterministic simulated data, an equilibrium of the "adversarial game" between the generator and discriminator can be reached much easily.

In this work, a musculoskeletal biomechanics guided loss function is used to formulate the objective of discriminator:

$$\mathcal{L}(\boldsymbol{\theta}) = \mathcal{L}(\boldsymbol{f}(\mathbf{X}, \boldsymbol{\theta}), \mathbf{Y}) + \boldsymbol{\varrho}\mathcal{R}(\boldsymbol{\theta}) + \gamma \mathcal{L}\_{\text{bi}\text{sinc}\text{h}\text{m}\text{s}}(\boldsymbol{f}(\mathbf{X}, \boldsymbol{\theta})) + \eta \mathcal{L}\_{\text{aeathe}\text{s}}(\boldsymbol{f}(\mathbf{X}, \boldsymbol{\theta})) \tag{5}$$
