**Abstract**

The chapter presents a game theoretic training model enabling a deep learning solution for rapid discovery of satellite behaviors from collected sensor data. The solution has two parts, namely, Part 1 and Part 2. Part 1 is a PE game model that enables data augmentation method, and Part 2 uses convolutional neural networks (CNNs) for satellite behavior classification. The sensor data are propagated with the various maneuver strategies from the proposed space game models. Under the PE game theoretic framework, various satellite behaviors are simulated to generate synthetic datasets with labels for the training to detect space object behaviors. To evaluate the performance of the proposed PE model, a CNN model is designed and implemented for satellite behavior classification. Python 3 and TensorFlow are used in this implementation. The simulation results show that the trained machine learning model can efficiently and correctly classify the satellite behaviors up to 99.8%.

**Keywords:** space situational awareness, satellite characterization, sensor models, simulated training data, training performance, general-sum games, CNN

#### **1. Introduction**

Since space has already been fully utilized, society has become increasingly dependent on space advantages in various industrial, civil, and commercial applications. This dependence brings an essential vulnerability, especially shortage of continuous situational awareness of the space environment to ensure freedom of movement. Due to the fact that information from space is crucial for key decisionmaking, such as urban, agricultural, and responsive planning, space is regarded as a significant frontier. In addition to real-time and hidden information constrains, the existence of space object density significantly produces the complexity of the space situational awareness (SSA). Understanding the position of space objects from lowlevel information fusion can support high-level information fusion SSA missions of sensor, user, and task refinement [1]. In order to implement SSA accurately, it is possible to coordinate the evaluation of residential space objects (RSO) through user-defined operation pictures (UDOP) [2].

Space control and SSA are required for space prevalence, which depend on fast and precise space object behavioral discovery. Developing a theoretical approach for fast discovery of variation of satellite behaviors is the main task of this book

chapter. The machine learning methods with novel neural networks are proposed in this case. However, there are numerous challenges for constructing the tools because of the following reasons: (i) partially observable movements, (ii) resident space objects (RSOs), (iii) uncertainties modeling and propagation, (iv) real-time response, and (v) computationally intractable algorithms.

Space access investigation and mission trade-off considers are imperative for the victory of space-borne operations. The tracking algorithms of space object can be measured depending on collecting data to track satellites, debris, and natural phenomena (such as comets, asteroids, and solar flares). Tacking is related with sensor administration, which can point sensors to observation points to decide the circumstance and to aware threatens. SSA improvements consist of models (such as orbital mechanics), measurements, computational software (such as tracking), and application-based system coordination (such as situations). For instance, *gametheory* methods for SSA can be utilized for pursuit-evasion analysis [3].

This book chapter creates and establishes game theoretic training enabled deep learning (GTEL) methodologies for fast discovery of the behaviors from the satellites. GTEL is an adaptive feedback adversarial theoretic method, which acquires data from sensors related to the relationship between resident space objects (RSOs) of interest and sensing assets from ground and space (GSAs). Thus, a game theory is modeled instead of a control problem for this circumstance. Game reasoning uses data-level fusion, random modeling/propagation, on the other hand, RSO detection/tracking predicting the future RSOs-GSAs relationships. The adversarial engine also supports optional space pattern dictionary/semantic rules for adaptive transition in the Markov game. In the event that no existing pattern dictionary is accessible, GTEL will construct an initial pattern and modify it during the game inference. The output of GTEL inference consists of two parts of control methods: (i) measurements processing and (ii) RSOs localization. The two parts establish a *game-equilibrium*, one of which is sensing asset management, the other of which is the estimation of RSO behaviors.

The chapter is organized as the following. Section 2 introduces the comprehensive system design for our methods. Section 3 presents the Markov game theory with satellite maneuvering. Section 4 proposes the details of our machine learning model for space behavior detection. And the numerical results and analysis are displayed in Section 5. At the end, Section 6 draws the conclusion of this chapter.

### **2. Overall system architecture**

The proposed methodology of GTEL is shown as **Figure 1**. The core piece of the method for detecting *unknown* patterns of space objects is Markov Game Engine. Due to the patterns are obscure, there is no preparing training data accessible. Therefore, the Markov Game Engine makes use of zero-shot learning [4] and transfer learning to unsupervised classify unavailable target domain data by training available data from source domain (i.e. simulate data). The knowledge adaptation or domain transfer is performed through manifold learning to share intermediate semantic embeddings (such as attributes) between labeled and unlabeled data. On one hand, the manifold learning can reduce the dimension of sensing data (such as azimuth angle, elevation angle, range, and range rate). On the other hand, the manifold learning can also mitigate the difficulties of object tracking and detection with fast space object behaviors' detection. Moreover, in order to solve the *uncertainties* in this task, the ℝ<sup>5</sup> × coordinate system is used with filtering technology on *optimal transport (OT)*. The essential components of our methodology are shown as the following:

*Game Theoretic Training Enabled Deep Learning Solutions for Rapid Discovery of Satellite… DOI: http://dx.doi.org/10.5772/intechopen.92636*


The simulated positions of satellite are used to generate sensor measurements, and the results will be utilized to track several space objects to complete the

**Figure 1.** *GTEL system architecture.*

estimation of the position. Afterwards, the estimations of the orbits are used for collision alert and maneuver detection of space objects. The satellite maneuvers are going to be interpreted as platform commands to perform course of actions and space object movements.

**Figure 1** shows an adaptive feedback approach with the game theory enabled. It utilizes sensors to obtain information about the relationships between the ground/ space sensing assets and the RSOs of interest [11]. RSOs and GSAs determine the relations. Thus, rather than saying it is a control problem, it is a dynamic game. Data-level fusion, game reasoning, RSO detection/tracking, and uncertainty modeling/propagation are combined to predict the RSO-SA relationships in the future. Optional space behavior dictionary/semantic rule for adaptive transition matrices in our Markov game is also supported by our game engine. In addition, the game solution is an equilibrium, which is controlled by both space sensing asset management and the (estimated) RSO behaviors.

#### **3. Markov game in space situational awareness**

Lloyd Shapley has invented a concept of stochastic game [6], which is a dynamic game played by one or several players focusing on probabilistic transitions. There are several stages for this game. At first, the game is set in one state. Then, the participated players should select an action individually. Based on the current state and the actions players chosen, each player will receive a reward. Therefore, after the chosen from each player, the game comes to a new random state, where the previous state and previous actions chosen by the players determine a distribution of the new random state. Afterwards, the above action will be repeated again for the new state. After finite or infinite number of stages of playing, the total reward for each player is obtained using the discounted sums of each stage reward or the averages of every stage rewards. In this way, each player gets a reward and the reward is compared with each other. The aforementioned sequences are the procedures for the stochastic game, which can be generalized by Markov decision processes (MDP) with repeated games. Our space situational awareness (SSA) would utilize this game tool for intent prediction [12].

The Markov game engine extracts specific information from each event as the following: (*i*) a finite set of players *N*, (*ii*) a finite or infinite set of states, *S*, (*iii*) a finite set of accessible actions for each player in *N* set, *D*<sup>i</sup> (the overall action space is D = ×*i*∈*ND*<sup>i</sup> ), (*iv*) a transition rule *q*: *S* × *D*→∆(*S*), (where ∆(*S*) is the space of all probability distributions over *S*), and (*v*) a reward function *r*: *S* × *D*→*RN*.

**Figure 2** shows a visual description of the simple game states with only two players, who have only two options of actions for each player. The arrows in the graph indicates the probable transitions from one state to the other state. The states with red color means that only player 1 changes the approach. On the contrary, the states with blue color indicates only player 2 changes the approach. The state with green color indicates both players change the approach.

The GTEL solution uses a two-player Markov game to investigate the sensor management for tracking space objects. Whether deliberate or unintentional, some of space objects may cause confusion to observers (sensors) when the orbital maneuvers are performed. In general, spatial object tracking can be assumed as an optimal control problem (one side optimization) or a game problem (two side optimization). For the settings of optimal control, the position and velocity of the space objects will be calculated (filtered) as the states dependent on the measurement from the sensor. However, this method ignores the possibility that the space objects may alter their orbits purposely with intelligence. It may cause difficulties for the

*Game Theoretic Training Enabled Deep Learning Solutions for Rapid Discovery of Satellite… DOI: http://dx.doi.org/10.5772/intechopen.92636*

satellite to track the space objects. Therefore, the Markov game method provides a solution for these difficulties. In this approach, on the one hand, the observed satellite will utilize the tracking and sensing model to destroy the tracking estimations by confusing the observer. On the other hand, the observer figures out ways to minimize the uncertainties of tracking, where the uncertainties are dependent on the entropy of tracking.

In this chapter, the information uncertainty of the GTEL pursuit-evasion (PE) game method [13] was exercised with a circumstance of two satellites, one of which is Geostationary Earth Orbit – GEO (observed satellite), the other of which is space based Low Earth Orbit-LEO satellite (observer satellite). **Figure 3** provides an illustration of a space based optical (SBO) sensor measurement model. The angle from the line from the object to the SBO and the line from the sun with object is defined as Bistatic Solar Angle, represented by θ. The smaller the angle is, the stronger the lighting conditions. Therefore, it causes difficulties for observations when the angle is large because of saturation of lighting. As shown in **Figure 4**, the scenarios of light have shown. In the graph, the blue line is for the LEO orbit, the green line

**Figure 2.**

*A diagram of states in a Markov game.*

**Figure 3.** *SBO with a Bistatic solar angle.*

**Figure 4.** *LEO and GEO based on sensor management and maneuver strategies with game theory.*

**Figure 5.** *The performance of tracking using theoretical Markov game strategies.*

indicates the GEO orbit with maneuvers, the pink lines are the SBO sensor performed to track the GEO in order to lower the uncertainty, and the red line displays the sunlight from the sun to the earth.

The research scenario is shown in **Figure 4**, where the red line indicates the direction of sun light, green color is for the GEO orbit with maneuvers, blue color *Game Theoretic Training Enabled Deep Learning Solutions for Rapid Discovery of Satellite… DOI: http://dx.doi.org/10.5772/intechopen.92636*

for LEO orbit, and pink lines indicate when the SBO sensor resource is used to track the GEO (use the sensor data to lower the uncertainty).

**Figure 5** shows the results of tracking based on intermitted measurements. Both cubature KF (CKF) and extended Kalman Filter (EKF) trackers are shown. With the increase of the period without measurement due to the Earth blockage, the tracking errors increased as well. In addition, the tracking errors increased with the maneuver actions from the satellite. On the contrary, informational entropy decreased by sensor measures in this process.

**Figure 6** top graph displayed the PE game control results, with α and β as angles of the maneuver thrust. The zoom-in view of the game optimal controls is shown

**Figure 6.**

*The maneuver controls dependent on the PE game solution for the satellite direction.*

**Figure 7.** *Sensor's game theoretic on–off controls and associated information gains.*

**Figure 6** below, which exhibited that the observer satellite can keep the tracking uncertainty within a desired level while saving the resources of sensor. **Figure 7** shows the on–off sensor controls and the associated information gains. The results show that with the larger potential information gain, the sensor would use the resources to take measures (for the observer's on–off control, 0 means turning off and 1 indicates turning on).
