**3. Opening the "black-box" of human capital productivity**

New science provided theoretical framework for creating ODT for modeling organization human capital productivity. First, there should understand how employees produce economic value. The theory of Quality of Working Life (QWL) determines the effective working time-share from the time spend at work. According the human capital production function the staff effective working time multiplied by K-coefficient produces customer value that is measured by revenue. The coefficient K describes the business branch, tangible investments and business logic. QWL improvement requires HR-development that increase auxiliary working time, thus reducing time for work [12] (**Figure 2**).

The human capital production function can be written in function where revenue is the production volume according the Equation [13]:

$$\mathbf{R} = \mathbf{K} \ast \mathbf{L} \ast T \mathbf{W} \hbar \ast (\mathbf{1} - \mathbf{A} \times) \ast Q \mathbf{W} \mathbf{L} \tag{1}$$

**41**

performance.

function.

**Figure 2.**

leadership game.

*The Digital Twin of an Organization by Utilizing Reinforcing Deep Learning*

When manager does efficient team development there can be increase in effective working time. In addition, if absence and staff turnover is high the development may reduce those, thus increase time for work. This way effective team development will increase effective working time, and have good effect on profit. However, at short notice the development will increase auxiliary working time and reduce time for work, thus will reduce both revenue and profit. The development of human capital involves the phenomenon of investment, which requires some sacrifice in order to gain delayed rewards. When investment phenomenon is involved in the agents' actions there is possible to utilize Q-learning

Q-learning is a mathematical method for analyzing behavioral learning points in a simulation model that considers short- and long-term rewards. Nash equilibrium is the result where Q-learning settles to a certain level where the model environment is stable and no player can improve his pay-off [14]. In this case, equilibrium is achieved with a behavior in which both QWL and profit mature to a certain level. Both QWL and profit are management game agents' rewards, which in short-term may be contradictory because improving QWL reduces profit in short-term. This article shows that there are several states of equilibrium in a

In most traditional well-being and commitment surveys, scores are averages of factors that are not individually relevant to the whole. Thus, the result is for example engagement index that does not necessarily tell what and how to improve and what impact the improvement would likely have. Traditional well-being surveys with average scores are oversimplified when measuring human performance. For ODT perspective, it is essential that the staff performance is determined realistically. It affects to the rewards and transition functions of agents' behaviors.

It seems evident that human performance is rather complex phenomenon, consisting several motivation theoretical aspects that cannot be included at simplified statistical staff survey analytics. Therefore, we have utilized motivation theories of Alderfer [15], Antonovsky [16], Kano [17] and Herzberg [18] in creating advanced human performance theory that meets the contribution of main scientists and forms practical QWL index for performance analytics. QWL index includes three self-esteem categories, which each has unique effect on

Therefore, it is essential to describe the theory of QWL.

*DOI: http://dx.doi.org/10.5772/intechopen.96168*

*Illustration of profit team human capital production function.*

and

$$\text{Profit} = \text{R} - \text{Variable costs} - \text{Staff costs} - \text{other costs} \tag{2}$$

Where

R = Revenue [\$].

K = Coefficient for effective working time revenue relation, HR business ratio [\$/h].

L = Labor capacity in full-time equivalent [pcs].

TWh = Theoretical yearly working time [h].

QWL = quality of working life, indicating human capital intangible asset utilization (0–100%).

Ax = The auxiliary working time of the total theoretical working time (vacation, absence, family leave, orientation, training, HR practices, and HRD) [%].

(1 – Ax) = (100% – Ax) = Time available for actual work (time spent at work)

(1 – Ax) \* QWL = Effective working time from the theoretical working time.

It should be noted that other working time includes so-called internal error factors such as waiting, searching, correcting and unnecessary work. These are symptoms of different kinds of development needs that the team has noticed either hidden or conceptual.

*The Digital Twin of an Organization by Utilizing Reinforcing Deep Learning DOI: http://dx.doi.org/10.5772/intechopen.96168*

#### **Figure 2.**

*Deep Learning Applications*

real world (**Figure 1**).

the future.

and

Where

ratio [\$/h].

R = Revenue [\$].

tion (0–100%).

hidden or conceptual.

Markov property means that the future is not determined by the past data, thus supervised learning regression analytics cannot be solely applied in creating ODT. Markov rule is one backbone for creating ODT digital twin and for utilizing Reinforcement Learning where the behavior of the agents determines

The state transition from state to state follows Markov chain where all necessary information is transferred from past to the present. Therefore, the probability of transition from the current state to the next state depends only on the current data and the activity of the players. In the digital twin, this current data must be able to determine the reality presented by the twin. The data in the twin can be measured and verified from reality, thus creating a feedback loop from the real world. This model verification against reality is also necessary for learning purposes so that ODT can learn to refine the transition functions to match the

**3. Opening the "black-box" of human capital productivity**

time, thus reducing time for work [12] (**Figure 2**).

L = Labor capacity in full-time equivalent [pcs]. TWh = Theoretical yearly working time [h].

revenue is the production volume according the Equation [13]:

New science provided theoretical framework for creating ODT for modeling organization human capital productivity. First, there should understand how employees produce economic value. The theory of Quality of Working Life (QWL) determines the effective working time-share from the time spend at work. According the human capital production function the staff effective working time multiplied by K-coefficient produces customer value that is measured by revenue. The coefficient K describes the business branch, tangible investments and business logic. QWL improvement requires HR-development that increase auxiliary working

The human capital production function can be written in function where

K = Coefficient for effective working time revenue relation, HR business

QWL = quality of working life, indicating human capital intangible asset utiliza-

Ax = The auxiliary working time of the total theoretical working time (vacation, absence, family leave, orientation, training, HR practices, and HRD) [%]. (1 – Ax) = (100% – Ax) = Time available for actual work (time spent at work) (1 – Ax) \* QWL = Effective working time from the theoretical working time. It should be noted that other working time includes so-called internal error factors such as waiting, searching, correcting and unnecessary work. These are symptoms of different kinds of development needs that the team has noticed either

R K L 1– = ∗∗ ∗ ∗ *TWh Ax QWL* ( ) (1)

Profit R – Variable costs – Staff costs – other = costs (2)

**40**

*Illustration of profit team human capital production function.*

When manager does efficient team development there can be increase in effective working time. In addition, if absence and staff turnover is high the development may reduce those, thus increase time for work. This way effective team development will increase effective working time, and have good effect on profit. However, at short notice the development will increase auxiliary working time and reduce time for work, thus will reduce both revenue and profit. The development of human capital involves the phenomenon of investment, which requires some sacrifice in order to gain delayed rewards. When investment phenomenon is involved in the agents' actions there is possible to utilize Q-learning function.

Q-learning is a mathematical method for analyzing behavioral learning points in a simulation model that considers short- and long-term rewards. Nash equilibrium is the result where Q-learning settles to a certain level where the model environment is stable and no player can improve his pay-off [14]. In this case, equilibrium is achieved with a behavior in which both QWL and profit mature to a certain level. Both QWL and profit are management game agents' rewards, which in short-term may be contradictory because improving QWL reduces profit in short-term. This article shows that there are several states of equilibrium in a leadership game.

In most traditional well-being and commitment surveys, scores are averages of factors that are not individually relevant to the whole. Thus, the result is for example engagement index that does not necessarily tell what and how to improve and what impact the improvement would likely have. Traditional well-being surveys with average scores are oversimplified when measuring human performance. For ODT perspective, it is essential that the staff performance is determined realistically. It affects to the rewards and transition functions of agents' behaviors. Therefore, it is essential to describe the theory of QWL.

It seems evident that human performance is rather complex phenomenon, consisting several motivation theoretical aspects that cannot be included at simplified statistical staff survey analytics. Therefore, we have utilized motivation theories of Alderfer [15], Antonovsky [16], Kano [17] and Herzberg [18] in creating advanced human performance theory that meets the contribution of main scientists and forms practical QWL index for performance analytics. QWL index includes three self-esteem categories, which each has unique effect on performance.

**Figure 3.** *The theory of QWL.*

The self-esteem categories:


Chosen categories and their effect on performance form the theory of QWL index. It is also important to know that in addition that QWL index is production parameter, it has also logical connection to customer satisfaction (see [17]) (**Figure 3**).

Finally, the QWL index is the combination of all three self-esteem factors according the following equation:

$$QWL = PE(\boldsymbol{\omega\_1}) \* \left[ \frac{\left(CI(\boldsymbol{\omega\_2}) + OC(\boldsymbol{\omega\_3})\right)}{2} \right] \tag{3}$$

**43**

**Figure 4.**

*The Digital Twin of an Organization by Utilizing Reinforcing Deep Learning*

At Nash equilibrium the optimal outcome of the game is one where no agent wants to deviate from the chosen policy because that seems to be parallel with opponents' policy. Workplace problems have reducing tendencies on workers' selfesteem, thus decreasing QWL as a production parameter. Management practices have tendencies to improve QWL, but each action will reduce short-term profit. Manager's strategy hypothesis guides the actions at different state events. When the consequence data of action tendencies update the status after each Markovsequence, the player can update the management strategy, which further controls the next actions. Bayesian probability is related to player subjective behavior, relying on the phenomenon that rational thinking will probably lead to optimal result

The manager should learn the optimal leadership strategy without knowing the exact reward function or state transition function. This approach is called stochastic model-free reinforcement learning and can be defined with the Nash Q-learning approach. The leader has prior-believe about the state of nature of profit-unit business situation and expected future reward. The uniqueness of the game comes from the fact that it has predictive features that allow for the use of reinforcing learning artificial intelligence for learning Nash equilibrium between staff QWL

Management game is signaling game since workers give essential signals about

Team leader, as an agent of the management game, is responsible for team profit performance that is the outcome of producing customer value measured by revenue. Agent registers workers' signals and makes own prior belief for the strategy. Agent monitors also scorecards from business outcomes of monthly and cumulative profit, and forms a prior believe policy on how to act to these measures. Agent is rewarded by the profit at each month and cumulative profit at the end of the year. After each state transition the agent will get profit signals and QWL signals from the worker's response from the state change at workers QWL. State-change signals and reward results may cause changes at the preference strategy of the agent for the

possible workplace problems that may threaten their self-esteem (QWL) and therefore team performance. Workers preference strategy is to give their leader signals about the problems. In simplified digital team leaders' learning-game the worker's strategy may be stationary, meaning that workers behavior may be chosen

*DOI: http://dx.doi.org/10.5772/intechopen.96168*

as the new information comes available [19].

in advance when the events scenario is known.

next sequence (Markov sequence [19]) (**Figure 4**).

*Leader's prior believe is biased and this strategy leads to delayed punishment.*

and organization profit.

where

QWL is calculated using the quality of working life index (0 … 1).

PE(x1) is the value of the function of physical and emotional safety.

CI(x2) is the value of the function of collaboration and identity.

OC(x3) is the value of the function of objectives and creativity.

The functions of the self-esteem categories are adjusted so that the final QWL result is always between 0 and 100% [12].
