2. VPIN software and formal flash crash definition

In this section, we first recall the VPIN model. Second, we propose a definition of flash crashes used to compute precision and recall rates. Finally, we present the data used in our tests.

2.1.3 Buckets

Vbucket ¼ mVb.

2.1.4 VPIN formula

upon bucket number j (j≥ n):

An Assessment of the Prediction Quality of VPIN DOI: http://dx.doi.org/10.5772/intechopen.86532

For a given bucket i:

bucket,i ¼ ∑<sup>j</sup> <sup>∈</sup>bucketi

bucket,i ¼ ∑<sup>j</sup> <sup>∈</sup>bucketi

2.2 Defining flash crashes with MIR

<sup>i</sup> <sup>∗</sup> , j <sup>∗</sup> <sup>¼</sup> argmaxi6¼j,i,j <sup>∈</sup>½ � <sup>t</sup>;tþ<sup>η</sup>

A flash crash will depend on two things here:

• V<sup>s</sup>

• V<sup>b</sup>

2.1.5 VPIN event

2.2.1 Formal definition

Let pt 

57

A bucket is defined to be a fixed number of successive trades. Here to simplify, as bars are defined also as a fixed number of trades, a bucket will be m successive

VPIN formula is computed on n successive buckets, where n is VPIN support. A buffer is defined as n successive buckets. Here is VPIN formula, approximating (1)

bucket,i � <sup>V</sup><sup>s</sup>

nVbucket

bucket,i ∣

VPINnormalized ≥ θVPIN (5)

∣pi � pj ∣

pi

<sup>∗</sup> � <sup>i</sup> <sup>∗</sup> <sup>∣</sup>

(4)

(6)

<sup>i</sup>¼j�nþ<sup>1</sup>∣V<sup>b</sup>

In order to distribute all VPIN values between 0 and 1, in practice, VPIN is normalized through a normal law. We thus consider VPINnormalized in the following:

where θVPIN is a given decision threshold. In practice [5] θVPIN ¼ 0:99.

MIRt, <sup>η</sup> ¼ maxi6¼j,i,j <sup>∈</sup>½ � <sup>t</sup>;tþ<sup>η</sup>

η that computes MIRt, <sup>η</sup> (e.g., 10 minutes), more precisely, noting

∣pi �pj ∣ pi

• The amplitude of the crash, which means extreme MIR values (e.g., 10%)

• The shortness of the fall, which means the shortness of the time window within

, the fall has length ∣j

<sup>t</sup> be a time series (e.g., of prices). Here is the definition of MIR:

bars. Let us note Vbucket the fixed volume of a bucket. We naturally have

VPINj <sup>¼</sup> <sup>∑</sup><sup>j</sup>

Vs j

Vb j

A VPIN event is declared when the following occurs:

### 2.1 VPIN software

Easley et al. [12] designed a model of the high-frequency financial market based on informed and uninformed traders. It is then possible to compute a probability of informed trading (PIN). Easley et al. [1] use these results and define an easy way to compute PIN only through the data of trades. We describe briefly VPIN model used in previous literature. The theoretic study of the model is treated in another research study.

#### 2.1.1 Bars

Following Easley et al. [1], a bar is a fixed volume of trades that are successive in time. With such a definition, one can associate the following quantities with each bar:


In practice, the last few trades that do not fill up a bar are dropped to the next bar.

#### 2.1.2 Bulk volume classification

The computation of VPIN requires to determine directions of trades, i.e., classifying each trade as a buy or a sell. The method used here is the following: bulk volume classification (BVC) [1, 5]. Let us note Vb the volume of a bar and j the label of bar number j (j>0) and Pj its price (closing, opening, median, mean). Then the number of buys V<sup>b</sup> <sup>j</sup> within bar j is determined according to this formula:

$$\mathbf{V}\_{j}^{b} = \mathbf{V}\_{b} \mathcal{Z} \left(\frac{\mathbf{P}\_{j} - \mathbf{P}\_{j-1}}{\sigma}\right) \tag{2}$$

where Z is the cumulative function of a given law (usually student or normal distribution) and σ is the standard deviation of the numerator on successive number of bars. In our test, σ is computed once on all successive values of the data set, and the student law is of parameter one. Within bar j the number of sells V<sup>s</sup> j is obviously

$$V\_j^s = V\_b \left( 1 - \mathcal{Z} \left( \frac{P\_j - P\_{j-1}}{\sigma} \right) \right) \tag{3}$$

An Assessment of the Prediction Quality of VPIN DOI: http://dx.doi.org/10.5772/intechopen.86532

### 2.1.3 Buckets

2. VPIN software and formal flash crash definition

Advanced Analytics and Artificial Intelligence Applications

data used in our tests.

2.1 VPIN software

research study.

2.1.1 Bars

each bar:

next bar.

2.1.2 Bulk volume classification

number of buys V<sup>b</sup>

obviously

56

In this section, we first recall the VPIN model. Second, we propose a definition of flash crashes used to compute precision and recall rates. Finally, we present the

Easley et al. [12] designed a model of the high-frequency financial market based on informed and uninformed traders. It is then possible to compute a probability of informed trading (PIN). Easley et al. [1] use these results and define an easy way to compute PIN only through the data of trades. We describe briefly VPIN model used

Following Easley et al. [1], a bar is a fixed volume of trades that are successive in time. With such a definition, one can associate the following quantities with

• A nominal price, computed according to a given technique (mean price,

In practice, the last few trades that do not fill up a bar are dropped to the

fying each trade as a buy or a sell. The method used here is the following: bulk volume classification (BVC) [1, 5]. Let us note Vb the volume of a bar and j the label of bar number j (j>0) and Pj its price (closing, opening, median, mean). Then the

Vb

Vs

<sup>j</sup> ¼ VbZ

distribution) and σ is the standard deviation of the numerator on successive number of bars. In our test, σ is computed once on all successive values of the data set, and the student law is of parameter one. Within bar j the number of sells V<sup>s</sup>

<sup>j</sup> ¼ Vb 1 � Z

The computation of VPIN requires to determine directions of trades, i.e., classi-

<sup>j</sup> within bar j is determined according to this formula:

where Z is the cumulative function of a given law (usually student or normal

Pj � Pj�<sup>1</sup> σ 

Pj � Pj�<sup>1</sup> σ  (2)

j is

(3)

median price, closing price, opening price, etc.)

• A nominal time (first trade time, last trade time)

• Local maximum and minimum values of trades

in previous literature. The theoretic study of the model is treated in another

A bucket is defined to be a fixed number of successive trades. Here to simplify, as bars are defined also as a fixed number of trades, a bucket will be m successive bars. Let us note Vbucket the fixed volume of a bucket. We naturally have Vbucket ¼ mVb.
