3.2.5 Deep search allowing lower bounds for θMIR

We remark the following in Tables 11–14:



In the following, we will first compare the results to the case where we allow higher bound on θVPIN, to see if there is a difference. Second, we will benchmark

Futures Recall Precision Precision+recall θMIR n ω (buckets) Classifier θVPIN ES 0.9421 0.9541 1.8962 0.015 30 2500 Student 0.99 EC 0.9080 0.9644 1.8724 0.006 30 2500 Gaussian 0.99 CL 0.9297 0.9806 1.9103 0.016 30 2500 Student 0.99 NQ 0.9076 0.9217 1.8293 0.02 50 2500 Student 0.999 YM 0.9460 0.9696 1.9156 0.015 50 2500 Gaussian 0.99

Best parameters maximizing precision+recall rate for different futures and last bar price structure allowing

Best parameters maximizing precision+recall rate for different futures and mean bar price structure allowing

Futures Recall Precision Precision+recall θMIR n ω (buckets) Classifier Bar price ES 0.9404 0.9402 1.8806 0.015 30 2500 Gaussian First EC 0.9127 0.9681 1.8808 0.006 30 2500 Gaussian First CL 0.9233 0.9728 1.8961 0.016 30 2500 Student First NQ 0.8291 0.9833 1.8124 0.01 30 2500 Student First YM 0.9517 0.9673 1.9190 0.015 50 2500 Student First

Best parameters maximizing precision+recall rate for different futures and first bar price structure allowing

Futures Recall Precision Precision+recall θMIR n ω (buckets) Classifier Bar price ES 0.9499 0.9498 1.8997 0.015 30 2500 Student Median EC 0.9037 0.9717 1.8754 0.006 30 2500 Student Median CL 0.9265 0.9718 1.8983 0.016 30 2500 Student Median NQ 0.9243 0.9017 1.8260 0.015 30 2500 Gaussian Median YM 0.9829 0.9427 1.9256 0.015 30 2500 Gaussian Median

Best parameters maximizing precision+recall rate for different futures and median bar price structure allowing

Futures Recall Precision Precision+recall θMIR n ω (buckets) Classifier Bar price ES 0.9526 0.9454 1.8979 0.015 30 2500 Student Mean EC 0.9058 0.9691 1.8749 0.006 30 2500 Student Mean CL 0.9302 0.9670 1.8972 0.016 30 2500 Gaussian Mean NQ 0.9407 0.8796 1.8203 0.02 60 2500 Gaussian Mean YM 0.9446 0.9779 1.9225 0.015 60 2500 Student Mean

3.2.6 Deep search allowing lower bounds for θMIR and higher bounds for θVPIN

We remark in Tables 15–18 that compared to previous deep search:

both results to the one of a "naive" classifier.

lower bounds for θMIR and higher bounds for θVPIN.

Table 12.

Table 13.

Table 14.

Table 15.

67

higher bounds for θMIR.

higher bounds for θMIR.

higher bounds for θMIR.

An Assessment of the Prediction Quality of VPIN DOI: http://dx.doi.org/10.5772/intechopen.86532

Table 10.

Best parameters maximizing precision+recall rate for different futures and mean bar price structure allowing higher bounds for θVPIN.


#### Table 11.

Best parameters maximizing precision+recall rate for different futures and last bar price structure allowing higher bounds for θMIR.

An Assessment of the Prediction Quality of VPIN DOI: http://dx.doi.org/10.5772/intechopen.86532


Table 12.

• Compared to the "naive" algorithm, VPIN results are effectively better in ES

• On average, mean and median bar price structures have the best precision

in data sets with a real flash crash, we study in the following first the results allowing lower bounds on θMIR while θVPIN = 0.99 and second the results allowing lower bounds on θMIR and higher constraints on θVPIN. Indeed, the intuition is that on NQ case, the "flash crash" amplitude constraints are far too high to have a good precision rate, because in this case there are too few events detected with MIR

To verify whether or not we can get at least better results than a naive algorithm

• Results have changed for every instrument except the ES one which has kept

• Precision is far higher than before, while recall is still high. Therefore, overall

• Optimal θMIR is around 0.015 for ES, CL, NQ, and YM financial instruments, whereas for EC the previous local maximum around 0.006 remains higher.

• On average, median bar price structure has the best precision+recall rate.

Futures Recall Precision Precision+recall θMIR n ω (buckets) Classifier θVPIN ES 0.9737 0.3786 1.3523 0.062 60 1600 Gaussian 0.99999 EC 0.9058 0.9691 1.8749 0.006 30 2500 Student 0.99 CL 0.9789 0.8653 1.8442 0.022 40 2500 Student 0.99 NQ 1 0.0036 1.0036 0.08 30 400 Gaussian 0.99 YM 1 0.1921 1.1921 0.055 30 2500 Student 0.99

Best parameters maximizing precision+recall rate for different futures and mean bar price structure allowing

Futures Recall Precision Precision+recall θMIR n ω (buckets) Classifier Bar price ES 0.9421 0.9541 1.8962 0.015 30 2500 Student Last EC 0.9080 0.9644 1.8724 0.006 30 2500 Gaussian Last CL 0.9297 0.9806 1.9103 0.016 30 2500 Student Last NQ 0.9179 0.9019 1.8198 0.015 30 2500 Gaussian Last YM 0.9460 0.9696 1.9156 0.015 50 2500 Gaussian Last

Best parameters maximizing precision+recall rate for different futures and last bar price structure allowing

case. In YM case we still find comparable results.

Advanced Analytics and Artificial Intelligence Applications

3.2.5 Deep search allowing lower bounds for θMIR

precision + recall rates are "high."

We remark the following in Tables 11–14:

the same local maximum as in the first deep search.

+recall rate.

algorithm.

Table 10.

Table 11.

66

higher bounds for θMIR.

higher bounds for θVPIN.

Best parameters maximizing precision+recall rate for different futures and first bar price structure allowing higher bounds for θMIR.


#### Table 13.

Best parameters maximizing precision+recall rate for different futures and median bar price structure allowing higher bounds for θMIR.


#### Table 14.

Best parameters maximizing precision+recall rate for different futures and mean bar price structure allowing higher bounds for θMIR.

In the following, we will first compare the results to the case where we allow higher bound on θVPIN, to see if there is a difference. Second, we will benchmark both results to the one of a "naive" classifier.

3.2.6 Deep search allowing lower bounds for θMIR and higher bounds for θVPIN

We remark in Tables 15–18 that compared to previous deep search:


#### Table 15.

Best parameters maximizing precision+recall rate for different futures and last bar price structure allowing lower bounds for θMIR and higher bounds for θVPIN.


• It has better results than VPIN on EC and CL cases, where the flash crash is not

Best parameters maximising precision+recall rate for different futures for the naive classifier allowing lower

Futures Recall Precision Precision+recall θMIR n ω (buckets) ES 1 0.7483 1.7483 0.01 40 2500 EC 1 0.9999 1.9999 0.001 60 2500 CL 1 0.9995 1.9995 0.01 40 2500 NQ 1 0.8465 1.8465 0.01 30 2500 YM 1 0.6892 1.6892 0.01 40 2500

• It reaches obviously best local results on lowest MIR bound of the deep search.

• VPIN has an interesting predictive behavior on flash events of magnitude far lower (around 1.5%) than what would be considered as a crash for specific

• But VPIN has poor results comparable to those of a "naive" classifier (precision +recall rate inferior to 1.2) on flash crash events for these financial instruments.

In this section, first we present the problem of VPIN's sensitivity to the starting point of the bucketing process. Second, we present different calibrations to test its

VPIN received among critics one which is important to precisely assess. Indeed, Bodarenko and Anderson [7] pointed out in their work that VPIN is sensitive to the starting point of the bucketing process. More precisely, if one removes the first buckets of the data set, results change. It is indeed right. We would like to know to which extent one can or cannot mitigate this effect. One idea is to test the different price bar structures. Indeed a bar structure influences trade imbalance and thus

There are at least two interesting ways of analyzing the sensitivity to the starting

• For other instruments such as CL or EC, VPIN behaves worse than a naive classifier for these flash events. On flash events of higher amplitude (at least 1.5%), VPIN behaves better than a "naive" classifier for CL instrument.

financial instrument (relatively liquid such as NQ, YM, or ES).

4. VPIN sensitivity to the starting point of a data set

sensitivity. Third we make a summary of our results.

influences the appearance of VPIN events.

really effective.

Table 19.

bounds for θMIR.

4.1 The problem

4.1.1 Methodology

point of a data set:

69

We may partially conclude that:

An Assessment of the Prediction Quality of VPIN DOI: http://dx.doi.org/10.5772/intechopen.86532

• On average median bar price structure has the best precision+recall rate.
