Statistical / AI Models

The main transformation from Klines is the generation of Training Data [1], from which the AI models are fed.

Given a WINDOW - the number of periods to be considered - each Training Data point consists of the percentual variation on open and close (variation) and on the high and low (variation_range) of the corresponding period and the WINDOW previous ones.

Prediction Model

A Symbol has a Prediction Model - a Statistical / AI model - associated with it.

The main goal of the Prediction Model is to provide a Model Score, this is a technical indicator of how “well” the model can predict or explain the already observed data from previous observations.

This score should be interpreted as an indicator of the “predictable behaviour” of the Symbol in the recent past. “Predictable” in the sense of “consistency”, higher model scores mean the Symbol has followed a consistent behaviour - the variation of each period can be “reasonably well” explained by the variation in the previous periods, while lower scores mean “erratic” - the variation of a certain period does not predict the following variation adequately.

It may be also associated with “smoothness”, symbols with higher model scores tend to be “smoother” in their evolution.

Comparing the K-lines chart of the Symbol with the highest Model Score to the one with the lowest and one in the middle may help to picture what this indicator refers to.

The Model Score descends when more history is available for training the model. This is likely due to markets behaving differently in different circumstances and more external shocks are taken into account. The amount of history to be used for training can be controlled with the CLEANING_WINDOW setting, while the --periods argument of the warm_and_ready command may be used for providing them start - each period of the command corresponds to one thousand TIME_INTERVAL s, and from recent experience, one is usually good for the current market situation.

Given the abundance of Symbols and the cost of tracking them, a Symbol with a higher Model Score should be preferred to a lower one, as if the conditions of the market do not change - i.e. with an external shock like big bad news or a big troubled whale in the need of cashing out - it is more likely that the market behaviour will be the same in the short term and hence, make a trading decision with more confidence (“A Seguro se lo llevaron preso” [4]).

Because of this, Symbols are sorted by the model score - internally and in the user interface - and the MODEL_SCORE_THRESHOLD setting controls which symbols are not tracked under that value.

Besides the model score, the Prediction Model provides a prediction for the next time interval (enabled with the PREDICTION_ENABLED setting) and for STP (Short Term Prediction).

Note that there are no confidence intervals provided for the predictions, they are shown for providing a measure of intention (up or down) and its intensity, i.e. an AC of 10% should indicate that the price is highly likely to go up in the short term.

Implementation details

The Training Data transformation allows the use of Regression Models instead of other time series models like SARIMAs.

The Prediction Model should follow the scikit-learn API and be integrated through django-ai, once this is done, subclass it from PredictionModel and the corresponding django-ai class [2] while setting it in PREDICTION_MODEL.

For the available Regression Models and more information on the default, see here.

Outliers Model

An outlier is a technical term for “atypical”, this means that it is “different” from the “majority” of its class.

If a Symbol had varied between 0.1 and 3 percent in the past intervals, when a variation of 10 percent is seen, that variation is considered to be “atypical” or outlier.

Outliers may provide trading opportunities, i.e. a “positive wave” may be starting.

The Outliers Detection Model is used to detect if the last 3 time intervals - considered separately, O1 is the last one, O2 the last two, and O3 the last three - and indicated as blue pills in the upper right corner of the Symbol snippet in the user interface.

The functionality can be enabled with the OUTLIERS_ENABLED and the proportion of observations to be considered as outliers can be controlled with the OUTLIERS_THRESHOLD one, while the model can be swapped with OUTLIERS_MODEL.

Implementation details

The Outlier Detection Model should follow the scikit-learn API and is integrated through django-ai, once this is done, subclass it from OutlierDetectionModel and the corresponding django-ai class [3] while setting it in OUTLIERS_MODEL.

For more comprehensive information on the subject within this context, see this guide.

References