Financial market prediction system with Evolino neural network and Delphi method.
Maknickiene, Nijole ; Maknickas, Algirdas
1. Introduction
Artificial intelligence methods have become very important in
making financial market predictions. The following elements are of major
importance: the selection of the input data, the selection of the
forecasting tool, and the correct use of the output data. As investors
are searching for profitable growth, they require the development of a
stable and reliable forecasting model.
Kimoto et al. (1990) proposed a stock market prediction system with
modular neural networks; Wang and Leu (1996) used ARIMA-based neural
networks. The application of neural networks to stock market prediction
was presented by Kulkarni (1996). The accuracy of the prediction
depended on neural networks and the input selection.
Stock prices have also been forecasted using evolutionary systems.
Kim and Han (2000) proposed a new hybrid of a genetic algorithm with
artificial neural networks. The genetic algorithm not only searches for
the optimal or near-optimal solutions of the connection weights in the
learning algorithm, but also looks for the optimal or near-optimal
thresholds of the feature discretization. Hussan et al. (2007) proposed
and implemented a fusion model by combining the hidden Markov model,
artificial neural networks (NN), and genetic algorithms to forecast the
behaviour of financial markets. The weighted average of the predictions
was used to forecast stock prices and increase the accuracy of the
model. Choudhry and Garg (2008) proposed a hybrid machine learning
system based on genetic algorithms and support vector machines for stock
market prediction, using the correlation between the stock prices of
different companies.
Prediction systems based on neuro-fuzzy sets have also been used to
predict financial markets. Ang and Quek (2006) proposed a model, which
synergizes the price difference forecast method with a forecast
bottleneck-free trading decision model. Chiang and Liu (2008) developed
a fuzzy rule based system, where the clustering technique and the
simplified and wavelet (Chang, Fan 2008) fuzzy rule based systems were
integrated for forecasting. The system proposed by Agrawal et al. (2010)
used an adaptive neuro-fuzzy inference system for making decisions based
on the values of some technical indicators. Among the various technical
indicators available, the system used the weighted moving averages,
divergence, and RSI (relative strength index). In the paper (Quek et al.
2011), a novel stock trading framework based on a neuro-fuzzy
associative memory architecture was proposed. The architecture
incorporated the approximate analogical reasoning schema to resolve the
problem of discontinuous responses and inefficient memory utilization
with uniform quantization in the associative memory structure.
Suppose it is known that p is an element of some set of
distributions P. Choose a fixed weight [[omega].sub.q] for each q in P
such that the [omega]q add up to 1 (for simplicity, suppose P is
countable). Then construct the Bayesmix M(x) = [summation].sub.q]
[omega]qq(x), and predict using M instead of the optimal but unknown p.
How wrong could this be? The recent work of Hutter provides general and
sharp loss bounds (Hutter 2001): Let LM(n) and Lp(n) be the total
expected unit losses of the M-predictor and the p-predictor,
respectively, for the first n events. Then LM(n)-Lp(n) is at most of the
order of [square root of]Lp(n). That is, M is not much worse than p. And
in general, no other predictor can do better than that. In particular,
if p is deterministic, then the M-predictor will not make any more
errors. If P contains all recursively computable distributions, then M
becomes the celebrated enumerable universal prior. The aim of this paper
is to construct a model that could make predictions with a small enough
difference M(t)-p(t) for some fixed time t.
Schmidhuber et al. (2005) introduced a general framework of
sequence learning algorithms, EVOlution of recurrent systems with LINear
outputs (Evolino). Evolino uses evolution to discover good recurrent
neural network hidden node weights, while using methods such as linear
regression or quadratic programming to compute the optimal linear
mappings from the hidden state to the output. In some cases, quadratic
programming is used to maximize the margin. Evolino-based Long
Short-Term Memory (LSTM) can solve tasks that Echo State nets cannot.
The block diagram of an LSTM recurrent neural network is shown in
Figure 1.
The Evolino recurrent neural network forms an LSTM network with N =
4n memory cells, where N is the total number of neurons and n is the
number of memory cells. The genetic evolution algorithm is applied to
each quartet of memory cells separately. The cell has an internal state
S together with a forget gate (GF) that determines how much the state is
attenuated at each time step. The input gate (GI) controls access to the
cell by the external inputs that are summed into the unit, and the
output gate (GO) controls when and how much the cell fires. The dark
nodes represent the multiplication function and the linear regression
Moore-Penrose pseudoinverse method is used to compute the output (light
blue circle). A detailed description of the Evolino RNN algorithm can be
found in (Schmidhuber et al. 2005, 2006).
Input selection is always important for adapting artificial
intelligence systems to forecasting. In the paper by Maknickas and
Maknickiene (2012), a statistical research of the orthogonality of
inputs was made. The tools of financial prediction were found by
searching for dependencies between the time series of various financial
indicators or for series that have been exploited.
[FIGURE 1 OMITTED]
Suppose
[absolute value of ([[summation].sub.n]f(n)g(n))] = [epsilon], (1)
where the absolute value of the scalar multiplication of vectors e
describes the degree of orthogonality (true orthogonality cannot be
reached for the time series of financial markets). The prediction of one
time series output has been obtained by two mostly orthogonal time
series inputs.
The output of one part of an AI system can be the input for another
part. Some outputs can describe the distribution of predictions and can
be analysed. We chose the method applied to expert panel opinions.
The Delphi method was examined in detail by Dalkey (1969). Its use
of a panel's opinion has many different modifications. One of them,
the fuzzy Delphi, was used for sales forecasting by Chang and Wang
(2006) and integrated with artificial NN for stock market forecasting by
Kuo et al. (1996).
Trading in a foreign exchange market is not only related to
profitability but also to the investment risk. Therefore, in order to
diversify the risk, it is necessary to create an investment portfolio.
The portfolio optimization problem and optimal portfolio selection
methods are considered in (Rutkauskas 2005; Rutkauskas, Stasytyte 2008;
Rutkauskas et al. 2009a, 2009b).
The aims of our article are to present the Evolino recurrent neural
network prediction model, coordinated and tailored to profitably trade
in the currency market, taking investment risk into account. This
article focuses more on the model and its reliability and accuracy; it
also aims to show the wide scope of investment opportunities. A good
forecasting tool will allow focusing more on research and justification
of economic processes.
2. Forecasting tools
2.1. Delphi method
The Delphi method is based on the assumption that group judgements
are more valid than individual judgements. Our observations on the
Evolino recurrent neural network prediction (Rutkauskas et al. 2010)
made it clear that some of the predictions are very accurate, while some
others are contradictory, unstable, and must be rejected. The Delphi
method makes it possible to achieve a certain consensus or clustering of
forecasts. The steps of classical Delphi method are:
1) The group of experts receives a questionnaire and expresses
their prognoses using numeric values, argues their assessments, and
completes the questionnaire.
2) The answers are arranged in the ascending order and the media Me
and quartiles [Q.sub.1], [Q.sub.3] are calculated. After determining the
upper and lower quartiles, the range between the two averages
[Q.sub.1]Me and [Q.sub.3]Me is considered the most desirable interval.
The compatibility of the predictions is calculated, such as whether
there is a consensus of the experts. The experts are then familiarised
with the results and the arguments and prognoses are made again.
3) The second step is then repeated. Theoretically, the Delphi
process can be continuously iterated until a consensus is reached. In
practice, the number of iterations is limited by the time available for
decision making.
In our method, experts are not people and they can't argue,
but several Evolino recurrent neural networks can make predictions and
the compatibility of their prediction can be calculated.
2.2. Compatibility of neural net predictions
The expert group evaluation of the performance can be considered
sufficiently reliable only if the expert evaluation possesses good
compatibility of the responses. Therefore, it is necessary to assess the
compatibility of the expert assessments and calculate the interquartile
coefficient. The variation of the responses is taken to be the
difference between the first and third quartiles, Q3-Q1. The
interquartile coefficient is the quotient of the variation response by
the median:
q = Q3-Q1/Me. (2)
The interquartile coefficient ranges from 0 to +1 and is close to
zero when the distribution has very little variation. In neural network
assessment, as in human-expert evaluations, one expert's estimates
may be better than that of the group as a whole.
2.3. Reliability of forecasting
Time series methods use historical data as their basis. Suppose
that, for t = 1, ..., N, y(t) is the actual value at time t, [??](t) is
the forecast value at time t, and then [epsilon](t) is the forecast
error at time t, where e(t) = y(t)-[??](t). With the purpose of testing
the reliability of the model, we will calculate Pearson's
correlation coefficient r between y(t) and [??](t):
r = [summation].sup.N.sub.t=1 (y(t)-[??])([??](t)-[??])/square root
of ([summation].sup.N.sub.t=1[(y(t)-[??]).sup.2])[square root of
([summation].sup.N.sub.t=1[([??](t)-[??]).sup.2])] (3)
This coefficient will help verifying the accuracy and reliability
of the model.
2.4. Profitability and risk of financial forecasting
We assume that the individual investor, having a certain initial
capital, wants to invest in the instruments available for a particular
period T. Investing the initial capital of [V.sub.0] in a set of
instruments, the investor shapes an investment portfolio p. Suppose that
at the end of the selected period, the investor will realize the entire
investment portfolio, in other words, liquidate it into cash, the amount
of which will be denoted by [V.sub.1]. The purchase and liquidation will
be at market prices, which depend on the interaction of the supply and
demand in the market. Naturally, the investor wants to maximize the
final value of profitability of the invested funds (Rutkauskas 2006)
during the period T, which is determined by the following formula:
r = V1-V0+D/[V.sub.0], (4)
where D is the current income of the portfolio for the period T.
There are many options for choosing the investment portfolio, but we
will explore only three: 1) conservative--even distribution of risk; 2)
moderate--the objective of the optimal profit with reasonable risk; 3)
aggressive, in order to maximize profits with low risk. The most
referenced risk/return measures used in finance are the standard
deviation and the Sharpe ratio:
S(x) = [r.sub.x]-[r.sub.f]/[[sigma].sub.x], (5)
where x is the investment, [r.sub.x] is the average rate of return
of x, [r.sub.f] is the best available rate of return of a risk-free
security, and [[sigma].sub.x] is standard deviation of [r.sub.x].
The risk levels of the strategies should be proportional to their
Sharpe ratios. Strategies with zero predicted Sharpe ratios should be
ignored. Those with positive ratios should be 'held long', and
those with negative ratios 'held short'. If strategy X has a
positive Sharpe ratio that is twice as large as that of strategy Y,
twice as much risk should be taken with X as with Y. The overall LSTM
network scale of all the positions should, in turn, be proportional to
the investor's risk tolerance (Sharpe 1994).
3. Basic architecture and simulation of the prediction algorithm
During the initial model testing stage, we have given a lot of
attention to prediction of various market parameters. We studied stock
prices, indexes, various resource prices, and chose a narrow area of
research, namely, exchange rate forecasting.
The accuracy of prediction was investigated with the python program
using the following steps:
Data step. Getting historical data on financial markets from Meta
Trader-Alpari. For prediction, we chose EUR/USD (Euro and American
Dollar), EUR/JPY (Euro and Japanese Yen), USD/JPY (American Dollar and
Japanese Yen), EUR/CHF (Euro and Swiss Franc) exchange rates and their
historical data for the first input, and for the second input, two years
historical data for XAUUSD (gold prices in American dollars), XAGUSD
(silver prices in American dollars), QM (Oil prices in American
dollars), and QG (gas prices in American dollars). At the end of this
step, we had the basis of historical data.
Input step. The python script calculated the ranges of
orthogonality of the last 80-140 points of the exchange rate historical
data chosen for prediction, and an adequate interval from the two years
historical data of XAUUSD, XAGUSD, QM, and QG. A value closer to zero
indicates higher orthogonality of the input base pairs. Eight pairs of
data intervals with the best orthogonality were used for the inputs to
the Evolino recurrent neural network.
Prediction step. Eight Evolino recurrent neural networks made
predictions for a selected point in the future. At the end of this step,
we had eight different predictions for one point of time in the future.
Consensus step. The resulting eight predictions were arranged in
the ascending order, and then the median, quartiles, and compatibility
were calculated. If the compatibility was within the range [0; 0.024],
the prediction was right. If not, then step 3 was repeated, sometimes
with another 'teacher' if the orthogonality was similar. At
the end of this step, we had one most probable prediction for the chosen
exchange rate.
Investment portfolio step. Repeating steps 1-4 for the other
exchange rates allowed having a set of exchange rate forecasts and
building an investment portfolio. The first portfolio was made from the
four exchange rates (EUR/USD, EUR/JPY, USD/JPY, EUR/CHF), and the
investment amount was divided equally at every step of the investing.
The second portfolio was made from the four same exchange rates but the
amount invested was divided by the projected percentage gain. The third
choice of investment portfolio consisted of that exchange rate whose
projected growth rate was the highest. The basic architecture of the
prediction algorithm is shown in Figure 2.
With the aim of verifying the accuracy and reliability of the model
predictions, a statistical analysis of the model was carried out. The
validation of the prediction was measured by the correlation between the
predicted values and the real values in the future. Different exchange
rates in the first input and different historical data in the second
input guaranteed a random selection of data inputs for the model. Figure
3 shows the distribution of 200 Pearson's correlation coefficients.
All tests were made in the time period from 12/2011 to 06/2012. The
choice of the period was not specifically planned.
Values of the correlation coefficient equel 1 were received in 30%
of all tests, which means an excellent prediction. Correlation
coefficient within the range [0.6; 1] was 68% of all tests, which means
a very good prediction, and within the range [0; 1]--77%, which means
that it is a very good predictor of the direction of change, which is
very important for investors as well. And only 23% negative correlation
coefficients were received in the statistical research of the model.
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
Prediction simulation
Having several different exchange rate forecasts allows an investor
to choose different investment portfolios and reduce the investment
risk, thus increasing its reliability. Three investment portfolios have
been tested:
Conservative. The first portfolio was made from four exchange rates
(EUR/USD, EUR/ JPY, USD/JPY, EUR/CHF) with the investment amount divided
equally at every step of investing (3 days in our research). The
investor, having four predictions from the model, can choose one from
the three operations: buying--if the exchange rate increases,
selling--if the exchange rate decreases, and keeping--if the prediction
has some doubt, such as a very high variation. Every operation with
exchange rates has some damage, which equals to 0.02 of the operation.
Moderate. The second portfolio was made from the same four exchange
rates. The investor, having four predictions from the model, in order to
maximize profits, divided the initial investment amount by the projected
percentage gain.
Aggressive. The third portfolio was made from the same exchange
rates but the entire amount was invested in only one exchange rate,
which had the greatest predicted profit. The comparison of this
portfolio's percentage profit of two variants in the time period
12/2011-06/2012 is shown in Figure 4.
The credibility of the model's forecast, building investment
portfolios by the projected gain, increases the profitability of the
investment from 12-15% to 20-25% and 27-35% in 40 trading days with the
different degrees of risk. The standard deviation describes the
portfolio risk and the Sharpe ratio indicates the expected differential
return per unit of risk associated with the differential return. The
risk free rate was put at 3% per year. The ratios of the different
portfolios are compared in Table 1. The first column gives names of
pairs of different portfolios, and two single currency prediction tests
without portfolio for comparison. At the second column are standard
deviation averages for the entire period. The third column provides
Sharpe ratio averages for the entire period.
After selecting three different levels of riskiness of the
investment portfolios, all portfolios had a good positive Sharpe index
and the aggressive portfolio had a very good Sharpe index (greater than
1). Using only one exchange rate (EUR/USD and USD/JPY) resulted in a
standard deviation of 0.61-0.62, and a Sharpe ratio of 0.73. Investments
based on the knowledge of the prediction of the recurrent neural network
'team' become more reliable and more profitable. The increased
reliability of the model provides a decision maker with big investment
opportunities and the freedom of choice.
[FIGURE 4 OMITTED]
4. Conclusions
The model developed, based on the Evolino recurrent neural network
and on expert methods, is simple to use and is a good tool for an
investor. The reliability of this model, measured by the correlation
coefficient, is high enough for profitable trading in the finance
market. The credibility of the model's forecast increases the
profitability of the investment.
The model allows an investor to make different investment
portfolios, based on the choice of different investment strategies with
different levels of risk.
This model has a great potential for various investment portfolios
and investment strategy choices. It can be easily adapted to trading of
other financial indicators or stocks.
Caption: Fig. 1. LSTM network
Caption: Fig. 2. Scheme of the model
Caption: Fig. 3. Statistical analysis of the model's accuracy
(200 tests)
Caption: Fig. 4. The percentage of profit growth in two different
variants
doi: 10.3846/16111699.2012.729532
References
Agrawal, S.; Jindal, M.; Pillai, G. 2010. Momentum analysis based
stock market prediction using adaptive neuro fuzzy inference system, in
Proceedings of the International MultiConference of Engineers and
Computer Scientists, vol. 1, March 17-19, 2010, Hong Kong. Newswood
Limited, 526-531.
Ang, K.; Quek, C.; 2006. Stock trading using RSPOP: a novel rough
set-based neuro-fuzzy approach, IEEE Trans. NeuralNetw. 17(5):
1301-1316. http://dx.doi.org/10.1109/TNN.2006.875996
Chang, P.-C.; Fan, C.-Y. 2008. A hybrid system integrating a
wavelet and tsk fuzzy rules for stock price forecasting, IEEE
Transactions on Systems, Man, and Cybernetics Part C: Applications and
Reviews 38(6): 802-815. http://dx.doi.org/10.1109/TSMCC.2008.2001694
Chang, P.-C.; Wang, Y.-W. 2006. Fuzzy delphi and back-propagation
model for sales forecasting in pcb industry, Expert Systems with
Applications 30(4): 715-726.
http://dx.doi.org/10.1016Zj.eswa.2005.07.031
Chiang, P.; Liu, C. 2008. A tsk type fuzzy rule based system for
stock market prediction, Expert Systems with Applications 34(1):
135-144. http://dx.doi.org/10.10167j.eswa.2006.08.020
Choudhry, R.; Garg, K. 2008. A hybrid machine learning system for
stock market forecasting, World Academy of Science, Engineering and
Technology 39: 315-318.
Dalkey, N. C. 1969. The delphi method, Tech. Rep. RM-5888-PR. RAND
Corporation.
Hussan, M.; Nath, B.; Kirley, M. 2007. Fussion model hmm, ann and
ga for stock market forecasting, Experts Systems with Applications 33:
171-180. http://dx.doi.org/10.1016/j.eswa.2006.04.007
Hutter, M. 2001. General loss bounds for universal sequence
prediction, in ICML 2001, Ed. by Morgan Kaufmann. June 28-July 1, 2001,
Williams College, Williamstown, MA, USA. ACE 210-217. ISBN
1-55860-778-1.
Kim, K.; Han, I. 2000. Genetic algorithms approach to feature
discretizationin artificial neural networks for the prediction of stock
price index, Expert Systems with Applications 19(2): 125-132.
http://dx.doi.org/10.1016/S0957-4174(00)00027-0
Kimoto, T.; Asakawa, K.; Yoda, M.; Takeoka, M. 1990. Stock market
prediction system with modular neural networks, in International Joint
Conference on Neural Networks, vol. 1, June 17-21, 1990, San Diego, CA,
USA. IEEE Press, 1-6.
Kulkarni, A. S. 1996. Application of neural networks to stock
market prediction, Tech. rep. [online], [cited 12 June 2012]. Available
from Internet: www.machine-learning.martinsewell.com
Kuo, R.; Lee, L.; Lee, C. 1996. Integration of artificial neural
networks and fuzzy delphi for stock market forecasting, in IEEE
International Conference on Systems, Man, and Cybernetics, vol. 2. 14-17
Oct., 1996, Beijing, China. IEEE Press, 1073-1078.
http://dx.doi.org/10.1109/ICSMC.1996.571232
Maknickas, A.; Maknickiene, N. 2012. Influence of data
orthogonality to accuracy and stability of financial market predictions,
in IJCCI 2012: 4th International Joint Conference on Computational
Intelligence, Barcelona, Spain, 5-7 October, 2012. Setubal: INSTICC,
616-619.
Quek, C.; Guo, Z.; Maskell, D. L. 2011. A novel fuzzy associative
memory architecture for stock market prediction and trading,
International Journal of Fuzzy System Applications 1(1): 61-78.
http://dx.doi.org/10.4018/ijfsa.2011010105
Rutkauskas, A. 2005. Portfelio sprendimai valiutu kursu ir kapitalo
rinkose, Business: Theory and Practice 6(2): 107-116.
Rutkauskas, A. 2006. Adekvaciojo investavimo portfelio anatomija ir
sprendimai panaudojant imitacines technologijas, Ekonomika 75: 52-76.
Rutkauskas, A.; Maknickiene, N.; Maknickas, A. 2010. Approximation
of dji, nasdaq and gold time series with evolino neural networks, in The
6th International Scientific Conference Business and Management 2010,
May 13-14, 2010, Vilnius, Lithuania. Vilnius: Technika, 170-175.
Rutkauskas, A.; Stasytyte, V. 2008. Stratification of stock
profitabilities--the framework for investors' possibilities
research in the market, Intellectual Economics 1(3): 67-72.
Rutkauskas, A.; Stasytyte, V.; Stankeviciene, J. 2009a. Profit,
riskness and reliability-three-dimensional base for investment decisions
management, in Modeling and Analysis of Safety and Risk in Complex
Systems: Proceedings of the Ninth International Scientific School, July
7-11, 2009, Saint-Petersburg, Russia. MASR, 105-110.
Rutkauskas, A.; Stasytyte, V.; Borisova, J. 2009b. Adequate
portfolio as a conceptual model of investment profitability, risk and
reliability adjustment to investor's interests, Economics and
Management 14: 1170-1174.
Schmidhuber, J.; Gagliolo, M.; Wierstra, D.; Gomez, F. 2006.
Evolino for recurrent support vector machines, in European Symposium on
Artificial Neural Networks, April 26-28, 2006, Bruges, Belgium. arXiv
preprint cs/0512062, 593-598.
Schmidhuber, J.; Wierstra, D.; Gomez, F. F. 2005. Evolino hybrid
neuroevolution/optimal linear search for sequence learning, in
Proceedings of the 19th International Joint Conference on Artificial
Intelligence, July 30-August 5, 2005, Switzerland: Morgan Kaufmann
Publishers Inc, 466-477.
Sharpe, W. 1994. The sharpe ratio, The Journal of Portfolio
Management 21(1): 49-58. http://dx.doi.org/10.3905/jpm.1994.409501
Wang, J.; Leu, J. 1996. Stock market trend prediction using
arimabased neural networks, in IEEE International Conference on Neural
Networks, vol. 44, June 3-6, 1996, Washington DC, USA. IEE Service
Center, 2160-2165.
Nijole Maknickiene (1), Algirdas Maknickas (2)
Vilnius Gediminas Technical University, Saulctekio al. 11, LT-10223
Vilnius, Lithuania
E-mails: (1) nijole.maknickiene@vgtu.lt (corresponding author); (2)
algirdas.maknickas@vgtu.lt
Received 16 July 2012; accepted 10 September 2012
Nijole MAKNICKIENE. Assistant, Master in Physics at Vilnius
University, Quantum Electronics (Dipl.-Phys.) 1986. Recently assistant
at the Department of Financial Engineering of Vilnius Gediminas
Technical University. Author of 1 peer-reviewed research paper, and 4
contributions to international conferences. Research interests: capital
markets, researching chaotic processes by a neural network.
Algirdas MAKNICKAS. Dr, Graduated in Physics at Vilnius University
(Dipl.-Phys.) 1986, Dr techn. (Ph.D.) at VGTU in 2009. Recently
associate professor at the Department of Information Technologies of
Vilnius Gediminas Technical University and senior researcher at the
Institute of Mechanical Sciences of Vilnius Gediminas Technical
University. Author of 9 peer-reviewed research papers, and more than 15
contributions to international conferences. He has participated in
several EU-funded projects. His recent research interests include:
theoretical investigation of particulated solids, software engineering,
algorithm theory, chaos forecasting theory.
Table 1. Portfolio indicators
Portfolio Standard Sharpe
Deviation Average Ratio Average
Conservative I 0.32 0.78
Conservative II 0.53 0.60
Moderate I 0.48 0.90
Moderate II 0.63 0.80
Aggressive I 0.71 0.81
Aggressive II 0.72 1.07
Without portfolio I 0.61 0.73
Without portfolio II 0.62 0.73