Comparison study on neural network and ordinary least squares model to stocks' prices forecasting.
Tjung, Luna Christie ; Kwon, Ojoung ; Tseng, K.C. 等
INTRODUCTION
Business Intelligence, such as data mining applications, has made some significant contribution to forecasting science. Burstein and Holsapple (2008) state that "business intelligence (BI) is a data-driven decision support system that combines data gathering, data storage, and knowledge management with analysis to provide input to the business decision process." According to Han and Kamber (2006), "data mining is extracting knowledge from large amounts of data".
Neural Network is one of these data mining applications and useful in making complex predictions in many disciplines. There are many examples of successful data mining being applied. DuMouchel (1999) used Bayesian data mining to work with large frequency tables with millions of cells for FDA Spontaneous application. Giudici (2001) used Bayesian data mining for benchmarking and credit scoring in highly dimensional complex datasets. Jeong, et.al. (2008) integrated data mining with a process designed using the robust Bayesian approach. "Mostafa (2010) reported many other applications such as air pollution forecasting (e.g. Videnova, Nedialkova, Dimitrova, & Popova, 2006), maritime traffic forecasting (Mostafa, 2004), airline passenger traffic forecasting (Nam & Yi, 1997), railway traffic forecasting (Zhuo, Li-Min, Yong, & Yan-hui, 2007), commodity prices (Kohzadi, Boyd, Kemlanshahi, & Kaastra, 1996), ozone level (Ruiz-Suarez, Mayora-Ibarra, Torres-Jimenez & Ruiz-Suarez, 1995), student grade point averages (Gorr, Nagin & Szczypula, 1994), forecasting macroeconomic data (Aminian, Suarez, Aminian & Walz, 2006), financial time series forecasting (Yu, Wang & Lai, 2009), advertising (Poh, Yao & Jasic, 1998), and market trends (Aiken & Bsat, 1999)."
Mostafa (2010) also shows many literatures in financial forecasting applications of NNs to predict indexes. Started with Cao, Leggio, and Schniederjans (2005) who used NNs to predict stock price movements for firms traded on the Shanghai stock exchange and found that NN models outperform the linear models. Kryzanowski, Galler, and Wright (1993) used NN models with historic accounting and macroeconomic data to identify stocks that will outperform the market. McGrath (2002) used market to book and price earnings ratios in a NN model to rank stocks based on the likelihood estimates. Ferson and Harvey (1993) and Kimoto, Asakawa, Yoda, and Takeoka (1990) used a series of macroeconomic variables to capture predictable variation in stock price returns. McNelis (1996) used the Chilean stock market to predict returns in the Brazilian markets. Yumlu, Gurgen, and Okay (2005) used various NN architectures to model the performance of Istanbul stock exchange over the period 1990-2002. Leigh, Hightower, and Modani (2005) used NN models and linear regression models to model the New York Stock Exchange Composite Index data for the period from 1981 to 1999. Results were robust and informative as to the role of trading volume in the stock market.
From this literature survey we find that no previous studies have attempted to predict the changes in stock prices from various industries with comprehensive independent variables. In this study, we want to focus on the 37 stock prices' changes from eight industries by incorporating critical independent variables.
If the market is efficient, according to the efficient market hypothesis by Fama, then we cannot predict stocks' prices. Fama (1970, p. 383) defined efficient market hypothesis (EMH) as the idea where "a market in which prices provide accurate signals for resource allocation, that is, a market in which firms can make production-investment decisions, and investors can choose among the securities that represent ownership of firms' activities under the assumption that security prices at any time 'fully reflect' all available information." Accordingly, it would be difficult, if not impossible, to consistently predict and outperform the market because the information that one would use to make such predictions would already be reflected in the prices.
Although EMH has become the mainstream in finance and received theoretical and empirical support, the technical analysis has been widely applied on Wall Street for over a century. It has been vigorously challenged by behavioral finance since the 1980s. It is impossible to review all relevant and significant studies to point out the flaws of EMH. Here we will summarize some of the prominent studies that are representative of the recent development.
EMH is based on three key fundamental assumptions. One of them is that market participants are perfectly rational and can value securities rationally all the time. Next, if there are irrational investors, their trading activities will cancel out with one another or will be arbitraged away by other rational investors (Shleifer, 2000). Finally, investors have well-defined subjective utility functions to be maximized. As Herbert Simon (1982 and 1997), a Nobel laureate, pointed out, when there is risk and uncertainty or incomplete information about an alternative or high degree of complexity such as investing, people tend to behave somewhat differently from rationality. This is what Simon called bounded rationality. Bounded rationality "is used to designate rational choice that takes into account of the cognitive limitations of the decision-maker, limitations of both knowledge and computational capacity. Bounded rationality is the central theme in the behavioral approach to economics, which is deeply concerned with the ways in which the actual decision-making process influences the decisions that are reached". (Simon, 1997, p. 291) In other words, the underlying assumptions of EMH can exist only in Plato's idealistic world. It may be an elegant theory, but it cannot exist in the complex, uncertain, and risky investing environments and financial markets.
In fact, Simon (1955, 1982), Kahneman, Slovic, and Tversky (1984), and Kahneman and Tversky (1979) brought the psychological aspects of decision-making into economics and finance and built the foundation for so-called behavioral economics and finance. However, behavioral finance caught the attention of the finance profession and investment community since the empirical studies of De Bondt and Thaler (1985, 1987) were published. Their 1985 paper was based on the monthly return data of the New York Stock Exchange common stocks from January 1926 to December 1982, while their 1987 follow-up study used the Annual Industrial COMPUSTAT tapes data from 1965 to 1984. They discovered that investors tended to overweigh recent information and underweigh base rate information and also found that portfolios of prior losers outperformed that of prior winners. If investors count on the representativeness heuristic, a tendency to make decisions or judge information that fits their preconceived categories or stereotypes of a situation, they become too optimistic about the recent winners and too pessimistic about the recent losers. In recent years, behavioral finance has become more widely accepted as a result of further works such as Shefrin (2000), Shiller (2000, 2002, and 2003), Hirshleifer (2000), and Thaler (2005).
People in general and investors in particular tend to predict future uncertain conditions by focusing on recent history but pay little attention to the possibility that the short recent history may occur by chance as found by Kahneman and Turversky (1979) in which market participants systematically violate Bayes rule and other maxims of probability theory to predict the risky and uncertain outcomes. People incline to weigh heavily on some memorable, salient, and vivid evidence than the true relevant information. Based on their prospect theory, they find that investors are reluctant to sell losing stocks. Similarly, by using a very large sample of individual investors at a large discount broker Barber and Odean (2000) also find that overconfident investors tended to sell winners too soon and keep losers too long. Kahneman and Riepe (1998) found that investors pervasively and systematically deviated from the maxims of economic rationality by overweighing the recent information and underweighing the base line information. Furthermore, Odean (1998, 1999) find that investors are inclined to overestimate their own abilities and be too optimistic about future conditions, focus too much on attention-getting information that is consistent with their existing beliefs, and emphasize their private information too heavily. He also found that people tend to demonstrate the highest overconfidence when dealing with complex and difficult matters, such as investing in stocks and predicting their future returns. He also found that overconfident traders are likely to trade too much and too frequently, lower their expected utility, generate greater market depth, and increase market volatility. Daniel, Hirshleifer, and Subrahmanyan (1998) argue that due to investors' self-attribution bias and representative heuristic their confidence grows significantly when public information agrees with investors' private information. However when public information disagrees with investors' private information, their confidence declines only slightly. As a result, investors tend to overreact to private information signals and underreact to public information signals. They also show that positive return autocorrelation is resulting from continuous short-term overreaction followed by long-term correction. These findings of short-term continuation and long-term reversal are consistent with a study by Balvers, Wu, and Gilliand (2000) when they used the national stock index data of 18 countries from 1969 to 1996. Finally, Easterwood and Nutt (1999) found that even professional analysts underreact to most negative information but overreact to most positive information.
The ways experts theorizing investors' overconfidence are quite different. Daniel, Hirshleifer, and Subramanyan (1998) attribute overconfidence to investors' self-attribution of investment outcomes and the resulting short-term returns continuation and long-term reversal. They show that positive return autocorrelations are the result of continuing overreaction only to be followed by long-term negative return autocorrelations. In addition, biased self-attribution increases positive short-term positive autocorrelations or momentum. Based on bounded rationality of market participants, Hong and Stein (1999) divided investors into news-watchers and momentum traders. They assumed that "each news-watcher observes some private information, but fails to extract other news-watchers information from prices. If information diffuses gradually across the population, prices underreact in the short run. The underreaction means that the momentum traders can profit by trend-chasing. However, if they can only implement simple (i.e., univariate) strategies, their attempts at arbitrage must inevitably lead to overreaction at long horizons." (Hong and Stein, p.2143) In their model, the news-watchers made forecasts on the basis of private observations about future fundamentals without taking into account the current and past prices. The momentum traders, on the other hand, applied only simple or univariate functions of the past prices to make their forecasts. Both theories lead to similar conclusions of short-term momentum and long-term reversal of returns. If indeed these are the general patterns of market movement, the investors can easily follow the patterns and profit from them. Many technical analysts have incorporated momentum and trading volumes to identify some patterns for profitable opportunities.
Recently, Gutierrez and Kelley (2008) used weekly data based on the midpoint of the final bid and ask quotes from Wednesday to Wednesday for the period of 1983-2003. They included all stocks listed on the NYSE and AMEX but excluded stocks of $5 or lower. They formed a portfolio of winners as those stocks with returns in the highest decile and portfolio of losers as those stocks with returns in the lowest decile. The differential profits between the winners' portfolio and losers' portfolio demonstrated significant reversal or negative returns in the first two weeks. The profit differentials became positive and significant from week 4 to week 52. Their findings were consistent with the 3 to 12 months continuation of return momentum found by Jegadeesh and Titman (1993). They used calendar data rather than event data typically used in most other studies. In other words, their findings of very short-term one to two week reversal and up to one year long-term momentum or underreaction were quite different from the news or information-driven overreaction and underreaction found by Daniel, Hirshleifer, and Subramanyan (DHS) (1998) or Hong and Stein (HS) (1999). In a DHS study, investors overreacted to private information and underreacted to public information while in a HS study news-watchers underreacted to public news and the trend chasers overreacted to price movements. They also found that week-1 stock return reversal is statistically greater for large stocks, for more institutional owned stocks, for greater volatility stocks, and for more analyst-covered stocks. The practical implication from this empirical study is that investors should buy the top winners and short the large losers in the previous week and realize the extra profits over the next 52 weeks.
Based on the Merged COMPUSTA-CRSP data base for the period of July 1963-June 2005, Menzly and Ozbas (2010) hypothesized that since investors specialize in certain market segmented industries, valuable and relevant information tends to diffuse gradually throughout the entire financial markets. First, they found that returns of individual stocks and industries demonstrate strong cross-predictability with some lagged returns in supplier and customer industries. Second, the results showed that the smaller the number of analysts or institutional ownership, the greater the cross-predictability. More analyst coverage or institutional ownership increased the speed of information diffusion and reduced the cross-predictability. Analysts and institutional investors may be considered as informed investors. Finally, they found that when investors buy stocks with high returns in supplier industries over the previous month and at the same time sell stocks with low returns in supplier industries over the previous month, will generate annual excess returns of 7.3%. If investors apply a similar strategy to buy and sell the stocks of the customer industries in the previous month, the annual excess returns will be 7.0%. When they incorporate the three factors identified by Fama and French (1994), both trading strategies are still consistently profitable with no significant effects by the three factors.
Based on the findings from these empirical studies on overreaction/underreaction, short-term (one week or one month) reversal, medium-term (three week or three months to 52 weeks or one year) continuation or momentum, and long-term (three to five years) reversal, and the most recent findings on cross-predictability, it appears that stock prices or returns may be predictable and excess returns may be realized with proper trading strategies. These findings in a sense support the traditional technical analysis and some arguments advanced by behavioral finance. We now turn to illustrate some examples to see how academicians tested the efficacy of technical analyses.
Brock, Lakonishok, and LeBaron (BLL) (1992) applied two simple trading rules: the moving averages and trading range break-out (TRB). The data they used were the daily Dow Jones Industrial Average (DJIA) from the first trading day in 1897 to the last trading day in 1986. In addition to the full sample they also tested the nonoverlapping subsamples: 1/1/1897-7/30/1914, 1/1/1915- 12/31/1938, 1/1/1939-6/30/1962, and 7/1/1962-12/31/1986. The variablelength moving average (VMA) or fixed-length moving average (FMA) was used to initiate some buy (sell) signals when the short moving average crosses the long moving average from below (above) with or without a band such as plus or minus one percent. They tried some popular short-and long moving average combinations of 1-50, 1-150, 5-150, 1-200, and 2-200. In TRB trading rule, a buy signal occurs when the price penetrates the resistance level or the local maximum. Similarly, when the price penetrates the support level or local minimum, a sell signal will result. To be comparable to the moving average rules the maximum or minimum prices were based on the past 50, 150, or 200 days. To avoid the possible dependencies across different trading rules, leptokurtosis, autocorrelation, conditional heteroskedasticity, and changing conditional means, they applied the bootstrap methodology to test and validate their empirical findings. Their empirical results strongly supported the technical trading strategies with returns from buy signals being 0.8 percent higher than the returns from sell signals over 10-day period. The returns from buy (sell) signals were higher (lower) than the normal returns. They pointed out that the return difference between buys and sells cannot be explained by risk differential.
In addition to moving averages, support and resistance levels, and channel breakouts, Sullivan, Timmermann, and White (1999) considered the on-balance volume (OBV) indicator. The OBV indicator "is calculated by keeping a running total of the indicator each day and adding the entire amount of daily volume when the closing price increases, and subtracting the daily volume when the closing price decreases." The OBV trading rules are similar to the moving average trading rules except the use of OBV indicator rather than price. They applied the DJIA daily data from 1897 to 1996 extending 10 more years from BLL study. They found that the BLL results are robust to data-snooping and that technical trading rules are profitable. But they also found that BLL's profitable technical trading rules do not repeat in the out-of-sample experiment for the period of 1987-1996. In fact, the results were reversed.
Recently, Lo, Mamaysky, and Wang (LMW) (2000) provided precise definitions of head- and shoulders having three peaks with the middle peak higher than the other two, broadening tops and bottoms, triangle tops and bottoms, and rectangle tops and bottoms. The last three definitions are characterized by five consecutive extrema. They also specify the double tops and double bottoms (LMW, pp.1716-1718). Then they applied the nonparametric kernel regression method to identify the nonlinear patterns of stock price movements. Using CRSP data they applied the goodness-of-fit and Komogonov-Smirnov tests to the daily returns of individual NYSE/Amex and Nasdaq stocks from 1962 to 1996. They found that some patterns of technical analysis provided incremental information and had practical value. They contended that technical analysis may be improved by their automated algorithms to determine some optimal patterns. Finally, Blume, Easley, and O'Hara (BEO) (1994) developed a theoretical model that showed a "technical analysis is valuable because current market statistics may be sufficient to reveal some information, but not all. Because the underlying uncertainty in the economy is not resolved in one period, sequences of market statistics can provide information that is not impounded in a single market price."(BEO, p.177) In their model, volume can provide some information different from that provided by price.
In brief, trading patterns of price and volume as used by technical analysts over a century indeed provide certain valuable and/or profitable information. It is widely known that most technical analysts apply both technical analyses and fundamental analyses to manage their portfolios and conduct their trading. Technical patterns and market momentum and reversal are helpful to market participants to develop some forecasting methods and models to predict stock prices. In this study, we tried to identify various factors to forecast prices of some randomly selected stocks and funds from different industries. Since ordinary least squares (OLS) has been widely applied, we used it in this study. In addition, a neural network (NN) was applied for comparison.
For the remainder of the paper, we introduced a business intelligence (BI) technique, specifically Neural Network, and discussed how NN can be used to predict stock prices. In the next section, we compared and contrasted Ordinary Least Squares (OLS) and Neural Network (NN) models with regard to their accuracy and ease of use in stock forecasting. Next, we explained the methodology for testing our hypothesis including details on our predictors and data normalization. Finally, we presented the results and discussed the implications of our findings.
TOOLS AND TECHNIQUES OF STOCK FORECASTING
The Ordinary Least Squares (OLS) model has many advantages. It is easy to use, to validate, and typically generates the best combination of predictors by using stepwise regression. However, OLS is a linear model that has a relatively high forecasting error when forecasting a nonlinear environment, which is common in the stock markets. Also, the OLS model can only predict one dependent variable at a time. On the other hand, a neural network model has high precision, is capable of prediction in nonlinear settings, and addresses problems with a great deal of complexity. Given these advantages, we expected the neural network to outperform OLS in predicting stock prices.
In addition to determining which factors best predict changes in stock prices, we also compared the two analytical strategies of OLS and artificial neural networks. Hammad, Ali, and Hall (2009) showed that the artificial neural network (ANN) technique provides fast convergence, high precision and strong forecasting ability of real stock prices. Traditional methods for stock price forecasting are based on statistical methods, intuition, or on experts' judgment. Traditional methods' performance depends on the stability of the prices; as more political, economical and psychological impact-factors get into the picture, the problem becomes nonlinear, and traditional methods need a more nonlinear method like ANN, fuzzy logic, or genetic algorithms.
Along the same lines as Hammad et al., (2009), West, Brockett, and Golden (1997) concluded that the neural network offers superior predictive capabilities over traditional statistical methods in predicting consumer choice in nonlinear and linear settings. Neural networks can capture nonlinear relationships associated with the use of non-compensatory decision rules. The study revealed that neural networks have great potential for improving model predictions in nonlinear decision contexts without sacrificing performance in linear decision contexts.
However, neural networks are not a panacea. For example, Yoon and Swales (1991) concluded that despite neural network's capability of addressing problems with a great deal of complexity, as the increase in the number of hidden units in neural network resulted in higher performance up to a certain point, additional hidden units beyond the point impaired the model's performance.
Prior research and common wisdom have suggested several factors that might be used in OLS or a neural network model to predict stock prices. Grudnitski and Osburn (1993) used general economic conditions and traders' expectations about what will happen in the market for their futures. Kahn (2006) stated that the sentiment indicator is the summation of all market expectation that is driven by volatility index, put/call ratio, short interest, commercial activity, surveys, magazines, emotions, and many more. Tokic (2005) showed that political events like the war on terror, fiscal policy to lower taxes, and monetary policy to lower short-term interest, and the increase in the budget deficit are related to stock prices. Nofsinger and Sias (1999) showed that there is a strong positive relation between annual changes in institutional ownership and returns over the herding interval across different capitalizations.
Moshiri and Cameron (2000) compared the most commonly used type of artificial neural network (the back-propagation networks (BPN) model) with six traditional econometric models (three structural models and three time series models) in forecasting inflation. BPN models are static or feed-forward-only (input vectors are fed through to output vectors, with no feedback to input vectors again); they are hetero-associative (the output vector may contain variables different from the input vector) and their learning is supervised (an input vector and a target output vector both are defined and the networks tend to learn the relationship between them through a specified learning rule). The three structural models include (1) the reduced-form inflation equation that follows from a fairly standard aggregate demand-aggregate supply model with adaptive expectations, (2) the inflation equation from Ray Fair's econometric forecasting model, and (3) a monetary model for forecasting inflation. The three time series models are (1) an ARIMA (Autoregressive integrated moving average) model, the single-variable model derived from Box-Jenkins methodology, (2) a Vector Autoregression or VAR model, considered the joint behavior of several variables, and (3) a Bayesian Vector Autoregression or BVAR model, the combination of VAR model with prior information on the coefficients of the model and estimated using a mixed-estimation method. In one-period-ahead dynamic forecasting, the information contained in the econometric models was contained in the BPN, and the BPN contained further information; the BPN models were superior in all four comparisons. Over a three-period forecast horizon the BPN models were superior in two comparisons (VAR and Structural) and inferior in two (ARIMA and BVAR). Over a twelve-period forecast horizon, the BPN models were superior in two comparisons (VAR and Structural) and equally good in two (ARIMA and BVAR). Moshiri (2001) concluded that the BPN model has been able to outperform econometric models over longer forecast horizons.
There are many examples of the successful applications of data mining. DuMouchel (1999) used Bayesian data mining to work with large frequency tables with millions of cells for FDA Spontaneous application. Giudici (2001) used Bayesian data mining for benchmarking and credit scoring in highly dimensional complex datasets. Jeong, et.al. (2008) integrated data mining to a process designed using the robust Bayesian approach. Dutta, Jha, Laha, and Mohan (2060) applied artificial neural network (ANN) models to forecast Bombay Stock Exchange's SENSEX weekly closing values. They compared two ANNs they developed and trained using 250 weeks' data from January 1997 to December 2001. They used root-mean square error and mean absolute error to evaluate forecasting performance of the two ANNs for the period of January 2002-December 2003. Ying, Kuo, and Seow (2005) applied the hierarchical Bayesian (HB) approach to stock prices of 28 companies contained in DJIA from the third quarter of 1984 to the first quarter of 1998. They found the HB method predicted better than the classical method. Finally, Tsai and Wang (2009) applied ANN and decision tree model (DT) and found that the combination of ANN and DT models resulted in more accurate prediction.
HYPOTHESIS
Because the neural network (NN) model can address problems with a great deal of complexity and improve its prediction in nonlinear settings, we expect that the neural network will outperform OLS in predicting stock prices.
H0 OLS model better predicts stock prices than NN Model
H1 NN model better predicts stock prices than OLS Model
METHODOLOGY
In order to forecast the changes in stock prices, we used the daily changes in stock prices of 37 stocks from September 1, 1998 to April 30, 2008. This 10-year time period was selected because we wanted to include the dot com bubble and the early part of the recent global financial crisis into our forecasting horizon instead of including expansion periods only. In our previous study, we used only seven financial stocks because they were relatively volatile and more sensitive to economic news. In this study, we would like to expand and include 37 stocks in various industries as shown in the Table 2 to examine whether we can generalize our findings from our previous study. In that study, we found that NN provided superior performance with up to 96% forecasting accuracy compared to OLS model with only 68%. (Tjung, Kwon, Tseng, and Bradley-Geist, 2010)
Predictors (see Appendix for full details)
As shown in the Appendix, we used eight indicators such as macroeconomic leading indicators (global market indices), microeconomic indicators (competitors), political indicators (presidential election date and party), market indicators (USA index), institutional investors (BEN), and calendar anomalies as our independent variables to predict changes in daily financial stock prices. We also took into account the business cycle factors such as the "dot-com bubble" into our forecasting horizon with dummy variables. We gathered our data through the National Bureau of Economic Research (NBER), Yahoo Finance, Federal Reserve Bank, Market Vane (MV), NYSE, and FXStreet.
The macroeconomic indicators included the 18 major global stock indices. The microeconomic indicators included the competitors and companies from different industries. There are 213 of them. The daily market indicators included changes in price and volume of the S&P500, Dow Jones Industrial, Dow Jones Utility, and Dow Jones Transportation. The sentiment indicators included the Volatility Index (VIX) and CBOE OEX Implied Volatility (VXO).
The political indicators included major elections and the political party in control. We denoted a non-election date with 0. The calendar anomalies include a daily, weekly, monthly, and pre-holiday calendar. The daily calendar includes Monday, Tuesday, Wednesday, Thursday, and Friday. The weekly calendar includes week one to week five. The monthly calendar includes January to December. We denoted Tuesday, week 2, and September with dummy variable zero in the OLS model. Contrastingly, we included them in the NN model. The business cycle included the recession from technological crash and current bear market with dummy variable one.
Coding
In NN model, we included all qualitative variables (political indicators, calendar anomalies, and business cycle) by using multiple neurons to minimize its bias. Table 1 below shows a NN coding example for political indicators. Coding is necessary to avoid discrimination to qualitative variables. Instead of having 1 column of both 0 and 1 dummy for both Republican and Democratic Party like in the OLS model, coding assigned 2 different neurons or 2 columns of both 0 and 1 for both 0 and 1 dummy. If we have just one column like in the OLS model, 1 for Republican and 0 for Democratic, Democrats will get less value compared to Republicans, thus impairing the network performance.
OLS Models
We used SPSS to perform stepwise regression to create a unique regression model for each company. First, we included all independent variables to predict dependent variables with a 95% of confidence interval. Second, we got statistically significant independent variables with a p-value of <5% and took the coefficients of the statistically significant independent variables to calculate the dependent variable value from 152 randomly selected datasets. Third, we calculated the forecasting error between the predicted and actual values. Then, we measured the mean and standard deviation of the % error = (actual value-predicted value)/actual value.
Y = [alpha] + [[beta].sub.1][X.sub.1] + [[beta].sub.2][X.sub.2] + ... + [[beta].sub.n][X.sub.n] where n = 1 to 267 (the initial model)
NN Model
A basic neural network model is shown in Figure 1. Input neurons (1 to n) are connected to an output neuron (j) and each connection has an assigned weight ([w.sub.j0] to [w.sub.jn]). In this example, the output of j becomes 1 (activated) when the sum of the total stimulus ([S.sub.j]) becomes greater than 0 (zero). The activation function in this example used a simple unit function (0 or 1), but other functions (Gaussian, exponential, sigmoid or hyperbolic functions) are used for complex networks.
Backpropagation is one of the most popular learning algorithms in neural networks and is derived to minimize the error using the following formula:
E = 0.5 [summation over (p)]([summation over (k)]([t.sub.pk] - [O.sub.pk]).sup.2])
where: p = the pattern i
k = the output unit
[t.sub.pk] = the target value of output unit k for patter p
[O.sub.pk] = the actual output value of output layer unit k for patter p.
[FIGURE 1 OMITTED]
Genetic Algorithm
Neural networks used a genetic algorithm that has capabilities in pattern recognition, categorization, and association. Turban (1992) showed that a genetic algorithm enabled neural networks to learn and adapt to changes through machine learning for automatically solving complex problems based on a set of repeated instructions. It would be similar to biological processes of evolution. One primary characteristic of general algorithms that we found in Alyuda NeuroIntelligence NN application is its reproduction. Genetic algorithms enable NN to produce improved solutions by selecting input variables with higher fitness ratings. Alyuda NeuroIntelligence enables us to retain the best network.
Fuzzy Logic
Fuzzy logic used the process of normal human reasoning to cope with uncertainty or partial information. Turban (1992) showed that fuzzy logic can be advantageous because it provides flexibility for the unexpected, gives options to make an educated guess, frees the imagination by asking "what if..?", and allows for observation.
First Attempt using BrianMaker
First, we used BrainMaker software to create the NN model for four companies (C, GS, JPM, and MS), but BrainMaker had a major limitation of 20 variables, and so it was not adequate for the number of variables in our model. We included independent variables from stepwise regression to BrainMaker due to the limit. However, BrainMaker burst out and unable to learn. BrainMaker failed to perform as shown in Figure 2.
[FIGURE 2 OMITTED]
Second Attempt using NeuroIntelligence
Alyuda NeuroIntelligence allowed us to handle more data and find the best network architecture. So, with this benefit, we increased the number of independent variables from 20 to 267 variables (see Table A.1 to A.8 in Appendix) and the number of dependent variables from 4 to 37 variables (see Table 2). The major difference with the number of independent variables is that they are all sub-industries in the market to increase our performance accuracy. We also took out the market vane indicators of the commodity market because of its lack of correlation with our dependent variables.
We used Alyuda NeuroIntelligence to create the second generation of NN models. We did data manipulation by using the changes or the first difference of our independent variables except for the dummy variables. We also normalized the data because there are negative to positive numbers. We followed a seven-step neural network design process to build up the network. We used the Alyuda NeuroIntelligence to perform data analysis, data preprocessing, network design, training, testing, and query. The dataset was analyzed during data analysis function. As a result, the neural network showed missing values, wrong type values, outliers, rejected records and columns, input columns, and output column in the dataset.
We tried both logistic and hyperbolic tangent function to design the network to see which method had higher accuracy. A hyperbolic tangent is a sigmoid curve and is calculated using the following formula: F(x)= ([e.sup.x]-[e.sup.-x])/([e.sup.x]+[e.sup.-x]) with output range of [0..1]. Logistic function also has a sigmoid curve and is calculated by: F(x)=1/(1+[e.sup.-x]) with output range of Empirically, it is often found that hyperbolic tangent function performs better than the logistic function. However, we wanted to try both functions and select a better one to increase performance. From table below, we found out that logistic functions outperform hyperbolic tangent function 27 out of 37 times. There is no significant spread by using either logistic or hyperbolic tangent function, but we can try both functions to increase performance accuracy.
We used Batch Back Propagation model with stopping training condition of 1001 iterations to find the best network during the network training. However, we retrained the network at smaller iterations where the best network at. For all networks, we used the same model architecture of 272-41-1 with different number of iterations. The network architecture consisted of 272 input neurons (additional 5 neurons from coding), 41 neurons in the hidden layer, and one output neuron. The number of iterations is important to escape a local minima and reach a global minima which is the lowest errors possible to train the network.
"Back propagation algorithm is the most popular algorithm for training of multi-layer perceptrons and is often used by researchers and practitioners. The main drawbacks of back propagation are: slow convergence, need to tune up the learning rate and momentum parameters, and high probability of getting caught in local minima." (Alyuda NeuroIntelligence Manual, 2010)
[FIGURE 3 OMITTED]
Also, we used overtraining control techniques such as "Retrain" and "Restore" the best network and "Add 10% jitter to inputs," "Weights randomization method such as Gaussian distribution of network inputs," and "Retrain network two times of the lowest training error to train the network." By retaining and restoring the best network, we can prevent over-training such as memorizing data instead of generalizing and encoding data relationships and thus reduce the network error. As a result, validation errors rise while training errors may still decline in the training graph. By adding jitter, we not only can prevent overtraining but also allow the network to escape local minima during training (a major drawback from the batch-back propagation) by adding 10% random noise to each input variable during training. By randomizing the weights, we avoid sigmoid saturation problems that cause slow training. We used a Gaussian distribution because it is characterized by a continuous, symmetrical, bell-shaped curve. A sample screen shot from NeuroIntelligence is shown in Figure 3. Figure 4 shows the error distribution while training. It is reduced rapidly and arrived to the best network at 1001 iterations.
[FIGURE 4 OMITTED]
Data--Model Building Data Set and Performance Testing Data Set for OLS
We used 2431 data points for each company to build the OLS model by running stepwise regression. With stepwise regression, we can reduce our independent variables to only the statistically significant variables. By doing that, we reduced our independent variables range to between 31 and 61 variables. After having the OLS model, we tested the model by using randomly selected 152 data points by calculating the predicting error. Finally, we tested the forecasting accuracy of the OLS with NN methods by calculating the mean and standard deviation of the % error that is error divided by the actual value of the stock price.
Data--Training Data Set, Validation Data Set, and Testing Data Set
Unlike the OLS model, NN model used all independent variables. There are three sets of data used in the neural network model such as training set, validation set, and testing set. The training set is used to train the neural network and adjust network weights. The validation set is used to tune network parameters other than weights, to calculate generalization loss, and retain the best network. The testing set is used to test how well the neural network performs on new data after the network is trained. We used training and validation data to train the network and come up with a model. Finally, we used testing data to test the forecasting errors between the actual and predicted values. Out of 2431 data for each company, we have 152 randomly chosen testing data. The remaining is equally distributed among the training and validation data.
Data Normalization
We looked at the numbers in our data: both positive and negative numbers. We thought we would want to have all positive numbers to see how the neural network learns. So, we wanted to shift up or normalize the data. First, we searched for the lowest negative numbers. We wanted to add the negative numbers to all numbers to make all positive numbers. Second, we took the absolute value of the lowest negative numbers. If it was not done so, we would have negative numbers, plus negative numbers result in bigger negative numbers. For example -6 + (-6) = -12. Third, we wanted to take into account the rounding error by adding 0.1 to the absolute value of the lowest negative numbers. For example, to normalize the data of company A, we added the absolute value of lowest negative numbers of company A that is [absolute value of -6.7] to 0.1. As a result, we have 6.8. Then we used 6.8 to add all number. Let's say we used the lowest numbers: 6.8 + (-6.7) = 0.1. To sum up, the formula we used to normalize the data = ([absolute value of lowest negative number] +0.1+ all number in our data set). After we normalized the data, we had both a lower mean and standard deviation for both the NN model and OLS model.
Analysis and Results
We measured our success by testing the accuracy between the NN with OLS model in terms of the significant % forecasting error of the mean and standard error. After analyzing the results, our mean for NN was low (2.47% to 19.68%) but our standard deviation was too high (218.73% to 584.26%). The same happened to the OLS model with a mean of 7.29% to 167.43% and standard deviation of 160.33% to 962.01%. Then, we realized that our % error had both positive and negative numbers because we were using the difference for all our variables except dummy variables. So, we took the absolute value of the error percentage of all variables. Even after we took the absolute value of the error percentage, our mean (127% to 206%) and standard deviation (174% to 532.8%) for NN model were still too high. For the OLS model, we got a mean of 104%-381% and standard deviation of 127% to 849%.
Table 3 shows the means and standard deviations of NN models and OLS models after we normalized the data as mentioned in the Data Normalization section. We have both lower means (2.1% to 12.31%) and standard deviations (2.11% to 14.92%) for NN model. We have similar results for the OLS model with the means (1.93% to 24.8%) and standard deviations (2.15% to 12.3%).
Finally, we conducted a paired-t test to measure the performance between two models by using the ((% NN error)--(% OLS error)). Based on the result from paired t-test shown in Table 5, 19 out of 27 pairs show that NN is a better predictor. Table 6 shows the results of an aggregated paired-t test. The significance is 0.000 and we reject the Ho that OLS better predicts stock prices. The negative sign in the t-statistics shows that NN has lower errors compared to the OLS model.
CONCLUSION AND FUTURE DIRECTION
The stock market is made of market participants with various risk and return characteristics, different perceptions and expectations about stocks and the economy, and how they interpret and react to the news. Each investor reacts to the market differently at a given point in time, focuses on different pieces of relevant information, and reaches different conclusions. It is unclear how important and how long are the impact of various pieces of information and economic data on the stock prices.
We observed that NN models have more consistent performance compared to OLS models. For example, the mean and standard deviation of errors in HCP Inc. are 2.33% and 2.11% for NN model and 24.80% and 4% for OLS model. Because OLS is a linear model, OLS possess higher variability or inconsistent performance when being used to a dataset with either positive or negative concentrated value. When OLS model took the average of positive or negative only dataset, the prediction will eventually understate or overstate the actual number. In HCP case, OLS model prediction overstated 152 out of 152 actual values by 14% to 30%. On the other hand, OLS model prediction understated 150 out of 152 actual values by 1.1% to 20.51% for Host Hotels and Resorts Inc. [HST]. As a result, mean and standard deviation of errors in HST are 3.41% and 3.53% for NN model and 9.96% and 3.47% for OLS model.
NN model had 272 surviving variables while OLS had 26 to 55 surviving variables. We found that the OLS model is easy to use and validate. It also works fast. However, it is a linear model with a relatively higher error to forecast non-linear environment in the stock market. Also, it only traced one dependent variable at a time.
In contrast, the NN model is complex and required more efforts to train the network repeatedly to find the best model. Some critical factors may create the best model such as the network architecture (number of layers and neurons) and design (logistic/ hyperbolic tangent/ linear), training algorithms, and stop training conditions (number of iterations). Although we can
choose low MSE, this does not guarantee that it is the best model because the network might be over-trained causing memorization rather than learning. The Alyuda NeuroIntelligence used different sets of data each time we ran the network to avoid the memorization. The software can only reveal what is the best network architecture. Since it was an exhaustive and blind search, we cannot be certain if the model is the best or not when it comes to train the network. With these uncertainties, it was hard to measure the performance of the neural network. It will require more time to train and learn how to use the neural network.
Our results showed that NN did a better job than OLS model. Furthermore, our paper showed a significant contribution to the financial forecasting where we saw how one industry affected others. We also learned that data normalization can make a sizeable difference to the results.
One of our research limitations was that we were only comparing two methods while there might be other possible models that may be considered and tested. Furthermore, researchers might include more techniques to find the best model for financial forecasting purpose especially for a learning algorithm that can handle market shocks, financial crisis, and business cycles. Finally, there are many other learning algorithms in the NN to be explored.
APPENDIX Table A1: Macroeconomic indicators AEX Amsterdam Price CAC Paris Price DAX German Index FTSE FTSE Index Price SMI Swiss Market Index STI Straits Times Index Singapore IPC Mexico Index HSI Hang Seng Index BSE Bombay Stock Exchange Sensex BVSP Bovespa-Brazillian Index ATX Vienna Stock Exchange MERV Merval Buenos Aires Index KLSE FTSE Bursa Malaysia Klci TSEC Taiwan Weighted Index KOSPI Kospi Composite Index N225 Nikkei 225 JKSE Jakarta Stock Exchange Index TA Tel Aviv 100 (World Indexes)Source: YahooFinance Table A2: Microeconomic indicators BASIC MATERIALS COMPANIES 1 Agricultural Chemicals POTASH CP SASKATCHEWAN [POT] 2 Aluminum ALCOA INC [AA] 3 Chemicals--Major Diversified DOW CHEMICAL [DOW] 4 Copper FREEPORT MCMORAN [FCX] 5 Gold BARRICK GOLD [ABX] 6 Independent Oil & Gas OCCIDENTAL PETROLEUM [OXY] 7 Industrial Metals & Minerals BHP BILLITON [BHP] 8 Major Integrated Oil & Gas EXXON MOBIL [XOM] 9 Nonmetallic Mineral Mining HARRY WINSTON DIAMOND [HWD] 10 Oil & Gas Drilling & TRANSOCEAN [RIG] Exploration 11 Oil & Gas Equipment & Services SCHLUMBERGER [SLB] 12 Oil & Gas Pipelines KINDER MORGAN ENERGY PARTNERS [KMP] 13 Oil & Gas Refining & Marketing IMPERIAL OIL [IMO] 14 Silver COEUR D' ALENE MINES COPR [CDE] 15 Specialty Chemicals LUBRIZOL CORP [LZ] 16 Steel & Iron RIO TINTO PLC [RTP] 17 Synthetics PRAXAIR INC [PX] CONGLOMERATES 18 Conglomerates GENERAL ELECTRIC [GE] CONSUMER GOODS 19 Appliances WHIRLPOOL CORP [WHR] 20 Auto Manufacturers--Major HONDA MOTOR CO. LTD [HMC] 21 Auto Parts JOHNSON CONTROLS INC [JCI] 22 Beverages--Brewers FORMENTO ECONOMICO MEXICANO [FMX] 23 Beverages--Soft Drinks THE COCA-COLA CO. [KO] 24 Beverages--Wineries & DIAGEO PLC [DEO] Distillers 25 Business Equipment XEROX CORP. [XRX] 26 Cigarettes BRITISH AMERICAN TOBACCO PCL [BTI] 27 Cleaning Products ECOLAB INC [ECL] 28 Confectioners CADBURY PLC [CBY] 29 Dairy Products LIFEWAY FOODS INC [LWAY] 30 Electronic Equipment SONY CORPORATION [SNE] 31 Farm Products ARCHER-DANIELS-MIDLAND [ADM] 32 Food--Major Diversified HJ HEINZ CO. [HNZ] 33 Home Furnishings & Fixtures FORTUNE BRANDS INC [FO] 34 Housewares & Accessories NEWELL RUBBERMAID INC [NWL] 35 Meat Products HORMEL FOODS CORP. [HRL] 36 Office Supplies ENNIS INC. [EBF] 37 Packaging & Containers OWENS-ILLINOIS [OI] 38 Paper & Paper Products INTERNATIONAL PAPER CO. [IP] 39 Personal Products PROCTER & GAMBLE CO. [PG] 40 Photographic Equipment & EASTMAN KODAK [EK] Supplies 41 Processed & Packaged Goods PEPSICO INC. [PEP] 42 Recreational Goods, Other FOSSIL INC. [FOSL] 43 Recreational Vehicles HARLEY-DAVIDSON INC. [HOG] 44 Rubber & Plastics GOODYEAR TIRE & RUBBER CO. [GT] 45 Sporting Goods CALLAWAY GOLF CO. [ELY] 46 Textile--Apparel Clothing VF CORP. [VFC] 47 Textile--Apparel Footwear & NIKE INC. [NKE] Accessories 48 Tobacco Products, Other UNIVERSAL CORP. [UVV] 49 Toys & Games MATTEL INC. [MAT] 50 Trucks & Other Vehicles PACCAR INC. [PCAR] FINANCIAL 51 Accident & Health Insurance AFLAC INC. [AFL] 52 Asset Management T. ROWE PRICE GROUP INC. [TROW] 53 Closed-End Fund--Debt ALLIANCE BERNSTEIN INCOME FUND INC. [ACG] 54 Closed-End Fund--Equity DNP SELECT INCOME FUND INC. [DNP] 55 Closed-End Fund--Foreign ABERDEEN ASIA-PACIFIC INCOME FUND INC. [FAX] 56 Credit Services AMERICAN EXPRESS CO. [AXP] 57 Diversified Investments MORGAN STANLEY [MS] 58 Foreign Money Center Banks WESTPAC BANKING CORP [WBK] 59 Foreign Regional Banks BANCOLOMBIA S.A. [CIB] 60 Insurance Brokers MARSH & MCLENNAN [MMC] 61 Investment Brokerage--National CHARLES SCHWAB CORP. [SCHW] 62 Investment Brokerage--Regional JEFFERIES GROUP INC. [JEF] 63 Life Insurance AXA [AXA] 64 Money Center Banks JPMORGAN CHASE & CO. [JPM] 65 Mortgage Investment ANALLY CAPITAL MANAGEMENT [NLY] 66 Property & Casualty Insurance BERKSHIRE HATHAWAY [BRK-A] 67 Property Management ICAHN ENTERPRISES, L.P. [IEP] 68 REIT--Diversified PLUM CREEK TIMBER CO. INC. [PCL] 69 REIT--Healthcare Facilities HCP INC. [HCP] 70 REIT--Hotel/Motel HOST HOTELS & RESORTS INC. [HST] 71 REIT--Industrial PUBLIC STORAGE [PSA] 72 REIT--Office BOSTON PROPERTIES INC. [BXP] 73 REIT--Residential EQUITY RESIDENTIAL [EQR] 74 REIT--Retail SIMON PROPERTY GROUP INC. [SPG] 75 Real Estate Development THE ST. JOE COMPANY [JOE] 76 Regional--Mid-Atlantic Banks BB & T CORP. [BBT] 77 Regional--Midwest Banks US BANCORP [USB] 78 Regional--Northeast Banks STATE STREET CORP. [STT] 79 Regional--Pacific Banks BANK OF HAWAII CORP. [BOH] 80 Regional--Southeast Banks REGIONS FINANCIAL CORP. [RF] 81 Regional--Southwest Banks COMMERCE BANCSHARES INC. [CBSH] 82 Savings & Loans PEOPLE'S UNITED FINANCIAL INC. [PBCT] 83 Surety & Title Insurance FIRST AMERICAN CORP. [FAF] HEALTHCARE 84 Biotechnology AMGEN INC. [AMGN] 85 Diagnostic Substances IDEXX LABORATORIES INC. [IDXX] 86 Drug Delivery ELAN CORP. [ELN] 87 Drug Manufacturers--Major JOHNSON & JOHNSON [JNJ] 88 Drug Manufacturers--Other TEVA PHARMACEUTICAL INDUSTRIES LTD [TEVA] 89 Drug Related Products PERRIGO CO. [PRGO] 90 Drugs--Generic MYLAN INC. [MYL] 91 Health Care Plans UNITEDHEALTH GROUP INC. [UNH] 92 Home Health Care LINCARE HOLDINGS INC. [LNCR] 93 Hospitals TENET HEALTHCARE CORP. [THC] 94 Long-Term Care Facilities EMERITUS CORP. [ESC] 95 Medical Appliances & Equipment MEDTRONIC INC. [MDT] 96 Medical Instruments & Supplies BAXTER INTERNATIONAL INC. [BAX] 97 Medical Laboratories & QUEST DIAGNOSTICS INC. [DGX] Research 98 Medical Practitioners TRANSCEND SERVICES INC. [TRCR] 99 Specialized Health Services DAVITA INC. [DVA] INDUSTRIAL GOODS 100 Aerospace/Defense--Major BOEING CO. [BA] Diversified 101 Aerospace/Defense Products & HONEYWELL INTERNATIONAL INC. Services [HON] 102 Cement CRH PLC[CRH] 103 Diversified Machinery ILLINOIS TOOL WORKS INC. [ITW] 104 Farm & Construction Machinery CATERPILLAR INC. [CT] 105 General Building Materials VULCAN MATERIALS CO. [VMC] 106 General Contractors EMCOR GROUP INC. [EME] 107 Heavy Construction MCDERMOTT INTERNATIONAL INC. [MDR] 108 Industrial Electrical EATON CORPORATION [ETN] Equipment 109 Industrial Equipment & EMERSON ELECTRIC CO. [EMR] Components 110 Lumber, Wood Production WEYERHAEUSER CO. [WY] 111 Machine Tools & Accessories STANLEY WORKS [SWK] 112 Manufactured Housing SKYLINE CORP [SKY] 113 Metal Fabrication PRECISION CASTPARTS CORP. [PCP] 114 Pollution & Treatment DONALDSON COMPANY INC. [DCI] Controls 115 Residential Construction NVR INC. [NVR] 116 Small Tools & Accessories THE BLACK & DECKER CORP. [BDK] 117 Textile Industrial MOHAWK INDUSTRIES INC. [MHK] 118 Waste Management WASTE MANAGEMENT INC. [WM] SERVICES 119 Advertising Agencies OMNICOM GROUP INC. [OMC] 120 Air Delivery & Freight FEDEX CORP. [FDX] Services 121 Air Services, Other BRISTOW GROUP INC. [BRS] 122 Apparel Stores GAP INC. [GPS] 123 Auto Dealerships CARMAX INC. [KMX] 124 Auto Parts Stores AUTOZONE INC. [AZO] 125 Auto Parts Wholesale GENUINE PARTS CO. [GPC] 126 Basic Materials Wholesale AM CASTLE & CO. [CAS] 127 Broadcasting--Radio SIRIUS XM RADIO INC. [SIRI] 128 Broadcasting--TV ROGERS COMMUNICATIONS INC. [RCI] 129 Business Services IRON MOUNTAIN INC. [IRM] 130 CATV Systems COMCAST CORP. [CMCSA] 131 Catalog & Mail Order Houses AMAZON.COM INC. [AMZN] 132 Computers Wholesale INGRAM MICRO INC. [IM] 133 Consumer Services MONRO MUFFLER BRAKE INC. [MNRO] 134 Department Stores THE TJX COMPANIES INC. [TJX] 135 Discount, Variety Stores WAL-MART STORES INC. [WMT] 136 Drug Stores CVS CAREMARK CORP. [CVS] 137 Drugs Wholesale MCKESSON CORP. [MCK] 138 Education & Training Services DEVRY INC. [DV] 139 Electronics Stores BEST BUY CO. INC. [BBY] 140 Electronics Wholesale AVNET INC. [AVT] 141 Entertainment--Diversified WALT DISNEY CO. [DIS] 142 Food Wholesale SYSCO CORP. [SYY] 143 Gaming Activities BALLY TECHNOLOGIES INC. [BYI] 144 General Entertainment CARNIVAL CORP. [CCL] 145 Grocery Stores KROGER CO. [KR] 146 Home Furnishing Stores WILLIAMS-SONOMA INC. [WSM] 147 Home Improvement Stores THE HOME DEPOT INC. [HD] 148 Industrial Equipment W.W. GRAINGER INC. [GWW] Wholesale 149 Jewelry Stores TIFFANY & CO. [TIF] 150 Lodging STARWOOD HOTELS & RESORTS WORLDWIDE INC. [HOT] 151 Major Airlines AMR CORP. [AMR] 152 Management Services EXPRESS SCRIPTS INC. [ESRX] 153 Marketing Services VALASSIS COMMUNICATIONS INC. [VCI] 154 Medical Equipment Wholesale HENRY SCHEIN INC. [HSIC] 155 Movie Production, Theaters MARVEL ENTERTAINMENT INC. [MVL] 156 Music & Video Stores BLOCKBUSTER INC. [BBI] 157 Personal Services H&R BLOCK INC. [HRB] 158 Publishing--Books THE MCGRAW-HILL CO. INC. [MHP] 159 Publishing--Newspapers WASHINGOTN POST CO. [WPO] 160 Publishing--Periodicals MEREDITH CORP. [MDP] 161 Railroads BURLINGTON NORTHERN SANTA FE CORP. [BNI] 162 Regional Airlines SOUTHWEST AIRLINES CO. [LUV] 163 Rental & Leasing Services RYDER SYSTEM INC. [R] 164 Research Services PAREXEL INTL CORP. [PRXL] 165 Resorts & Casinos MGM MIRAGE [MGM] 166 Restaurants MCDONALD'S CORP. [MCD] 167 Security & Protection GEO GROUP INC. [GEO] Services 168 Shipping TIDEWATER INC. [TDW] 169 Specialty Eateries STARBUCKS CORP. [SBUX] 170 Specialty Retail, Other STAPLES INC. [SPLS] 171 Sporting Activities SPEEDWAY MOTORSPORTS INC. [TRK] 172 Sporting Goods Stores HIBBETT SPORTS INC. [HIBB] 173 Staffing & Outsourcing PAYCHEX INC. [PAYX] Services 174 Technical Services JACOBS ENGINEERING GROUP INC. [JEC] 175 Trucking JB HUNT TRANSPORT SERVICES INC. [JBHT] 176 Wholesale, Other VINA CONCHA Y TORO S.A. [VCO] TECHNOLOGY 177 Application Software MICROSOFT CORP. [MSFT] 178 Business Software & Services AUTOMATIC DATA PROCESSING INC. [ADP] 179 Communication Equipment NOKIA CORP. [NOK] 180 Computer Based Systems ADAPTEC INC. [ADPT] 181 Computer Peripherals LEXMARK INTERNATIONAL INC. [LXK] 182 Data Storage Devices EMC CORP. [EMC] 183 Diversified Communication TELECOM ARGENTINA S A [TEO] Services 184 Diversified Computer Systems INTERNATIONAL BUSINESS MACHINES CORP. [IBM] 185 Diversified Electronics KYOCERA CORP. [KYO] 186 Healthcare Information CERNER CORP. [CERN] Services 187 Information & Delivery DUN & BRADSTREET CORP. [DNB] Services 188 Information Technology COMPUTER SCIENCES CORPORATION Services [CSC] 189 Internet Information YAHOO! INC. [YHOO] Providers 190 Internet Service Providers EASYLINK SERVICES INTERNATIONAL CORP. [ESIC] 191 Internet Software & Services CGI GROUP INC. [GIB] 192 Long Distance Carriers TELEFONOS DE MEXICO, S.A.B. DE C.V. [TMX] 193 Multimedia & Graphics ACTIVISION BLIZZARD INC. [ATVI] Software 194 Networking & Communication CISCO SYSTEMS INC. [CSCO] 19Devices 195 Personal Computers APPLE INC. [AAPL] 196 Printed Circuit Boards FLEXTRONICS INTERNATIONAL LTD. [FLEX] 197 Processing Systems & Products POLYCOM INC. [PLCM] 198 Scientific & Technical THERMO FISHER SCIENTIFIC INC. Instruments [TMO] 199 Security Software & Services SYMANTEC CORP. [SYMC] 200 Semiconductor--Broad Line INTEL CORP. [INTC] 201 Semiconductor--Integrated QUALCOMM INC. [QCOM] Circuits 202 Semiconductor--Specialized XILINX INC. [XLNX] 203 Semiconductor Equipment & APPLIED MATERIALS INC. [AMAT] Materials 204 Semiconductor- Memory Chips MICRON TECHNOLOGY INC. [MU] 205 Technical & System Software AUTODESK INC. [ADSK] 206 Telecom Services--Domestic AT&T INC. [T] 207 Telecom Services--Foreign NIPPON TELEGRAPH & TELEPHONE CORP. [NTT] 208 Wireless Communications CHINA MOBILE LIMITED [CHL] UTILITIES 209 Diversified Utilities EXELON CORP. [EXC] 210 Electric Utilities SOUTHERN COMPANY [SO] 211 Foreign Utilities ENERSIS S.A. [ENI] 212 Gas Utilities TRANSCANADA CORP. [TRP] 213 Water Utilities AQUA AMERICA INC. [WTR] Source: YahooFinance Table A.3. Market Indicators S&P S&P 500's price changes DJI Dow Jones Industrial's price changes DJT Dow Jones Transportation's price changes DJU Dow Jones Utility's price changes Source: YahooFinance Table A.4. Market Sentiment Indicators VIX CBOE Volatility Index changes VXO CBOE OEX Volatility Index Source: YahooFinance Table A.5. Institutional Investor BEN FRANKLIN RESOURCES INC. Table A.6. Politics Indicators Election Presidential Election day White House Party Party: Republican or Democratic Wikipedia Table A.7. Business Cycles Tech crash Technological Crash 3/24/2000-10/9/2002 Current bear Current bear 10/9/2007-11/16/2009 Table A.8. Calendar Anomalies Mon Monday Tue Tuesday Wed Wednesday Thurs Thursday Fri Friday W1 First week W2 Second week W3 Third week W4 Fourth week W5 Fifth week Jan January Feb February Mar March Apr April May May Jun June Jul July Aug August Sep September Oct October Nov November Dec December pHol Pre holiday
REFERENCES
Aiken, M., & Bsat, M. (1999). "Forecasting Market Trends with Neural Networks," Information Systems Management, 16, pp. 42-49.
Allen, F. & Karjalainen, R. (1999). "Using Genetic Algorithm to Find Technical Trading Rules," Journal of Financial Economics, 51, pp 245-271.
Alyuda NeuroIntelligence. (2010). Alyuda NeuroIntelligence Manual. (http://www.alyuda.com/neural-networks software.htm).
Aminian, F., Suarez, E., Aminian, M., & Walz, D. (2006). "Forecasting Economic Data with Neural Networks," Computational Economics, 28, pp. 71-88.
Balvers, R., Wu, R., & Gilliand, E. (2000). "Mean Reversion Across National Stock Markets and Parametric Contraian Investment Strategies," Journal of finance, 55, pp. 745-772.
Barber, B. & Odean, T. (2000). "Trading is Hazardous to your Wealth: The Common Stock Investment Performance of Individual Investors," , Journal of Finance, 55, pp. 773-806.
Brainmaker, (2010). California Scientific. (http://www.calsci.com/BrainMaker.html)
Blume, L. Easley, & O'Hara, M. (1994). "Market Statistics and Technical Analysis: The Role of Volume," Journal of Finance, 49, pp. 153-181.
Brock, W., Lakonishok, J. & LeBaron, B. (1992). "Simple Technical Trading rules and the Stochastic Properties of Stock Returns, Journal of Finance, 47, pp1731-1764.
Burstein, F., & Holsapple, C. (2008). Handbook on Decision Support System 2. Springer Berlinn Heidelbert, 175-193.
Cao, Q., Leggio, K., & Schniederjans, M. (2005). "A Ccomparison between Fama and French's Model and Artificial Networks in Predicting the Chinese Stock Market," Computers and Operations Research, 32, 2499-2512.
Chordia, T. & Swaminathan, B. (2000). "Trading Volume and Cross-Autocorrelations in Stock Reutrns,' Journal of Finance, 55, pp.913-935.
Dayhoff, J. (1990). Neural Network Architectures: An Introduction. New York: Van Nostrand Reinhold.
Daniel, K. Hirshleifer, D. & Subrahmanyam, A. (1998). "Investor Psychology and Security Market Under- and Overreactions," Journal of Finance, 53, pp. 1839-1885.
De Bondt, W. & Thaler, R. (1985). "Does the Stock Market Overreact?" Journal of Finance, 40, pp.793-805.
DeBondt, W. & Thaler, R. (1987). "Further Evidence on Investor Overreaction and Stock Market Seasonality, 42, pp. 557-581.
DuMouchel, W. (1999). "Bayesian Data Mining in Large Frequency Tables--with an Application to the FDA Spontaneous," American Statistician, 53, p. 177.
Dutta, G. Jha, P. , Laha, A. & Mohan, N. (2006). "Artificial Neural Network Models for Forecasting Stock Price Index in the Bombay Stock Exchange," Journal of Emerging Market Finance," 5, pp. 283-295.
Easterwoo, J. & Nutt, S. (1999). "Inefficiency in Analysts' Earnings Forecasts: Systematic Misreaction or Systematic Optimism?" Journal of Finance, 54, pp. 1777-1797.
Fama, E. (1970). "Efficient Capital Market: A Review of Theory and Empirical Work," Journal of Finance, 25,. 383-417.
Fama, E. & French, K. (1993). "Common Risk Factors in Returns on Stocks and Bonds," Journal of Financial Economics, 33, pp. 3-56.
Ferson, W., & Harvey, C. (1993). "The Risk and Predictability of International Equity Returns," Review of Financial Studies, 6, pp. 527-566.
Giudici, P. (2001). "Bayesian Data Mining, with Application to Benchmarking and Credit Scoring," Applied Stochastic Models in Business & Industry, 17, pp. 69-81.
Gorr, W., Nagin, D., & Szczypula, J. (1994). "Comparative Study of Artificial Neural Network and Statistical Models for Predicting Student Point Averages," International Journal of Forecasting, 10, pp. 17-34.
Grudnitski, G., & Osburn, L. (1993). "Forecasting S&P and Gold Futures Prices: An Application of Neural Network," Journal of Futures Markets, 13, pp. 631-643.
Gutierrez, R. & Kelley, E. (2008). "The Long-Lasting Momentum in Weekly Returns," Journal of Finance, 63, 415-447
Hammad, A., Ali, S., & Hall, E. (2009). "Forecasting the Jordanian Stock Price using Artificial Neural Network," http://www.min.uc.edu/robotics/papers/paper2007/ Final%20ANNIE%2007%20Souma%20Alhaj%20Ali% 206p.pdf)
Han, J., & Kamber, M. (2006). Data Mining: Concepts and Techniques. 2nd Edition. Morgan Kaufmann, page 5.
Hess, A., & Frost, P. (1982). "Tests for Price Effects of New Issues of Seasoned Securities," Journal of Finance, 36, pp. 11-25.
Hirshleifer, D. (2001). "Investor Psychology and Asset Pricing," Journal of Finance, 56, pp. 1533-1597.
Hong, H. & Stein, J. (1999). "A Unified Theory of Underreaction, Momemtum Trading, and Overreaction in Asset Markets," Journal of Finance, 54, pp. 2143-2184.
Jegadeesh, N. & Titman, S. (1993). "Returns to Buying Winners and Selling Losers: Implications for Market Efficiency,' Journal of Fiunance, 48, pp.65-92.
Jeong, H., Song, S., Shin, S., & Cho, B. (2008). "Integrating Data Mining to a Process Design Using the Robust Bayesian Approach," International Journal of Reliability, Quality & Safety Engineering. 15, pp. 441-464.
Kahn, M. (2006). Technical Analysis Plain and Simple: Charting the Markets in Your Language. Financial Times, Prentice Hall Books.
Kahneman, D. & Riepe, M. (1998). "Aspects of Investor Psychology," Journal of Portfolio Mnagement, 24, 52-65.
Kahneman, D., Slovic, P. & Tversky, (1982). A. Judgment under Uncertainty: Heuristics and Biases, Cambridge: Cambridge University Press.
Kahneman, D. & Tversky, A. (1979). "Prospect Theory: An Analysis of Decision under Risk," Econometrica, 47, pp. 263-291.
Kimoto, T., Asakawa, K., Yoda, M., & Takeoka, M. (1990). "Stock Market Prediction System with Modular Neural Networks," Proceedings of the IEEE International Conference on Neural Networks, pp. 1-16.
Kohzadi, N., Boyd, M., Kemlanshahi, B., & Kaastra, I. (1996). "A Comparison of Artificial Neural Network and Time Series Models for Forecasting Commodity Prices," Neurocomputing, 10, pp. 169-181.
Kryzanowski, L., Galler, M., & Wright, D. (1993). "Using Artificial Neural Networks to Pick Stocks," Financial Analysts Journal, 49, pp. 21-27.
Leigh, W., Hightower, R., & Modani, N. (2005). "Forecasting the New York Stock Exchange Composite Index with Past Price and Interest Rate on Condition of Volume Spike," Expert Systems with Applications, 28, pp. 1-8.
Lo, A, Mamaysky, H. & Wang, J. (2000). "Foundations of technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation," Journal of Finance, 55, pp. 1705-1765.
McGrath, C. (2002). "Terminator Portfolio," Kiplinger's Personal Finance, 56, pp. 56-57.
McNelis, P. (1996). "A Neural Network Analysis of Brazilian Stock Prices: Tequila Effects vs. Pisco Sour Effects," Journal of Emerging Markets, 1, pp. 29-44.
Menzly, L. & Ozbas, O. (2010). "Market Segmentation and Cross-predictability of Returns," Journal of Finance, 65, pp. 1555-1580.
Moshiri, S., & Cameron, N. (2000). "Neural Network versus Econometric Models in Forecasting Inflation," Journal of Forecasting, 19, pp. 201-217.
Mostafa, M. (2004). "Forecasting the Suez Canal Traffic: A Neural Network Analysis," Maritime Policy and Management, 31, pp. 139-156.
Mostafa, M. (2010). "Forecasting Stock Exchange Movements using Neural Networks: Empirical evidence from Kuwait," Expert Systems with Application, 37, pp. 6302-6309.
Nam, K., & Yi, J. (1997). "Predicting Airline Passenger Volume," Journal of Business Forecasting Methods and Systems, 16, pp. 14-17.
Neely, C, Weller, P., & Dittmar, R. (1997). "Is Technical Analysis in the Foreign Exchange Market Profitable? A Genetic Programming Approach," Journal of Financial and Quantitative Analysis, pp. 405-426.
Neftci, S. (1991). "Naive Trading Rules in Financial Market and Wiener-Komogorov Prediction Theory: A Study of Technical Analysis," Journal of Business, 64, pp.549-571.
Nofsinger, J., & Sias, R. (1999). "Herding and Feedback Trading by Institutional and Individual Investors," Journal of Finance, 54, pp. 2263-2295.
Odean, T. (1998). "Volume, Volatility, Price, and Profits When All Traders Are Above Average," Journal of Finance, 53, pp. 1887-1934.
Odean, T. (1999). "Do Investors Trade too Much?" American Economic Review, 89, pp. 1279-1298.
Poh, H., Yao, J., & Jasic, T. (1998). "Neural Networks for the Analysis and Forecasting of Advertising Impact," International Journal of Intelligent Systems in Accounting, Finance and management, 7, pp. 253-268.
Pruitt, S. & White, R. (1988). "The CRISMA Trading System: Who Says Technical Analysis Can't Beat the Market?" Journal of Portfolio Management, 14, pp.55-58.
Ruiz-Suarez, J., Mayora-Ibarra, O., Torres -Jimenez, J., & Ruiz-Suarez, L. (1995). "Short-term Ozone Forecasting by Artificial Neural Network," Advances in Engineering Software, 23, pp. 143-149.
Shefrin, H. (2000). Beyond Greed and Fear Understanding Behavioral Finance and the Psychology of Investing, Boston: Harvard Business School.
Shleifer, A. (2000). Inefficient Markets: An Introduction to Behavioral Finance, New York: Oxford University Press.
Shiller, R. (2000). Irrational Exuberance, Princeton: Princeton University Press.
Shiller, R. (2002). "Bubble, Human judgment, and Expert Opinion," Financial Analyst Journal, 58, pp.18-26.
Shiller, R. (2003). "From Efficient Market Theory to Behavioral Finance," Journal of Economic Perspectives, 17, pp. 83-104.
Simon, H. (1955). "A Behavioral Model of Rational Choice," Quarterly Journal of Economics, 69, pp. 99-118.
Simon, H. (1982). Models of Bounded Rationality, Vol. 2, Behavioral Economics and Business organization, Cambridge: the MIT Press.
Simon, H. (1997). Models of Bounded Rationality, Vol. 3, Empirically Grounded Economic Reason, Cambridge: the MIT Press.
Sullivan, R., Timmermann, A. & White, H. (1999). "Data-Snooping, Technical Trading Rule Performance, and the Bootstrap," Journal of Finance, 54, pp. 1647-1691.
Thaler, R. (ed.) (2005). Advances in Behavioral Finance, Vol. 2, New York: Russell Sage Foundation, Princeton University Press.
Tjung, L., Kwon, O., Tseng, K., & Bradley-Geist, J. (2010). "Forecasting Financial Stocks using Data Mining," Global Economy and Finance Journal, 3(2), pp. 13 - 26.
Tokic, D. (2005). "Explaining US Stock Market Returns from 1980 to 2005," Journal of Asset Management, 6, pp. 418-432.
Tsai, C.-F. & Wang, S.-P. (2009). Stock Price Forecasting by Hybrid Machine Learning Techniques," Proceedings of International Multi Conference of Engineers and Computer Scientists, 1.
Tseng, K. (2004). "Panorama of NASDAQ Stock Bubbles and Aftermath," American Business Review, 22, 61-71.
Tseng, K.C. (2006). "Behavioral Finance, Bounded Rationality, Nero-Finance, and Traditional Finance," Investment Management and Financial Innovations, 3, pp. 7-18
Turban, E. (1992). Expert Systems and Applied Aartificial Intelligence. New York: Macmillan Publishing Company.
Tveter, D. (2000). "The Backpropagation Algorithm," retrieved on March 10, 2010 from http://www.dontveter.com
Videnova, I., Nedialkova, D., Dimitrova, M., & Popova, S. (2006). "Neural Networks for Air Pollution Forecasting," Applied Artificial Intelligence, 20, pp. 493-506.
West, P., Brockett, P., & Golden, L. (1997). "A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice," Marketing Science, 16, pp. 370-391.
Ying, J., Kuo, L. & Seow, G, (2005). "Forecasting Stock Prices Using a Hierarchical Bayesian Approach," Journal of Forecasting, 24, pp. 39-59.
Yoon, Y., & Swales, G. (1991) "Predicting Stock Price Performance: A Neural Network Approach," Proceedings of the IEEE 24th Annual International Conference of Systems Sciences, pp.156-162.
Yu, L., Wang, S., & Lai, K. (2009). "A Neural-Network-based Nonlinear Metamodeling Approach to Financial Time Series Forecasting," Applied Soft Computing, 9, pp. 563-574.
Yumlu, S., Gurgen, F., & Okay, N. (2005). "A Comparison of global, Recurrent and Smoothed-Piecewise Neural Models for Istanbul Stock Exchange (ISE) Prediction," Pattern Recognition Letters, 26, pp. 2093-2103.
Zhuo, W., Li-Min, J., Yong, Q., & Yan-hui, W. (2007). "Railway Passenger Traffic Volume Prediction based on Neural Network," Applied Artificial Intelligence, 21, pp. 1-10.
Luna Christie Tjung, California State University at Fresno
Ojoung Kwon, California State University at Fresno
K.C. Tseng, California State University at Fresno Table 1: Sample Coding For Political Party Preference None Democratic Republican Both Neuron 1 (Republican) 0 0 1 1 Neuron 2 (Democratic) 0 1 0 1 Table 2: Performance Comparison On Logistic And Hyperbolic Tangent Method Logistic- Hyperbolic BASIC MATERIALS INDUSTRY Iterations mean of Tangent- error mean of (standard error deviation) (standard deviation) 1 BHP BILLITON [BHP] 1001 3.86% 4.14% (4%) (4.09%) 2 EXXON MOBIL [XOM] 781 2.85% 4.04% (2.49%) (3.37%) 3 TRANSOCEAN [RIG] 1001 4.57% 5.03% (5.05%) (5.78%) 4 SCHLUMBERGER [SLB] 1001 6.64% 5.54% (8.21%) (6.44%) 5 IMPERIAL OIL [IMO] 901 4% 3.92% (5.12%) (5.10%) 6 BARRICK GOLD CORP. [ABX] 1001 4.31% 5.17% (3.55%) (3.74%) CONGLOMERATES 7 GENERAL ELECTRIC CO. [GE] 1001 9.50% 7.44% (9.69%) (7.18%) 8 PACCAR INC. [PCAR] 1001 3.73% 3.96% (3.8%) (4.41%) 9 JOHNSON CONTROLS INC. [JCI] 6.14% 6.34% (6.87%) (6.80%) 10 PROCTER & GAMBLE CO. [PG] 1001 5.90% 5.91% (5.21%) (5.38%) FINANCIAL INDUSTRY 11 T. ROWE PRICE GROUP INC. 746 3.29% 3.67% [TROW] (4.48%) (4.39%) 12 JPMORGAN CHASE & CO. [JPM] 1001 3.79% 3.99% (3.48%) (3.91%) 13 MORGAN STANLEY [MS] 1001 4.86% 5.53% (5.37%) (5.69%) 14 CHARLES SCHWAB [SCHW] 1001 3% 3.56% (4.27%) (4.84%) 15 PLUM CREEK TIMBER CO. INC. 1001 2.10% 2.43% [PCL] (2.28%) (2.25%) 16 HCP INC. [HCP] 356 2.33% 2.42% (2.11%) (2.18%) 17 HOST HOTELS & RESORTS INC. 604 3.41% 3.54% [HST] (3.53%) (3.56%) 18 PUBLIC STORAGE [PSA] 1001 2.78% 3.15% (3.13%) (3.57%) 19 BOSTON PROPERTIES INC. 1001 2.25% 2.54% [BXP] (2.61%) (3.04%) 20 SIMON PROPERTY GROUP INC. 776 3.19% 3.42% [SPG] (3.05%) (3.29%) HEALTHCARE 21 JOHNSON & JOHNSON [JNJ] 591 6.26% 7.92% (5.92%) (6.12%) 22 AMGEN INC. [AMGN] 626 7.58% 9.30% (6.78%) (8.04%) INDUSTRIAL GOODS 23 CATERPILLAR INC. [CT] 691 6.28% 5.48% (7.05%) (6.6%) 24 VULCAN MATERIALS CO. [VMC] 1001 4.47% 4.56% (4.60%) (5.04%) 25 EMCOR GROUP INC. [EME] 1001 6.06% 6.27% (6.41%) (6.94%) 26 MCDERMOTT INTERNATIONAL 1001 2.84% 2.81% INC. [MDR] (3.84%) (3.90%) 27 EMERSON ELECTRIC CO. [EMR] 1001 11.30% 12.16% (10.94%) (12.38%) 28 PRECISION CASTPARTS CORP. 1001 2.67% 2.71% [PCP] (2.28%) (2.38%) SERVICES 29 FEDEX CORPORATION [FDX] 1001 12.93% 12.31% (13.01%) (12.28%) 30 GENUINE PARTS CO. [GPC] 1001 5.86% 5.78% (5.05%) (5.21%) 31 THE HOME DEPOT INC. [HD] 1001 10.54% 10.84% (14.92%) (15.84%) 32 W.W. GRAINGER INC. [GWW] 1001 7.19% 7.35% (6.70%) (7.28%) 33 TIFFANY & CO. [TIF] 1001 7.51% 6.34% (7.26%) (6.18%) 34 BURLINGTON NORTHERN SANTA 1001 9.43% 7.50% FE CORP. [BNI] (9.63%) (8.19%) 35 JB HUNT TRANSPORT SERVICES 1001 6.53% 6.92% INC. [JBHT] (6.78%) (6.56%) UTILITIES 36 SOUTHERN COMPANY [SO] 1001 4.60% 4.55% (3.56%) (3.46%) 37 TRANSCANADA CORP. [TRP] 1001 3.15% 3.17% (2.93%) (2.96%) Table 3: Mean And Standard Deviation Of The % Forecasting Error BASIC COMPANIES NN Mean OLS Mean Average Mean MATERIALS (Stdev) (Stdev) (Stdev) INDUSTRY 1 Industrial BHP 3.86% 3.93% 3.89% Metals & BILLITON (4%) (3.83%) (3.92%) Minerals [BHP] 2 Major EXXON 2.85% 9.52% 6.18% Integrated MOBIL (2.49%) (3.92%) (3.20%) Oil & Gas [XOM] 3 Oil & Gas TRANSOCEAN 4.57% 7.98% 6.27% Drilling & [RIG] (5.05%) (3.82%) (4.44%) Exploration 4 Oil & Gas SCHLUMBERGER 5.54% 5.79% 5.66% Equipment & [SLB] (6.44%) (4.03%) (5.24%) Services 5 Oil & Gas IMPERIAL OIL 3.92% 4.19% 4.05% Refining & [IMO] (5.1%) (5.06%) (5.08%) Marketing 6 Gold BARRICK GOLD 4.31% 4.21% 4.26% CORP. [ABX] (3.55%) (3.05%) (3.30%) CONGLOMERATES NN Mean OLS Mean Average Mean (Stdev) (Stdev) (Stdev) 7 Conglomerates GENERAL 7.44% 6.62% 7.03% ELECTRIC CO. (7.18%) (5.98%) (6.58%) [GE] CONSUMER GOODS NN Mean OLS Mean Average Mean (Stdev) (Stdev) (Stdev) 8 Trucks & PACCAR INC. 3.73% 9.28% 6.50% Other [PCAR] (3.8%) (4.9%) (4.35%) Vehicles 9 Auto Parts JOHNSON 6.14% 6% 6.07% CONTROLS INC. (6.87%) (6.28%) (6.58%) [JCI] 10 Personal PROCTER & 5.9% 9.01% 7.45% Products GAMBLE CO. (5.21%) (6.74%) (5.97%) [PG] FINANCIAL NN Mean OLS Mean Average Mean INDUSTRY (Stdev) (Stdev) (Stdev) 11 Asset T. ROWE PRICE 3.29% 5.81% 4.55% Management GROUP INC. (4.48%) (3.62%) (4%) [TROW] 12 Money JPMORGAN 3.79% 7% 5.4% Center Banks CHASE & CO. (3.48%) (7%) (5.24%) [JPM] 13 MORGAN 4.86% 9% 6.93% Diversified STANLEY [MS] (5.37%) (7%) (6.19%) Investments 14 Investment CHARLES 3% 5% 4% Brokerage- SCHWAB [SCHW] (4.27%) (4%) (4.13%) National 15 REIT-- PLUM CREEK 2.10% 1.93% 2.01% Diversified TIMBER CO. (2.28%) (2.15%) (2.21%) INC. [PCL] 16 REIT-- HCP INC. 2.33% 24.80% 13.56% Healthcare [HCP] (2.11%) (4%) (3.05%) Facilities 17 HOST HOTELS & 3.41% 9.96% 6.68% REIT--Hotel/ RESORTS INC. (3.53%) (3.47%) (3.50%) Motel [HST] 18 REIT-- PUBLIC 2.78% 6.96% 4.87% Industrial STORAGE [PSA] (3.13%) (3.25%) (3.19%) 19 REIT-- BOSTON 2.25% 6.31% 4.28% Office PROPERTIES (2.61%) (2.22%) (2.42%) INC. [BXP] 20 REIT-- SIMON 3.19% 7.02% 5.11% Retail PROPERTY (3.05%) (5.33%) (4.19%) GROUP INC. [SPG] HEALTHCARE NN Mean OLS Mean Average Mean (Stdev) (Stdev) (Stdev) 21 Drug JOHNSON & 6.26% 9.36% 7.81% Manufacturers JOHNSON [JNJ] (5.92%) (6.51%) (6.21%) -Major 22 AMGEN INC. 7.58% 6.78% 7.18% Biotechnology [AMGN] (6.78%) (6.34%) (6.56%) INDUSTRIAL NN Mean OLS Mean Average Mean GOODS (Stdev) (Stdev) (Stdev) 23 Farm & CATERPILLAR 5.48% 6.41% 5.95% Construction INC. [CT] (6.61%) (6.68%) (6.64%) Machinery 24 General VULCAN 4.47% 5.23% 4.85% Building MATERIALS CO. (4.60%) (4.84%) (4.72%) Materials [VMC] 25 General EMCOR GROUP 6.06% 8.45% 7.25% Contractors INC. [EME] (6.41%) (8.30%) (7.35%) 26 Heavy MCDERMOTT 2.81% 4.59% 3.70% Construction INTERNATIONAL (3.9%) (4.61%) (4.26%) INC. [MDR] 27 Industrial EMERSON 11.30% 8.26% 9.78% Equipment & ELECTRIC CO. (10.94%) (8.64%) (9.79%) Component [EMR] 28 Metal PRECISION 2.67% 3.93% 3.30% Fabrication CASTPARTS (2.28%) (2.74%) (2.51%) CORP. [PCP] SERVICES NN Mean OLS Mean Average Mean (Stdev) (Stdev) (Stdev) 29 Air FEDEX 12.31% 16.18% 14.24% Delivery & CORPORATION (12.28%) (9.43%) (10.86%) Freight [FDX] Services 30 Auto Parts GENUINE PARTS 5.78% 7.30% 6.54% Wholesale CO. [GPC] (5.21%) (6.10%) (5.65%) 31 Home THE HOME 10.54% 7.84% 9.19% Improvement DEPOT INC. (14.92%) (12.30%) (13.61%) Stores [HD] 32 Industrial W.W. GRAINGER 7.19% 7.89% 7.54% Equipment INC. [GWW] (6.70%) (6.54%) (6.62%) Wholesale 33 Jewelry TIFFANY & CO. 6.34% 6.52% 6.43% Stores [TIF] (6.18%) (5.96%) (6.07%) 34 Railroads BURLINGTON 7.50% 6.42% 6.96% NORTHERN (8.19%) (5.30%) (6.74%) SANTA FE CORP. [BNI] 35 Trucking JB HUNT 6.53% 6.81% 6.67% TRANSPORT (6.78%) (5.91%) (6.34%) SERVICES INC. UTILITIES [JBHT] NN Mean OLS Mean Average Mean (Stdev) (Stdev) (Stdev) 36 Electric SOUTHERN 4.55% 5.19% 4.86% Utilities COMPANY [SO] (3.46%) (3.88%) (3.67%) 37 Gas TRANSCANADA 3.15% 10.66% 6.90% Utilities CORP. [TRP] (2.93%) (4.44%) (3.69%) Table 5: Paired T-Test Results Paired Differences Mean Std. Deviation Pair 1 BHPNN--BHPOLS -6.34259E-4 3.42404730E-2 Pair 2 XOMNN--XOMOLS -6.67665E-2 4.21730911E-2 Pair 3 RIGNN--RIGOLS -3.41046E-2 5.74627710E-2 Pair 4 SLBNN--SLBOLS -2.50065E-3 5.95664222E-2 Pair 5 IMONN--IMOOLS -2.72088E-3 3.42164416E-2 Pair 6 ABXNN--ABXOLS 9.56043E-4 2.98818579E-2 Pair 7 GENN--GEOLS 8.20936E-3 4.67522987E-2 Pair 8 PCARNN--PCAROLS -5.54600E-2 5.08675418E-2 Pair 9 JCINN--JCIOLS 1.41944E-3 4.26025544E-2 Pair 10 PGNN--PGOLS -3.11020E-2 6.69933325E-2 Pair 11 TROWNN--TROWOLS -2.51645E-2 4.53672736E-2 Pair 12 JPMNN--JPMOLS -3.55717E-2 6.70021588E-2 Pair 13 MSNN--MSOLS -3.97635E-2 8.04816222E-2 Pair 14 SCHWNN--SCHWOLS -1.75567E-2 4.50929150E-2 Pair 15 PCLNN--PCLOLS 7.140248E0 3.13463572E1 Pair 16 HCPNN--HCPOLS -2.24665E-1 4.34806491E-2 Pair 17 HSTNN--HSTOLS -6.54900E-2 5.64878909E-2 Pair 18 PSANN--PSAOLS -4.18075E-2 3.48384639E-2 Pair 19 BXPNN--BXPOLS -4.05686E-2 3.06940199E-2 Pair 20 SPGNN--SPGOLS -3.82876E-2 5.24715328E-2 Pair 21 JNJNN--JNJOLS -3.10540E-2 6.83283382E-2 Pair 22 AMGNNN--AMGNOLS 8.01284E-3 4.93690931E-2 Pair 23 CTNN--CTOLS -9.33403E-3 4.63714383E-2 Pair 24 VMCNN--VMCOLS -7.67446E-3 3.66897808E-2 Pair 25 EMENN--EMEOLS -2.38353E-2 5.79509746E-2 Pair 26 MDRNN--MDROLS -1.77961E-2 3.27104313E-2 Pair 27 EMRNN--EMROLS 3.04359E-2 7.95976401E-2 Pair 28 PCPNN--PCPOLS -1.26224E-2 3.13628082E-2 Pair 29 FDXNN-FDXOLS -3.86884E-2 1.27169704E-1 Pair 30 GPCNN--GPCOLS -1.52066E-2 4.97985906E-2 Pair 31 HDNN--HDOLS 2.70019E-2 8.27355383E-2 Pair 32 GWWNN--GWWOLS -7.02212E-3 5.36681089E-2 Pair 33 TIFNN--TIFOLS -1.82811E-3 4.31806947E-2 Pair 34 BNINN--BNIOLS 1.08634E-2 7.21605544E-2 Pair 35 JBHTNN--JBHTOLS -2.81620E-3 5.55325362E-2 Pair 36 SONN--SOOLS -6.40091E-3 3.60359912E-2 Pair 37 TRPNN--TRPOLS -7.51490E-2 3.87235412E-2 Paired t df Sig. Differences 2-tailed Std. Error Mean Pair 1 2.777269095E-3 -.228 151 .820 Pair 2 3.420689389E-3 -19.518 151 .000 Pair 3 4.660846189E-3 -7.317 151 .000 Pair 4 4.831447483E-3 -.518 151 .606 Pair 5 2.775314989E-3 -.980 151 .328 Pair 6 2.423738473E-3 .394 151 .694 Pair 7 3.792112164E-3 2.165 151 .032 Pair 8 4.125902466E-3 -13.442 151 .000 Pair 9 3.455523462E-3 .411 151 .682 Pair 10 5.433876806E-3 -5.724 151 .000 Pair 11 3.679771800E-3 -6.839 151 .000 Pair 12 5.434592715E-3 -6.545 151 .000 Pair 13 6.527921567E-3 -6.091 151 .000 Pair 14 3.657518384E-3 -4.800 151 .000 Pair 15 2.5425253142E0 2.808 151 .006 Pair 16 3.526746345E-3 -63.703 151 .000 Pair 17 4.581772975E-3 -14.294 151 .000 Pair 18 2.825772567E-3 -14.795 151 .000 Pair 19 2.489613771E-3 -16.295 151 .000 Pair 20 4.256003314E-3 -8.996 151 .000 Pair 21 5.542160068E-3 -5.603 151 .000 Pair 22 4.004362293E-3 2.001 151 .047 Pair 23 3.761220316E-3 -2.482 151 .014 Pair 24 2.975934195E-3 -2.579 151 .011 Pair 25 4.700444729E-3 -5.071 151 .000 Pair 26 2.653166327E-3 -6.708 151 .000 Pair 27 6.456221149E-3 4.714 151 .000 Pair 28 2.543859660E-3 -4.962 151 .000 Pair 29 1.031482510E-2 -3.751 151 .000 Pair 30 4.039199035E-3 -3.765 151 .000 Pair 31 6.710738305E-3 4.024 151 .000 Pair 32 4.353058445E-3 -1.613 151 .109 Pair 33 3.502416833E-3 -.522 151 .602 Pair 34 5.852993843E-3 1.856 151 .065 Pair 35 4.504283478E-3 -.625 151 .533 Pair 36 2.922904856E-3 -2.190 151 .030 Pair 37 3.140893949E-3 -23.926 151 .000 Table 6: Aggregated Paired T-Test Results Paired Differences Mean Std. Std. Error Deviation Mean Pair 1 NN--OLS -0.0238 0.068720 0.0009161 Paired Differences 95% Confidence Interval of the Difference Lower Upper Pair 1 NN--OLS -0.02565 -0.02206 t df Sig. (2- tailed) Pair 1 NN--OLS -26.044 5625 .000