PREPROCESSING DATA FOR NEURAL NETWORKS

By: Lou Mendelsohn

Today’s global markets demand new analytical tools for survival and profit as prevailing methods of analysis lose their luster. Here, STOCKS & COMMODITIES contributor Lou Mendelsohn explains how an emerging method of analysis — synergistic market analysis — can be applied to neural networks for financial forecasting and discusses how to select and combine various types of market information and transform the data into a format appropriate for neural network training.

With the rise of artificial intelligence technology and the growing interrelated markets of the 1990s offering unprecedented trading opportunities, technical analysis simply based on single- market historical testing is no longer enough. To meet the trading challenge in today’s global markets, technical analysis must be redefined. I propose a multidimensional method of analysis known as synergistic market analysis, which utilizes artificial intelligence technologies, including neural networks, to synthesize technical, fundamental and intermarket data. Synergistic analysis can quantify and discern underlying relationships and patterns between related markets, capturing information that reflects global market dynamics, which can markedly improve trading performance.

Previously, I discussed the selection of neural network paradigms and architectures for synergistic trading. This time, I will explore several issues related to input data selection and preprocessing. Because available technical, intermarket and fundamental market data is extensive and useful preprocessing methods quite extensive, here are some ways to handle input data effectively and efficiently in developing neural networks.

INPUT DATA SELECTION
Data selection can be a demanding and intricate task. After all, a neural network is only as good as the input data used to train it. If important data inputs are missing, then the effect on the neural network’s performance can be significant. Developing a workable neural network application can be considerably more difficult without a solid understanding of the problem domain. When selecting input data, the implications of following a market theory should be kept in mind. Existing market inefficiencies can be noted quantitatively by making use of artificial intelligence tools.

Individual perspective on the markets also influences the choice of input data. Technical analysis suggests the use of only single-market price data as inputs, while conversely, fundamental analysis concentrates solely on data inputs that reflect supply/ demand and economic factors. In today’s global environment, neither approach alone is sufficient for financial forecasting. Instead, synergistic market analysis combines both approaches with intermarket analysis within a quantitative framework using neural networks. This overcomes the limitations of interpreting intermarket relationships through simple visual analysis of price charts and carries conceptualization of intermarket analysis to its logical conclusion.

Here, then, is an example of a neural network that predicts the next day’s high and low for the Treasury bond market. This way, we will be able to see how synergistic market analysis can be implemented in a neural network. First, technical price data on T-bonds should be input into the network, allowing it to learn the general price patterns and characteristics of the target market. In addition, fundamental data that can have an effect on the market — for example, the federal funds rate, the Gross Domestic Product, money supply, inflation rates and the consumer price index –can all be input into the network.

Because the neural network does not subscribe to a particular form of analysis, it will attempt to use all of the input information available to model the market. Thus, using fundamental data in addition to technical data can improve the overall performance of the network. Finally, incorporating intermarket input data on related markets such as the US Dollar Index, Standard & Poor’s 500 index and the German Bund allows the network to utilize this information to find intermarket relationships and patterns that affect the target market. The selection of fundamental and intermarket data is based on domain knowledge coupled with the use of various statistical analysis tools to determine the correlation between this data and target market price data.

PREPROCESSING INPUT DATA
Once the most appropriate raw input data has been selected, it must be preprocessed; otherwise, the neural network will not produce accurate forecasts. The decisions made in this phase of development are critical to the performance of a network.

Transformation and normalization are two widely used preprocessing methods. Transformation involves manipulating raw data inputs to create a single input to a net, while normalization is a transformation performed on a single data input to distribute the data evenly and scale it into an acceptable range for the network. Knowledge of the domain is important in choosing preprocessing methods to highlight underlying features in the data, which can increase the network’s ability to learn the association between inputs and outputs.

Some simple preprocessing methods include computing differences between or taking ratios of inputs. This reduces the number of inputs to the network and helps it learn more easily. In financial forecasting, transformations that involve the use of standard technical indicators should also be considered. Moving averages, for example, which are utilized to help smooth price data, can be useful as a transform.

When creating a neural net to predict tomorrow’s close, a five-day simple moving average of the close can be used as an input to the net. This benefits the net in two ways. First, it has been given useful information at a reasonable level of detail; and second, by smoothing the data, the noise entering the network has been reduced. This is important because noise can obscure the underlying relationships within input data from the network, as it must concentrate on interpreting the noise component. The only disadvantage is that worthwhile information might be lost in an effort to reduce the noise, but this tradeoff always exists when attempting to smooth noisy data.

While not all technical indicators have a smoothing effect, this does not mean that they cannot be utilized as data transforms. Possible candidates are other common indicators such as the relative strength index (RSI), the average directional movement indicator (ADX) and stochastics.

Data normalization is the final preprocessing step. In normalizing data, the goal is to ensure that the statistical distribution of values for each net input and output is roughly uniform. In addition, the values should be scaled to match the range of the input neurons. This means that along with any other transformations performed on network inputs, each input should be normalized as well.

DATA NORMALIZATION
Here are three methods of data normalization, the first of which is a simple linear scaling of data. At the very least, data must be scaled into the range used by the input neurons in the neural network. This is typically the range of -1 to 1 or zero to 1. Many commercially available generic neural network development programs such as NeuralWorks, BrainMaker and DynaMind automatically scale each input. This function can also be performed in a spreadsheet or custom-written program. Of course, a linear scaling requires that the minimum and maximum values associated with the facts for a single data input be found. Let’s call these values Dmin and Dmax, respectively. The input range required for the network must also be determined. Let’s assume that the input range is from Imin to Imax. The formula for transforming each data value D to an input value I is:

I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)

Dmin and Dmax must be computed on an input-by-input basis. This method of normalization will scale input data into the appropriate range but will not increase its uniformity.

The second normalization method utilizes a statistical measure of central tendency and variance to help remove outliers, and spread out the distribution of the data, which tends to increase uniformity. This is a relatively simple method of normalization, in which the mean and standard deviation for the input data associated with each input are determined. Dmin is then set to the mean minus some number of standard deviations. So, if the mean is 50, the standard deviation three and two standard deviations are chosen, then the Dmin value would be 44 (50-2*3).

Dmax is conversely set to the mean plus two standard deviations. All data values less than Dmin are set to Dmin and all data values greater than Dmax are set to Dmax. A linear scaling is then performed as described above. By clipping off the ends of the distribution this way, outliers are removed, causing data to be more uniformly distributed. The third normalization method minimizes the standard deviation of the heights of the columns in the initial frequency distribution histogram.

Figure 1 depicts an example of a distribution as a histogram in which the data is unevenly distributed. To show the effects of the various approaches, we have used both methods of normalization to prepare the data as input to a neural net in the range of zero to 1. Figure 2 shows that a simple linear scaling of the data has no effect on the shape of the frequency distribution itself while Figure 3 shows the same original distribution normalized by the second method, in which two standard deviations were used to set the limits for the outliers so that the distribution spreads out and becomes more uniform. Figure 4 shows that after performing the third normalization on the data, the resulting distribution is the most uniformly distributed. There are other methods for data normalization. Some methods are more appropriate than others, depending on the nature and characteristics of the data to be normalized.

When the network is run on a new test fact, the output produced must be denormalized. If the normalization is entirely reversible with little or no loss in accuracy, then there is no problem. However, if the original normalization involved clipping outlier values, then output values equal to the clipping boundaries should be suspect concerning their actual value. For example, assume that during training ail output values greater than 50 were clipped. Then, during testing, if the net produces an output of 50, this indicates only that the net’s output is 50 or greater. If that information is acceptable for the application, then the normalization method would be sufficiently reversible.

Transformation and normalization can greatly improve a network’s performance. Basically, these preprocessing methods are used to encode the highest-level knowledge that is known about a given problem.

SUMMARY
In developing a neural network for price prediction, direction prediction or buy and sell signal generation, choosing raw data inputs and preprocessing methods are critical to the network’s performance. The following raw input data is necessary for neural networks to capture the market synergy in today’s global markets:

Single-market technical price, volume and open interest data
Intermarket data from highly correlated markets
Appropriate fundamental data affecting the target market.

Here are some suggestions for transforming the input data prior to training a neural network:

1. Preprocess internal data from the target market. This gives the network a basic understanding of the target market. Transforms should include:

a) Changes over time, such as changes in the opens, highs, lows, closes, volume and open interest.

b) A method to reduce the noise in the data. To do so, use simple or exponential moving averages or other appropriate forms of smoothing. More advanced noise reduction techniques such as a fast Fourier transform (FFT) can be attempted.

c) Directional indicators.

d) Overbought and oversold indicators.

Transforms that classify the state that the market is in should be explored: For example, whether the market is in a bull, bear or sideways state. By using indicators that help identify these conditions, the neural network can interpret similar data in different ways when they occur during different market states.

2. Preprocess the intermarket data associated with the target market. One way to do this is to calculate spreads between the target market and the various inter-market. This will make the relationship between the markets more apparent to the neural network.

3. Preprocess associated fundamental data. Find, or attempt to find, data that is updated in the appropriate time frame for the predictions. When predicting the high for tomorrow, for example, attempt to utilize data that is available daily or at least weekly. For weekly predictions, weekly or monthly data would be more appropriate. Remember, daily data can be transformed to weekly data through averaging or taking maximum or minimum values.

4. Normalize the data. Here are some rules of thumb when performing data normalization:

a. Normalize all inputs and outputs.

b. Don’t restrict yourself to the same type of normalization for all inputs/outputs.

c. Use the same normalization type for testing data as well as for training data for each input and output.

d. Make sure that normalization of output data is sufficient reversible.

Once the network architecture has been selected and the inputs chosen and preprocessed, the neural network is ready to be trained. First, however, several questions must be answered: how should facts be presented to the net? What learning rates should be used? What should the initial weights be? These questions, as well as others, will be addressed next time.

Lou Mendelsohn 813 973-0496, is president of Market Technologies, Wesley Chapel, FL., a research, development, and consulting firm involved in the application of artificial intelligence to synergistic market analysis.

REFERENCES
Lo. Andrew W., and A.C. MacKinlay [1988]. “Stock market prices do not follow random walks: Evidence from a simple specification test.” The Review of Financial Studies, Vol. 1, No. 1.
Mendelsohn. Lou [1993]. “Neural network development for financial forecasting.” STOCKS & CO.MMODITIES. September.
_____[1991]. “The basics of developing a neural trading svstem.” Technical Analysis of STOCKS & COMMODITIES, Volume 9: JUNE.
Murphy. John J. [1991]. Intermarket Technical Analysis, John Wiley & Sons.
Peters. Edgar E. [ 1991]. Chaos and Order in the Capital Markets, John Wiley & Sons.
Trippi. Robert R., and Efraim Turban [19921. Neural Networks in Finance and Investing, Probus Publishing.
Vaga, T. [1991]. “The Coherent Market Hypothesis,” Financial Analysts Journal, December/January

Reprinted from Technical Analysis of
Stocks & Commodities magazine. (C) 1993 Technical Analysis, Inc.,
4757 California Avenue S.W., Seattle, WA 98116-4499, (800) 832-4642.