The Basics of Developing an Artificial Intelligence Trading System
As traders begin to experiment with and apply artificial intelligence systems to financial forecasting, there are pitfalls to avoid in the design and training of these systems so this new technology can be used effectively and profitably. If you want to apply artificial intelligence to predict the Standard & Poor’s 500 stock index (S&P 500) or the Dow Jones Industrial Average (DJIA) for the next day, for instance, you would need to specify five factors: the output that you want to forecast, the input data requirements, the type of neural system to apply to the problem, its size and structure.
First, an artificial intelligence trading system can predict four outputs: classification, pattern, real numbers (such as tomorrow’s S&P 500 or DJIA close) and optimization.
Second, you need to select the input data that will be used by the neural system. The data should be related to the output that you want to forecast. Unlike conventional technical trading systems, neural systems work well when you combine technical and fundamental data. Remember that apparently irrelevant data could conceivably allow the network to make distinctions that are not readily apparent, so don’t be afraid to include such data as an input. During training, the neural system will sift through all the input data to determine relevance and may turn up with something that could surprise you.
Next, you need to consider how to “massage,” or manipulate, the input data before training. This step is extremely important, since neural systems train much better on relative numbers such as oscillators, momentum and ratios, which provide the relevant relationships explicitly rather than forcing the system to discover them. The more pertinent the data that you provide the network, the better it will train. Therefore, it is important to select the training data carefully and preprocess it before training.
NEURAL LAYER ON LAYER
Neural networks consist of layers of neurons that are connected to each other. Typically, there are three layers: an input layer, an output layer, and a hidden layer (Figure 1). One type of neural system architecture that I have used for financial forecasting is known as a feed forward network with supervised learning (Figure 2). This type of system has two or more layers, with neurons in one layer receiving information only from the previous layer and sending outputs only to the next layer. Neurons in a given layer do not interconnect. Each neuron in a layer is connected to every neuron of the succeeding layer, with mathematical weights (or connection strengths) assigned to their connections. This is known as “fully connected” network configurations.
THE INPUT LAYER
The input layer presents data to the network. The number of neurons in the input layer is determined by the number of data categories. Each category of input data requires one input neuron, and it is here that the size and structure of the neural system must be determined. For instance, in a S&P 500 or DJIA prediction system, if your input data include each day’s closing price for the Deutschemark, S&P 500, Japanese yen, Treasury bills, Eurodollars, Swiss franc, U.S. dollar index, Treasury bonds, DJIA and gold, as well as the discount and Fed funds rates (a total of 12 categories of data), your network would have 12 neurons in the input layer. Massaging the data with moving averages, ratios and so on to eliminate data noise will affect the number of input neurons. Coupled with each day’s input data would be the next day’s S&P 500 or DJIA closing price. Each of these input/output pairs of data or training pattern is called a “fact.”
The hidden layer is composed of neurons that are connected to neurons in the input and output layers but do not connect directly with the outside world. The hidden layer is where the system recodes the input data into a form that captures the hidden correlations, allowing the system to generalize from previously learned facts to new inputs.
Experimentation often determines the number of hidden layers and the appropriate number of neurons in them. Too few neurons impair the network and prevent it from correctly mapping inputs to outputs, while too many neurons impede generalization by allowing the network to “memorize” the patterns presented to it without extracting any of the salient features (similar to curve-fitting or over optimization). Then, when presented with new patterns, the network cannot process them properly because it has not discovered the underlying relationships.
THE OUTPUT LAYER
Each neuron in the output layer receives its inputs from each neuron in the hidden layer. Your desired output determines how many output neurons the system needs. Each output category requires one output neuron. Thus, if we want to predict the next day’s open, high, low and close for the S&P 500 or the DJIA, the system would, in fact, need four neurons in the output layer.
With supervised learning, you would provide the artificial intelligence with “facts” that represent input training patterns (today’s prices, discount rate and Fed funds rate) that you expect the system to encounter subsequently during trading, and an output training pattern (next day’s prices) that you want it to forecast. In this manner, during training the system forecasts as its output the next day’s S&P 500 or DJIA level, which is then used to adjust each neuron’s connection weight, so that during subsequent training iterations, the system will be more likely to forecast the correct output.
For the system to learn during training, there must be a way to alter the connection weights in terms of how much and in which direction they will be changed. This algorithm, or paradigm, is known as the “learning law.” While numerous learning laws can be applied to neural networks, perhaps the most widely used is the generalized delta rule or back propagation method.
During each iteration of training, the inputs presented to the network generate “a forward flow of activation” from the input to the output layer. Then, whenever the output forecast by the system (next day’s S&P 500 or DJIA) is incorrect when compared with its corresponding value in the training pattern, information will flow backward from the output layer to the input layer, adjusting the weights on the inputs along the way. On the next training iteration, when the system is presented with the same input data, it will be more likely to forecast the correct output.
The learning law for a given network defines precisely how to modify these connection weights between neurons to minimize output errors during subsequent training iterations. If no error occurs, then no learning is needed for that fact. Eventually, when the system has completed learning on all of the facts, it reaches a stable state and is ready for further testing.
DISCERNING INTERNAL MAPPING
Through learning, the network creates an internal mapping of the input data that discerns the underlying causal relationships that exist within the data, allowing the system to be predictive on new input data. This methodology is more accurate at predicting future prices and trading signals than are conventional technical trading systems that rely on an expert to write trading rules.
The time necessary to train can be considerable, depending on your computer’s speed, the number of layers and the number of neurons. You can perform “walk-forward” testing by creating a testing file made up of facts not included in the training facts.
It may be necessary to refine system parameters, including modifying its architecture, learning law, input data or means of preprocessing. You may even need to redefine the expected output. Unlike training, during testing the system will not adjust the connection strengths to compensate for errors. The most common training problems occur when the system cannot train adequately or takes too long to train. If your system refuses to train on certain facts, there may be contradictory or ambiguous pairs of facts. If this is the case, you should reexamine your data inputs. You may also need to massage your input data more efficiently.
To actually run the trained network in real time to get price forecasts, you would perform daily updates, giving the system each day’s inputs just as you did during training, except that no adjustments are made to the neuron’s interconnections. A trading fact file contains daily data that you provide to a trained system to get the daily projected output you’re looking for. Trading fact files look like training fact files but don’t contain training patterns.
Artificial intelligence trading systems are poised to change the nature of analysis performed on the financial markets. With the means to develop dynamic trading systems that can adapt themselves to changing market conditions, without the necessity of relying upon preconceived trading rules, this new “sixth-generation” trading technology has the potential to affect the way computerized traders apply technical and fundamental analyses to global markets of the 1990s.
Louis Mendelsohn, president of Market Technologies, designs and tests artificial intelligence trading systems for the financial industry.
REFERENCES
Fishman, Mark B., Dean S. Barr and Walter J. Loick [1991].
“Artificial intelligence and market analysis,” Stocks & Commodities, March.
[1991]. “Using neural nets in market analysis,” Stocks & Commodities, April.
Halquist, Carol H., and George F. Schmoll [1989]. “Neural networks: A trading perspective,” Technical Analysis of Stocks & Commodities, Volume 7:November.
Reprinted from Technical Analysis of
Stocks & Commodities magazine. (C) 1991 Technical Analysis, Inc.,
4757 California Avenue S.W., Seattle, WA 98116-4499, (800) 832-4642.