home *** CD-ROM | disk | FTP | other *** search
-
-
- Neural networks are very useful for solving problems where you know the
- inputs that make some output, but you have no idea how they are related. Some
- simple examples are the weather, speech recognition, and vision. It is very
- hard to program a routine which will recognize speech when you don't even
- know what makes the word "network" sound like the work "network". Neural
- networks are good classifiers and are able to learn why an input generates
- a certain output. To train a neural network, you first show it many
- examples of inputs that you know the output it generates. After a while, the
- network will generate the output that is correct for a given input. Now if
- the network learned correctly, the next time you show it an input which it
- has never seen before, it should give the correct output. Now that some of
- you are totally confused, let's do an example.
-
- Problem: Let us say that we are trying to predict the weather for the next
- day. We believe that tomorrow's weather depends upon today's
- temperature, sky conditions (sunny, cloudy, rainy), wind,
- barometric pressure, and humidty. Assuming that the assumption
- is correct (it is not but lets keep the example simple), then all
- we have to do is give the inputs to the network and tell it what
- has happened in the past.
-
- We have recorded 100 days of readings (inputs) and the next day's
- weather (output). Now we show the network the inputs and tell it
- what the outputs must look like based on our data. After training
- the network for a while, all 100 input examples will generate the
- correct output (prediction of tomorrow's weather). Now if the
- network has sufficient information then when we give it today's
- weather and ask it what tomorrow should be like, it should predict
- correctly.
-
- Now the question is, how do we implement a neural network. The network
- design is a two hidden layer, feed-forward, fully-connected network. The
- size of each layer (input, hidden1, hidden2, and output) is set at run-time
- and can be as large as your computer's memory allows.
-
- The INPUTS: The inputs must be between +-1.0.
- The OUTPUTS: Generally each output will either be 1 or 0. For the weather
- problem, a 1 at output 1 may indicate sunny while a 0 indicates
- rain. Also output 2 could indicate windy or not windy.
-
- To train the network:
- STEP 1: You first construct the Neural_network with the size and the
- learning parameters that you want. You may read in the size and
- previously trained weights from a file.
-
- STEP 2: You call calc_forward () with a known input which calculates the
- actual output.
-
- STEP 3: You call back_propagation () with the desired output which will then
- compare the actual output with the desired output and then calculate
- how to change the connections to reduce the difference.
-
- STEP 4: Go back to step 2 with another known input until all know inputs
- have been shown once.
-
- STEP 5: Call update_weights () which will actually change all the inter-
- connections the way back_propagation () calculated the should.
-
- STEP 6: Go back to step 2 and show all the known inputs again until the
- actual output is close (usually within 0.1) of the desired output.
-
- STEP 7: Save the weights (connections) to a file.
-
- See the programs xor_dbd.cc and xor_bp.cc for an example of this basic
- procedure.
-
-
- //****************************************************************************
- //
- // Neural_network class:
- //
- // This class performs all the necessary functions needed to train
- // a Neural Network. The network has an input layer, two hidden
- // layers, and an output layer. The size of each layer is specified
- // a run time so there is no restriction on size except memory.
- // This is a feed-forward network with full connctions from one
- // layer to the next.
- //
- // The network can perform straight back-propagation with no
- // modifications (Rumelhart, Hinton, and Williams, 1985) which
- // will find a solution but not very quickly. The network can also
- // perform back-propagation with the delta-bar-delta rule developed
- // by Robert A. Jacobs, University of Massachusetts
- // (Neural Networks, Vol 1. pp.295-307, 1988). The basic idea of this
- // rule is that every weight has its own learning rate and each
- // learning rate should be continously changed according to the
- // following rules -
- // - If the weight changes in the same direction as the previous update,
- // then the learning rate for that weight should increase by a constant.
- // - If the weight changes in the opposite direction as the previous
- // update, then the learning rate for that weight should decrease
- // exponentially.
- //
- // learning rate = e(t) for each individual weight
- // The exact formula for the change in learning rate (DELTA e(t)) is
- //
- //
- // | K if DELTA_BAR(t-1)*DELTA(t) > 0
- // DELTA e(t) = | -PHI*e(t) if DELTA_BAR(t-1)*DELTA(t) < 0
- // | 0 otherwise
- //
- // where DELTA(t) = dJ(t) / dw(t) ---> Partial derivative
- //
- // and DELTA_BAR(t) = (1 - THETA)*DELTA(t) + THETA*DELTA_BAR(t-1).
- //
- // For full details of the algorithm, read the article in
- // Neural Networks.
- //
- //
- // To perform straight back-propagation, just construct a Neural_network
- // with no learning parameters specified (they default to straight
- // back-propagation) or set them to
- // K = 0, PHI = 0, THETA = 1.0
- //
- // However, using the delta-bar-delta rule should increase your rate of
- // convergence by a factor of 10 to 100 generally. The parameters for
- // the delta-bar-delta rule I use are
- // K = 0.025, PHI = 0.2, THETA = 0.8
- //
- // One more heuristic method has been employed in this Neural net class-
- // the skip heuristic. This is something I thought of and I am sure
- // other people have also. If the output activation is within
- // skip_epsilon of its desired for each output, then the calc_forward
- // routine returns the skip_flag = 1. This allows you to not waste
- // time trying to push already very close examples to the exact value.
- // If the skip_flag comes back '1', then don't bother calculating forward
- // or back-propagating the example for X number of epochs. You must
- // write the routine to skip the example yourself, but the Neural_network
- // will tell you when to skip the example. This heuristic also has the
- // advantage of reducing memorization and increases generalization.
- // Typical values I use for this heuristic -
- // skip_epsilon = 0.01 - 0.05
- // number skipped = 2-10.
- //
- // Experiment with all the values to see which work best for your
- // application.
- //
- //
- // Comments and suggestions are welcome and can be emailed to me
- // anstey@sun.soe.clarkson.edu
- //
- //****************************************************************************
-
-
- //***********************************************************************
- // Constructors : *
- // Full size specifications and learning parameters. *
- // Learning parameters are provided defaults which are set to *
- // just use the BP algorithm with no modifications. *
- // *
- // Read constructor which reads in the size and all the weights from *
- // a file. The network is resized to match the size specified *
- // by the file. Learning parameters must be specified *
- // separately. *
- //***********************************************************************
-
- Neural_network (int number_inputs = 1, int number_hidden1 = 1,
- int number_hidden2 = 1,
- int number_outputs = 1, double t_epsilon = 0.1,
- double t_skip_epsilon = 0.0, double t_learning_rate = 0.1,
- double t_theta = 1.0, double t_phi = 0.0, double t_K = 0.0,
- double range = 3.0);
- Neural_network (char *filename, int& file_error, double t_epsilon = 0.1,
- double t_skip_epsilon = 0.0, double t_learning_rate = 0.1,
- double t_theta = 1.0, double t_phi = 0.0, double t_K = 0.0);
- ~Neural_network ();
-
-
- //**************************************************************************
- // Weight parameter routines: *
- // save_weights : This routine saves the weights of the network *
- // to the file <filename>. *
- // *
- // read_weights : This routine reads the weight values from the file *
- // <filename>. The network is automatically resized to the *
- // size specified by the file. *
- // *
- // Activation routines return the node activation after a calc_forward *
- // has been performed. *
- // *
- // get_weight routines return the weight between node1 and node2. *
- // *
- //**************************************************************************
-
- int save_weights (char *filename);
- int read_weights (char *filename);
-
- double get_hidden1_activation (int node);
- double get_hidden2_activation (int node);
- double get_output_activation (int node);
-
- double get_input_weight (int input_node, int hidden1_node);
- double get_hidden1_weight (int hidden1_node, int hidden2_node);
- double get_hidden2_weight (int hidden2_node, int output_node);
-
-
- //*******************************************************************
- // Size parameters of network. *
- // The size of the network may be changed at any time. The weights *
- // will be copied from the old size to the new size. If the new *
- // size is larger, then the extra weights will be randomly set *
- // between +-range. The matrices used to hold learning updates *
- // and activations will be re-initialized (cleared). *
- //*******************************************************************
-
- int get_number_of_inputs ();
- int get_number_of_hidden1 ();
- int get_number_of_hidden2 ();
- int get_number_of_outputs ();
- void set_size_parameters (int number_inputs, int number_hidden1,
- int number_hidden2, int number_outputs,
- double range = 3.0);
-
-
- //*******************************************************************
- // Learning parameters functions. These parameters may be changed *
- // on the fly. The learning rate and K may have to be reduced as *
- // more and more training is done to prevent oscillations. *
- //*******************************************************************
-
- void set_epsilon (double eps);
- void set_skip_epsilon (double eps);
- void set_learning_rate (double l_rate);
- void set_theta (double t_theta);
- void set_phi (double t_phi);
- void set_K (double t_K);
-
- double get_epsilon ();
- double get_skip_epsilon ();
- double get_learning_rate ();
- double get_theta ();
- double get_phi ();
- double get_K ();
- long get_iterations ();
-
-
- //**************************************************************************
- // The main neural network routines: *
- // *
- // The network input is an array of doubles which has a size of *
- // number_inputs. *
- // The network desired output is an array of doubles which has a size *
- // of number_outputs. *
- // *
- // back_propagation : Calculates how each weight should be changed. *
- // Assumes that calc_forward has been called just prior to *
- // this routine to calculate all of the node activations. *
- // *
- // calc_forward : Calculates the output for a given input. Finds *
- // all node activations which are needed for back_propagation *
- // to calculate weight adjustment. Returns abs (error). *
- // The parameter skip is for use with the skip_epsilon *
- // parameter. What it means is if the output is within *
- // skip_epsilon of the desired, then it is so close that it *
- // should be skipped from being calculated the next X times. *
- // Careful use of this parameter can significantly increase *
- // the rate of convergence and also help prevent over-learning. *
- // *
- // calc_forward_test : Calculates the output for a given input. This *
- // routine is used for testing rather than training. It returns *
- // whether the test was CORRECT, GOOD or WRONG which is *
- // determined by the parameters correct_epsilon and *
- // good_epsilon. CORRECT > GOOD > WRONG. *
- // *
- // update_weights : Actually adjusts all the weights according to *
- // the calculations of back_propagation. This routine should *
- // be called at the end of every training epoch. The weights *
- // can be updated by the straight BP algorithm, or by the *
- // delta-bar-delta algorithm developed by Robert A. Jacobs *
- // which increases the rate of convergence generally by at *
- // least a factor of 10. The parameters THETA, PHI, and K *
- // determine which algorithm is used. The default settings *
- // for these parameters cause update_weights to use the straight *
- // BP algorithm. *
- // *
- // kick_weights : This routine changes all weights by a random amount *
- // within +-range. It is useful in case the network gets *
- // 'stuck' and is having trouble converging to a solution. I *
- // use it when the number wrong has not changed for the last 200 *
- // epochs. Getting the range right will take some trial and *
- // error as it depends on the application and the weights' *
- // actual values. *
- // *
- //**************************************************************************
-
- void back_propagation (double input [], double desired_output [],
- int& done);
-
- double calc_forward (double input [], double desired_output [],
- int& num_wrong, int& skip, int print_it,
- int& actual_printed);
-
- int calc_forward_test (double input [], double desired_output [],
- int print_it, double correct_eps, double good_eps);
-
- void update_weights ();
-
- void kick_weights (double range);
-
- };
-
- KNOWN BUGS:
- There are no known bugs in my code (does not mean there aren't any), but
- there is one bug with the Aztec C 5.0a m8 library. The fscanf routine does
- not correctly read in floating point numbers from a text file so you must
- use a different math library to compile and link.
-
- This code has been compiled with g++, gcc, and Aztec C 5.0a. The C version
- has not been fully tested but it seems to work just fine. I also do not
- plan to continue revising the C version, just the C++ version. However, I
- will fix any bugs in either version.
-
- You can email me at
- anstey@sun.soe.clarkson.edu
-
- Hope you like the code!
-
-