home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!know!cass.ma02.bull.com!think.com!rpi!batcomputer!ghost.dsi.unimi.it!univ-lyon1.fr!chx400!dxcern!dxlaa.cern.ch!block
- From: block@dxlaa.cern.ch (Frank Block)
- Newsgroups: comp.ai.neural-nets
- Subject: Re: How to train a lifeless network (of "silicon atoms")?
- Message-ID: <1992Nov22.182325.24185@dxcern.cern.ch>
- Date: 22 Nov 92 18:23:25 GMT
- References: <1992Nov21.002654.13198@news.columbia.edu>
- Sender: news@dxcern.cern.ch (USENET News System)
- Reply-To: block@dxlaa.cern.ch (Frank Block)
- Organization: CERN, European Laboratory for Particle Physics, Geneva
- Lines: 48
-
-
- In article <1992Nov21.002654.13198@news.columbia.edu>, rs69@cunixb.cc.columbia.edu (Rong Shen) writes:
-
- |> Please allow me to ask you this childish question:
- |>
- |> Suppose you have a neural network and you want to train it to
- |> perform a task; for the moment, let's say the task is to recognize
- |> handwriting. Now suppose the network has recognized the word "hello,"
- |> and the weight in the synapse between neurodes (neurons) X and Y is k.
- |> If you proceed to train the network to recognize the word "goodbye"
- |> (by back propagation, or whatever algorithms), and since all the
- |> neurodes are connected in some way (through some interneurons, maybe),
- |> the synaptic weight between X and Y is likely to change from k to some
- |> other number; similarly, the weights in other synapses will change.
- |> Therefore, it is extremely likely that one training session will erase
- |> the efforts of previous sessions.
- |>
- |> My question is, What engineering tricks shall we use to
- |> overcome this apparent difficulty?
- |>
- |> Thanks.
- |>
- |> --
- |> rs69@cunixb.cc.columbia.edu
- |>
- --
-
- What you normally do during training is to present (taking you example) the
- words 'hello' and 'goodbye' alternatively. You should not train the net first
- just on one and then, when it has learned to recognize it, on the other.
- The training is a statistical process which in the end (let's hope) converges
- to a good set of weights (a compromise which recognizes all patterns in an
- optimal way).
- The engineering trick is mostly the so-called 'gradient descent' (in backprop).
- This moves you current weight vector always in a direction which decreases the
- network error measure.
-
- Hope this helps a bit
-
- Frank
-
- ===============================================================================
- Frank Block
- Div. PPE
- European Laboratory for Particle Physics - CERN
- CH-1211 Geneve 23 BLOCKF@cernvm.cern.ch
- Switzerland BLOCKF@vxcern.cern.ch
- ===============================================================================
-