NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / ai / neuraln / 4345 < prev next >

Wrap

Internet Message Format | 1992-11-22 | 2.5 KB

Path: sparky!uunet!know!cass.ma02.bull.com!think.com!rpi!batcomputer!ghost.dsi.unimi.it!univ-lyon1.fr!chx400!dxcern!dxlaa.cern.ch!block From: block@dxlaa.cern.ch (Frank Block) Newsgroups: comp.ai.neural-nets Subject: Re: How to train a lifeless network (of "silicon atoms")? Message-ID: <1992Nov22.182325.24185@dxcern.cern.ch> Date: 22 Nov 92 18:23:25 GMT References: <1992Nov21.002654.13198@news.columbia.edu> Sender: news@dxcern.cern.ch (USENET News System) Reply-To: block@dxlaa.cern.ch (Frank Block) Organization: CERN, European Laboratory for Particle Physics, Geneva Lines: 48 In article <1992Nov21.002654.13198@news.columbia.edu>, rs69@cunixb.cc.columbia.edu (Rong Shen) writes: |> Please allow me to ask you this childish question: |> |> Suppose you have a neural network and you want to train it to |> perform a task; for the moment, let's say the task is to recognize |> handwriting. Now suppose the network has recognized the word "hello," |> and the weight in the synapse between neurodes (neurons) X and Y is k. |> If you proceed to train the network to recognize the word "goodbye" |> (by back propagation, or whatever algorithms), and since all the |> neurodes are connected in some way (through some interneurons, maybe), |> the synaptic weight between X and Y is likely to change from k to some |> other number; similarly, the weights in other synapses will change. |> Therefore, it is extremely likely that one training session will erase |> the efforts of previous sessions. |> |> My question is, What engineering tricks shall we use to |> overcome this apparent difficulty? |> |> Thanks. |> |> -- |> rs69@cunixb.cc.columbia.edu |> -- What you normally do during training is to present (taking you example) the words 'hello' and 'goodbye' alternatively. You should not train the net first just on one and then, when it has learned to recognize it, on the other. The training is a statistical process which in the end (let's hope) converges to a good set of weights (a compromise which recognizes all patterns in an optimal way). The engineering trick is mostly the so-called 'gradient descent' (in backprop). This moves you current weight vector always in a direction which decreases the network error measure. Hope this helps a bit Frank =============================================================================== Frank Block Div. PPE European Laboratory for Particle Physics - CERN CH-1211 Geneve 23 BLOCKF@cernvm.cern.ch Switzerland BLOCKF@vxcern.cern.ch ===============================================================================