NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / comp / ai / neuraln / 4654 < prev next >

Wrap

Text File | 1992-12-28 | 2.0 KB | 40 lines

Newsgroups: comp.ai.neural-nets Path: sparky!uunet!cs.utexas.edu!sun-barr!ames!agate!spool.mu.edu!yale.edu!jvnc.net!news.edu.tw!news!Net.nthu.edu.tw!News.nthu.edu.tw!dr788307 From: dr788307@cs.nthu.edu.tw (dr78) Subject: Neural Reinforcement Learning? Message-ID: <1992Dec28.142832.16169%dr788307@cs.nthu.edu.tw> Sender: news@News.nthu.edu.tw (Net News) Organization: National Tsing Hua University (HsinChu) Date: Mon, 28 Dec 1992 14:28:32 GMT Lines: 29 Sutton's Adaptive Heuristic Critic (AHC) and Watkin's Q-learning are two popular reinforcement learning mechanisms, and their original designs are based on storing utilities on a look-up table structure. In order to properly predict (interpolate) utilities of unvisited states and to compress the utility storage, many literature tried and reported progresses in neural implementation. However, this will introduce new problems: (1) updating the utility of a state may undesiredly cause a large change the utilities of other states. Besides, (2) the training set seems to be more or less conflict since we have many utilities needed to assign to a state in the updating process. If we just use the on-line backpropagation to train a MLP (feeding a pair of state and its new utility), the problem (1) and (2) will occur and (2) emphasizes the problem (1). If we use a batch backpropagation, the table storage will be necessary to keep a single utility of each state (to avoid the problem (2)). Obviously, this kind of neural implementation is guided by the symbolic knowledge and violates the original purpose of saving memory. To date, the reported results seem to be limited on small applications. Thus, the batch backpropagation seems to be the way which those researchers chose. Am I right? But, this seems to a bad neural implementation since the training set (stored in a table) is guided by symbolic knowledges. Any comments will be highly appreciated!! ------------------------------------------------------------------------------- Hown-Wen Chen e-mail: dr788307@cs.nthu.edu.tw