NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / sci / engr / control / 377 < prev next >

Wrap

Internet Message Format | 1992-12-22 | 2.6 KB

Path: sparky!uunet!pipex!bnr.co.uk!uknet!cam-eng!dsl!ttj10 From: ttj10@eng.cam.ac.uk (Tim Jervis) Newsgroups: sci.engr.control Subject: Technical report: real pole balancing Message-ID: <TTJ10.92Dec22123522@dsl.eng.cam.ac.uk> Date: 22 Dec 92 17:35:22 GMT Sender: ttj10@eng.cam.ac.uk (T.T. Jervis) Organization: Engineering Department, Cambridge University, England. Lines: 61 Nntp-Posting-Host: dsl.eng.cam.ac.uk The following technical report is available via the Cambridge University ftp archive svr-ftp.eng.cam.ac.uk. Instructions for retrieval from the archive follow the summary. ------------------------------------------------------------------------------ Pole Balancing on a Real Rig using a Reinforcement Learning Controller Timothy Jervis and Frank Fallside Cambridge University Engineering Department Cambridge CB2 1PZ, England Abstract In 1983, Barto, Sutton and Anderson~\cite{Barto83} published details of an adaptive controller which learnt to balance a simulated inverted pendulum. This {\em reinforcement learning} controller balanced the pendulum as a by-product of avoiding a cost signal delivered to the controller when the pendulum fell over. This paper describes their controller learning to balance a real inverted pendulum. As far as the authors are aware, this is the first example of a reinforcement learning controller being applied to a real inverted pendulum learning in real time. The results show that the controller was able to improve its performance as it learnt, and that the task is computationally tractable. However, the implementation was not straightforward. Although some of the controller's parameters were tuned automatically by learning, some were not and had to be carefully set for successful control. This limits the usefulness of this kind of learning controller to small problems which are likely to be better controlled by other means. Before a learning controller can tackle more difficult problems, a more powerful learning scheme has to be found. ------------------------------------------------------------------------------ FTP INSTRUCTIONS unix> ftp svr-ftp.eng.cam.ac.uk Name: anonymous Password: (your_userid@your_site) ftp> cd reports ftp> binary ftp> get jervis_tr115.ps.Z ftp> quit unix> uncompress jervis_tr115.ps.Z unix> If "ftp svr-ftp.eng.cam.ac.uk" does not work, you might try "ftp 129.169.24.20". -- _/_/_/_/_/ _/ _/ _/ Cambridge University _/ _/ _/_/ _/_/ Engineering Department, _/ _/ _/ _/ _/ Trumpington Street, Cambridge CB2 1PZ, England