NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / comp / lang / lisp / mcl / 1891 < prev next >

Wrap

Text File | 1992-12-24 | 4.9 KB | 100 lines

Newsgroups: comp.lang.lisp.mcl Path: sparky!uunet!spool.mu.edu!agate!boulder!cambridge.apple.com!news.cambridge.apple.com!news.media.mit.edu!enterpoop.mit.edu!usc!news.service.uci.edu!ucivax!pazzani From: pazzani@pan.ics.uci.edu (Michael Pazzani) Subject: Re: speed freak buys farm? Message-ID: <725254225.24113@news.Colorado.EDU> Lines: 81 Sender: news Nntp-Posting-Host: pan.ics.uci.edu Reply-To: pazzani@ics.uci.edu (Michael Pazzani) Organization: Univ. of Calif., Irvine, Info. & Computer Sci. Dept. References: <725148821.5417973@AppleLink.Apple.COM> Distribution: co Date: 25 Dec 92 02:36:07 GMT Approved: news X-Note1: mail msgid was <2B3A7397.27344@ics.uci.edu> X-Note2: message-id generated by recnews Lines: 81 Howard- Others have asked the question "Are you sure you running the right algorithm?" Although Koza has implemented his genetic programming approach efficiently, I'm not convinced that genetic programming is the best solution for your task. I've read most of his papers, and looked at his book, (and seen the videotape), and I agree that it's a very interesting research paradigm, and has a lot of potential. It overcomes a limitation of typical genetic algorithms in that it works on trees rather than bit-strings. The great thing about genetic programming, is its generality. it can solve a wide variety of problems, from learning classification rules, curve fitting, robot control, game playing etc. But I don't think you really care about this generality. You have a specific problem in mind: >The particular problem that I am battling with evaluates a 1000 deep stack >filter of window 7 across a test series of 600 data points There's not enough information here to tell exactly what your task is, but there may be a special purpose algorithm (function optimization?), that can be used instead of the general purpose one. For an area, I know more about (learning classification rules), there are algorithms (e.g., decision trees) that run thousands of times faster than genetic programming, and produce more accurate rules. I would imagine there might be techniques (e.g., gradient descent, simulated annealing, etc.) that might also be applied to your problem. In a previous message, you wrote: >Ultimately, you will end up with the 'fittest' (optimal) set of factors. Actually, you'll learn one of a large family of functions that performs perfectly on the fitness (or training) data. However, like all induction algorithms, there are no guarantees about how it will perform on unseen data drawn from the same population. A potential problem with genetic programming is that the function learned is typically quite complex. There are reasons to believe (the minimum description or Occam's razor), that simpler functions will be more accurate than complex ones on unseen data. For example, look at the spiral example in Koza's book. Although the learned function correctly distinguishes which of two spirals each point lies on, the resulting function (illustrated in gray in the figures) doesn't really look like two interleaved spirals. Koza rarely discusses how functions created by genetic programming work on unseen data. One excuse is that it would be difficult. To do this well would require running genetic programming a number of times on different subsets of the data, and measuring the predictive accuracy on unseen data. With repeated trials, you could find the average accuracy, and compare it to alternative approaches. When each trial takes a week on Quadra, I can understand why this isn't done often. By the way, it might be the case that a slight modification of genetic programming would generalize better on unseen data. The fitness function typically used in this approach computes something (mean squared error, accuracy, etc) on the a sample of the data. A fitness function that in addition penalized complex functions (e.g., by subtracting the number of cons cells in the s-exp of the program) might perform better. I'll let you know in a few months. Finally, if you are going to use genetic programming, you really aren't restricted to lisp. It is very simple to write an interpreter in any language for the subset of lisp that genetic programming program operates on. All you need is constants and function application. In fact, that's what the "fast-eval" function does. For example, you could write a C program that mutates and breeds s-expressions, write the "terminal" functions in C, and write an interpreter to "eval" them. Mike P.S. The above messages isn't meant as a critcism of Koza. He's clearly made a significant research contribution. However, until more experience is gained with genetic programming, and it is compared to alternative approaches in AI, (and statistics), I wouldn't recommend purchasing any hardware to run it on a specific application. In case anyone from NSF or DARPA is reading, I'd strongly recommend buying Koza a connection machine or a farm of Quadra so that he can continue research in this area.