home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!paladin.american.edu!gatech!rpi!usc!news.service.uci.edu!ucivax!pazzani
- From: pazzani@pan.ics.uci.edu (Michael Pazzani)
- Subject: Re: speed freak buys farm?
- Nntp-Posting-Host: pan.ics.uci.edu
- Message-ID: <2B3A7397.27344@ics.uci.edu>
- Newsgroups: comp.lang.lisp.mcl
- Reply-To: pazzani@ics.uci.edu (Michael Pazzani)
- Organization: Univ. of Calif., Irvine, Info. & Computer Sci. Dept.
- Lines: 81
- Date: 25 Dec 92 02:36:07 GMT
- References: <725148821.5417973@AppleLink.Apple.COM>
-
-
- Howard-
- Others have asked the question "Are you sure you running the right
- algorithm?" Although Koza has implemented his genetic programming
- approach efficiently, I'm not convinced that genetic programming
- is the best solution for your task.
-
- I've read most of his papers, and looked at his book, (and seen the
- videotape), and I agree that it's a very interesting research
- paradigm, and has a lot of potential. It overcomes a limitation of
- typical genetic algorithms in that it works on trees rather than
- bit-strings. The great thing about genetic programming, is its
- generality. it can solve a wide variety of problems, from learning
- classification rules, curve fitting, robot control, game playing etc.
- But I don't think you really care about this generality. You have a
- specific problem in mind:
- >The particular problem that I am battling with evaluates a 1000 deep stack
- >filter of window 7 across a test series of 600 data points
-
- There's not enough information here to tell exactly what your task
- is, but there may be a special purpose algorithm (function optimization?),
- that can be used instead of the general purpose one. For an area,
- I know more about (learning classification rules), there are algorithms
- (e.g., decision trees) that run thousands of times faster than genetic
- programming, and produce more accurate rules. I would imagine there
- might be techniques (e.g., gradient descent, simulated annealing, etc.)
- that might also be applied to your problem.
-
- In a previous message, you wrote:
- >Ultimately, you will end up with the 'fittest' (optimal) set of factors.
-
- Actually, you'll learn one of a large family of functions that performs
- perfectly on the fitness (or training) data. However, like all
- induction algorithms, there are no guarantees about how it will
- perform on unseen data drawn from the same population. A potential
- problem with genetic programming is that the function learned is
- typically quite complex. There are reasons to believe (the minimum
- description or Occam's razor), that simpler functions will be more
- accurate than complex ones on unseen data. For example, look at the
- spiral example in Koza's book. Although the learned function correctly
- distinguishes which of two spirals each point lies on, the resulting
- function (illustrated in gray in the figures) doesn't really look like
- two interleaved spirals.
-
- Koza rarely discusses how functions created by genetic programming
- work on unseen data. One excuse is that it would be difficult. To do
- this well would require running genetic programming a number of times
- on different subsets of the data, and measuring the predictive
- accuracy on unseen data. With repeated trials, you could find the
- average accuracy, and compare it to alternative approaches. When each
- trial takes a week on Quadra, I can understand why this isn't done
- often.
-
- By the way, it might be the case that a slight modification of genetic
- programming would generalize better on unseen data. The fitness
- function typically used in this approach computes something (mean
- squared error, accuracy, etc) on the a sample of the data. A fitness
- function that in addition penalized complex functions (e.g., by
- subtracting the number of cons cells in the s-exp of the program)
- might perform better. I'll let you know in a few months.
-
- Finally, if you are going to use genetic programming, you really
- aren't restricted to lisp. It is very simple to write an interpreter
- in any language for the subset of lisp that genetic programming
- program operates on. All you need is constants and function
- application. In fact, that's what the "fast-eval" function does. For
- example, you could write a C program that mutates and breeds
- s-expressions, write the "terminal" functions in C, and write an
- interpreter to "eval" them.
-
-
- Mike
-
- P.S. The above messages isn't meant as a critcism of Koza. He's
- clearly made a significant research contribution. However, until more
- experience is gained with genetic programming, and it is compared to
- alternative approaches in AI, (and statistics), I wouldn't recommend
- purchasing any hardware to run it on a specific application. In case
- anyone from NSF or DARPA is reading, I'd strongly recommend buying
- Koza a connection machine or a farm of Quadra so that he can continue
- research in this area.
-