NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / ai / 4359 < prev next >

Wrap

Text File | 1992-11-20 | 5.2 KB | 127 lines

Newsgroups: comp.ai Path: sparky!uunet!haven.umd.edu!darwin.sura.net!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!usenet.ins.cwru.edu!agate!apple!mumbo.apple.com!gallant.apple.com!kip2-11.apple.com!user From: PAULSON3@Applelink.apple.com (munch) Subject: Re: Games and genetic algorithms Sender: news@gallant.apple.com Message-ID: <PAULSON3-201192153909@kip2-11.apple.com> Date: Sat, 21 Nov 1992 00:51:52 GMT References: <1992Nov11.001553.12600@samba.oit.unc.edu> <1992Nov11.131739.19137@athena.mit.edu> <1992Nov16.175215.29411@versyss.com> <1992Nov18.210045.19530@Princeton.EDU> Organization: Apple Computer, Inc. Followup-To: comp.ai,comp.ai.genetic,rec.games.programmer Lines: 114 In article <1992Nov18.210045.19530@Princeton.EDU>, >acoggins@phoenix.Princeton.EDU (Adam Cody Coggins) wrote: > > In article <1992Nov16.175215.29411@versyss.com> keithd@versyss.UUCP (Keith Doyle) writes: > > >In article <1992Nov11.131739.19137@athena.mit.edu> acgoldis@athena.mit.edu (Andrew C Goldish) writes: [...] wrt automatically creating optimal strategies for game play via genetic algorithms: For several years I've had as a background challenge developing an algorithm which will create the optimal strategy for the simple two player poker game described in _Theory of Games and Economic Behavior_ by von Neumann and MorgenStern. In _Theory_, they describe a discrete form of the game and then evolve it into a continuous form. They provide an analytic solution to the continuous form. My version of the game deals from a 14 card deck. The cards have values from 1 .. 7. Ties are possible. The game is played by dealing each player a card. There is a single betting round where a player wagers either "low" number of chips (in my case, 5) or wagers a "high" number of chips (15). The players simultaneously declare if they are betting high or if they are betting low. If both players bet high or both bet low, the cards are revealed and the player with the higher card wins high/low number of chips from the other player. If the players tie, no chips are exchanged. If one player bets high and the other player bets low, there is no showdown. The player betting high receives the "low" number of chips from the player betting low. Strategy is represented by a 7 element vector. Each element is the probability of the player betting high if the player is dealt the corresponding card. My last attempts used genetic algorithms. Working off the fact that an optimal strategy in the game-theoretic sense is one which is no worse than any other strategy, I create two populations of players, evolve optimal players in one pool and collect survivors in the other. By collecting locally optimal survivors I am able to provide the evolving pool with a set of tougher opponents. Computation of the fitness function of an individual strategy versus any other strategy is done with two nested loops and two conditionals in the heart of the loops. No randomness is involved. An individual is better than an other if the fitness is greater than 0. Computation of the fitness of an individual in the evolving pool versus the survivor pool is done by computing the fitness of the individual against all strategies in the survivor pool. To satisfy the "no worse than any other strategy" constraint, I add a massive penalty (-100000) to the cumulative fitness of an individual whenever a fitness value is negative. (To avoid any potential problems with floating point roundoff and to allow otherwise good individuals to avoid penalties, I penalize if below -5 instead of 0). To summarize the technique: // Initialize to random strategies evolvers <- random strategies survivors <- random strategies // Evolve repeat repeat // Clear fitness of evolvers for e in evolvers fitness of e <- 0 // Compute fitness values for e in evolvers for s in survivors value <- fitness(e, s) if value is sufficiently negative value <- value - penalty fitness of e <- fitness of e + value // Apply genetic algorithm to evolvers evolvers <- ga(evolvers) until evolvers is opimized // Evolvers has been optimized against survivors. // Replace least fit survivor with optimized evolver for s in survivors fitness of s <- 0 // Compute fitness values for s1 in survivors for s2 in survivors value <- fitness(s1, s2) if value is sufficiently negative value <- value - penalty fitness of s1 <- fitness of s1 + value // Replace least fit s in survivors with best from evolvers replace(survivors, best of evolvers) until survivors are only optimal strategies My experiences have been disappointing. Viewing the results, we discover that strategies rapidly evolve to always bet high with the best hand. We then see that the strategies for individual cards are then optimized, but as time goes by, those earlier optimizations are discarded. We then wedge, with the pool of evolvers unable to produce a strategy no worse than any in the pool of survivors. This is not correct, there is an optimal strategy for this game. I hope that some of you might be able to use this experience and, perhaps, to show me the errors of my ways. munch paulson3@applelink.apple.com