home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.ai
- Path: sparky!uunet!haven.umd.edu!darwin.sura.net!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!usenet.ins.cwru.edu!agate!apple!mumbo.apple.com!gallant.apple.com!kip2-11.apple.com!user
- From: PAULSON3@Applelink.apple.com (munch)
- Subject: Re: Games and genetic algorithms
- Sender: news@gallant.apple.com
- Message-ID: <PAULSON3-201192153909@kip2-11.apple.com>
- Date: Sat, 21 Nov 1992 00:51:52 GMT
- References: <1992Nov11.001553.12600@samba.oit.unc.edu> <1992Nov11.131739.19137@athena.mit.edu> <1992Nov16.175215.29411@versyss.com> <1992Nov18.210045.19530@Princeton.EDU>
- Organization: Apple Computer, Inc.
- Followup-To: comp.ai,comp.ai.genetic,rec.games.programmer
- Lines: 114
-
- In article <1992Nov18.210045.19530@Princeton.EDU>,
- >acoggins@phoenix.Princeton.EDU (Adam Cody Coggins) wrote:
- > > In article <1992Nov16.175215.29411@versyss.com> keithd@versyss.UUCP (Keith Doyle) writes:
- > > >In article <1992Nov11.131739.19137@athena.mit.edu> acgoldis@athena.mit.edu (Andrew C Goldish) writes:
-
- [...]
-
-
- wrt automatically creating optimal strategies for game play via genetic
- algorithms:
-
- For several years I've had as a background challenge developing an
- algorithm which will create the optimal strategy for the simple two
- player poker game described in _Theory of Games and Economic Behavior_
- by von Neumann and MorgenStern. In _Theory_, they describe a discrete
- form of the game and then evolve it into a continuous form. They provide
- an analytic solution to the continuous form.
-
- My version of the game deals from a 14 card deck. The cards have values
- from 1 .. 7. Ties are possible. The game is played by dealing each player
- a card. There is a single betting round where a player wagers either "low"
- number of chips (in my case, 5) or wagers a "high" number of chips (15).
- The players simultaneously declare if they are betting high or if
- they are betting low. If both players bet high or both bet low, the cards
- are revealed and the player with the higher card wins high/low number of
- chips from the other player. If the players tie, no chips are exchanged.
- If one player bets high and the other player bets low, there is no
- showdown.
- The player betting high receives the "low" number of chips from the player
- betting low.
-
- Strategy is represented by a 7 element vector. Each element is the
- probability
- of the player betting high if the player is dealt the corresponding card.
-
- My last attempts used genetic algorithms. Working off the fact that an
- optimal strategy in the game-theoretic sense is one which is no worse than
- any other strategy, I create two populations of players, evolve optimal
- players in one pool and collect survivors in the other. By collecting
- locally optimal survivors I am able to provide the evolving pool with a
- set of tougher opponents.
-
- Computation of the fitness function of an individual strategy versus any
- other strategy is done with two nested loops and two conditionals in the
- heart of the loops. No randomness is involved. An individual is better
- than an other if the fitness is greater than 0. Computation of the fitness
- of an individual in the evolving pool versus the survivor pool is done by
- computing the fitness of the individual against all strategies in the
- survivor pool. To satisfy the "no worse than any other strategy"
- constraint,
- I add a massive penalty (-100000) to the cumulative fitness of an
- individual
- whenever a fitness value is negative. (To avoid any potential problems
- with
- floating point roundoff and to allow otherwise good individuals to avoid
- penalties, I penalize if below -5 instead of 0).
-
- To summarize the technique:
-
- // Initialize to random strategies
- evolvers <- random strategies
- survivors <- random strategies
-
- // Evolve
- repeat
- repeat
- // Clear fitness of evolvers
- for e in evolvers
- fitness of e <- 0
-
- // Compute fitness values
- for e in evolvers
- for s in survivors
- value <- fitness(e, s)
- if value is sufficiently negative
- value <- value - penalty
- fitness of e <- fitness of e + value
-
- // Apply genetic algorithm to evolvers
- evolvers <- ga(evolvers)
- until evolvers is opimized
-
- // Evolvers has been optimized against survivors.
- // Replace least fit survivor with optimized evolver
- for s in survivors
- fitness of s <- 0
-
- // Compute fitness values
- for s1 in survivors
- for s2 in survivors
- value <- fitness(s1, s2)
- if value is sufficiently negative
- value <- value - penalty
- fitness of s1 <- fitness of s1 + value
-
- // Replace least fit s in survivors with best from evolvers
- replace(survivors, best of evolvers)
- until survivors are only optimal strategies
-
-
- My experiences have been disappointing. Viewing the results, we discover
- that strategies rapidly evolve to always bet high with the best hand. We
- then see that the strategies for individual cards are then optimized, but
- as time goes by, those earlier optimizations are discarded. We then wedge,
- with the pool of evolvers unable to produce a strategy no worse than any in
- the pool of survivors. This is not correct, there is an optimal strategy
- for this game.
-
- I hope that some of you might be able to use this experience and, perhaps,
- to show me the errors of my ways.
-
-
- munch
- paulson3@applelink.apple.com
-