home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-04-26 | 50.7 KB | 1,025 lines |
-
- 6) LISTING OF DISTRIBUTION FILES
-
- The distribution includes the following files:
-
- README.T1 & README.T2 - this file, on line documentation
-
- arg.c - the main module for the assembler/disassembler, written by Tom Uffner.
- This program converts ascii assembler files into binary files which can
- be executed by the Tierran virtual computer
-
- arg.prj - the Turbo C V 2.0 project file for compiling the
- assember/disassembler
-
- arginst.h - a file containing a structure used by arg.c to map assembler
- mnemonics to executable opcodes.
-
- bookeep.c - source code for bookeeping routines, which keep track of how many
- of what kind of creatures are in the soup, and other stuff like that.
-
- ccarg - a file for compiling the assembler/disassembler on unix systems.
- This file should be made executable (chmod +x ccarg).
-
- cctierra - a file for compiling Tierra on unix systems.
- This file should be made executable (chmod +x cctierra).
-
- configur.h - a file for configuring Tierra. You probably won't need to
- touch this unless you get into advanced stuff.
-
- debug.h - this file claims to provide some useful debugging stuff, I don't
- know, I didn't create it.
-
- declare.h - all global variables are declared in this file, except those
- whose values are set by soup_in. Those globals are declared in soup_in.h.
- declare.h is included by tierra.c which contains the main function.
-
- depend - a listing of interdependencies of the source code files
-
- extern.h - all global variables are delcared as extern in this file, and this
- file is included by all *.c files except tierra.c which includes
- delcare.h instead.
-
- extract.c - functions for extracting creatures from the soup and saving their
- genomes to disk.
-
- frontend.c - functions for handling input/output for Tierra. Hopefully this
- module will grow in the near future as we put a better interface on
- Tierra.
-
- genebank.c - functions for managing the genebank. This module has benefited
- from a lot of work by Tom Uffner.
-
- genio.c - functions for input/output of creatures. This stuff is also used
- by arg.c, the assembler/disassembler. This module has benefited from
- a lot of work by Tom Uffner.
-
- instruct.c - this module contains generalized executable functions. These
- generalized functions are mapped to specific functions by the parsing
- functions in the parse.c module.
-
- memalloc.c - functions for handling memory allocation in the soup, the stuff
- that ``cell membranes'' are made of.
-
- parse.c - the parsing functions interpret the executable code of the creatures,
- and map it onto the executable functions contained in the instruct.c
- module.
-
- portable.c - functions for portability between operating systems.
-
- portable.h - definitions for portability between operating systems and
- architectures.
-
- prototyp.h - all functions in Tierra are prototyped here.
-
- queues.c - queue management functions for the slicer and reaper queues.
-
- slicers.c - interchangeable slicer functions. This file contains some
- experiments in the allocation of cpu time to creatures. This is
- an interesting thing to play with.
-
- soup_in - the ascii file read by Tierra on startup, which contains all
- the global parameters that determine the environment, and a list of
- creatures to use in innoculating the soup at the start of a run.
-
- soup_in.h - this file defines the default values of all the soup_in variables,
- and defines the instruction set by mapping the assember mnemonics to the
- opcodes, parser functions, and executables.
-
- tierra.c - this file contains the main function, and the central code
- driving the virtual computer.
-
- tierra.h - this file contains all the structure definitions. It is a good
- source of documentation for anyone trying to understand the code.
-
- tierra.prj - the Turbo C V. 2.0 project file for compiling Tierra.
-
- trand.c - random number generation routines from Numerical Recipes in C.
-
- tsetup.c - routines called when Tierra starts up and comes down. Tom Uffner
- has been putting some work into this module as well.
-
- geneban1: - a subdirectory containing the genomes of the creatures saved
- during a run.
-
- 0080aaa.tie - the ancestor, written by a human, mother of all other creatures.
-
- 0022abn.tie - the smallest non-parasitic self-replicating creature to evolve.
-
- 0045aaa.tie - the archtypical parasite
-
- 0072etq.tie - a phenomenal example of optimization through evolution,
- involving the unrolling of the copy loop.
-
- list - a list of genotypes in the genebank, which will be read by Tierra
- at startup. All genotypes listed in soup_in must also be listed in
- this file. This file will be written to when the system is saved.
- Therefore to start a fresh run, you must start with a fresh copy of
- the list file. Therefore we provide the two files below, list4580 and
- list80, which allow you to make a fresh start either with the genome
- 0080aaa, or 0080aaa and 0045aaa together.
-
- list4580 - a fresh list file for starting runs with the genotypes 0080aaa and
- 0045aaa together. To use this file just copy it to a file named list.
-
- list80 - a fresh list file for starting runs with the genotype 0080aaa
- To use this file just copy it to a file named list.
-
- tiedat: - a subdirectory where a complete record of births and deaths will
- be written.
-
- break.1 - a file containing a record of births and deaths.
-
- 7) SOUP_IN PARAMETERS
-
- A typical soup_in file looks like the following:
-
- /* begin soup_in file */
-
- tierra core: 6-10-91
-
- alive = 50 how many millions of instruction will we run
- BrkupSiz = 5120 size of output file in K, named break.1, break.2 ...
- CellsSize = 600 initial size of cells array of structures
- debug = 0 0 = off, 1 = on, printf statements for debugging
- DiskOut = 1 output data to disk (1 = on, 0 = off)
- DistFreq = .1 frequency of disturbance, factor of recovery time
- DistProp = .4 proportion of population affected by distrubance
- DivSameGen = 1 cells must produce offspring of same genotype, to stop evolution
- DivSameSiz = 0 cells must produce offspring of same size, to stop size change
- DropDead = 5 stop system if no reproduction in the last x million instructions
- GeneBnker = 1 turn genebanker on and off
- GenebankPath = geneban1/ path for genebanker output
- GenPerBkgMut = 12 mutation rate control by generations ("cosmic ray")
- GenPerFlaw = 16 flaw control by generations
- GenPerMovMut = 8 mutation rate control by generations (copy mutation)
- hangup = 1 0 = exit on error, 1 = hangup on error for debugging
- MaxFreeBlocks = 500 initial number of structures for memory allocation
- MaxMalMult = 3 multiple of cell size allowed for mal()
- MinCellSize = 8 minimum size for cells
- MinTemplSize = 3 minimum size for templates
- MovPropThrDiv = .7 minimum proportion of daughter cell filled by mov
- new_soup = 1 1 = this a new soup, 0 = restarting an old run
- NumCells = 3 number of creatures and gaps used to inoculate new soup
- OutPath = tiedat/ path for data output
- PhotonPow = 1.5 power for photon match slice size
- PhotonWidth = 8 amount by which photons slide to find best fit
- PhotonWord = chlorophill word used to capture photon
- RamBankSiz = 20000 array size for genotypes in ram, use with genebanker
- SaveFreq = 10 frequency of saving core_out, soup_out and list
- SavThrMem = .015 threshold memory occupancy to save genotype
- SavThrPop = .015 threshold population proportion to save genotype
- SearchLimit = 5
- seed = 0 seed for random number generator, 0 uses time to set seed
- SizDepSlice = 0 set slice size by size of creature
- SlicePow = 1 set power for slice size, use when SizDepSlice = 1
- SliceSize = 25 slice size when SizDepSlice = 0
- SliceStyle = 2 choose style of determining slice size
- SlicFixFrac = 0 fixed fraction of slice size
- SlicRanFrac = 2 random fraction of slice size
- SoupSize = 60000 size of soup in instructions
- WatchExe = 0 mark executed instructions in genome in genebank
- WatchMov = 0 set mov bits in genome in genebank
- WatchTem = 0 set template bits in genome in genebank
-
- 0080aaa
- 0045aaa
- 0080aaa
-
- /* end soup_in file */
-
- The meaning of each of these parameters is explained below:
-
- alive = 50 how many millions of instruction will we run
-
- This tells the simulator how long to run, in millions of instructions.
-
- BrkupSiz = 5120 size of output file in K, named break.1, break.2 ...
-
- If this value is set to zero (0) the record of births and deaths will
- be written to a single file named tierra.run. However, if BrkupSiz has a
- non-zero value, birth and death records will be written to a series of files
- with the names break.1, break.2, etc. Each of these files will have the
- size specified, in K (1024 bytes). The value 5120 indicates that the
- break files will each be five megabytes in size. The output file(s) will
- be in the path specified by OutPath (see below). See also DiskOut.
-
- CellsSize = 600 initial size of cells array of structures
-
- The initial size of the ``cells array'' which contains all the demographic
- data, as well as the CPU of each creature. Due to a bug in the Borland
- Turbo C farrealloc function, care must be taken to be sure that this array
- is initially large enough that it does not need to be reallocated. A good
- rule of thumb is to let CellsSize = SoupSize / 100. If a compiler other than
- Borland is used, don't worry, any initial value will do.
-
- debug = 0 0 = off, 1 = on, printf statements for debugging
-
- This is used during code development, to turn on and off print statements
- for debugging purposes.
-
- DiskOut = 1 output data to disk (1 = on, 0 = off)
-
- If this parameter is set to zero (0), no birth and death records will
- be saved. Any other value will cause birth and death records to be saved
- to a file whose name is discussed under BrkupSiz above, in the path discussed
- under OutPath below.
-
- DistFreq = .1 frequency of disturbance, factor of recovery time
-
- The frequency of disturbance, as a factor of recovery time. This and
- the next option control the pattern of disturbance. If you do not want the
- system to be disturbed, set DistFreq to a negative value. If DistFreq has
- a non-negative value, when the soup fills up the reaper will be invoked to
- kill cells until it has freed a proportion DistProp of the soup. The system
- will then keep track of the time it takes for the creatures to recover from
- the disturbance by filling the soup again. Let's call this recovery time:
- rtime. The next disturbance will occur: (rtime X DistFreq) after recovery
- is complete. Therefore, if DistFreq = 0, each disturbance will occur
- immediately after recovery is complete. If DistFreq = 1, the time between
- disturbances will be twice the recovery time, that is, the soup will remain
- full for a period equal to the recovery time, before another disturbance hits.
-
- DistProp = .4 proportion of population affected by distrubance
-
- The proportion of the soup that is freed of cells by each disturbance.
- The occurs by invoking the reaper to kill cells until the total amount of
- free memory is greater than or equal to: (DistProp X SoupSize). Note that
- cells are not killed at random, they are killed off the top of the reaper
- queue.
-
- DivSameGen = 0 cells must produce offspring of same genotype, to stop evolution
-
- This causes attempts at cell division to abort if the offspring is of
- a genotype different from the parent. This can be used when the mutation rates
- are set to zero, to prevent sex from causing evolution.
-
- DivSameSiz = 0 cells must produce offspring of same size, to stop evolution
-
- Like DivSameGen, but cell division aborts only if the offspring is of
- a different size than the parent. Changes in genotype are not prevented,
- only changes in size are prevented.
-
- DropDead = 5 stop system if no reproduction in the last x million instructions
-
- Sometimes the soup dies, such as when mutation rates are too high.
- This parameter watches the time elapsed since the last cell division, and
- brings the system down if it is greater than DropDead million instructions.
-
- GeneBnker = 1 turn genebanker on and off
-
- The parameter turns the genebanker on and off. The value zero turns
- the genebanker off, any other value turns it on. With the genebanker off,
- the record of births and deaths will contain the sizes of the creatures,
- but not their genotypes. Also no genomes will be saved in the genebank.
- When the genebanker is turned on, the record of births and deaths will
- contain a three letter unique name for each genotype, as well as the size
- of the creatures. Also, any genome whose frequency exceeds the thresholds
- SavThrMem and SavThrPop (see below) will be saved to the genebank, in
- the path indicated by GenebankPath (see below).
-
- GenebankPath = geneban1/ path for genebanker output
-
- This is a string variable which describes the path to the genebank
- where the genomes will be saved. The path name should be terminated by
- a forward slash.
-
- GenPerBkgMut = 12 mutation rate control by generations ("cosmic ray")
-
- Control of the background mutation rate ("cosmic ray"). The value 12
- indicates that in each generation, roughly one in twelve cells will be hit
- by a mutation. These mutations occur completely at random, and also affect
- free space where there are no cells. If the value of GenPerBkgMut were 0.5,
- it would mean that in each generation, each cell would be hit by roughly
- two mutations.
-
- GenPerFlaw = 16 flaw control by generations
-
- Control of the flaw rate. The value 16 means that in each generation,
- roughly one in sixteen individuals will experience a flaw. Flaws cause
- instructions to produce results that are in error by plus or minus one,
- in some sense. If the value of GenPerFlaw were 0.5, it would mean that in
- each generation, each cell would be hit by roughly two flaws.
-
- GenPerMovMut = 8 mutation rate control by generations (copy mutation)
-
- Control of the move mutation rate (copy mutation). The value 8
- indicates that in each generation, roughly one in eight cells will be hit
- by a mutation. These mutations only affect copies of instructions made
- during replication (by the double indirect mov instruction). When an
- instruction is affected by a mutation, one of its five bits is selected
- at random and flipped. If the value of GenPerMovMut were 0.5, it would
- mean that in each generation, each cell would be hit by roughly two mutations.
-
- hangup = 1 0 = exit on error, 1 = hangup on error for debugging
-
- If an error occurs which is serious enough to bring down the system,
- having hangup set to 1 will prevent the program from exiting. In this case,
- the program will hang in a simple loop so that it remains active for
- debugging purposes.
-
- MaxFreeBlocks = 500 initial number of structures for memory allocation
-
- There is an array of structures used for the virtual memory allocator.
- This parameter sets the initial size of the allocated array, at startup.
-
- MaxMalMult = 3 multiple of cell size allowed for mal()
-
- When a cell attempts to allocate a second block of memory (presumably
- to copy its genome into), this parameter is checked. If the amount of memory
- requested is greater than MaxMalMult times the size of the mother cell, the
- request will fail. This prevents mutants from requesting the entire soup,
- which would invoke the reaper to cause a massive kill off.
-
- MinCellSize = 8 minimum size for cells
-
- When a cell attempts to divide, this parameter is checked. If the
- daughter cell would be smaller than MinCellSize instructions, divide will
- fail. The reason this is needed is that with no lower limit, there is a
- tendency for some mutants to spawn large numbers of very small cells.
-
- MinTemplSize = 3 minimum size for templates
-
- When an instruction (like jump) attempts to use a template, this parameter
- is checked. If the actual template is smaller than MinTemplSize instructions,
- the instruction will fail. This is a matter of taste.
-
- MovPropThrDiv = .7 minimum proportion of daughter cell filled by mov
-
- When a cell attempts to divide, this parameter is checked. If the mother
- cell has moved less than MovPropThrDiv times the mother cell size, of
- instructions into the daughter cell, cell division will abort. A value of .7
- means that the mother must at least fill the daughter 70% with instructions
- (though all these instructions could have been moved to the same spot in
- the daughter cell). The reason this parameter exists is that without it,
- mutants will attempt to spew out large numbers of empty cells.
-
- new_soup = 1 1 = this a new soup, 0 = restarting an old run
-
- This value is checked on startup, to determine if this is a new soup,
- or if this is restarting an old run where it left off. When the system
- comes down, all soup_in parameter (and many other global variables) are
- saved in a file called soup_out. The value of new_soup is set to 0 in
- soup_out. In order to restart an old run, just use soup_out as the input
- file rather than soup_in. This is done by using soup_out as a command line
- parameter at startup: tierra soup_out
-
- NumCells = 5 number of creatures and gaps used to inoculate new soup
-
- This parameter is checked at startup, and the system will look for a
- list of NumCells creatures at the end of the soup_in file. The value 5
- indicates that the soup will initially be innoculated by five cells.
- However, NumCells also counts gaps that are placed between cells (without
- gaps, all cells are packed together at the bottom of the soup at startup).
- The gap control feature does not work at present, so don't use it. Notice
- that after the list of parameters in the soup_in file, there is a blank
- line, followed by a list of genotypes. The system will read the first
- NumCells genotypes from the list, and place them in the soup in the same
- order that they occur in the list.
-
- OutPath = tiedat/ path for data output
-
- The record of births and deaths will be written to files in a directory
- specified by OutPath. See BrkupSiz above for a discussion of the name of
- the file(s) containing the birth and death records.
-
- PhotonPow = 1.5 power for photon match slice size
-
- If SliceStyle (see below) is set to the value 1, then the allocation
- of CPU cycles to creatures is based on a photon - chlorophyll metaphor.
- Imagine that photons are raining down on the soup at random. The cell hit
- by the photon gets a time slice that is proportional to the goodness of fit
- between the pattern of instructions that are hit, and an arbitrary pattern
- (defined by PhotonWord, see below).
-
- The template of instructions defined by PhotonWord is laid over the
- sequence of instructions at the site hit by the photon. The number of
- instructions that match between the two is used to determine the slice
- size. However, the number of matching instructions is raised to the power
- PhotonPow, to calculate the slice size.
-
- PhotonWidth = 8 amount by which photons slide to find best fit
-
- When a photon hits the soup, it slides a distance PhotonWidth, counting
- the number of matching characters at each position, and the slice size will
- be equal to the number of characters in the best match (raised to the power
- PhotonPow, see above). If PhotonWidth equals 8, the center of the template
- will start 4 instructions to the left of the site hit by the photon, and
- slide to 4 instructions to the right of the site hit.
-
- PhotonWord = chlorophill word used to capture photon
-
- This string determines the arbitrary pattern that absorbs the photon.
- It uses a base 32 numbering system: the digits 0-9 followed by the characters
- a-v. The characters w, x, y and z are not allowed (that is why chlorophyll
- is misspelled). The string may be any length up to 79 characters.
-
- RamBankSiz = 20000 array size for genotypes in ram, use with genebanker
-
- Places an upper limit on the number of genomes that may be stored
- in the genebank maintained in RAM at any one time. This is a memory
- management feature provided for DOS systems. When the RAM genebank
- fills, genomes start swapping out to disk. The genomes that have not
- been checked for the longest time are swapped out first. At this time
- the RAM bank management scheme does not work. For this reason, you should
- be sure that this parameter is set high enough that the bank does not
- fill up during the run.
-
- SaveFreq = 10 frequency of saving core_out, soup_out and list
-
- Every SaveFreq million instructions, the complete state of the
- virtual machine is saved. This is a useful feature for long runs, so that
- the system can be restarted if it is interrupted for some reason.
-
- SavThrMem = .015 threshold memory occupancy to save genotype
-
- If a particular genotype fills SavThrMem of the total space available
- in the soup, it will be assigned a permanent unique name, and saved to disk.
- Note that an adjustment is made because only adult cells are counted, and
- embryos generally fill half the soup. Therefore adult cells of a particular
- genotype need only occupy SavThrMem * 0.5 of the space to be saved.
-
- SavThrPop = .015 threshold population proportion to save genotype
-
- If a particular genotype amounts to SavThrPop of the total population
- of (adult) cells in the soup, it will be assigned a permanent unique name,
- and saved to disk.
-
- SearchLimit = 5
-
- This parameter controls how far instructions may search to match
- templates. The value five means that search is limited to five times the
- average adult cell size. The actual distance is updated every million
- instructions.
-
- seed = 0 seed for random number generator, 0 uses time to set seed
-
- The seed for the random number generator. If you use the value zero,
- the system clock is used to set the seed. If you use any other value, it
- will be the seed. The starting seed (even when provided by the clock) will
- be written to standard output, and also saved in the soup_out file when the
- simulator comes down. By using the original seed and all the same initial
- parameter settings in soup_in, a run may be repeated exactly.
-
- SizDepSlice = 0 set slice size by size of creature
-
- This determines a major slicer option. If this parameter is set to
- zero, the slice size will either be a constant set by SliceSize (see below)
- or a uniform random variate, or a mix of the two. The mix is determined by
- the relative values of SlicFixFrac and SlicRanFrac (see below). The actual
- slice size will be:
-
- (SlicFixFrac * SliceSize) + (tlrand() % (I32s) ((SlicRanFrac * SliceSize) + 1))
-
- If SizDepSlice is set to a non-zero value, the slice size will be
- proportional to the size of the genome. In this case, the base slice size
- will be the genome size raised to the power SlicePow (see below). To clarify
- let slic_siz = genome_size ^ SlicePow, the actual slice size will be:
-
- (SlicFixFrac * slic_siz) + (tlrand() % (I32s) ((SlicRanFrac * slic_siz) + 1))
-
- SlicePow = 1 set power for slice size, use when SizDepSlice = 1
-
- This parameter is only used when SizDepSlice = 1. In this case, the
- genome size is raised to the power SlicePow to determine the slice size
- (see algorithm under SizDepSlice above). If SlicePow = 1, the run will be
- size neutral, selection will not be biased toward either large or small
- creatures (the probability of an instruction being executed is not dependent
- on the size of the genome it is located in). If SlicePow > 1, selection will
- favor larger genomes. If SlicePow < 1, selection will favor small genomes.
-
- SliceSize = 25 slice size when SizDepSlice = 0
-
- This parameter determines the base slice size when SizDepSlice = 0.
- The actual slice size in this case depends on the values of SlicFixFrac
- and SlicRanFrac (see below). The way the slice size is actually calculated
- is explained under SizDepSlice above.
-
- SliceStyle = 2 choose style of determining slice size
-
- The slicer is a pointer to function, and the function actually used
- is determined by this parameter. At present there are three choices (0-2).
- The pointer to function is assigned in the setup.c module, and the slicer
- functions themselves are contained in the slicers.c module.
- 0 = SlicerQueue() - slice sizes without a random component
- 1 = SlicerPhoton() - slice size based on photon interception metaphor
- 2 = RanSlicerQueue() - slice size with a fixed and a random component
-
- SlicFixFrac = 0 fixed fraction of slice size
-
- When SliceStyle = 2, the slice size has a fixed component and a random
- component. This parameter determines the fixed component as a multiple
- of SliceSize, or genome_size ^ SlicePow.
-
- SlicRanFrac = 2 random fraction of slice size
-
- When SliceStyle = 2, the slice size has a fixed component and a random
- component. This parameter determines the random component as a multiple
- of SliceSize, or genome_size ^ SlicePow.
-
- SoupSize = 60000 size of soup in instructions
-
- This variable sets the size of the soup, measured in instructions.
-
- WatchExe = 0 mark executed instructions in genome in genebank
-
- If the genebank is on, setting this parameter to a non-zero value
- will turn on a watch of which instructions are being executed in each
- permanent genotype (this helps to distinguish junk code from code that is
- executed), and also, who is executing whose instructions. There
- is a bit field in struct g_list (bit definitions are defined in the tierra.h
- module) that keeps track of whether a creature executes its own instructions,
- those of another creature, if another creature executes this creatures
- instructions, etc:
-
- bit 2 EXs = executes own instructions (self)
- bit 3 EXd = executes daughter's instructions
- bit 4 EXo = executes other cell's instructions
- bit 5 EXf = executes instructions in free memory
- bit 6 EXh = own instructions are executed by other creature (host)
-
- WatchMov = 0 set mov bits in genome in genebank
-
- If the genebank is on, setting this parameter to a non-zero value
- will turn on a watch of who moves whose instructions and where. This
- information is recorded in the bit field in struct g_list:
-
- bit 17 MFs = moves instruction from self
- bit 18 MFd = moves instruction from daughter
- bit 19 MFo = moves instruction from other cell
- bit 20 MFf = moves instruction from free memory
- bit 21 MFh = own instructions are moved by other creature (host)
- bit 22 MTs = moves instruction to self
- bit 23 MTd = moves instruction to daughter
- bit 24 MTo = moves instruction to other cell
- bit 25 MTf = moves instruction to free memory
- bit 26 MTh = is written on by another creature (host)
- bit 27 MBs = executing other creatures code, moves inst from self
- bit 28 MBd = executing other creatures code, moves inst from daughter
- bit 29 MBo = executing other creatures code, moves inst from other cell
- bit 30 MBf = executing other creatures code, moves inst from free memory
- bit 31 MBh = other creature uses another cpu to move your instructions
-
- WatchTem = 0 set template bits in genome in genebank
-
- If the genebank is on, setting this parameter to a non-zero value
- will turn on a watch of whose templates are matched by whom. This
- information is recorded in the bit field in struct g_list:
-
- bit 7 TCs = matches template complement of self
- bit 8 TCd = matches template complement of daughter
- bit 9 TCo = matches template complement of other
- bit 10 TCf = matches template complement of free memory
- bit 11 TCh = own template complement is matched by other creature (host)
- bit 12 TPs = uses template pattern of self
- bit 13 TPd = uses template pattern of daughter
- bit 14 TPo = uses template pattern of other
- bit 15 TPf = uses template pattern of free memory
- bit 16 TPh = own template pattern is used by other creature (host)
-
- 0080aaa
- 0045aaa
- 0080aaa
- 0045aaa
- 0080aaa
-
- This is the list of cells that will be loaded into the soup when
- the simulator starts up. This example indicates that five cells will
- be loaded at startup, the ancestor 0080aaa alternating with the parasite
- 0045aaa. These cells will be loaded in the bottom of the soup, with no
- space between them. Only NumCells genotypes from the list will actually
- be loaded, so the NumCells parameter should be modified when you change
- the number of genotypes that you wish to have loaded. Also, all genotypes
- to be loaded must also be listed in the file geneban1/list, and all of the
- genotypes must occur in the genebank.
-
- 8) THE ANCESTOR & WRITING A CREATURE
-
- 8.1) The Ancestor
-
- The ASCII assembler code file with comments, for the ancestor, is listed
- below. Below the listing I have some explanatory material.
-
- **** begin genome file (note blank line at head of file)
-
- format: 1 bits: 45750471 EXsh TCsh TPs MFsofh MTdf MB
- genotype: 0080aaa parent genotype: 0666god
- 1st_daughter: flags: 0 inst: 827 mov_daught: 80 breed_true: 1
- 2nd_daughter: flags: 0 inst: 809 mov_daught: 80 breed_true: 1
- InstExe.m: 0 InstExe.i: 0 origin: 662270168 Wed Dec 26 22:56:08 1990
- MaxPropPop: 0.8306 MaxPropInst: 0.4239
- ploidy: 1 track: 0
-
- track 0: prot
- xwr
- nop_1 ; 010 110 01 0 beginning marker
- nop_1 ; 010 110 01 1 beginning marker
- nop_1 ; 010 110 01 2 beginning marker
- nop_1 ; 010 110 01 3 beginning marker
- zero ; 010 110 04 4 put zero in cx
- or1 ; 010 110 02 5 put 1 in first bit of cx
- shl ; 010 110 03 6 shift left cx (cx = 2)
- shl ; 010 110 03 7 shift left cx (cx = 4)
- mov_cd ; 010 110 18 8 move cx to dx (dx = 4)
- adrb ; 010 110 1c 9 get (backward) address of beginning marker -> ax
- nop_0 ; 010 100 00 10 complement to beginning marker
- nop_0 ; 010 100 00 11 complement to beginning marker
- nop_0 ; 010 100 00 12 complement to beginning marker
- nop_0 ; 010 100 00 13 complement to beginning marker
- sub_ac ; 010 110 07 14 subtract cx from ax, result in ax
- mov_ab ; 010 110 19 15 move ax to bx, bx now contains start address of mother
- adrf ; 010 110 1d 16 get (forward) address of end marker -> ax
- nop_0 ; 010 100 00 17 complement to end marker
- nop_0 ; 010 100 00 18 complement to end marker
- nop_0 ; 010 100 00 19 complement to end marker
- nop_1 ; 010 100 01 20 complement to end marker
- inc_a ; 010 110 08 21 increment ax, to include dummy instruction at end
- sub_ab ; 010 110 06 22 subtract bx from ax to get size, result in cx
- nop_1 ; 010 110 01 23 reproduction loop marker
- nop_1 ; 010 110 01 24 reproduction loop marker
- nop_0 ; 010 110 00 25 reproduction loop marker
- nop_1 ; 010 110 01 26 reproduction loop marker
- mal ; 010 110 1e 27 allocate space (cx) for daughter, address to ax
- call ; 010 110 16 28 call template below (copy procedure)
- nop_0 ; 010 100 00 29 copy procedure complement
- nop_0 ; 010 100 00 30 copy procedure complement
- nop_1 ; 010 100 01 31 copy procedure complement
- nop_1 ; 010 100 01 32 copy procedure complement
- divide ; 010 110 1f 33 create independent daughter cell
- jmp ; 010 110 14 34 jump to template below (reproduction loop)
- nop_0 ; 010 100 00 35 reproduction loop complement
- nop_0 ; 010 100 00 36 reproduction loop complement
- nop_1 ; 010 100 01 37 reproduction loop complement
- nop_0 ; 010 100 00 38 reproduction loop complement
- if_cz ; 010 000 05 39 dummy instruction to separate templates
- nop_1 ; 010 110 01 40 copy procedure template
- nop_1 ; 010 110 01 41 copy procedure template
- nop_0 ; 010 110 00 42 copy procedure template
- nop_0 ; 010 110 00 43 copy procedure template
- push_ax ; 010 110 0c 44 push ax onto stack
- push_bx ; 010 110 0d 45 push bx onto stack
- push_cx ; 010 110 0e 46 push cx onto stack
- nop_1 ; 010 110 01 47 copy loop template
- nop_0 ; 010 110 00 48 copy loop template
- nop_1 ; 010 110 01 49 copy loop template
- nop_0 ; 010 110 00 50 copy loop template
- mov_iab ; 010 110 1a 51 move contents of [bx] to [ax] (copy one instruction)
- dec_c ; 010 110 0a 52 decrement cx (size)
- if_cz ; 010 110 05 53 if cx == 0 perform next instruction, otherwise skip it
- jmp ; 010 110 14 54 jump to template below (copy procedure exit)
- nop_0 ; 010 110 00 55 copy procedure exit complement
- nop_1 ; 010 110 01 56 copy procedure exit complement
- nop_0 ; 010 110 00 57 copy procedure exit complement
- nop_0 ; 010 110 00 58 copy procedure exit complement
- inc_a ; 010 110 08 59 increment ax (address in daughter to copy to)
- inc_b ; 010 110 09 60 increment bx (address in mother to copy from)
- jmp ; 010 110 14 61 bidirectional jump to template below (copy loop)
- nop_0 ; 010 100 00 62 copy loop complement
- nop_1 ; 010 100 01 63 copy loop complement
- nop_0 ; 010 100 00 64 copy loop complement
- nop_1 ; 010 100 01 65 copy loop complement
- if_cz ; 010 000 05 66 this is a dummy instruction to separate templates
- nop_1 ; 010 110 01 67 copy procedure exit template
- nop_0 ; 010 110 00 68 copy procedure exit template
- nop_1 ; 010 110 01 69 copy procedure exit template
- nop_1 ; 010 110 01 70 copy procedure exit template
- pop_cx ; 010 110 12 71 pop cx off stack (size)
- pop_bx ; 010 110 11 72 pop bx off stack (start address of mother)
- pop_ax ; 010 110 10 73 pop ax off stack (start address of daughter)
- ret ; 010 110 17 74 return from copy procedure
- nop_1 ; 010 100 01 75 end template
- nop_1 ; 010 100 01 76 end template
- nop_1 ; 010 100 01 77 end template
- nop_0 ; 010 100 00 78 end template
- if_cz ; 010 000 05 79 dummy instruction to separate creature
- **** end genome file
-
- Each genome file begins with some header information. Let me explain
- each item:
-
- format: 1 because we occasionally change the format of the genome files,
- this parameter is included for backwards compatibility. It is used by the
- assembler/disassembler to know how to read and write the files.
-
- bits: 45750471 this is the bit field associated with each genome in the
- genebank. If the genebanker is on and if any of the parameters: WatchExe,
- WatchMov, or WatchTem are set to a non-zero value, then bits in this field
- will be set to characterize the ecological characteristics of the genotype.
- The definitions of the bits in the field are given in the tierra.h module,
- and above in the description of the soup_in parameters. For more specific
- details, follow the Watch variables in the source modules to see exactly what
- they are doing.
-
- EXsh TCsh TPs MFsofh MTdf MB this is an ASCII summary of the meaning of
- the bits that are set in the bit field. The meanings of these abbreviations
- are given in the tierra.h file and above in the description of the soup_in
- parameters.
-
- genotype: 0080aaa This is the name of this genotype. The name has two
- parts. The first part is numeric and must be equal to the size of the cell
- of this creature (how large is its allocated block of memory). The cell size
- usually, but not always, corresponds to the size of the genome. The second
- part is a unique (and arbitrary) three letter code to distinguish this
- particular genotype from others of the same size.
-
- parent genotype: 0666god This is the name of the genotype of the
- immediate ancestor of this genotype. The immediate ancestor is the creature,
- whose cpu gave rise to the first individual of this genotype. The original
- creature, 0080aaa was created by god and the devil.
-
- 1st_daughter: This is a set of metabolic data about what transpired
- during the production of the first daughter by this genotype. flags: 0 This
- tells us how many errors (flags) were generated during the first reproduction.
- The generation of errors indicates invalid execution of instructions and causes
- the creature to move up the reaper queue, closer to death. inst: 827 This
- tells us how many instructions were executed during the first reproduction,
- this is an indication of metabolic costs and efficiency. mov_daught: 80 This
- tells us how many instructions were copied from the mother to the daughter
- during the first reproduction. breed_true: 1 This tells us if the first
- daughter ever has the same genotype as the mother.
-
- 2nd_daughter: flags: 0 inst: 809 mov_daught: 80 breed_true: 1
- This is a set of metabolic data about what transpired during the production
- of the second daughter by this genotype. The data are the same as those
- from the first daughter. The second daughter and those that follow generally
- have the same metabolic data, but they also generally differ from the first
- daughter, because the second time through, the parent often does not examine
- itself again, and it does not start the algorithm from the same place.
-
- InstExe.m: 0 At the time this genotype first appeared, the system had
- executed this many millions of instructions, plus the remainder indicated
- by the InstExe.i parameter.
-
- InstExe.i: 0 At the time this genotype first appeared, the system had
- executed this many instructions, plus however many millions indicated by
- the InstExe.m parameter.
-
- origin: 662270168 This is the system clock time at the first origin
- of this genotype.
-
- Wed Dec 26 22:56:08 1990 This is the system clock time at the first
- origin of this genotype.
-
- MaxPropPop: 0.8306 The maximum proportion of the population of cells of
- adult cells in the soup, attained by this genotype.
-
- MaxPropInst: 0.4239 The maximum proportion of space in the soup attained
- by adults of this genotype.
-
- ploidy: 1 The ploidy level of this genotype (i.e., this genotype
- is haploid).
-
- track: 0 Which copy of the genome will start executing at birth. This
- is only used when the ploidy level is greater than one (i.e., diploid).
-
- track 0: prot
- xwr
- nop_1 ; 010 110 01 0 beginning marker
-
- track 0: prot This tells us that the assembler code that follows is
- track one. If the genotype has a ploidy of 2, a second assembler listing
- will follow, and it will be labeled track 1. The word prot refers to the
- protection bits: xwr, or x = execute, w = write, r = read.
-
- nop_1 ; 010 110 01 0 beginning marker
-
- This is the first line of the actual genome. The first word, nop_1 is
- the assembler mnemonic for one of the two no-operation instructions. The
- semicolon indicates the beginning of comments.
-
- The digits 010 tell us what protection this instruction will have at
- birth. Only the write bit is set, so this instruction will be write protected,
- but open to reading or execution at birth.
-
- The digits 110 are a record of which instructions were executed by this
- creature's own CPU (first digit), and the CPUs of other creatures' (second
- digit), the third digit is not used at present. These bits are set when the
- WatchExe parameter is set. That the first two digits are set to one indicates
- that this instruction was executed both by its own CPU and by the CPU of
- another creature (perhaps a parasite, or a lost instruction pointer).
-
- The digits 01 are the actual hexadecimal op code of the instruction. It
- is this value that will actually be stored in the soup.
-
- The digit 0 just before the words ``beginning marker'' is a count of
- the Nth instruction in the genome. This is the first instruction, so it is
- numbered zero.
-
- The words ``beginning marker'' are a comment describing the intended
- purpose of this instruction.
-
- If you study the code of the ancestor, you may be perplexed by the
- reason for including the following instructions:
-
- zero ; 010 110 04 4 put zero in cx
- or1 ; 010 110 02 5 put 1 in first bit of cx
- shl ; 010 110 03 6 shift left cx (cx = 2)
- shl ; 010 110 03 7 shift left cx (cx = 4)
- mov_cd ; 010 110 18 8 move cx to dx (dx = 4)
-
- In the original version of the simulator, the size of the templates
- was determine by the value in the dx register. These five instructions
- loaded the dx register with the value 4, which is the size of the templates
- in this creature. Later, it was decided that this was a stupid way to
- determine template sizes. Now the parser just looks to see how many nops
- follow any instruction using them, and the number of consecutive nops determine
- the template size. Therefore, these five instructions don't do any useful
- work in the present model, but they have been left in place because the code
- still works.
-
- 8.2) Writing a Creature
-
- If you write your own creature, you must obey the following conventions:
-
- **** begin genome file (note blank line at top of file)
-
- format: 1 bits: 3
- genotype: 0080aaa parent genotype: 0666god
-
- track 0: prot
- xwr
- nop_1 ; 010
- nop_1 ; 010
- **** end genome file
-
- Yank the above lines into the file you are going to write, to use as
- a template. You must have the following:
-
- 1) a blank line at the top of the file.
- 2) a line declaring the format and bits, just use the line given.
- 3) a line stating the genome size and three letter name, and that of
- the parent genotype. The genome size must match the actual number
- of instructions in the genome. The three letter name is arbitrary,
- you can make up any name, but I advise using a low letter name like
- aaa because these names are used in a base 26 numbering system by
- the genebanker, and the genebanker must allocate an array as big
- as the largerst of these numbers. You may make up the parent genotype
- size and age, it won't be used for anything, so its details don't
- matter, but it should have the format of four numeric digits followed
- by three letters.
- 4) a blank line
- 5) the line: track 0: prot, just use the line provided
- 6) the line: xwr, just use the line provided
- 7) the listing of assembler mnemonics, followed by a semicolon and a
- three digit code indicating the protection at birth. I recomment that
- you use the protection indicated. The listing of the 32 assembler
- mnemonics can be found at the end of the soup_in.h file. For a
- description of what they actually do, study the comments on the
- code of the ancestor listed above, and study the corresponding
- parser and execute functions in the two modules in parse.c and
- instruct.c.
-
- 9) IF YOU WANT TO MODIFY THE SOURCE CODE
-
- If you make some significant improvements to Tierra, we would welcome
- receiving the source code, so that we may integrate it into our version, and
- then make it available to others.
-
- All lines of source code should be 78 characters or less, or it will
- mess up the formatting of the code for distribution.
-
- The simulator has been designed so that it can be brought down, and then
- brought back up where it left off. This means that there can be no static
- local variables. Any variables that hang around must be global. They
- are declared and defined in soup_in.h if they are also soup_in parameters.
- Otherwise they are declared in declare.h, and all global variables are
- declared as externals in extern.h.
-
- The code for bringing the simulator up and down is in the tsetup.c
- module. The system is brought up by GetSoup(), which calls GetAVar()
- to read soup_in. All soup_in variables are read by the GetAVar() function.
- If a new simulation is being started, GetSoup() calls GetNewSoup(). If an
- old simulation is being restarted, GetSoup() calls GetOldSoup(). GetOldSoup()
- will read all global variables not contained in soup_in, and will also read
- in all arrays, such as the soup, the cells array, and the free_mem array.
- When the simulator goes down, and periodically during a run, all global
- variables are written to a file soup_out, and all global arrays such as
- soup, the cells array, the free_mem array, and the random number generator
- array, and some structures, are written to a binary file called core_out.
- Thus if you create any new global variables or arrays, be sure they are read
- by GetOldSoup(), and written by WriteSoup().
-
- There are several obvious projects that I would like to comment on:
-
- 9.1) Creating a Frontend
-
- All I/O to the console is routed through the frontend.c module, so that
- it can be handled by a variety of front ends now under development. The
- simplest of these just uses printf to write to standard out. The frontend.c
- module is just a sketch at the moment. If your are going to work on the
- frontend, please get back to us for an updated version of the frontend.c
- module. The module is guaranteed to have been completely rewritten by the
- end of October 1991.
-
- 9.2) Creating New Instruction Sets
-
- If you want to create a new instruction set, more power to you. The
- relevant modules to study are: instruct.c, parse.c, soup_in.h, arginst.h, and
- configur.h. You will also need to study the definitions of struct cpu,
- struct InstDef, struct ArgInstDef, and struct inst, all in the tierra.h module.
- Note that the cpu structure includes an array of registers. The idea is that
- you may change the size of this array to make just about any changes you might
- want to the CPU architecture. You should avoid actually having to alter the
- structure definition in the tierra.h file.
-
- 9.3) Creating New Slicer Mechanisms
-
- If you want to experiment with artificial rather than natural selection,
- consider that selection is both a carrot and a stick. The carrot in this
- model is CPU time which is allocated by the slicers. The stick is the reaper.
- If you want to try to evolve algorithms that do useful work, your evaluation
- functions should be embedded into the slicer, and should allocate more CPU
- time to creatures who rank high.
-
- 9.4) Creating a Sexual Model
-
- Sex emerges spontaneously in runs whenever parasites appear. However,
- this sex is primitive and disorganized. I believe that the easiest way to
- engineer organized sex is to work with diploid creatures. The infrastructure
- to allow multiple ploidy levels is already in place. Notice that the
- definition of Instruction, the type of which the soup is composed is:
-
- typedef struct Inst Instruction[PLOIDY];
-
- This means that if PLOIDY is defined as two, there are two parallel
- tracks for genomes. The instruction pointer will run down the track
- specified by the ce->c.tr variable in the cpu structure. We have not
- implemented any other controls over the tracking of the instruction pointer
- in diploid or higher models. This is future work.
-
- 9.5) Creating a Multi-cellular Model
-
- Multi-cellularity was the hallmark of the Cambrian explosion of
- diversity, and thus is likely a biological feature worth including in Tierra.
- Also, it is likely that a multi-cellular model is the appropriate one for
- evolving large application programs on massively parallel machines. How
- can we implement multi-cellularity? What does it mean in the context of
- Tierran creatures?
-
- Consider that at the conceptual core, multi-cellularity means that the
- mother cell determines what portion of the genome its daughter cell will
- express. For many daughter cells, the mother cells narrows their options
- by preventing them from expressing (executing) large portions of their
- genome (code). In the organic world this is done by loading the daughter
- cell with regulatory proteins which determine which genes will be expressed.
-
- In the Tierran world, the same result can be achieved by allowing the
- mother cell to set the position of the instruction pointer in the daughter
- cell, and also the initial values of the CPU registers. These acts can
- place the daughter cell into a portion of its code from which it may never
- be able to reach certain other parts of its code. In this way the mother
- cell determines what parts of the code are executed by the daughter.
-
- To facilitate this process, the divide instruction has been broken into
- three steps: 1) Create and initialize a CPU for the daughter. 2) Start the
- daughter CPU running. 3) Become independent from the daughter by loosing
- write privelages on the daughter space. Now, between steps 1 and 2, the
- mother can place values into the CPU registers and instruction pointer of
- the daughter. This will require and inter-CPU move instruction. The divide
- instruction takes an argument that determines which of the three steps is
- being performed.
-
- 10) KNOWN BUGS
-
- When Tierra runs, if the genebanker is on, a growing number of genomes
- will accumulate in RAM, causing memory useage to increase throughout a run.
- This will eventually lead to a memory allocation failure on DOS systems, or
- to thrashing on Unix systems due to the need to use virtual memory. The
- parameter RamBankSiz is designed to prevent the accumulation of too many
- genomes in the RAM bank, by swapping out the least used genomes when there
- are more than RamBankSize genomes in the genebank. At present this memory
- management does not work. Even when this is fixed, memory demands will
- still grow during a run because the genebanker must keep track of genomes
- swapped out to disk.
-
- When compiled with a Borland C compiler, Tierra will use the farrealloc()
- function to realloc several arrays during a run. The farrealloc() function
- is supposed to be able to reallocate arrays larger than 64K. Unfortunately
- the function does not work for arrays larger than 64K in most versions of
- Borland's compilers. The most recent versions of Borland C++ have fixed this
- bug. If you have an older version of the compiler, you can usually avoid
- the problem by setting CellsSize = SoupSize / 100
- This should prevent the need to reallocate the Cells array, which is what
- usually generates the problem. Just be sure that the initial value of
- CellsSize is large enough that it does not need to be increased.
-
- When the system is brought down, and then brought back up where it left
- off, it continues writing birth and death records to the tierra.run or
- break.X files. However, if the system comes down due to being killed or
- due to a hardware crash, when it is brought back up, it will resume execution
- from the state when the simulator was last saved (see SaveFreq variable
- in section 5 above). The problem is that the birth and death records will
- now be appended to the end of the a file that contained all records up to the
- last buffered write before the crash. This means that the last part of the
- birth and death records will be incorrect. This bug will be fixed soon.
-
- Tom Ray
- University of Delaware
- School of Life & Health Sciences
- Newark, Delaware 19716
- ray@tierra.slhs.udel.edu
- ray@life.slhs.udel.edu
- ray@brahms.udel.edu
- 302-451-2281 (FAX)
- 302-451-2753
-