home *** CD-ROM | disk | FTP | other *** search
- POLYSIM - Fits experimental data to a large polynomial equation using
- a modified simplex approach.
-
- Version 1.1
-
- D.J.Murphy 1989
-
-
- Usage: polysim [min_index max_index] inputfile outputfile
-
- Description: POLYSIM is designed to be run either on a PC in a "fire & forget"
- manner, or on a Unix (or other) host capable of supporting
- background processes. This is because optimization can be a time
- consuming business, and so once invoked no other input is required
- providing the data has been entered into the input file correctly.
- It fits data to the equation:
-
- y = ax^5 + bx^4 + cx^3 + dx^2 + ex + f + g/x + h/(x^2)
-
- using a modified simplex method (see below) to optimize the
- coefficients a through h such that the mean deviation of Y(input)
- from Y(calculated) is minimized.
- Alternatively, any part of this equation can be fitted by entering
- minimum and maximum values for the X-index on the command line.
- Restrictions here are that the indices must either both be entered
- or not at all (not one alone), and must be entered minimum then
- maximum. The indices, I, must be integers and 5 >= I >= -2.
- Using this option it is possible to use an educated guess at the
- general form of the function your data should fit to restrict the
- amount of work POLYSIM has to do, and thus reduce the time it will
- take to run.
- Output is via a text file, and includes number of iterations used,
- values of all the coefficients, mean deviation and a list of
- input data, Y(calculated) and the differences.
-
- Index Range: If you do not include this, the default range of X^5 to X^(-2)
- will be used. If you want to restrict this, enter the minimum
- index value you want, then the maximum, as integers.
-
- Input file: The input file must contain the input data as delimited
- X Y pairs, preferably one pair per line:
- eg. 1 1
- 2 4
- 3 9
- 4 16
- | |
-
- Do not include anything else in this dataset. Try to avoid
- including x=0 values, as data pairs with x = 0 are ignored on
- reading the file (this to avoid divide-by-zero errors when
- evaluating Y(calculated)). Any valid filename is accepted.
-
-
- Output file: Any valid filename is accepted. Any existing file of that name
- will be overwritten, unless it is write-protected in which case
- POLYSIM will fail AFTER running the optimization.
-
- Example: C:\polysim 0 2 expt1.dat expt1.res
-
- Run POLYSIM on the data in C:\EXPT1.DAT, optimizing to the
- function y = ax^2 + bx + c, and write the results to the
- file C:\EXPT1.RES which can be listed to the screen, printer
- or included in other programs.
-
- Cautions: This is an optimization program - it is wise to plot the data
- visually before using it, as the fit POLYSIM reports may be better
- for some parts of the dataset than others. It is unwise to
- extrapolate too much beyond the range of the given data set.
- POLYSIM also takes a while to run - you are advised to try using
- a log(X) - log(Y) least squares method first to try and get a
- straight line, unless you either know that this won't give you
- a straight line or you are desperate for a curve.
- Also, not all data sets will give reasonable curves - again, try
- plotting first to see if this is likely to be so - if not the
- answer (if you get one) is likely to be meaningless.
-
- Portability: POLYSIM is written in C, and was developed on a PC using
- Borland's Turbo C v1.5, and compiled using floating point
- emulation so it will use an 80x87 math coprocessor if one is
- present but it does not require it (use of such is highly
- recommended, though - POLYSIM makes extensive use of
- double precision real numbers and as such will be very slow
- without a floating point processor).
- No odd facilities are used in the source code, so it should
- be portable onto a Unix system, or to any ANSI C compiler -
- it _should_ be - I'm making no promises.
-
- What is "modified simplex" optimization?
-
- Simplex is a formal procedure for optimizing a set of parameters in a case
- where you can measure responses for different combinations of parameter values
- but have no knowledge of the "response surface" which gives rise to the
- observed pattern - i.e. it is not possible to use calculus to find a maximum
- or minimum. A simplex is a geometrical figure which has one more vertex than
- the number of dimensions it occupies - a 2 dimensional simplex is a triangle,
- for example. POLYSIM uses up to an 8 dimensional simplex, one for each
- coefficient, so needs up to 9 equations. Consider the 2d one for illustration.
-
- We have some response, z, such that z = f(x,y), but we don't know what. In the
- basic simplex method, we test 3 points, B, N and W, and get the Best, Next best
- and Worst responses respectively. We then take a line between B and N and find
- the centre of it, C - then reflect W through that centre point to get a new
- point, A :
- The figure BNW is the old simplex, and BNA is the
- new one. If we now test the response at A, and compare
- B it with those at B and N, we can repeat the process.
- Doing this often enough will home the simplex in on
- the optimum response (think about it).
- W C A
-
-
- N
-
- However, the problem with this is 3 fold. Firstly, the simplex stays the same
- size, so there is a limit to how accurately we can home in on the optimum. Also
- because of this, large "distances" on the x,y surface are covered fairly slowly
- if we make the figure small enough to get a reasonable resolution.
- The third problem is that, other than looking for repetition of the position
- of the simplex, there is little opportunity for detecting when the optimum has
- been reached and thus when to stop the process.
-
- The answer to these problems comes with "modified" or "accelerated" simplex,
- which is as above with additions:
-
- B Now we reflect to find A, as before, but
- this time we check the response at A first
- and modify the simplex:
- W F C G A H If response at A is BETTER than B, we're
- obviously going in the right general
- direction, so double the W translation and
- N make the new simplex BNH instead.
- If response at A is between B and N, stay at A, but if it is between W and N,
- we must be careful not to overshoot, so the new simplex is BNG. If response at
- A is worse than W, the new simplex is BNF.
- In practice, this means that the simplex gets bigger to cover large "distances"
- toward the optimum, then shrinks itself around it to home in with a higher
- resolution. This gives a method of stopping the process as we can elect to
- terminate the simplex loop when it has got below a certain size (for example,
- 0.1% of the total Y-range - the sort of thing POLYSIM does).
- POLYSIM uses this this method, but with up to 8 dimensions, to optimize the
- combination of coefficients such that |(y(calculated) - y(given))| is minimized.
- The point at which it stops, and hence the resolution, is set within the
- program, but can be changed with recompilation.
-
- Acknowledgements:
-
- What I know about simplex optimization comes from writing a teaching program
- in 1988 for the University of Edinburgh Chemistry Department. POLYSIM is an
- application based upon experience gained in that project, which was carried
- out for Drs. Kenneth Lawley and Ian Gosney.
- The paper I was referred to was:
- S.N. Deming & S.L. Morgan, Analytical Chemistry vol.45, 278A (1973),
- and further references can be found at the end of this, though I have not
- looked at them.
-
-
- Disclaimer and such like:
-
- This was written by me for me. I'm not going to accept any hassle over it
- since I have not ripped anyone off and am distributing it because it has
- come to my attention that some people might find it useful.
- I've included source code as well as IBM PC executable in the archive
- supplied, for those who want to play with it (and make improvements which
- are possible but which escape me at present) or port it to other machines.
- I really don't care if anyone changes anything, but unless you're going to
- improve it there is not much point. If you do improve it, could you please
- mail them to me at one of the addresses below with a description of the
- modification. Anything implemented will be forwarded with acknowledgements.
- The same goes for constructive criticism. Flame will go to /dev/null.
-
- Contact:
- Anyone wishing to contact me about this program can do so at the following
- addresses:
-
- SnailMail: D.J. Murphy
- Room 213
- Chemistry Department
- Edinburgh University
- King's Buildings
- West Mains Road
- Edinburgh
- Scotland
- EH9 3JJ
-
- Email:
- JANET ARPA
- djm@uk.ac.edinburgh.etive djm%ed.etive@nsfnet-relay.ac.uk
- or or
- Murff@uk.ac.edinburgh Murff%ed.emas-a@nsfnet-relay.ac.uk