Source Code 1992 March

home *** CD-ROM | disk | FTP | other *** search

/ Source Code 1992 March / Source_Code_CD-ROM_Walnut_Creek_March_1992.iso / msdos / educaton / polysim.arc / POLYSIM.DOC < prev next >

Wrap

Text File | 1989-05-22 | 9.8 KB | 190 lines

POLYSIM - Fits experimental data to a large polynomial equation using a modified simplex approach. Version 1.1 D.J.Murphy 1989 Usage: polysim [min_index max_index] inputfile outputfile Description: POLYSIM is designed to be run either on a PC in a "fire & forget" manner, or on a Unix (or other) host capable of supporting background processes. This is because optimization can be a time consuming business, and so once invoked no other input is required providing the data has been entered into the input file correctly. It fits data to the equation: y = ax^5 + bx^4 + cx^3 + dx^2 + ex + f + g/x + h/(x^2) using a modified simplex method (see below) to optimize the coefficients a through h such that the mean deviation of Y(input) from Y(calculated) is minimized. Alternatively, any part of this equation can be fitted by entering minimum and maximum values for the X-index on the command line. Restrictions here are that the indices must either both be entered or not at all (not one alone), and must be entered minimum then maximum. The indices, I, must be integers and 5 >= I >= -2. Using this option it is possible to use an educated guess at the general form of the function your data should fit to restrict the amount of work POLYSIM has to do, and thus reduce the time it will take to run. Output is via a text file, and includes number of iterations used, values of all the coefficients, mean deviation and a list of input data, Y(calculated) and the differences. Index Range: If you do not include this, the default range of X^5 to X^(-2) will be used. If you want to restrict this, enter the minimum index value you want, then the maximum, as integers. Input file: The input file must contain the input data as delimited X Y pairs, preferably one pair per line: eg. 1 1 2 4 3 9 4 16 | | Do not include anything else in this dataset. Try to avoid including x=0 values, as data pairs with x = 0 are ignored on reading the file (this to avoid divide-by-zero errors when evaluating Y(calculated)). Any valid filename is accepted. Output file: Any valid filename is accepted. Any existing file of that name will be overwritten, unless it is write-protected in which case POLYSIM will fail AFTER running the optimization. Example: C:\polysim 0 2 expt1.dat expt1.res Run POLYSIM on the data in C:\EXPT1.DAT, optimizing to the function y = ax^2 + bx + c, and write the results to the file C:\EXPT1.RES which can be listed to the screen, printer or included in other programs. Cautions: This is an optimization program - it is wise to plot the data visually before using it, as the fit POLYSIM reports may be better for some parts of the dataset than others. It is unwise to extrapolate too much beyond the range of the given data set. POLYSIM also takes a while to run - you are advised to try using a log(X) - log(Y) least squares method first to try and get a straight line, unless you either know that this won't give you a straight line or you are desperate for a curve. Also, not all data sets will give reasonable curves - again, try plotting first to see if this is likely to be so - if not the answer (if you get one) is likely to be meaningless. Portability: POLYSIM is written in C, and was developed on a PC using Borland's Turbo C v1.5, and compiled using floating point emulation so it will use an 80x87 math coprocessor if one is present but it does not require it (use of such is highly recommended, though - POLYSIM makes extensive use of double precision real numbers and as such will be very slow without a floating point processor). No odd facilities are used in the source code, so it should be portable onto a Unix system, or to any ANSI C compiler - it _should_ be - I'm making no promises. What is "modified simplex" optimization? Simplex is a formal procedure for optimizing a set of parameters in a case where you can measure responses for different combinations of parameter values but have no knowledge of the "response surface" which gives rise to the observed pattern - i.e. it is not possible to use calculus to find a maximum or minimum. A simplex is a geometrical figure which has one more vertex than the number of dimensions it occupies - a 2 dimensional simplex is a triangle, for example. POLYSIM uses up to an 8 dimensional simplex, one for each coefficient, so needs up to 9 equations. Consider the 2d one for illustration. We have some response, z, such that z = f(x,y), but we don't know what. In the basic simplex method, we test 3 points, B, N and W, and get the Best, Next best and Worst responses respectively. We then take a line between B and N and find the centre of it, C - then reflect W through that centre point to get a new point, A : The figure BNW is the old simplex, and BNA is the new one. If we now test the response at A, and compare B it with those at B and N, we can repeat the process. Doing this often enough will home the simplex in on the optimum response (think about it). W C A N However, the problem with this is 3 fold. Firstly, the simplex stays the same size, so there is a limit to how accurately we can home in on the optimum. Also because of this, large "distances" on the x,y surface are covered fairly slowly if we make the figure small enough to get a reasonable resolution. The third problem is that, other than looking for repetition of the position of the simplex, there is little opportunity for detecting when the optimum has been reached and thus when to stop the process. The answer to these problems comes with "modified" or "accelerated" simplex, which is as above with additions: B Now we reflect to find A, as before, but this time we check the response at A first and modify the simplex: W F C G A H If response at A is BETTER than B, we're obviously going in the right general direction, so double the W translation and N make the new simplex BNH instead. If response at A is between B and N, stay at A, but if it is between W and N, we must be careful not to overshoot, so the new simplex is BNG. If response at A is worse than W, the new simplex is BNF. In practice, this means that the simplex gets bigger to cover large "distances" toward the optimum, then shrinks itself around it to home in with a higher resolution. This gives a method of stopping the process as we can elect to terminate the simplex loop when it has got below a certain size (for example, 0.1% of the total Y-range - the sort of thing POLYSIM does). POLYSIM uses this this method, but with up to 8 dimensions, to optimize the combination of coefficients such that |(y(calculated) - y(given))| is minimized. The point at which it stops, and hence the resolution, is set within the program, but can be changed with recompilation. Acknowledgements: What I know about simplex optimization comes from writing a teaching program in 1988 for the University of Edinburgh Chemistry Department. POLYSIM is an application based upon experience gained in that project, which was carried out for Drs. Kenneth Lawley and Ian Gosney. The paper I was referred to was: S.N. Deming & S.L. Morgan, Analytical Chemistry vol.45, 278A (1973), and further references can be found at the end of this, though I have not looked at them. Disclaimer and such like: This was written by me for me. I'm not going to accept any hassle over it since I have not ripped anyone off and am distributing it because it has come to my attention that some people might find it useful. I've included source code as well as IBM PC executable in the archive supplied, for those who want to play with it (and make improvements which are possible but which escape me at present) or port it to other machines. I really don't care if anyone changes anything, but unless you're going to improve it there is not much point. If you do improve it, could you please mail them to me at one of the addresses below with a description of the modification. Anything implemented will be forwarded with acknowledgements. The same goes for constructive criticism. Flame will go to /dev/null. Contact: Anyone wishing to contact me about this program can do so at the following addresses: SnailMail: D.J. Murphy Room 213 Chemistry Department Edinburgh University King's Buildings West Mains Road Edinburgh Scotland EH9 3JJ Email: JANET ARPA djm@uk.ac.edinburgh.etive djm%ed.etive@nsfnet-relay.ac.uk or or Murff@uk.ac.edinburgh Murff%ed.emas-a@nsfnet-relay.ac.uk