home *** CD-ROM | disk | FTP | other *** search
- LDHFITR.DOC VERS:- 01.00 DATE:- 09/26/86 TIME:- 10:02:47 PM
-
- description of computer reduction of data from kinetic
- measurements for the enzyme lactate dehydrogenase
- information on how to run program LDHFIT
-
-
-
-
-
-
- By J. A. Rupley, Tucson, Arizona
-
-
- NOTES ON DATA REDUCTION BY COMPUTER
- AND INFORMATION ON RUNNING THE PROGRAM LDHFIT
-
-
- INTRODUCTION:
-
- In order to obtain conclusions from quantitative measurements,
- there must be some form of data reduction. This can be as simple
- as a comparison by eye of two curves drawn through the data. If,
- however, the data set is large and complex, for example with more
- than one independent variable, and if the questions posed are
- detailed or involve a complicated nonlinear model, then visual or
- graphical methods are less satisfactory than a computer-based
- analysis. Procedures of the latter type are now widely used.
-
- This laboratory is a short introduction to data reduction by use
- of a computer. The intent is to show that a sophisticated
- computer program can be handled easily, that its use saves time
- and effort, that it can treat a more complicated model than can
- be treated graphically, and that it produces information such as
- estimates of uncertainties in the parameters that is difficult or
- impossible to obtain from graphical methods.
-
- The data to be analyzed are initial rate measurements made
- on the lactate dehydrogenase catalyzed reaction of pyruvate with
- NADH, in the presence and absence of lactate as inhibitor. The
- results of the computer fit are the following: (1) values of the
- kinetic constants V, KmA, KmB, KmAB, KmQ/KmPQ, and KBInhib. The
- first five constants are those that can be evaluated by the
- standard graphical methods of primary and secondary reciprocal
- plots. The constant KBInhib is the dissociation constant for the
- dead-end complex LDH-NADH-lactate, which is included in the
- mechanism fit by the computer program but cannot be included in
- the mechanism on which the graphical methods are based. (2)
- Estimates of the standard deviations of the kinetic constants.
- These are needed for an understanding of the reliability and
- significance of the values calculated for the kinetic constants.
- (3) A list of the coordinates of points suitable for construction
- of the lines of the reciprocal plots of the standard graphical
- methods.
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1
-
-
-
-
-
-
-
-
-
- By J. A. Rupley, Tucson, Arizona
-
- THEORY:
-
- A. REMARKS ON FITTING OF A MODEL TO DATA
-
- In a typical data reduction, a particular model to be tested is
- fit to a set of data points under some criterion for best fit.
- The ith data point of a set of N data points consists of a single
- value for the dependent variable Yobserved(i) measured for
- corresponding single values for the one or more independent
- variables Xobserved(i). The commonly-used least squares criterion
- for quality of fit is the minimum value of the sum of the squares
- of the deviations between the observed values of Y and the values
- of Y calculated according to the model being tested.
-
- Working from the model to be fit to the data, one develops an
- equation relating, for each of the N data points, the dependent
- variable Y to the independent variables X and to a set of M
- variable parameters p:
-
- Ymodel(i) = F(Xobserved(i); p(j), j=1,M) eq. 1
-
- For example, if the model predicts a linear relationship between
- Y and a single independent variable X :
-
- Ymodel(i) = p(1) + p(2) * Xobserved(i) eq. 2
-
- The constants p(1) and p(2) of equation (2) are the Y axis
- intercept and the slope, respectively, and of course are the same
- for all data points (for all pairs of values Y(i) and X(i)).
-
- The fitting of a model to data consists of finding the values of
- the M variable parameters p that give the best agreement between
- the N pairs of values of Ymodel(i) and Yobserved(i). Best
- agreement can be defined as the minimum value of the least
- squares function y:
-
- N
- y = SUM (Yobserved(i) - Ymodel(i))^2 * W(i) eq. 3
- i=1
-
-
- The factor W(i) of equation (3) is the normalized reciprocal
- variance (the statistical weighting) of the ith data point, and
- it can be set at unity if the data points are all of equal
- estimated uncertainty.
-
- Combining equations (1) and (3), one sees that the least squares
- function y of equation (3) is a function of the full set of N
- data points and a set of M variable parameters:
-
- y = f(Yobserved(i), Xobserved(i), i=1,N; p(j), j=1,M) eq. 4
-
-
-
- 2
-
-
-
-
-
-
-
-
-
- The fitting problem therefore consists of finding the minimum
- value of the least squares function y, which for a given set of
- data depends only on the M variable parameters p (the data points
- Yobserved(i)---Xobserved(i) in equation (4) are constant in the
- fitting). There are several methods commonly used to find the
- minimum of y and thus evaluate the best fit values of the
- parameters p. The more useful of these can handle nonlinear model
- functions F (equation (1)) of arbitrary mathematical form. The
- rate law for lactate dehydrogenase is an example of a nonlinear
- model function.
-
- In the simplex method used here, one constructs an M dimensional
- polyhedron with M + 1 vertices (the simplex). Each dimension of
- the simplex corresponds to a variable parameter of equation (4).
- Each vertex of the simplex is a point in the M dimensional space,
- which is called "parameter space" or "factor space." The M
- coordinates of each vertex are values of the M parameters. Thus
- each vertex of the simplex has an associated value of the least
- squares function y. The starting simplex is constructed to be so
- large as to include within it the point corresponding to the
- minimum value of y. This minimum point has as its coordinates of
- the best fit values of the parameters.
-
- The minimization process shrinks the simplex about the minimum
- point, even though the coordinates of the minimum are not known
- beforehand, until the vertices of the simplex are so close
- together and so nearly equal that an exit test is satisfied. The
- exit test is set so that a desired level of accuracy is obtained.
- The values of the M parameters averaged over all the vertices, ie
- the parameter values for the centroid of the simplex, serve as
- reliable estimates of the best fit parameter values (those for
- the least squares function minimum), because the minimum point is
- known to be inside the shrunken simplex and thus near the
- centroid.
-
- We generally want to estimate the uncertainties in the parameter
- values obtained for a model fit to a particular set of data
- points. Standard deviations of the parameters are calculated by
- the program used here.
-
- There are likely to be large uncertainties in the parameters if
- there are few data points or if there are large deviations
- between Ymodel and Yobserved. As a rule, one should have 5 to 10
- times as many data points as parameters.
-
- The first try at estimating uncertainties of the parameters can
- fail. The calculation involves matrix inversion, the use of
- differences between nearly equal large numbers, and the
- approximation of a complex surface by a simple quadratic
- function. It may be necessary to change certain test values and
- then to repeat the calculation of the standard deviations. In
- particular, if a parameter is close to a bound, so that expansion
- of the simplex in that dimension is not possible, then that
- parameter should be fixed in the quadratic fit.
-
-
- 3
-
-
-
-
-
-
-
-
-
-
- All fitting methods can fail. We will not discuss problems with
- bounds, local minima, ill-behaved functions, poor quality data,
- physically unreasonable best fits, etc. References given in
- subsection C below discuss fitting methods in more detail.
-
- For additional discussion of fitting by use of the simplex
- method, see "Notes on the fitting program", and the article by
- Nelder and Mead (1965).
-
-
- B. THE FUNCTION FIT TO THE DATA
-
- In the computer program you will use, the function fit to the
- data, which you obtained in the kinetic experiment with lactate
- dehydrogenase, is for an ordered ternary-complex pathway with
- dead-end complexes (EAP and EQB). The following scheme describes
- the pathway:
-
- EAP
- | KPInhib = ea * p/eap
- |
- ----------EA-----------
- k1 * a | k-1 k-2 | k2 * b
- | |
- | |
- E EAB <---> EPQ eq. 5
- | |
- | |
- k4 | k-4 * q k-3 * p | k3
- ----------EQ-----------
- |
- | KBInhib = eq * b/eqb
- EQB
-
- This pathway is more complicated than the one, without dead- end
- complexes, on which are based the standard graphical methods of
- analysis of two-substrate--two-product reactions:
-
- -----------EA----------
- | |
- | |
- E EAB <---> EPQ eq. 6
- | |
- | |
- -----------EQ----------
-
- For the direction of reaction (pyruvate reduction by NADH) and
- the product inhibitor (lactate) studied in these measurements,
- the rate law for the pathway of equation (5) is:
-
-
-
-
-
-
- 4
-
-
-
-
-
-
-
-
-
- vo = V / [ 1 + KmQ/KmPQ * p + KmA/a eq. 7
-
- + KmB * (1 + KmQ/KmPQ * p) * (1 + 1/KPInhib * p)
-
- + KmAB/(a * b) * (1 + KmQ/KmPQ * p)
-
- + k3/(k3 + k4) * 1/KBInhib * b ]
-
- The presence in the pathway of equation (5) of the dead end
- complexes EAP and EQB leads to a significantly more complicated
- rate law than is found for the simpler pathway of equation (6);
- compare equation (7) with the following equation, for the "bare"
- compulsory order pathway without dead-end complexes (the pathway
- of equation (6)):
-
- vo = V / [ 1 + KmQ/KmPQ * p + KmA/a eq. 8
-
- + KmB * (1 + KmQ/KmPQ * p)
-
- + KmAB/(a * b) * (1 + KmQ/KmPQ * p) ]
-
- The computer program used to reduce the kinetic data fits
- equation (7) to measurements of the initial rate v(0) as a
- function of the concentrations of NADH, pyruvate, and lactate.
-
- The parameters that are evaluated in the fitting are listed in
- Table I and are those of equation (7).
-
- Several points should be noted regarding equations (7) and (8):
- (1) The last term of equation (7), containing the equilibrium
- constant KBInhib, probably can be neglected under initial rate
- conditions with q=0; in the fitting program this is recognized by
- setting the last term at a small value, 1E-10, which removes it
- from the equation. (2) With this change, equations (7) and (8)
- are identical except for the appearance in one term of equation
- (7) of a factor containing the equilibrium constant KPInhib,
- which is for dissociation of the dead-end complex EAP defined in
- the pathway of equation (5).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 5
-
-
-
-
-
-
-
-
-
- By J.A. Rupley, Tucson, Arizona
-
- REFERENCES:
-
- A simplex method for function minimization.
- J.A. Nelder and R. Mead (1965. Computer J. 7, 308.
-
- Digital computer user's handbook.
- M. Klerer and G.A. Korn (1967). Mcgraw-Hill, New York.
-
- Data analysis in biochemistry and biophysics.
- M.E. Magar (1972). Academic, New York.
-
- The solution of the general least squares problem with special
- reference to high-speed computers.
- R.H. Moore and R.K. Ziegler (1960). Los Alamos Scientific
- Laboratory Report LA-2367.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 6
-
-
-
-
-
-
-
-
-
- By J.A. Rupley, Tucson, Arizona
-
- TABLE I:
-
- EQUATION (7)
- FITTING RATE LAW
- PARAMETERS PARAMETERS
-
- 1 V
-
- 2 KmA = KmNADH
-
- 3 KmB = KmPyruvate
-
- 4 KmAB = KmNADH-Pyruvate
-
- 5 KmQ/KmPQ = KmNAD/KmLactate-NAD
-
- 6 1/KPInhib = 1/KInhibLactate
-
- 7 k3/(k3 + k4) * 1/KInhibPyruvate
- parm(7) approx. equal to 0 at t=0
-
-
- INDEPENDENT VARIABLES:
-
- a = [NADH] b = [Pyruvate] p = [Lactate] q = [NAD]
- q = 0 at t=0
-
-
- DEPENDENT VARIABLE:
-
- vo = initial rate of conversion of pyruvate to lactate
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 7
-
-
-
-
-
-
-
-
-
- By J. A. Rupley, Tucson, Arizona
-
- PROCEDURES:
-
- A. SUMMARY OF STEPS IN THE DATA REDUCTION
-
- (1) Startup. Create a new file with your data, following the
- template of the DATA SHEET. Setup the title and the starting
- ranges of the parameters, from which the starting simplex is
- constructed. Be sure to save the setup on disk.
-
- (2) Minimization. The minimum of the least squares function is
- located by an iterative method, in which the simplex, a
- polyhedron in parameter space, is shrunk about the point
- representing the minimum. The starting simplex is large,
- sufficiently so to certainly include the minimum point. After
- shrinking, all the vertices of the simplex are so close together,
- and thus so close to the minimum point, that the average of the
- vertices, the centroid, is a good estimate of the minimum point.
- The minimization can take more than two hours. These results
- should be saved on a disk file, so if the fitting fails, it is
- possible to quickly return to this point.
-
- (3) Quadratic fit and standard deviations. The least squares
- function about the minimum is approximated by a quadratic
- surface, the shape of which gives estimates of the standard
- deviations of the parameters. This step takes about 10 minutes.
- There is considerable output, the last of which is a list of the
- standard deviations.
-
- The fitting program can be set to cycle between a set of
- minimization iterations and the quadratic approximation. This
- option can produce quicker convergence.
-
-
- B. FAILURE OF THE FITTING
-
- Fitting can lead to physically unreasonable values of the
- parameters, for example for a kinetics experiment a set of
- parameters for which V is unreasonably large or small, or a set
- with standard deviations such that one or more of the principal
- parameters (V, KmA, KmB, KmAB) have no significance.
-
- The best fit may be unacceptable. To judge whether a fit is
- acceptable, compare the root mean square error (RMS error) with
- the expected uncertainty in your independent variable, the meas-
- ured rates.
-
- The most likely causes of failure are:
-
- (1) A few data points (flyers) that are widely wrong. The
- results should always be examined for flyers, which we will
- somewhat arbitrarily define as measured values differing from
- calculated values by more than 3 standard deviations: ABS(Ymodel
-
-
- 8
-
-
-
-
-
-
-
-
-
- - Yobserved) > 3 x RMS ERROR. If there are flyers, delete them
- and repeat the calculation, or correct the values by
- recalculation of vo from your experimental data.
-
- (2) Trapping of the fit in a local minimum in parameter space
- not near the true minimum, or a broad and ill-defined minimum
- region. To test for trapping or a broad minimum, start the
- fitting with a different simplex. For example, set tighter
- starting ranges on one or more of the principal parameters (V,
- KmA, KmB, KmAB). Alternatively, set one principal parameter, e.g.
- V, at a reasonable value.
-
- Generally, to test for trapping, one repeats the fitting with a
- more expanded starting simplex, not with a tighter one as
- suggested above. However, assuming the tighter ranges include
- physically reasonable values of the parameters, the approach
- above, using a restricted simplex, has the advantage of testing
- whether a set of physically reasonable values can give an
- acceptable fit of the data. To judge whether a fit is acceptable,
- even though it may not be as good as a physically unreasonable
- best fit, compare the root mean square error (RMS error) with the
- expected uncertainty in your dependent variable, the measured
- rates.
-
- (3) Systematic error in the measurements or poor data. Look for
- systematic bias in the extraction of the slope. Look for errors
- in calculating the value of vo.
-
- One can plot the measured values with the calculated lines, ie
- construct the reciprocal plots. It should then be clear what the
- problem is. Correction, if it is possible, generally requires
- examination of the original measurements, ie the tracings of
- transmittance versus time. A point to be made is that the
- computer fitting, because it is unbiased, is more likely than the
- experimenter to detect systematic error or poor data.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 9
-
-
-
-
-
-
-
-
-
- By J. A. Rupley, Tucson, Arizona
-
-
- INSTRUCTIONS FOR FITTING KINETIC DATA FOR LACTATE DEHYDROGENASE
- WITH THE PROGRAM LDHFIT.COM (C-LANGUAGE):
-
-
- A. CREATE THE DATA FILE
-
- By use of a word processing program, modify the template file
- LDHDATA, incorporating your vo values and so constructing your
- own data file, eg TUE-C.DAT.
-
-
- B. RUN THE FITTING PROGRAM
-
- Run the program LDHFIT with your data file (eg TUE-C.DAT) to
- produce the results file (eg TUE-C.PRT)
-
- A>LDHFIT TUE-C.DAT TUE-C.PRT
-
- It takes about 50 minutes for 100 iterations. 200 to 400
- iterations are commonly required for convergence. The fitting
- will cycle between sessions of Nelder-Mead minimization (30
- iterations) and quadratic fitting.
-
- You should watch the fitting. Let the program run freely until
- the test value is < 10-4. Then: (1) See if a parameter needs to
- be eliminated. If the calculation of -pmin- fails at this stage
- in the fitting, probably the fitting will not converge. If one of
- the parameter values listed under -pmin- in the summary of the
- quadratic fit results is negative, then this parameter should be
- fixed at 1.e-10. This can be done by use of a program option (see
- below). (2) Examine the data listing for flyers: (Yobserved(i) -
- Ycalculated(i)) > 3 * rms error. If these are present, you should
- delete them from the data set, or set their weight to zero. This
- is done by reworking the data file, with a word processor, then
- rerunning LDHFIT. You will get faster convergence by setting the
- starting simplex in the data file at an intermediate point in the
- fitting.
-
- To exercise options, during the minimization type: CTRL-X
-
- After some output, you will be asked to select an option:
- display of the data and then quadratic fit (<CR>)
- fix of a parameter (CTRL-F)
- exit from the program to operating system (CTRL-C)
- return to the fitting (CTRL-X)
-
-
-
-
-
-
-
-
- 10
-
-
-
-
-
-
-
-
-
- B. AFTER THE FITTING CONVERGES
-
- When the simplex has shrunk to where it satisfies the exit test,
- the minimization pauses and expects input from the operator:
-
- respond with <RETURN> when asked if you want to display data;
-
- respond with <RETURN> when asked at the next pause if you
- want to carry out quadratic fit;
-
- at the next pause, record the number of the iteration, and
- exit by typing CTRL-C;
-
-
- D. CONDENSE THE RESULTS FILE
-
- The results file (eg TUE-C.PRT) will be too long to print.
- Select the information from the last data display and quadratic
- fitting by use of a word processor or a special search program.
-
-
- E. FAILURE OF THE FIT
-
- Repeat the fit after checking your data and possibly resetting
- the starting parameter values.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 11
-
-
-
-