Introduction To Fitting

Beginner's guide to fitting in general

fit is used to find a set of parameters to be used in a parametric function to make it fit to your data optimally. The quantity to be minimized is the sum of squared differences between your input data points and the function values at the same places, usually called 'chisquared' (i.e. the Greek letter chi, to the power of 2). (To be precise, the differences will be divided by the input data errors before being squared; see fit errors for details.)

Now that you know why it's called 'least squares fitting', let's see why it's 'nonlinear'. That's because the function's dependence on the parameters (not the data!) may be non-linear. Of course, this might not tell you much if you didn't know already, so let me try to describe it. If the fitting problem were to be linear, the target function would have to be a sum of simple, non-parametric functions, each multiplied by one parameter. (For example, consider the function f(x) = c*sin(x), where we want to find the best value for the constant c. This is nonlinear in x, of course, but it is linear in c. Since the fitting procedure solves for c, it has a linear equation to solve.) For such a linear case, the task of fitting can be performed by comparatively simple linear algebra in one direct step. But fit can do more for you: the parameters may be used in your function in any way you can imagine. To handle this more general case, however, it has to perform an iteration, i.e. it will repeat a sequence of steps until it finds the fit to have 'converged', or until you stop it.

Generally, the function to be fitted will come from some kind of theory (some prefer the term 'model' here) that makes a prediction about how the data should behave, and fit is then used to find the free parameters of the theory. This is a typical task in scientific work, where you have lots of data that depend in more or less complicated ways on the values you're interested in. The results will then usually be of the form 'the measured data can be described by the {foo} theory, for the following set of parameters', and then a set of values is given, together with the errors of your determination of these values.

This reasoning implies that fit is probably not your tool of choice if all you really want is a smooth line through your data points. If you want this, the smooth option to plot is what you've been looking for, not fit. See plot datafile smooth for details.