The two forms of the via argument to fit serve two largely distinct purposes. The via "file" form is best used for batch operation (possibly unattended), where you just supply the startup values in a file and perhaps later use update to copy the results back into another file (or the same one).
The via var1, var2, ... form is best used interactively. Using the command history mechanism built into gnuplot, you can easily edit the list of parameters to be fitted or supply new startup values for the next try. This is particularly useful for hard problems, where a direct fit to all the parameters at once won't work, at least not without really good values to start with. To find such a set of good starting parameters, you can iterate several times, fitting only some of the parameters each time, until the values are close enough to the goal that the final fit (to all the parameters at once) will work.
A general word about starting values: fit may, and often will, get really badly lost in searching for the optimal parameter set if you start it way off any possible solution. The main reason for this is that nonlinear fitting is not guaranteed to converge to a global optimum. It can get stuck in a local optimum, and there's no way for the routine to find out about that. You'll have to use your own judgement in checking whether this has happened to you or not.
To partly avoid that problem, you should put all starting values at least roughly into the vicinity of the solution. At least the order of magnitude should be correct, if possible. The better your starting values are, the less error-prone the fit. A good way to find starting values is to draw data and fit-function into one plot, and iterate, changing the values and replot-ting until reasonable similarity is reached. The same plot is also useful to check if the fit got stuck in a non-global minimum.
Make sure that there is no mutual dependency among parameters of the function you are fitting. E.g., don't try to fit a*exp(x+b), because a*exp(x+b) = a*exp(b)*exp(x). Instead, fit either a*exp(x) or exp(x+b).
A technical issue: the parameters must not be too different in magnitude. The larger the quotient of the largest and the smallest absolute parameter values, the slower the fit will converge. If the quotient is close to or above the inverse of the machine floating point precision, it may take next to forever to converge, or refuse to converge at all. You'll have to adapt your function to avoid this, e.g. replace 'parameter' by '1e9*parameter' in the function definition, and divide the starting value by 1e9.
If you can write your function as a linear combination of simple functions weighted by the parameters to be fitted, by all means do so. That helps a lot, because the problem is then not nonlinear any more. It should take only a really small number of iterations to converge on a linear case, maybe even only one.
In prescriptions for analysing data from practical experimentation courses, you'll often find descriptions how to first fit your data to some functions, maybe in a multi-step process accounting for several aspects of the underlying theory one by one, and then extract the data you really wanted from the fitting parameters of that function. With fit, this last step can often be eliminated by rewriting the model function to directly use the desired final parameters. Transforming data can also be avoided quite often, although sometimes at the cost of a harder fit problem. If you think this contradicts the previous paragraph about keeping the fit function as simple as possible, you're correct.
Finally, a nice quote from the manual of another fitting package (fudgit) that kind of summarizes all these issues: "Nonlinear fitting is an art!"