Yale University

PHYS 381/504: Least-mean-squares fitting

Modules:

Optical Tweezers
Computer simulations/Phase transitions
Scanning tunnelling microscope
Superconducting tunneling
Weak localization of light
Nuclear magnetic resonance
Compton scattering
Least-mean-squares
Random Variables
COURSE GRADE
Questions
PHYS 381La HOME
Classes Page
Yale Graduate School
Yale University
Publications
© 2005 Yale University, New Haven, Connecticut 06520

When you carry out an experimental measurement, you generally acquire a series of data points (x sub i). The value of each x sub i will usually deviate from its expected value (xi sub i), because of measurement noise or other nefarious reasons. Often, it is assumed that x sub i may be approximated as a Gaussian random variable (GRV) with mean xi sub i and some standard deviation sigma sub i - a.k.a. the measurement error. When counting the number of times that independent events occur within some range, we expect a Poisson distribution of counts. In this case, the error in n counts is square-root of n. In other examples, that do not involve counting, the error will be different. Be aware that a Poisson distribution is not Gaussian. Nevertheless, for more than a few counts, a Poisson distribution is well-approximated by a Gausssian.

Least mean-squares fitting generally seeks to minimize ("chi-squared'')

                                  chi-squared equals Sum (i equals 1 to N) {(x sub i - xi sub i)-squared over (sigma sub i)-squared,                                     (1)

where x sub i, sigma sub i, and xi sub i are the experimental value, the error, and the theoretical value, respectively, of data point i, and N is the total number of data points. Usually, xi sub i is a function of the fitting parameters, which are varied to minimize chi-squared corresponding to the best fit. Just as x sub i is a random variable, so is chi-squared. When (x sub i) minus (xi sub i) over (sigma sub i) can be considered a GRV, it turns out that c equals chi-squared can be shown to have a probability density

                            (p sub chi-squared) (c) equals 1 over (2 to the (M over 2) Gamma (M over 2)) e to the (minus c over 2) c to the (M over (2 minus j))                          (2)

where M equals N minus (N sub p) and N sub p is the number of fitting parameters. Thus, we can answer the question: what is the probability (P (chi-squared greater than xhi-squared sub o)), that chi-squared exceeds the observed value (chi-squared sub o)? Specifically, we have

                      P (chi-squared greater than chi-squared sub o) equals integral of chi-squared sub o to infinity p sub chi-squared (c) dc equals (Gamma (M over 2, chi-squared sub o over 2) over (Gamma (M over 2))               (3)

where Gamma (z) is the Euler gamma function and Gamma (z, a) is the incomplete gamma function, both of which Mathematica can calculate for you. The idea, of course, is that if it turns out that there is only a small probability of finding a value of chi-squared greater than or less than the observed value (chi-squared sub o), why then you probably should be worried, either about whether your theoretical model is in fact appropriate, or the quality of your measurements. This assumes that you have appropriate errors. The most common reason for funky values of chi-squared is incorrect errors.

Top