polyfit implementation

I'm having some troubles while fitting data with high order polynomials
(typically order 6). I looked at the polyfit function in
'matplotlib/mlab.py' and I must say I don't understand why the
least-square problem isn't solved using the matrix multiplication
instead of the * multiplication. Unless I misunderstood something,
shouldn't the following line:

Obviously I misunderstood something since quick trials gave me:

>>> array([[0,1],[1,0]]) * array([[2],[3]])
array([[0, 2],
        [3, 0]])

>>> matrixmultiply(array([[0,1],[1,0]]), array([[2],[3]]))
array([[3],
        [2]])

as the polyfit function works well...

Anyway it seems that my polyfit troubles come from computation precision issues. Indeed my polynomial may be of too high order than data actually need so that some coefficients are very low (eg. 1e-20). I made some experiments that show the results depend on the use of Numeric/numarray:

* Numeric
I must use polyfit(x,y,N) with Float64 for x because it crashes with Float32, but results are very inaccurate

* numarray
I can use either Float64 or Float32, but only Float32 give good results (may someone have an explanation...)

Compared to Matlab, polynomial fitting with Python is not as good in this case. For people who are interested in testing my data I provided an attached text file of x and y, the polynomial order is 6.

JM. Philippe

data.txt (22.5 KB)

···

jean-michel.philippe@...526... wrote:

Hello Jean-Michel,

Anyway it seems that my polyfit troubles come from computation precision
issues. Indeed my polynomial may be of too high order than data actually

need so that some coefficients are very low (eg. 1e-20). I made some
experiments that show the results depend on the use of Numeric/numarray:

* Numeric
I must use polyfit(x,y,N) with Float64 for x because it crashes with
Float32, but results are very inaccurate

* numarray
I can use either Float64 or Float32, but only Float32 give good results

(may someone have an explanation...)

Compared to Matlab, polynomial fitting with Python is not as good in
this case. For people who are interested in testing my data I provided

an attached text file of x and y, the polynomial order is 6.

That does not depend on Numeric, numarray or Matlab. Is your data. If you
try, in Matlab, to see how polyfit works (type polyfit), you will see with
a simple trial that your data are bad conditioned. As an example, if x,
y are the 2 rows of your data:

x = x(:);
y = y(:);

% Construct the Vardermonde Matrix
V(:,n+1) = ones(length(x),1);
for j = n:-1:1
   V(:,j) = x.*V(:,j+1);
end

[Q,R] = qr(V,0);
p = R\(Q'*y);
Warning: Matrix is close to singular or badly scaled.
         Results may be inaccurate. RCOND = 2.575966e-026.

In fact, try to take a look at the condition number of your matrix R:

condest(R)

ans =

     3.89823249817041e+025

That's too high. Neither Matlab, Python or whatever software will give you
a result on which you can rely. Maybe at a first glance Matlab seems to
be more powerful (and, in general, this is the case), but be aware that
you should not trust on results that are affected by so bad conditioning
number/numerical errors.
Try to reduce the number of points (some of them are too close), or try
a non-linear regression (as lsqnonlin), even if you should not need such
a tool in order to do the job.

HTH.

Andrea.