Hi,
I wouldn't call myself a PCA expert - so don't weight my answer too
heavily - but here is what I think is happening:
Looking at the code, the input data array is centered and scaled to
unit variance in each dimension. The attribute .a of the class is a
copy of the array that is actually sent to the SVD; note the
centering/scaling. I don't have a proof of this, but intuitively I
expect that the PCA axes associated with a 2-dimension centered/scaled
array will always be at 45" angles (e.g., [1,1], [-1,1], etc., which
are normalized to [sqrt(1/2), sqrt(1/2)], etc). I think one way to
describe this is that after centering/scaling there are no degrees of
freedom left if you only started with 2 dimensions. So I don't think
there is a bug, but it is maybe unclear what the PCA class is doing.
If you increase to > 2 dimensions, you can see there is random
fluctuation in Wt:
In [102]: pcaObj = PCA(np.random.randn(200,2))
In [103]: pcaObj.Wt
Out[103]:
array([[-0.70710678, -0.70710678],
[-0.70710678, 0.70710678]])
In [104]: pcaObj = PCA(np.random.randn(200,3))
In [105]: pcaObj.Wt
Out[105]:
array([[ 0.65456366, -0.24141116, -0.7164266 ],
[ 0.39843462, 0.91551401, 0.05553329],
[ 0.64249223, -0.32179924, 0.69544877]])
In [106]: pcaObj = PCA(np.random.randn(200,3))
In [107]: pcaObj.Wt
Out[107]:
array([[-0.29885902, -0.67436982, 0.67521007],
[-0.95428685, 0.21449891, -0.20815098],
[-0.00446109, -0.70655189, -0.70764718]])
Hope that helps,
Aronne
···
On Tue, Jun 12, 2012 at 1:03 AM, Justin R <justinbrowe@...287...> wrote:
operating system Windows 7
matplotlib version : 1.1.0
obtained from sourceforge
the class seems to generate the same Wt matrix for every input. The
every element of the weight matrix is either +sqrt(1/2) or -sqrt(1/2).
dat1 = 4*np.random.randn(200,1) + 2
dat2 = dat1*.25 + 1*np.random.randn(200,1)
pcaObj1 = PCA(np.hstack((dat1,dat2)))
print pcaObj1.Wt
dat3 = 2*np.random.randn(200,1) + 2
dat4 = dat3*2 + 3*np.random.randn(200,1)
pcaObj2 = PCA(np.hstack((dat1,dat2)))
print pcaObj2.Wt
The output Y seems to be correct, and the projection function works.
only the Wt matrix seems to be messed up. Am I using this class
incorrectly, or could this be a bug?