Repeating docs

Hi,

In tweaking mlab.psd(), I'm noticing there is a lot of overlap between
the keyword args for psd() and csd(). In fact, csd() doesn't document
them itself, but just references psd(). Additionally, the csd() and
psd() Axes methods duplicate these docs, with a few additional
parameters. Would it be a good thing to restructure the duplicated docs
into it's own string that can be incorporated when necessary? Or is
this kind of "monkey patching" of the docs something we're trying to
minimize?

Ryan

···

--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

No, this is something we are doing more of lately (eg see the contour
docs) but the psd, csd, cohere predated this docstring manipulation.
So feel free to consolidate.

JDH

···

On Tue, Nov 11, 2008 at 10:35 AM, Ryan May <rmay31@...149...> wrote:

Hi,

In tweaking mlab.psd(), I'm noticing there is a lot of overlap between
the keyword args for psd() and csd(). In fact, csd() doesn't document
them itself, but just references psd(). Additionally, the csd() and
psd() Axes methods duplicate these docs, with a few additional
parameters. Would it be a good thing to restructure the duplicated docs
into it's own string that can be incorporated when necessary? Or is
this kind of "monkey patching" of the docs something we're trying to
minimize?

John Hunter wrote:

In tweaking mlab.psd(), I'm noticing there is a lot of overlap between
the keyword args for psd() and csd(). In fact, csd() doesn't document
them itself, but just references psd(). Additionally, the csd() and
psd() Axes methods duplicate these docs, with a few additional
parameters. Would it be a good thing to restructure the duplicated docs
into it's own string that can be incorporated when necessary? Or is
this kind of "monkey patching" of the docs something we're trying to
minimize?

No, this is something we are doing more of lately (eg see the contour
docs) but the psd, csd, cohere predated this docstring manipulation.
So feel free to consolidate.

I've done psd and csd so far. I might get to cohere (and spectrogram)
later. It got a little ugly doing the axes methods, since you can only
use a single dictionary for string replacement.

On a separate note, there is *A LOT* of code duplication between psd()
and csd() in mlab. It's bugged me while I've been doing these tweaks,
but the problem was that csd() would end up doing an extra FFT vs. the
same call to psd. I think I might finally have a solution:

1) Have psd(x) call csd(x,x)
2) Have csd() check if y is x, and if so, avoid doing the extra work.

Would this be an acceptable solution to reduce code duplication?

On a separate note, once I get done with these tweaks, are there any
objections to submitting something based on this to scipy?

Ryan

···

--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

1) Have psd(x) call csd(x,x)
2) Have csd() check if y is x, and if so, avoid doing the extra work.

Would this be an acceptable solution to reduce code duplication?

Sure, that should work fine.

On a separate note, once I get done with these tweaks, are there any
objections to submitting something based on this to scipy?

No objections here -- if it were put into numpy though, we could
depend on it and avoid the duplication. I would campaign for numpy
first, eg np.fft.psd, etc.

JDH

···

On Tue, Nov 11, 2008 at 1:38 PM, Ryan May <rmay31@...149...> wrote:

John Hunter wrote:

1) Have psd(x) call csd(x,x)
2) Have csd() check if y is x, and if so, avoid doing the extra work.

Would this be an acceptable solution to reduce code duplication?

Sure, that should work fine.

Ok, I noticed that specgram() duplicated much of the same code, so I
factored it all out and made a _spectral_helper() function, which pretty
much implements a cross-spectrogram. csd() and specgram() use this, and
then psd still calls csd(). Now all of the spectral analysis stuff is
using the same computational code base.

On a separate note, once I get done with these tweaks, are there any
objections to submitting something based on this to scipy?

No objections here -- if it were put into numpy though, we could
depend on it and avoid the duplication. I would campaign for numpy
first, eg np.fft.psd, etc.

I agree it'd be better for us if it went to numpy, but I've gotten the
sense that they're not really receptive to adding things like this now.
I'll give it a try, but I sense that scipy.signal would end up being a
more likely home. That wouldn't help us with duplication, but would
help the community at large. It's always bugged me that I can't just
grab a psd() function from my general computing packages. (In my
opinion, anything in mlab that doesn't involve plotting should really
exist in a more general package.)

Ryan

···

On Tue, Nov 11, 2008 at 1:38 PM, Ryan May <rmay31@...149...> wrote:

--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

Ryan May wrote:

John Hunter wrote:
  

1) Have psd(x) call csd(x,x)
2) Have csd() check if y is x, and if so, avoid doing the extra work.

Would this be an acceptable solution to reduce code duplication?
      

Sure, that should work fine.
    
Ok, I noticed that specgram() duplicated much of the same code, so I
factored it all out and made a _spectral_helper() function, which pretty
much implements a cross-spectrogram. csd() and specgram() use this, and
then psd still calls csd(). Now all of the spectral analysis stuff is
using the same computational code base.

On a separate note, once I get done with these tweaks, are there any
objections to submitting something based on this to scipy?
      

No objections here -- if it were put into numpy though, we could
depend on it and avoid the duplication. I would campaign for numpy
first, eg np.fft.psd, etc.
    
I agree it'd be better for us if it went to numpy, but I've gotten the
sense that they're not really receptive to adding things like this now.
I'll give it a try, but I sense that scipy.signal would end up being a
more likely home. That wouldn't help us with duplication, but would
help the community at large. It's always bugged me that I can't just
grab a psd() function from my general computing packages. (In my
opinion, anything in mlab that doesn't involve plotting should really
exist in a more general package.)

There's an interesting case to be made here for modules shared between
packages at the version-control-system and bug tracking level (e.g. in a
DVCS type system) but installed in separate namespaces and shipped
independently when it was time for source and binary distributions of
the package to be made. There'd be duplication at the install and
distribution level, but at least not at the source level. I'd guess the
linux packagers would also find a way to reduce duplication at those
other levels, too, for their systems.

It seems to me that this would reduce a lot of the developer angst about
having multiple sources for the same things, while still making things
easy on the users.

However, I don't know any VCS that would facilitate such a thing...

-Andrew

···

On Tue, Nov 11, 2008 at 1:38 PM, Ryan May <rmay31@...149...> wrote: