Yahoo historical quotes

Thanks for the code. I added matplotlib's yahoo historical quote code
to my labeled array package, la. While doing so I noticed a couple of
possible bugs:

- The doc string for parse_yahoo_historical says that the volume is
adjusted. But I don't see a line of code that does the adjustment.
- quotes_historical_yahoo builds an error message from a variable
named "url" which doesn't exist in the scope of the function
- "import time" and "from matplotlib.cbook import is_string_like" are not used

I stripped down the code keeping only what I needed:

http://github.com/kwgoodman/la/blob/master/la/external/matplotlib.py

Here's a demo of how I use it:

    >>> from la.data.yahoo import quotes
    >>> lar = quotes(['aapl', 'msft'], (2010,10,1), (2010,10,5))
    >>> lar
    label_0
        aapl
        msft
    label_1
        open
        close
        high
        low
        volume
    label_2
        2010-10-01
        2010-10-04
        2010-10-05
    x
    array([[[ 2.86150000e+02, 2.81600000e+02, 2.82000000e+02],
            [ 2.82520000e+02, 2.78640000e+02, 2.88940000e+02],
            [ 2.86580000e+02, 2.82900000e+02, 2.89450000e+02],
            [ 2.81350000e+02, 2.77770000e+02, 2.81820000e+02],
            [ 1.60051000e+07, 1.55256000e+07, 1.78743000e+07]],

           [[ 2.47700000e+01, 2.39600000e+01, 2.40600000e+01],
            [ 2.43800000e+01, 2.39100000e+01, 2.43500000e+01],
            [ 2.48200000e+01, 2.39900000e+01, 2.44500000e+01],
            [ 2.43000000e+01, 2.37800000e+01, 2.39100000e+01],
            [ 6.26236000e+07, 9.80868000e+07, 7.80329000e+07]]])

    >>> close = lar.lix[:,['close']]
    >>> close
    label_0
        aapl
        msft
    label_1
        2010-10-01
        2010-10-04
        2010-10-05
    x
    array([[ 282.52, 278.64, 288.94],
           [ 24.38, 23.91, 24.35]])

    Calculate the log return from the close prices:

    >>> ret = close / close.lag(1, axis=-1)
    >>> ret = ret.log()
    >>> ret
    label_0
        aapl
        msft
    label_1
        2010-10-04
        2010-10-05
    x
    array([[-0.01382872, 0.03629843],
           [-0.01946634, 0.01823507]])

Thanks for the code. I added matplotlib's yahoo historical quote code
to my labeled array package, la. While doing so I noticed a couple of
possible bugs:

- The doc string for parse_yahoo_historical says that the volume is
adjusted. But I don't see a line of code that does the adjustment.

We don't do the adjustment, Yahoo does. Eg, take a look at CROX
around their 6/15/07 split.

  Crocs, Inc. (CROX) Stock Historical Prices & Data - Yahoo Finance

IDC reports consolidated volume on 6/14/07 (pre-split) was 4,366,319
shares. Yahoo reports 8,726,000 -- which is close to 2x the raw
volume (I assume the difference is in how some of the shares
transacted on non-primary exchanges are counted). Likewise, on
6/13/07, IDC reports
7,852,268 and Yahoo reports 15,544,200, which is close to 2x. So it
appears they are backward split adjusting the volume.

quotes_historical_yahoo builds an error message from a variable
named "url" which doesn't exist in the scope of the function

Fixed -- this was a legacy error message when we used to pass in urls
but now pass in filehandles

"import time" and "from matplotlib.cbook import is_string_like" are not used

Fixed.

Thanks for the report -- fixed on the branch (8742) and will be merged
to the trunk when MD resuscitates svnmerge....

JDH

···

On Tue, Oct 12, 2010 at 10:55 AM, Keith Goodman <kwgoodman@...149...> wrote:

I knew it was worth it (to me) to report what I thought was a bug. Thanks.

Well, then perhaps the doc string needs tweaking?

    *adjusted*
      If True (default) replace open, close, high, low, and volume with
      their adjusted values.
      The adjustment is by a scale factor, S = adjusted_close/close.
      Adjusted volume is actual volume divided by S;
      Adjusted prices are actual prices multiplied by S. Hence,
      the product of price and volume is unchanged by the adjustment.

It does state that volume is adjusted by S (where S contains dividend
info). I take it that volume is the same whether adjusted is True or
False, which means that dollar volume is not unchanged.

···

On Tue, Oct 12, 2010 at 9:23 AM, John Hunter <jdh2358@...149...> wrote:

On Tue, Oct 12, 2010 at 10:55 AM, Keith Goodman <kwgoodman@...149...> wrote:

Thanks for the code. I added matplotlib's yahoo historical quote code
to my labeled array package, la. While doing so I noticed a couple of
possible bugs:

- The doc string for parse_yahoo_historical says that the volume is
adjusted. But I don't see a line of code that does the adjustment.

We don't do the adjustment, Yahoo does. Eg, take a look at CROX
around their 6/15/07 split.

Crocs, Inc. (CROX) Stock Historical Prices & Data - Yahoo Finance

IDC reports consolidated volume on 6/14/07 (pre-split) was 4,366,319
shares. Yahoo reports 8,726,000 -- which is close to 2x the raw
volume (I assume the difference is in how some of the shares
transacted on non-primary exchanges are counted). Likewise, on
6/13/07, IDC reports
7,852,268 and Yahoo reports 15,544,200, which is close to 2x. So it
appears they are backward split adjusting the volume.

How does this look?

    Parse the historical data in file handle fh from yahoo finance.

    *adjusted*
      If True (default) replace open, close, high, low, and volume with
      their adjusted values.
      The adjustment is by a scale factor, S = adjusted_close/close.
      Adjusted prices are actual prices multiplied by S. Hence,

      Note that volume is already backward split adjusted by Yahoo, so
      if you want to compute dollars traded, multiply volume by the
      adjusted close, regardless of whether you choose adjusted =
      True>False

···

On Tue, Oct 12, 2010 at 11:47 AM, Keith Goodman <kwgoodman@...149...> wrote:

Well, then perhaps the doc string needs tweaking?

*adjusted*
If True (default) replace open, close, high, low, and volume with
their adjusted values.
The adjustment is by a scale factor, S = adjusted_close/close.
Adjusted volume is actual volume divided by S;
Adjusted prices are actual prices multiplied by S. Hence,
the product of price and volume is unchanged by the adjustment.

It does state that volume is adjusted by S (where S contains dividend
info). I take it that volume is the same whether adjusted is True or
False, which means that dollar volume is not unchanged.

Here are some suggested tweaks:

   *adjusted*
     If True (default) replace open, close, high, and low prices with
     their adjusted values. The adjustment is by a scale factor,
     S = adjusted_close/close. Adjusted prices are actual prices
     multiplied by S.

     Volume is not adjusted as it is already backward split adjusted
     by Yahoo. If you want to compute dollars traded, multiply
     volume by the adjusted close, regardless of whether you choose
     adjusted = True|False.

···

On Tue, Oct 12, 2010 at 10:11 AM, John Hunter <jdh2358@...149...> wrote:

On Tue, Oct 12, 2010 at 11:47 AM, Keith Goodman <kwgoodman@...149...> wrote:

Well, then perhaps the doc string needs tweaking?

*adjusted*
If True (default) replace open, close, high, low, and volume with
their adjusted values.
The adjustment is by a scale factor, S = adjusted_close/close.
Adjusted volume is actual volume divided by S;
Adjusted prices are actual prices multiplied by S. Hence,
the product of price and volume is unchanged by the adjustment.

It does state that volume is adjusted by S (where S contains dividend
info). I take it that volume is the same whether adjusted is True or
False, which means that dollar volume is not unchanged.

How does this look?

Parse the historical data in file handle fh from yahoo finance.

*adjusted*
If True (default) replace open, close, high, low, and volume with
their adjusted values.
The adjustment is by a scale factor, S = adjusted_close/close.
Adjusted prices are actual prices multiplied by S. Hence,

 Note that volume is already backward split adjusted by Yahoo, so
 if you want to compute dollars traded, multiply volume by the
 adjusted close, regardless of whether you choose adjusted =
 True&gt;False