Trying to plot -- problem with date2num?

Hi there,

I’m trying to plot a file which keeps track of my inbox count over time. Every minute it creates an observation, recording a timestamp, and the inbox count, like this:

2009-03-25 08:33:48, 5

2009-03-25 08:34:48, 5
2009-03-25 08:35:48, 5

and so on. I have about a day’s worth of data so far. Here is code I’m trying to use to plot this:

import dateutil, pylab, csv, matplotlib
import numpy as np

import matplotlib.pyplot as plt
import matplotlib.dates as mdates

Used for axis formatting

days = mdates.DayLocator() # every day

hours = mdates.HourLocator() # every month
daysFmt = mdates.DateFormatter(‘%D’)

Open data file

data = csv.reader(open(‘gmail-count.txt’), delimiter=‘,’)

Convert to vectors

time =

inbox =
for c_time, c_inbox in data:

time.append(c_time)   
inbox.append(int(c_inbox))

Plot the data

x_dates = pylab.date2num([dateutil.parser.parse(s) for s in time])
print x_dates

fig = plt.figure()
ax = fig.add_subplot(111)

ax.plot(x_dates, inbox, ‘g^’)

format the ticks

ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(daysFmt)

ax.xaxis.set_minor_locator(hours)
fig.autofmt_xdate()

Save the file

plt.savefig(‘testA’)

And yet here is the result: http://screencast.com/t/gLPDFtwnJM4

I can’t figure out why the values are ‘grouping’ around particular values on the x-axis… I would expect it to look more like a function, with only one y-value for each x.

Am I using date2num wrongly, or can anyone please suggest where I might be going wrong? (In case anyone wants to see the data, I’ve attached it as well… just ignore the 3rd column)

Thanks!
Tyler

gmail-count.txt (31.3 KB)

Sorry to spam.. I was advised to re-send this as plain text. Thanks
for any help!

Hi there,

I'm trying to plot a file which keeps track of my inbox count over
time. Every minute it creates an observation, recording a timestamp,
and the inbox count, like this:

2009-03-25 08:33:48, 5
2009-03-25 08:34:48, 5
2009-03-25 08:35:48, 5
...
and so on. I have about a day's worth of data so far. Here is code
I'm trying to use to plot this:

import dateutil, pylab, csv, matplotlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

# Used for axis formatting
days = mdates.DayLocator() # every day
hours = mdates.HourLocator() # every month
daysFmt = mdates.DateFormatter('%D')

# Open data file
data = csv.reader(open('gmail-count.txt'), delimiter=',')

# Convert to vectors
time =
inbox =
for c_time, c_inbox in data:
time.append(c_time)
inbox.append(int(c_inbox))

# Plot the data
x_dates = pylab.date2num([dateutil.parser.parse(s) for s in time])
print x_dates

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x_dates, inbox, 'g^')

# format the ticks
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(daysFmt)
ax.xaxis.set_minor_locator(hours)
fig.autofmt_xdate()

# Save the file
plt.savefig('testA')

And yet here is the result: http://screencast.com/t/gLPDFtwnJM4

I can't figure out why the values are 'grouping' around particular
values on the x-axis... I would expect it to look more like a
function, with only one y-value for each x.

Am I using date2num wrongly, or can anyone please suggest where I
might be going wrong? (In case anyone wants to see the data, I've
attached it as well.. just ignore the 3rd column)

Thanks!
Tyler

gmail-count.txt (31.3 KB)

I am not seeing the problem when I run your script with your datafile (there is no grouping in columns like you have; instead I get a distinct x for each date). I wonder if you have an old dateutil or an old matplotlib.

With recent matplotlib, you can do with less code::

import matplotlib.cbook as cbook
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
r = mlab.csv2rec('gmail-count.txt', names='date,val1,val2',

                 converterd={'date' : cbook.todatetime('%Y-%m-%d %H:%M:%S')})
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(r.date, r.val1, 'g^')
fig.autofmt_xdate ()

plt.show()

JDH

···

On Wed, Mar 25, 2009 at 3:25 PM, Tyler B <bosmeny@…985…> wrote:

And yet here is the result: http://screencast.com/t/gLPDFtwnJM4

I can’t figure out why the values are ‘grouping’ around particular values on the x-axis… I would expect it to look more like a function, with only one y-value for each x.

Hi JDH,

Thanks for looking into this -- it has been driving me crazy!
I tried running your much better code but ended up with the same
result: http://screencast.com/t/UMl6l0Y4

I checked and matplotlib is version 0.98.5.2, and your code doesn't
using dateutil so I guess that's not it.

Any other ideas? I can't think of what else to try...

Thanks again,
Tyler

···

On Wed, Mar 25, 2009 at 5:38 PM, John Hunter <jdh2358@...287...> wrote:

On Wed, Mar 25, 2009 at 3:25 PM, Tyler B <bosmeny@...287...> wrote:

And yet here is the result: http://screencast.com/t/gLPDFtwnJM4

I can't figure out why the values are 'grouping' around particular values
on the x-axis... I would expect it to look more like a function, with only
one y-value for each x.

I am not seeing the problem when I run your script with your datafile (there
is no grouping in columns like you have; instead I get a distinct x for each
date). I wonder if you have an old dateutil or an old matplotlib.

With recent matplotlib, you can do with less code::

import matplotlib\.cbook as cbook
import matplotlib\.mlab as mlab
import matplotlib\.pyplot as plt
r = mlab\.csv2rec\(&#39;gmail\-count\.txt&#39;, names=&#39;date,val1,val2&#39;,
                 converterd=\{&#39;date&#39; : cbook\.todatetime\(&#39;%Y\-%m\-%d

%H:%M:%S')})
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(r.date, r.val1, 'g^')
fig.autofmt_xdate ()
plt.show()

JDH