I am trying to use matplotlib (for the first time) to graph the address space
usage of an application against time. The data is written to a log file by
trace statements throughout the source code of the application. The trace
statements contain the current address space usage as well as a timer value
with millisecond granularity.
My data in the y-axis (address space usage) is fairly uniform (0-2000 MB
values), but my data in the x-axis (the time at which the the trace statements
were executed) is highly clustered. For example, I have approximately 150
data points over a 5 minute run, but some of the data points are only 10ms
apart.
I would like to annotate each point on the graph with the line number in the
log file so that the user can look up what was happening at that point. I have
succeeded, but the graph isn't readable because there is so much overlap in
the points.
Is there a standard way that people display data like this? I don't really
like the idea of equally spacing all of the points along the x-axis because
you lose the understanding of the timing. One idea I had was to have some
sort of vertical break in the graph at areas where there was a long gap
without a data point, but I have no idea whether it's possible to implement
something like that in matplotlib.
The output format hasn't been strictly specified, so if you have any ideas of
how I can produce a useful graph, I would be happy to hear them.
Thanks,
Dave