Development advice needed

Dear all,

As some of you might have noticed, I am asking questions frequently recently, most of which are naive ones. The reason for this is that I recently decided to develop a satellite data viewer with matplotlib, and I am new to both python and matplotlib.

Here is a little background of this decision. I am a postdoc in the space physics field, where a lot of people watch and analyze satellite data for a living. The data are time-series data by nature. As for now, a lot of people use packages based on IDL to navigate their data. I myself is one of them too. For now. :slight_smile: One big problem with IDL is that it is very expensive because it doesn’t have a broad enough user base to drive their cost down. Another big problem is that the company that is developing IDL doesn’t seem to work in the right direction. For example, more than 99% of the time of more than 99% of the IDL users use the so-called direct graphics system in IDL, but IDL hasn’t upgraded this system since, I don’t know, maybe early 90s. Compared to what matplotlib can offer, the on-screen graphics quality of the IDL direct graphics is simply “ugly”, which is a big reason why I want to switch to matplotlib. There are also some other frequently-used nice features in matplotlib that IDL doesn’t have.

After reading the matplotlib documents and trying out several little examples for a few days, I now have a feeling that matplotlib at least has most of the infrastructure ready for my purposes. One thing that bothers me a little bit is that the plotting speed seems to be a little slow. But IDL had the same problem in the first place too. As computers became faster and faster, that problem just became less and less important. I expect the same thing will happen to matplotlib too.

Now let me turn to technical stuff. What I want is a time-series plotting system like the following. First, it manages all the figure windows it generates, including the positions and looks of the figure windows. A common use case for such a system would be that the user is also analyzing the data while watching the time-series data,and hence the user likely needs to plot some temporary results, such as a snapshot of particle distribution function, but doesn’t want to screw up the time-series plot. Therefore, it will be nice that the plotting windows of the time-series data are managed particularly. Second, it should come with a navigation toolbar that facilitates the data navigation. The current navigation toolbar widget is nice and probably suit more than 50% percent of my needs, but it’s not sufficient. Third, the system should have minimal dependencies for the sake of portability and installation easiness. As for now, I don’t want any dependencies beyond numpy, scipy, and matplotlib. Ipython would be a highly recommended tool, but the system should be just fine without it.

After weighing all the options, I sense that I will probably be better off to use the matplotlib library directly, rather than the convenient utilities provided by pyplot. However, I am having a hard time to find good instructions for using the matplotlib infrastructure. So, I would like to hear some references on that. I also would like to hear general advice about how to construct such a system so that its structure is consistent with matplotlib conventions. Other comments and advice are warmly welcome too. :slight_smile:

Thank you very much for reading this far.

Jianbao

Hi Jianbao,

First some context: at the company I work for, we've been using
matplotlib to do much of what you want to do for the past 4 years. We
have created our own application for plotting, interrogating, and
manipulating time-series data coming from both simulations and
measurements, although from a completely different domain (in our case
it's virtual manufacturing of composite materials). In the past two
years, we've also been using matplotlib to plot in more-or-less
realtime data from a cloud industrial sensors (temperature, pressure,
etc).

After reading the matplotlib documents and trying out several little
examples for a few days, I now have a feeling that matplotlib at least has
most of the infrastructure ready for my purposes. One thing that bothers me
a little bit is that the plotting speed seems to be a little slow. But IDL
had the same problem in the first place too. As computers became faster and
faster, that problem just became less and less important. I expect the same
thing will happen to matplotlib too.

This is true, matplotlib can be slow, particularly for large data sets
and many data sets. The trick is to downsample (and use tiling if
you're going to be panning around a lot) what you're actually plotting
before handing it off to the plot. I think more recent versions of
matplotlib handle some of this for you, but we've found that it's
faster to do the downsampling ourselves.

Now let me turn to technical stuff. What I want is a time-series plotting

[...]

sufficient. Third, the system should have minimal dependencies for the sake
of portability and installation easiness. As for now, I don't want any
dependencies beyond numpy, scipy, and matplotlib. Ipython would be a highly
recommended tool, but the system should be just fine without it.

You're going to need more than that. At the very least you're going to
need a widget framework like wxPython, pyQT, pyGTK, or some such.
These will provide you with all the window management, widget
controls, and so on. Our preference is wxPython but YMMV.

After weighing all the options, I sense that I will probably be better off
to use the matplotlib library directly, rather than the convenient utilities
provided by pyplot. However, I am having a hard time to find good
instructions for using the matplotlib infrastructure. So, I would like to
hear some references on that. I also would like to hear general advice about
how to construct such a system so that its structure is consistent with
matplotlib conventions. Other comments and advice are warmly welcome too.

Absolutely, you'll want to use the API rather than the utility
functions. The best reference for that is the online documentation at
matplotlib.org. In the past we've found the source code documentation
(or, say, that generated by doxygen) more helpful than the Sphinx
documentation, but frankly our matplotlib bits are pretty stable now
and we haven't had to use the documentation for a while (perhaps it's
better now).

Good luck! We've been very happy with our design choices, and get
nothing but positive feedback on how our plots look and feel.
matplotlib and the amazing active community around it have everything
to do with that.

Anthony.

Jianbao,

The one thing I would add to Anthony's response, which is a good summary of what I would say, is that you should look into the animation aspects of matplotlib, and the xdata and ydata attributes of lines/axes for speed in replotting mostly similar situations. I regret having not learned of these before I made an application, and I have yet to go back and implement these faster methods.

-Sterling

···

On Oct 3, 2012, at 9:49AM, Anthony Floyd wrote:

Hi Jianbao,

First some context: at the company I work for, we've been using
matplotlib to do much of what you want to do for the past 4 years. We
have created our own application for plotting, interrogating, and
manipulating time-series data coming from both simulations and
measurements, although from a completely different domain (in our case
it's virtual manufacturing of composite materials). In the past two
years, we've also been using matplotlib to plot in more-or-less
realtime data from a cloud industrial sensors (temperature, pressure,
etc).

After reading the matplotlib documents and trying out several little
examples for a few days, I now have a feeling that matplotlib at least has
most of the infrastructure ready for my purposes. One thing that bothers me
a little bit is that the plotting speed seems to be a little slow. But IDL
had the same problem in the first place too. As computers became faster and
faster, that problem just became less and less important. I expect the same
thing will happen to matplotlib too.

This is true, matplotlib can be slow, particularly for large data sets
and many data sets. The trick is to downsample (and use tiling if
you're going to be panning around a lot) what you're actually plotting
before handing it off to the plot. I think more recent versions of
matplotlib handle some of this for you, but we've found that it's
faster to do the downsampling ourselves.

Now let me turn to technical stuff. What I want is a time-series plotting

[...]

sufficient. Third, the system should have minimal dependencies for the sake
of portability and installation easiness. As for now, I don't want any
dependencies beyond numpy, scipy, and matplotlib. Ipython would be a highly
recommended tool, but the system should be just fine without it.

You're going to need more than that. At the very least you're going to
need a widget framework like wxPython, pyQT, pyGTK, or some such.
These will provide you with all the window management, widget
controls, and so on. Our preference is wxPython but YMMV.

After weighing all the options, I sense that I will probably be better off
to use the matplotlib library directly, rather than the convenient utilities
provided by pyplot. However, I am having a hard time to find good
instructions for using the matplotlib infrastructure. So, I would like to
hear some references on that. I also would like to hear general advice about
how to construct such a system so that its structure is consistent with
matplotlib conventions. Other comments and advice are warmly welcome too.

Absolutely, you'll want to use the API rather than the utility
functions. The best reference for that is the online documentation at
matplotlib.org. In the past we've found the source code documentation
(or, say, that generated by doxygen) more helpful than the Sphinx
documentation, but frankly our matplotlib bits are pretty stable now
and we haven't had to use the documentation for a while (perhaps it's
better now).

Good luck! We've been very happy with our design choices, and get
nothing but positive feedback on how our plots look and feel.
matplotlib and the amazing active community around it have everything
to do with that.

Anthony.

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Dear Anthony,

Thank you so much for your advice. I embedded my response below.

Jianbao

Hi Jianbao,

First some context: at the company I work for, we’ve been using

matplotlib to do much of what you want to do for the past 4 years. We

have created our own application for plotting, interrogating, and

manipulating time-series data coming from both simulations and

measurements, although from a completely different domain (in our case

it’s virtual manufacturing of composite materials). In the past two

years, we’ve also been using matplotlib to plot in more-or-less

realtime data from a cloud industrial sensors (temperature, pressure,

etc).
Do you have any references, such as screen shots, gallery, examples, or whatever? I am very curious to see what people can do with matplotlib.

After reading the matplotlib documents and trying out several little

examples for a few days, I now have a feeling that matplotlib at least has

most of the infrastructure ready for my purposes. One thing that bothers me

a little bit is that the plotting speed seems to be a little slow. But IDL

had the same problem in the first place too. As computers became faster and

faster, that problem just became less and less important. I expect the same

thing will happen to matplotlib too.

This is true, matplotlib can be slow, particularly for large data sets

and many data sets. The trick is to downsample (and use tiling if

you’re going to be panning around a lot) what you’re actually plotting

before handing it off to the plot. I think more recent versions of

matplotlib handle some of this for you, but we’ve found that it’s

faster to do the downsampling ourselves.
As a matter of fact, I considered writing intermediate routines to handle downsampling before feeding data in matplotlib. However, you will have to do anti-alias filtering for that. So, I wasn’t sure downsampling would boost the speed anyway. But based on your experience, this is probably a good idea. :slight_smile:

Now let me turn to technical stuff. What I want is a time-series plotting

[…]

sufficient. Third, the system should have minimal dependencies for the sake

of portability and installation easiness. As for now, I don’t want any

dependencies beyond numpy, scipy, and matplotlib. Ipython would be a highly

recommended tool, but the system should be just fine without it.

You’re going to need more than that. At the very least you’re going to

need a widget framework like wxPython, pyQT, pyGTK, or some such.

These will provide you with all the window management, widget

controls, and so on. Our preference is wxPython but YMMV.
One of my concerns about third-party widget framework is that sometimes it is difficult to install them. In fact, I tried to install wxPython on my Mac (10.8 OS X) last night, but didn’t succeed. Another concern of mine is that I don’t know how efficient or how easy to interact with a thrid-party widget framework from a python interpreter. However, again, based on your reply, it doesn’t seem to be a big issue after all.

After weighing all the options, I sense that I will probably be better off

to use the matplotlib library directly, rather than the convenient utilities

provided by pyplot. However, I am having a hard time to find good

instructions for using the matplotlib infrastructure. So, I would like to

hear some references on that. I also would like to hear general advice about

how to construct such a system so that its structure is consistent with

matplotlib conventions. Other comments and advice are warmly welcome too.

Absolutely, you’ll want to use the API rather than the utility

functions. The best reference for that is the online documentation at

matplotlib.org. In the past we’ve found the source code documentation

(or, say, that generated by doxygen) more helpful than the Sphinx

documentation, but frankly our matplotlib bits are pretty stable now

and we haven’t had to use the documentation for a while (perhaps it’s

better now).

Good luck! We’ve been very happy with our design choices, and get

nothing but positive feedback on how our plots look and feel.

matplotlib and the amazing active community around it have everything

to do with that.
I am very glad to hear that. :slight_smile:

···

On Wed, Oct 3, 2012 at 10:49 AM, Anthony Floyd <anthonyfloyd@…287…> wrote:

Anthony.

Hi Jianbao,

Do you have any references, such as screen shots, gallery, examples, or
whatever? I am very curious to see what people can do with matplotlib.

If you can find a Windows machine (or a Windows VM) and stomach a 60
MB download, visit
http://www.convergent.ca/products/raven/downloads.html and grab the
"RAVEN Viewer" and "Demo RAVEN Workspace". When starting the program
for the first time, don't worry about selecting a license, select
"Viewer Only". Open the demo file. All plotting, annotating, legends,
etc are handled by matplotlib. wxPython provides the rest of the GUI
elements. The entirety of the program except for the engineering
backend (which isn't exposed in the viewer anyway) is written in
Python.

If you can't get to a Windows box, then just visit
http://www.convergent.ca/raven to get a sense of the application.

Cheers,
Anthony.

Thank you so much, Anthony. After weighing the options, I decided to go for Tkinter. The major reason for this is portability. BTW, I checked out your website. Those screenshots are quite impressive. :slight_smile:

Jianbao

···

On Wed, Oct 3, 2012 at 3:27 PM, Anthony Floyd <anthonyfloyd@…287…> wrote:

Hi Jianbao,

Do you have any references, such as screen shots, gallery, examples, or

whatever? I am very curious to see what people can do with matplotlib.

If you can find a Windows machine (or a Windows VM) and stomach a 60

MB download, visit

http://www.convergent.ca/products/raven/downloads.html and grab the

“RAVEN Viewer” and “Demo RAVEN Workspace”. When starting the program

for the first time, don’t worry about selecting a license, select

“Viewer Only”. Open the demo file. All plotting, annotating, legends,

etc are handled by matplotlib. wxPython provides the rest of the GUI

elements. The entirety of the program except for the engineering

backend (which isn’t exposed in the viewer anyway) is written in

Python.

If you can’t get to a Windows box, then just visit

http://www.convergent.ca/raven to get a sense of the application.

Cheers,

Anthony.