A while ago there was a discussion [1] about how using the
get_sample_data function in building the documentation is a problem for
Debian packagers. Let me see if I understand the goals of
get_sample_data correctly:
* we want to enable users to run examples they find in the gallery
without downloading extra files;
* we don't want to package all the sample data with matplotlib, either
because it is too large, or because it changes more often than we
release new versions.
The current sample data takes about 2.5 megabytes uncompressed, so the
size doesn't look like a real problem, but of course it is desirable
that new examples are usable with old versions unless they need new
features.
The problem that the Debian packagers have with the current system is
(I suppose) that building the documentation requires network access and
is not guaranteed to be repeatable.
Here's what I suggest:
1. Package the sample data in a separate zip file that users can
download and expand in e.g. ~/.matplotlib/sample_data if they like.
This file could be released more often than matplotlib, if needed.
Debian can use this as one source file and package it as a separate
deb file.
2. Make get_sample_data look first in the place where the zip file could
have been expanded, and only if the required file is not found, try
to obtain it from the web. Add an option to disable the network
access. This is different from what we do now, because now
get_sample_data always tries to check if there is a newer version
available, which apparently doesn't work reliably on unconnected
computers.
3. To make this work, agree that sample data files are immutable: if a
new version is needed, it needs to have a new name (and thus the
examples using it need to be updated). The files have not been
changed a lot [2], so I don't think this is very much of a burden.
What do you think?
Jouni
[1] http://thread.gmane.org/gmane.comp.python.matplotlib.devel/8865
[2] Here is a summary of the changes to each file in sample_data:
=== ./aapl.csv ===
···
------------------------------------------------------------------------
r7379 | jdh2358 | 2009-08-05 18:57:31 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r6202 | jdh2358 | 2008-10-15 15:43:41 +0300 (Wed, 15 Oct 2008)
------------------------------------------------------------------------
r4975 | jdh2358 | 2008-02-16 22:58:37 +0200 (Sat, 16 Feb 2008)
------------------------------------------------------------------------
=== ./AAPL.dat ===
------------------------------------------------------------------------
r7388 | jdh2358 | 2009-08-05 20:16:50 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
=== ./aapl.npy ===
------------------------------------------------------------------------
r7377 | jdh2358 | 2009-08-05 18:52:29 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r6203 | jdh2358 | 2008-10-15 18:39:44 +0300 (Wed, 15 Oct 2008)
------------------------------------------------------------------------
=== ./axes_grid/bivariate_normal.npy ===
------------------------------------------------------------------------
r7436 | leejjoon | 2009-08-09 07:34:08 +0300 (Sun, 09 Aug 2009)
------------------------------------------------------------------------
=== ./ct.raw ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r177 | jdh2358 | 2004-03-13 01:00:12 +0200 (Sat, 13 Mar 2004)
------------------------------------------------------------------------
=== ./data_x_x2_x3.csv ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r7078 | efiring | 2009-05-03 03:09:06 +0300 (Sun, 03 May 2009)
------------------------------------------------------------------------
=== ./demodata.csv ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r5100 | jdh2358 | 2008-04-30 22:53:10 +0300 (Wed, 30 Apr 2008)
------------------------------------------------------------------------
=== ./eeg.dat ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r52 | jdh2358 | 2003-11-02 23:23:21 +0200 (Sun, 02 Nov 2003)
------------------------------------------------------------------------
=== ./embedding_in_wx3.xrc ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r397 | astraw | 2004-07-10 21:39:48 +0300 (Sat, 10 Jul 2004)
------------------------------------------------------------------------
=== ./goog.npy ===
------------------------------------------------------------------------
r7377 | jdh2358 | 2009-08-05 18:52:29 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r6203 | jdh2358 | 2008-10-15 18:39:44 +0300 (Wed, 15 Oct 2008)
------------------------------------------------------------------------
=== ./INTC.dat ===
------------------------------------------------------------------------
r7387 | jdh2358 | 2009-08-05 20:16:00 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
=== ./lena.jpg ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r2557 | astraw | 2006-07-12 02:32:31 +0300 (Wed, 12 Jul 2006)
------------------------------------------------------------------------
r2556 | astraw | 2006-07-12 02:28:46 +0300 (Wed, 12 Jul 2006)
------------------------------------------------------------------------
r603 | astraw | 2004-10-19 20:50:03 +0300 (Tue, 19 Oct 2004)
------------------------------------------------------------------------
=== ./lena.png ===
------------------------------------------------------------------------
r7364 | jdh2358 | 2009-08-05 17:36:27 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r7327 | jdh2358 | 2009-07-31 21:55:17 +0300 (Fri, 31 Jul 2009)
------------------------------------------------------------------------
=== ./logo2.png ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r5669 | jdh2358 | 2008-06-24 21:58:41 +0300 (Tue, 24 Jun 2008)
------------------------------------------------------------------------
=== ./membrane.dat ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r64 | jdh2358 | 2003-11-15 19:05:37 +0200 (Sat, 15 Nov 2003)
------------------------------------------------------------------------
=== ./Minduka_Present_Blue_Pack.png ===
------------------------------------------------------------------------
r7421 | leejjoon | 2009-08-08 04:40:31 +0300 (Sat, 08 Aug 2009)
------------------------------------------------------------------------
=== ./msft.csv ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r2144 | jdh2358 | 2006-03-14 03:28:43 +0200 (Tue, 14 Mar 2006)
------------------------------------------------------------------------
r86 | jdh2358 | 2003-11-21 19:50:00 +0200 (Fri, 21 Nov 2003)
------------------------------------------------------------------------
=== ./msft_nasdaq.npy ===
------------------------------------------------------------------------
r7377 | jdh2358 | 2009-08-05 18:52:29 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r6203 | jdh2358 | 2008-10-15 18:39:44 +0300 (Wed, 15 Oct 2008)
------------------------------------------------------------------------
=== ./s1045.ima ===
------------------------------------------------------------------------
r7382 | jdh2358 | 2009-08-05 19:21:23 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r48 | jdh2358 | 2003-11-02 21:43:30 +0200 (Sun, 02 Nov 2003)
------------------------------------------------------------------------
=== ./testdata.csv ===
------------------------------------------------------------------------
r7364 | jdh2358 | 2009-08-05 17:36:27 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r7361 | jdh2358 | 2009-08-05 14:39:37 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
r7360 | jdh2358 | 2009-08-05 14:34:43 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
=== ./testdir/subdir/testsub.csv ===
------------------------------------------------------------------------
r7368 | jdh2358 | 2009-08-05 17:54:01 +0300 (Wed, 05 Aug 2009)
------------------------------------------------------------------------
--
Jouni K. Seppänen
http://www.iki.fi/jks