nosetests: too slow, too much memory

Eric_Firing2 · June 1, 2014, 8:25am

Our standard test has gotten out of control. The most serious problem is that running a full test suite now fails on a linux VM with 4 GB--it's out of memory. Half-way through the set, it is already using more than 2 GB. That's ridiculous. Running nosetests separately on each test module keeps the max reported by top to 1.6 GB, and the max by report_memory to 0.5 GB; still quite a bit, but tolerable. (I don't know why there is this factor of 3 between top and report_memory.) This scheme of running test modules one at a time also speeds it up by a factor of 2; I don't understand why.

The script I used for the module-at-a-time test is attached. It is a modification of matplotlib.tests().

Are there any nosetest experts out there with ideas about how to streamline the standard test routine?

Eric

test_modules.py (1.96 KB)

Nelle_Varoquaux · June 4, 2014, 4:20pm

Our standard test has gotten out of control. The most serious problem is
that running a full test suite now fails on a linux VM with 4 GB--it's out
of memory. Half-way through the set, it is already using more than 2 GB.
That's ridiculous. Running nosetests separately on each test module keeps
the max reported by top to 1.6 GB, and the max by report_memory to 0.5 GB;
still quite a bit, but tolerable. (I don't know why there is this factor of
3 between top and report_memory.) This scheme of running test modules one
at a time also speeds it up by a factor of 2; I don't understand why.

The script I used for the module-at-a-time test is attached. It is a
modification of matplotlib.tests().

Are there any nosetest experts out there with ideas about how to streamline
the standard test routine?

This issue is probably worth mentionning on other mailing list of
people using nosetest, and nosetests.
I'm thinking of scikit-learn in particular, which also uses nosetest
heavily. The scipy-users list might be a good place to exchange
experience.

N

···

Eric

------------------------------------------------------------------------------
Time is money. Stop wasting it! Get your web API in 5 minutes.
www.restlet.com/download
http://p.sf.net/sfu/restlet
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

Benjamin_Root1 · June 4, 2014, 4:26pm

A theory…

If I remember correctly, the nosttests was set up to execute in parallel using the default Multiprocessing settings, which is to have a process worker for each available CPU core. Perhaps this might be the crux of the issue with so many simultaneous tests running that the amount of memory used at the same time becomes too large. Or, am I thinking of the doc build system?

Ben Root

···

On Wed, Jun 4, 2014 at 12:20 PM, Nelle Varoquaux <nelle.varoquaux@…149…> wrote:

Our standard test has gotten out of control. The most serious problem is

that running a full test suite now fails on a linux VM with 4 GB–it’s out

of memory. Half-way through the set, it is already using more than 2 GB.

That’s ridiculous. Running nosetests separately on each test module keeps

the max reported by top to 1.6 GB, and the max by report_memory to 0.5 GB;

still quite a bit, but tolerable. (I don’t know why there is this factor of

3 between top and report_memory.) This scheme of running test modules one

at a time also speeds it up by a factor of 2; I don’t understand why.

The script I used for the module-at-a-time test is attached. It is a

modification of matplotlib.tests().

Are there any nosetest experts out there with ideas about how to streamline

the standard test routine?

This issue is probably worth mentionning on other mailing list of

people using nosetest, and nosetests.

I’m thinking of scikit-learn in particular, which also uses nosetest

heavily. The scipy-users list might be a good place to exchange

experience.

N

Eric

Time is money. Stop wasting it! Get your web API in 5 minutes.

www.restlet.com/download

http://p.sf.net/sfu/restlet

Matplotlib-devel mailing list

Matplotlib-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Learn Graph Databases - Download FREE O’Reilly Book

“Graph Databases” is the definitive new guide to graph databases and their

applications. Written by three acclaimed leaders in the field,

this first edition is now available. Download your free book today!

http://p.sf.net/sfu/NeoTech

Matplotlib-devel mailing list

Matplotlib-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Eric_Firing2 · June 4, 2014, 4:57pm

Ben,

Top shows a single process. The VM is configured with 2 cores.

Eric

···

On 2014/06/04 6:26 AM, Benjamin Root wrote:

A theory...

If I remember correctly, the nosttests was set up to execute in parallel
using the default Multiprocessing settings, which is to have a process
worker for each available CPU core. Perhaps this might be the crux of
the issue with so many simultaneous tests running that the amount of
memory used at the same time becomes too large. Or, am I thinking of the
doc build system?

Ben Root

Benjamin_Root1 · June 4, 2014, 5:49pm

So, I just tried comparing memory usage for a plot displayed via show() versus savefig() as a PNG. It would seem that saving to pngs uses more memory. Not sure why, though.

Ben

···

On Jun 4, 2014 12:57 PM, “Eric Firing” <efiring@…229…> wrote:

On 2014/06/04 6:26 AM, Benjamin Root wrote:

A theory…

If I remember correctly, the nosttests was set up to execute in parallel

using the default Multiprocessing settings, which is to have a process

worker for each available CPU core. Perhaps this might be the crux of

the issue with so many simultaneous tests running that the amount of

memory used at the same time becomes too large. Or, am I thinking of the

doc build system?

Ben Root

Ben,

Top shows a single process. The VM is configured with 2 cores.

Eric

Benjamin_Root1 · June 26, 2014, 10:57pm

Just noticed an oddity with the tests on Travis versus the tests on my machine. The test log on Travis for a single run has over 10,000 lines. But, for me, it is over ~4800 lines. At a glance, I can see that test_mlab is not executed for me, but they are for Travis. I am very suspicious of the test_mlab run on Travis because it seems to be running multiple times, but I can’t be sure.

Michael, can I get the test log for one of the recent Travis runs?

Thanks,

Ben Root

···

On Wed, Jun 4, 2014 at 1:49 PM, Benjamin Root <ben.root@…553…> wrote:

So, I just tried comparing memory usage for a plot displayed via show() versus savefig() as a PNG. It would seem that saving to pngs uses more memory. Not sure why, though.

Ben

On Jun 4, 2014 12:57 PM, “Eric Firing” <efiring@…1100…> wrote:

On 2014/06/04 6:26 AM, Benjamin Root wrote:

A theory…

If I remember correctly, the nosttests was set up to execute in parallel

using the default Multiprocessing settings, which is to have a process

worker for each available CPU core. Perhaps this might be the crux of

the issue with so many simultaneous tests running that the amount of

memory used at the same time becomes too large. Or, am I thinking of the

doc build system?

Ben Root

Ben,

Top shows a single process. The VM is configured with 2 cores.

Eric

Benjamin_Root1 · June 26, 2014, 11:09pm

False alarm. The Travis logs includes (but hides) the install output, and test_mlab was running for me, but I was looking at the wrong line numbers. Still though, I would be curious as to any differences, but I can’t seem to download the log file.

Ben

···

On Thu, Jun 26, 2014 at 6:57 PM, Benjamin Root <ben.root@…867…> wrote:

Just noticed an oddity with the tests on Travis versus the tests on my machine. The test log on Travis for a single run has over 10,000 lines. But, for me, it is over ~4800 lines. At a glance, I can see that test_mlab is not executed for me, but they are for Travis. I am very suspicious of the test_mlab run on Travis because it seems to be running multiple times, but I can’t be sure.

Michael, can I get the test log for one of the recent Travis runs?

Thanks,

Ben Root

On Wed, Jun 4, 2014 at 1:49 PM, Benjamin Root <ben.root@…553…> wrote:

So, I just tried comparing memory usage for a plot displayed via show() versus savefig() as a PNG. It would seem that saving to pngs uses more memory. Not sure why, though.

Ben

On Jun 4, 2014 12:57 PM, “Eric Firing” <efiring@…1100…> wrote:

On 2014/06/04 6:26 AM, Benjamin Root wrote:

A theory…

If I remember correctly, the nosttests was set up to execute in parallel

using the default Multiprocessing settings, which is to have a process

worker for each available CPU core. Perhaps this might be the crux of

the issue with so many simultaneous tests running that the amount of

memory used at the same time becomes too large. Or, am I thinking of the

doc build system?

Ben Root

Ben,

Top shows a single process. The VM is configured with 2 cores.

Eric