imread() and PNGs

Hi everyone,
After getting fed severely fed up with Matlab in recent months I downloaded Python, Numpy and Matplotlib to try out as an alternative. So far I'm pleasantly impressed, even if building from source on Mac OS X is an experience :wink: However, I have discovered a couple of problems with Matplotlib's imread() function and, shall we say, 'esoteric' PNG files. My research group uses a 12-bit CCD controlled through Labview to capture high dynamic range image stacks. Often there are ~30 images in a single data set. These get read into Matlab in one go for processing as a stack. I tried converting my code over to Python but, after digging through the _png.cpp source file found the following that are problems from my point of view:

1) All .png files in imread() are converted to 4-plane RGBA files regardless of original format. I would prefer greyscale images to return a single plane.
2) 16-bit PNGs are stripped to 8 bit, losing any extra precision.
3) The significant bits option in the PNG header was not being checked. Our camera software will automatically save the PNGs at the maximum bit-depth required to cover the dynamic range in the image, and can sum images before saving, so pixels can be anywhere from 6- to 16-bits (at least those are the values I have observed whilst using the camera).

I have attached the results of an svn diff after I made an attempt at correcting these issues. This is the first time I have contributed to an open source project, so am not sure of the etiquette here. Also, I have only had Python and Matplotlib for a fortnight so am still unfamiliar with them and haven't programmed with libpng before so I apologise in advance if there any stupid mistakes in my code. I am aware that imread() is a pretty important function in Matplotlib and hence any changes I suggest would need comprehensive testing. In brief, I made the following changes:

1) Removed the libpng 16- to 8-bit strip command
2) Added in the libpng calls to cope with variable bit-depth and converting 16-bit pngs from big-endian to little-endian
3) Added a large if/else if stucture at the end to return different PyArrays depending on the input data. RGBA images are 4 plane, RGB 3 plane and greyscale 1 plane. Numbers within these are still floats scaled between 0 and 1, except 16-bit images which are doubles (Are floats preferable to doubles?). The scaling factor is worked out from the significant bits struct.

There are still a couple of issues with this code, mainly that I have only tested it with PNGs I have lying to hand, all of which display correctly with imshow() and I have not made much attempt at supporting 1,2 and 4 bit pngs. I'm personally not a big fan of large if/else ifs but in this case thought it was the clearest way to return the different types.

I would finally like to point out that no software I have used so far has been able to read the images produced by this camera completely correctly. PIL interprets the variable bit-depth images as binary (?!), and we had to write a wrapper round the Matlab imread() function using iminfo() as Matlab ignores the significant bits setting as well.

Oh, almost forgot, I'm compiling on Mac OS X 10.5, Python 2.6.1 (r261:67515) and the latest Numpy and Matplotlib SVN checkouts.

Kind regards,
Tobias Wood

png_patch.txt (5.55 KB)

Tobias Wood wrote:

Hi everyone,
After getting fed severely fed up with Matlab in recent months I downloaded Python, Numpy and Matplotlib to try out as an alternative. So far I'm pleasantly impressed, even if building from source on Mac OS X is an experience :wink: However, I have discovered a couple of problems with Matplotlib's imread() function and, shall we say, 'esoteric' PNG files. My research group uses a 12-bit CCD controlled through Labview to capture high dynamic range image stacks. Often there are ~30 images in a single data set. These get read into Matlab in one go for processing as a stack. I tried converting my code over to Python but, after digging through the _png.cpp source file found the following that are problems from my point of view:

1) All .png files in imread() are converted to 4-plane RGBA files regardless of original format. I would prefer greyscale images to return a single plane.
2) 16-bit PNGs are stripped to 8 bit, losing any extra precision.
3) The significant bits option in the PNG header was not being checked. Our camera software will automatically save the PNGs at the maximum bit-depth required to cover the dynamic range in the image, and can sum images before saving, so pixels can be anywhere from 6- to 16-bits (at least those are the values I have observed whilst using the camera).

I have attached the results of an svn diff after I made an attempt at correcting these issues. This is the first time I have contributed to an open source project, so am not sure of the etiquette here. Also, I have only had Python and Matplotlib for a fortnight so am still unfamiliar with them and haven't programmed with libpng before so I apologise in advance if there any stupid mistakes in my code. I am aware that imread() is a pretty important function in Matplotlib and hence any changes I suggest would need comprehensive testing. In brief, I made the following changes:

Tobias,

Thank you very much for the patch and the careful explanation. I'm not the right person to review it, but I expect someone familiar with libpng use in mpl will do so soon. Mike D. would be a candidate, but I think he will be unavailable for several days. If you have not gotten any feedback within a week, *please* ping us with a reminder. If it comes to that, you could do it by forwarding your original message to matplotlib-devel.

Eric

···

1) Removed the libpng 16- to 8-bit strip command
2) Added in the libpng calls to cope with variable bit-depth and converting 16-bit pngs from big-endian to little-endian
3) Added a large if/else if stucture at the end to return different PyArrays depending on the input data. RGBA images are 4 plane, RGB 3 plane and greyscale 1 plane. Numbers within these are still floats scaled between 0 and 1, except 16-bit images which are doubles (Are floats preferable to doubles?). The scaling factor is worked out from the significant bits struct.

There are still a couple of issues with this code, mainly that I have only tested it with PNGs I have lying to hand, all of which display correctly with imshow() and I have not made much attempt at supporting 1,2 and 4 bit pngs. I'm personally not a big fan of large if/else ifs but in this case thought it was the clearest way to return the different types.

I would finally like to point out that no software I have used so far has been able to read the images produced by this camera completely correctly. PIL interprets the variable bit-depth images as binary (?!), and we had to write a wrapper round the Matlab imread() function using iminfo() as Matlab ignores the significant bits setting as well.

Oh, almost forgot, I'm compiling on Mac OS X 10.5, Python 2.6.1 (r261:67515) and the latest Numpy and Matplotlib SVN checkouts.

Kind regards,
Tobias Wood

Eric Firing wrote:

Tobias Wood wrote:

Hi everyone,
After getting fed severely fed up with Matlab in recent months I
downloaded Python, Numpy and Matplotlib to try out as an alternative. So
far I'm pleasantly impressed, even if building from source on Mac OS X
is an experience :wink: However, I have discovered a couple of problems with
Matplotlib's imread() function and, shall we say, 'esoteric' PNG files.
My research group uses a 12-bit CCD controlled through Labview to
capture high dynamic range image stacks. Often there are ~30 images in a
single data set. These get read into Matlab in one go for processing as
a stack. I tried converting my code over to Python but, after digging
through the _png.cpp source file found the following that are problems
from my point of view:

1) All .png files in imread() are converted to 4-plane RGBA files
regardless of original format. I would prefer greyscale images to return
a single plane.
2) 16-bit PNGs are stripped to 8 bit, losing any extra precision.
3) The significant bits option in the PNG header was not being checked.
Our camera software will automatically save the PNGs at the maximum
bit-depth required to cover the dynamic range in the image, and can sum
images before saving, so pixels can be anywhere from 6- to 16-bits (at
least those are the values I have observed whilst using the camera).

I have attached the results of an svn diff after I made an attempt at
correcting these issues. This is the first time I have contributed to an
open source project, so am not sure of the etiquette here. Also, I have
only had Python and Matplotlib for a fortnight so am still unfamiliar
with them and haven't programmed with libpng before so I apologise in
advance if there any stupid mistakes in my code. I am aware that
imread() is a pretty important function in Matplotlib and hence any
changes I suggest would need comprehensive testing. In brief, I made the
following changes:

Tobias,

Thank you very much for the patch and the careful explanation. I'm not
the right person to review it, but I expect someone familiar with libpng
use in mpl will do so soon. Mike D. would be a candidate, but I think
he will be unavailable for several days. If you have not gotten any
feedback within a week, *please* ping us with a reminder. If it comes
to that, you could do it by forwarding your original message to
matplotlib-devel.

Tobias,

I would like to apply your patch, but the test in
examples/tests/pngsuite fails. If you can submit a new patch where this
test passes, and, even better, if a small example 12-bit PNG of yours is
added to the test, I will apply it.

Apart from that, I would echo Eric's thanks for the patch and explanation.

-Andrew

Tobias,

I would like to apply your patch, but the test in
examples/tests/pngsuite fails. If you can submit a new patch where this
test passes, and, even better, if a small example 12-bit PNG of yours is
added to the test, I will apply it.

Apart from that, I would echo Eric's thanks for the patch and explanation.

-Andrew

Hi Andrew and Eric,
Thanks for the responses. I was unaware of the png test suite. I have attached a new diff that passes this test correctly. It originally failed because I was not handling greyscale images with an alpha channel, but it also brought to light several other issues with my code that I have fixed. I have changed the structure of the code significantly - the if/else struct has gone and a single loop returns the different image matrices. Although this is more concise, it no longer informs the user if it hits an unsupported image type.

Unfortunately I do not have a small test image available, all of ours are a minimum of 512x620. However the pngsuite page, http://libpng.org/pub/png/pngsuite.html, does have a set of suitable images labelled cs*n*.png. These have thrown up an interesting issue - to what maximum value should n-bit images be scaled when n is between 8 and 16? The png spec and test images suggest it should be (2^n - 1). This means that higher bit depths give higher precision over the same intensity range and the same maximum value. However for my particular camera and software this would be wrong, as the CCD has a fixed 12-bit dynamic range and the lower png bit depths are only used to save file space. Hence at the moment I have set my software to scale to (2^16 - 1) for 8 < n < 16, but it follows the png spec for n < 8, so there are two contradictory behaviours and I am unsure which is the best approach. Personally I would prefer matplotlib to return raw integer values, not floats scaled between 0 and 1 and then I can apply the scaling myself, but I am aware that this is not particularly user friendly for anyone else. imshow() seems to handle integer values fine and correctly scales for display, provided that no alpha channel is present.

Should I post another message to the developer list about this to see what people think? I'd very much like to discuss this with someone who has a lot more experience of pngs than me.

Thanks,
Tobias

png_patch2.txt (4.75 KB)

Tobias Wood
[...]

These have thrown up an interesting issue - to what maximum value should n-bit images be scaled when n is between 8 and 16? The png spec and test images suggest it should be (2^n - 1). This means that higher bit depths give higher precision over the same intensity range and the same maximum value. However for my particular camera and software this would be wrong, as the CCD has a fixed 12-bit dynamic range and the lower png bit depths are only used to save file space. Hence at the moment I have set my software to scale to (2^16 - 1) for 8 < n < 16, but it follows the png spec for n < 8, so there are two contradictory behaviours and I am unsure which is the best approach. Personally I would prefer matplotlib to return raw integer values, not floats scaled between 0 and 1 and then I can apply the scaling myself, but I am aware that this is not particularly user friendly for anyone else. imshow() seems to handle integer values fine and correctly scales for display, provided that no alpha channel is present.

In the past I worked on a similar problem, saving and loading images from a 14bit monochrome CCD camera to a PNG file. The PNG specification gives precise recommondations how to handle such a case: for saving an n-bit image which is not directly supported by the PNG specs the image needs to be scaled up to one of the supported bit depths, i.e. 1,2,4,8 or 16. The original bit depth should be stored in the sBIT chunk to recover the original data. Explicitely, a 12 bit image needs to be scaled by a factor (2^16 - 1)/(2^12-1).
The approach I have chosen is: I have png_save_image(filename, img, significant_bits) that behaves as specified in the specs, i.e., 14bit images are scaled up to 16bit. For loading an image I use img, metadata = png_load_image(filename) that returns the downscaled image as an integer array and a dict containing some metadata (which includes the original bit depth). I also noticed that neither pnglib, PIL nor matlab perform this downscaling. With this approach the loaded image data is identical to the original raw image. For further processing I typically normalize the image to the range 0 to 1.0, using the bit depth information. A float array is sufficiently precise, also for 16bit images. For your enhancements to imread, introducing a new keyword 'normalized' would allow to switch between both these possibilies.As I understand I have essentially chosen the same approach like you, at least for bitdepths > 8. I didn't get the point what is different for bitdepth s<=8.

Another remark to your first posting: I didn't experience a problem with PIL to load 16bit PNG grayscale images.
I also noticed you used C++ constructs in your code. I think this is not recommended.

Gregor

Gregor Thalhammer wrote:

Tobias Wood
[...]

[...]

I also noticed you used C++ constructs in your code. I think this is not recommended.

Gregor,

Would you elaborate, please, to satisfy my curiosity? The original file is C++, so use of C++ constructs in the patch would seem natural.

Thank you.

Eric

Tobias Wood wrote:

Tobias,

I would like to apply your patch, but the test in
examples/tests/pngsuite fails. If you can submit a new patch where this
test passes, and, even better, if a small example 12-bit PNG of yours is
added to the test, I will apply it.

Apart from that, I would echo Eric's thanks for the patch and
explanation.

-Andrew

Hi Andrew and Eric,
Thanks for the responses. I was unaware of the png test suite. I have
attached a new diff that passes this test correctly. It originally
failed because I was not handling greyscale images with an alpha
channel, but it also brought to light several other issues with my code
that I have fixed. I have changed the structure of the code
significantly - the if/else struct has gone and a single loop returns
the different image matrices. Although this is more concise, it no
longer informs the user if it hits an unsupported image type.

Unfortunately I do not have a small test image available, all of ours
are a minimum of 512x620. However the pngsuite page,
http://libpng.org/pub/png/pngsuite.html, does have a set of suitable
images labelled cs*n*.png. These have thrown up an interesting issue -
to what maximum value should n-bit images be scaled when n is between 8
and 16? The png spec and test images suggest it should be (2^n - 1).
This means that higher bit depths give higher precision over the same
intensity range and the same maximum value. However for my particular
camera and software this would be wrong, as the CCD has a fixed 12-bit
dynamic range and the lower png bit depths are only used to save file
space. Hence at the moment I have set my software to scale to (2^16 - 1)
for 8 < n < 16, but it follows the png spec for n < 8, so there are two
contradictory behaviours and I am unsure which is the best approach.
Personally I would prefer matplotlib to return raw integer values, not
floats scaled between 0 and 1 and then I can apply the scaling myself,
but I am aware that this is not particularly user friendly for anyone
else. imshow() seems to handle integer values fine and correctly scales
for display, provided that no alpha channel is present.

Should I post another message to the developer list about this to see
what people think? I'd very much like to discuss this with someone who
has a lot more experience of pngs than me.

Tobias,

I went ahead and applied your patch to the svn trunk and the 0.98.5
maintenance branch -- the aspect of having grayscale images come in as
2d arrays brings the functionality inline with the docstring to
imread(), so it qualifies as a bug fix. The rest is a nice feature
addition (the ability to read high dynamic range PNGs) that I think is
unlikely to break anything.

As for your questions, I think they can be addressed later. (I hope you
maintain your interest in this subject.) In particular, it would be good
to get Michael Droetboom's responses on this -- he's the resident PNG
expert. In terms of the 8 < n < 16 bit PNG issue, if they are to be
stored as integers, they will have to be stored in 16 bits, thus there
are two reasonable ways to do it -- left shifted and right shifted.
There are arguments for both. Thus, my opinion is that adding keyword
arguments to imread() that would modify the current behavior
appropriately would be the best solution. In other words, something like
return_as_integer=False, integer_shift='left' would be the defaults.

Next time you submit patches, I do think it would be best to submit to
the mpl-dev email list.

Anyhow, thanks for the patch!

-Andrew

Eric Firing schrieb:

Gregor Thalhammer wrote:

Tobias Wood
[...]

[...]

I also noticed you used C++ constructs in your code. I think this is not recommended.

Gregor,

Would you elaborate, please, to satisfy my curiosity? The original file is C++, so use of C++ constructs in the patch would seem natural.

Thank you.

Eric

You are right. I was assuming the original file is C. Sorry for this stupid comment.

Gregor