[rfc] a better interchange format for colormaps

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
    https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

···

--
Nathaniel J. Smith -- https://vorpus.org

While I'm not a fan of Matlab, I think it would be nice to be able to
use colormaps defined in this way with it. It is a few years I'm not
working in Matlab, but at the time there were no built-in tools to read
JSON files. It would be unfortunate if this interchange format would
depend on a large third party library to be used with Matlab.

Cheers,
Daniele

···

On 4/20/16 5:22 PM, Nathaniel Smith wrote:

I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

Do you have any suggestions for how to format the data so that it
could be used more easily in Matlab while still being easily usable
from Javascript? I guess if I have to pick I'm more worried about
compatibility with D3 and Vega than with Matlab...

-n

···

On Wed, Apr 20, 2016 at 4:59 PM, Daniele Nicolodi <daniele at grinta.net> wrote:

On 4/20/16 5:22 PM, Nathaniel Smith wrote:

I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

While I'm not a fan of Matlab, I think it would be nice to be able to
use colormaps defined in this way with it. It is a few years I'm not
working in Matlab, but at the time there were no built-in tools to read
JSON files. It would be unfortunate if this interchange format would
depend on a large third party library to be used with Matlab.

--
Nathaniel J. Smith -- https://vorpus.org

Could you use HDF format? Its a bit overkill as a storage container but
almost everything supports it.

···

On Wed, Apr 20, 2016 at 8:10 PM, Nathaniel Smith <njs at pobox.com> wrote:

On Wed, Apr 20, 2016 at 4:59 PM, Daniele Nicolodi <daniele at grinta.net> > wrote:
> On 4/20/16 5:22 PM, Nathaniel Smith wrote:
>> I wrote up a little spec to
>> standardize a way of storing and distributing colormaps in JSON, with
>> the hope that we can convince everyone to implement this and stop
>> writing silly little conversion scripts all the time.
>
> While I'm not a fan of Matlab, I think it would be nice to be able to
> use colormaps defined in this way with it. It is a few years I'm not
> working in Matlab, but at the time there were no built-in tools to read
> JSON files. It would be unfortunate if this interchange format would
> depend on a large third party library to be used with Matlab.

Do you have any suggestions for how to format the data so that it
could be used more easily in Matlab while still being easily usable
from Javascript? I guess if I have to pick I'm more worried about
compatibility with D3 and Vega than with Matlab...

-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20160420/d358bf6d/attachment.html>

The only thing that can read/write HDF5 is libhdf5, which is a giant C
codebase. Unless I'm missing something, I think this would make the
format utterly inaccessible to browser-based Javascript visualizations
-- versus, it's at least *possible* to read JSON in Matlab :-).

-n

···

On Wed, Apr 20, 2016 at 6:25 PM, Isaac Gerg <isaac.gerg at gergltd.com> wrote:

Could you use HDF format? Its a bit overkill as a storage container but
almost everything supports it.

--
Nathaniel J. Smith -- https://vorpus.org

Nathaniel,

I like the idea.

Minor and tentative suggestion: add a key to specify the number of
colors. Yes, it can be calculated from the length of the "colors"
string, but I have a hunch the convenience of having it immediately
available (even grep-able) would be worth the slight redundancy.

Although matplotlib's colormap support was originally designed by John
Hunter around his LinearSegmentedColormap scheme, that method for
defining and recording a colormap has always been baffling to most
people--and the end result is a simple lookup table anyway. Therefore I
favor moving to the direct specification of the evenly-spaced
fine-grained color list (suitable for matplotlib's ListedColormap) as in
your proposal here.

Eric

···

On 2016/04/20 1:22 PM, Nathaniel Smith wrote:

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
     https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

While I'm not a fan of Matlab, I think it would be nice to be able to
use colormaps defined in this way with it. It is a few years I'm not
working in Matlab, but at the time there were no built-in tools to read
JSON files. It would be unfortunate if this interchange format would
depend on a large third party library to be used with Matlab.

I haven't tried it (I don't use Matlab any more), but it looks like this
small third party library takes care of it:

https://github.com/fangq/jsonlab

Eric

···

On 2016/04/20 1:59 PM, Daniele Nicolodi wrote:

On 4/20/16 5:22 PM, Nathaniel Smith wrote:

Cheers,
Daniele

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
     https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

Nathaniel,

I like the idea.

Minor and tentative suggestion: add a key to specify the number of colors.
Yes, it can be calculated from the length of the "colors" string, but I have
a hunch the convenience of having it immediately available (even grep-able)
would be worth the slight redundancy.

Hmm, I'll have to think about that one. My gut reaction is to be wary
about introducing redundancy because it tends to create
source-of-truth problems -- if the two things disagree, then which is
right? -- and then those tend to create interoperability problems.
But, maybe I'm just missing something... I don't think I've ever
grepped for a colormap on the basis of how many control points it
contained, so I might not be the target audience :-). Can you give any
examples of when you'd use this?

Although matplotlib's colormap support was originally designed by John
Hunter around his LinearSegmentedColormap scheme, that method for defining
and recording a colormap has always been baffling to most people--and the
end result is a simple lookup table anyway. Therefore I favor moving to the
direct specification of the evenly-spaced fine-grained color list (suitable
for matplotlib's ListedColormap) as in your proposal here.

This might improve the handling of discrete/qualitative colormaps too
-- right now AFAICT matplotlib just doesn't handle these sensibly at
all by default. The way the colorbrewer palettes have historically
been mangled in matplotlib is pretty bad :-/

    http://matplotlib.org/mpl_examples/color/colormaps_reference_04.png

Though in some sense I suppose one has lost as soon as one is trying
to use the Colormap interface's float -> color semantics for a
discrete map -- the seaborn "palette" concept is in some ways closer
to what we'd want for qualitative colormaps.

-n

···

On Wed, Apr 20, 2016 at 7:32 PM, Eric Firing <efiring at hawaii.edu> wrote:

On 2016/04/20 1:22 PM, Nathaniel Smith wrote:

--
Nathaniel J. Smith -- https://vorpus.org

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
      https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

Nathaniel,

I like the idea.

Minor and tentative suggestion: add a key to specify the number of colors.
Yes, it can be calculated from the length of the "colors" string, but I have
a hunch the convenience of having it immediately available (even grep-able)
would be worth the slight redundancy.

Hmm, I'll have to think about that one. My gut reaction is to be wary
about introducing redundancy because it tends to create
source-of-truth problems -- if the two things disagree, then which is
right? -- and then those tend to create interoperability problems.

Agreed, which is why I said "tentative".

But, maybe I'm just missing something... I don't think I've ever
grepped for a colormap on the basis of how many control points it
contained, so I might not be the target audience :-). Can you give any
examples of when you'd use this?

I was thinking of how one would quickly summarize the characteristics of
a file without having to parse it programmatically as JSON. The number
of colors in the list is just meta-data. But this is not a big deal
either way.

Although matplotlib's colormap support was originally designed by John
Hunter around his LinearSegmentedColormap scheme, that method for defining
and recording a colormap has always been baffling to most people--and the
end result is a simple lookup table anyway. Therefore I favor moving to the
direct specification of the evenly-spaced fine-grained color list (suitable
for matplotlib's ListedColormap) as in your proposal here.

This might improve the handling of discrete/qualitative colormaps too
-- right now AFAICT matplotlib just doesn't handle these sensibly at
all by default. The way the colorbrewer palettes have historically
been mangled in matplotlib is pretty bad :-/

Agreed. That slipped in without adequate review.

     http://matplotlib.org/mpl_examples/color/colormaps_reference_04.png

Though in some sense I suppose one has lost as soon as one is trying
to use the Colormap interface's float -> color semantics for a
discrete map -- the seaborn "palette" concept is in some ways closer
to what we'd want for qualitative colormaps.

Matplotlib colormaps can be indexed with an integer to go straight to an
element in the lookup table; it is not necessary to use a float in the
0-1 range.

It would be possible to modify Colormap to include a "qualitative" flag,
and require a direct integer index when that flag is set. Or a warning
could be issued upon use of a float.

Eric

···

On 2016/04/20 5:07 PM, Nathaniel Smith wrote:

On Wed, Apr 20, 2016 at 7:32 PM, Eric Firing <efiring at hawaii.edu> wrote:

On 2016/04/20 1:22 PM, Nathaniel Smith wrote:

-n

Though in some sense I suppose one has lost as soon as one is trying
to use the Colormap interface's float -> color semantics for a
discrete map

I agree -- I think a discrete set of colors is fundamentally different than
a (logically at least) continuous colormap -- we should simply keep them
separate.

-- the seaborn "palette" concept is in some ways closer
to what we'd want for qualitative colormaps.

yup.

-CHB

···

On Wed, Apr 20, 2016 at 8:07 PM, Nathaniel Smith <njs at pobox.com> wrote:

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20160421/cac24879/attachment.html>

Great idea!

AS it happens, we're messing with colormaps ourselves here, and indeed to
need to have consistent ones across browser-JS and python -- so great
timing!

I'm not so sure about the encoding of the colors -- I read the
justification (nicely written), but I still think file size is pretty much
a non-issue -- these are not going to be gigabytes no matter how you slice
it. and the compressed file sizes are definitely close enough in size to
make no difference. So I would much prefer:

[ [r, g, b], [r,g,b], ....]

However, maybe that's because I still can't wrap my head easily around hex
:slight_smile:

I think compatibility with D3 per se should be a minor influence, but if
Javascript itself "likes" to use hex colors and could more easily deal with
this, then I guess that might be a driver (or do I think that because I'm a
pythonista that could easily write the conversion code in Python than JS
...)

Also for discrete colormaps, might there be a use case for transparency?
i.e. RGBA ? In which case, the encoding should be able to handle either
three or four channels per color with no ambiguity. Would that be two hex
values per color for a 32 bit color (then you have to have a clean
endianness, yes?) or commas or something separating the colors.

As for specifying the number of colors somewhere else, I agree that that's
just a recipe for inconsistency with no gain -- is anyone going to have to
write a parser where they have to alocate teh memory ahead of time???

-CHB

···

On Wed, Apr 20, 2016 at 4:22 PM, Nathaniel Smith <njs at pobox.com> wrote:

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
    https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20160421/0098746b/attachment.html>

Columns of rgb points that I have been using for sharing colormaps have
been pretty easy and straight-forward (easily extended to 4 columns for
alpha). Is the main problem the lack of meta data?

Kristen

···

On Thu, Apr 21, 2016 at 2:24 PM, Chris Barker - NOAA Federal [via matplotlib] <ml-node+s1069221n46987h22 at n5.nabble.com> wrote:

Great idea!

AS it happens, we're messing with colormaps ourselves here, and indeed to
need to have consistent ones across browser-JS and python -- so great
timing!

I'm not so sure about the encoding of the colors -- I read the
justification (nicely written), but I still think file size is pretty much
a non-issue -- these are not going to be gigabytes no matter how you slice
it. and the compressed file sizes are definitely close enough in size to
make no difference. So I would much prefer:

[ [r, g, b], [r,g,b], ....]

However, maybe that's because I still can't wrap my head easily around hex
:slight_smile:

I think compatibility with D3 per se should be a minor influence, but if
Javascript itself "likes" to use hex colors and could more easily deal with
this, then I guess that might be a driver (or do I think that because I'm a
pythonista that could easily write the conversion code in Python than JS
...)

Also for discrete colormaps, might there be a use case for transparency?
i.e. RGBA ? In which case, the encoding should be able to handle either
three or four channels per color with no ambiguity. Would that be two hex
values per color for a 32 bit color (then you have to have a clean
endianness, yes?) or commas or something separating the colors.

As for specifying the number of colors somewhere else, I agree that that's
just a recipe for inconsistency with no gain -- is anyone going to have to
write a parser where they have to alocate teh memory ahead of time???

-CHB

On Wed, Apr 20, 2016 at 4:22 PM, Nathaniel Smith <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=46987&i=0>> wrote:

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
    https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Matplotlib-devel mailing list
[hidden email] <http:///user/SendEmail.jtp?type=node&node=46987&i=1>
https://mail.python.org/mailman/listinfo/matplotlib-devel

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

[hidden email] <http:///user/SendEmail.jtp?type=node&node=46987&i=2>

_______________________________________________
Matplotlib-devel mailing list
[hidden email] <http:///user/SendEmail.jtp?type=node&node=46987&i=3>
https://mail.python.org/mailman/listinfo/matplotlib-devel

------------------------------
If you reply to this email, your message will be added to the discussion
below:

http://matplotlib.1069221.n5.nabble.com/Matplotlib-devel-rfc-a-better-interchange-format-for-colormaps-tp46977p46987.html
To start a new topic under matplotlib - devel, email
ml-node+s1069221n28077h12 at n5.nabble.com
To unsubscribe from matplotlib - devel, click here
<http://matplotlib.1069221.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=28077&code=a3RoeW5nQGdtYWlsLmNvbXwyODA3N3wtMTk2MDcwNTI1Ng==>
.
NAML
<http://matplotlib.1069221.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml>

--
Kristen M. Thyng
Assistant Research Scientist
Department of Oceanography
Texas A&M University
Eller O&M 607
http://kristenthyng.com

--
View this message in context: http://matplotlib.1069221.n5.nabble.com/Matplotlib-devel-rfc-a-better-interchange-format-for-colormaps-tp46977p46992.html
Sent from the matplotlib - devel mailing list archive at Nabble.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20160422/1b94d2b3/attachment-0001.html>

A few quick comments:

We have a gsoc student who will be working on categorical axis which might
spill out in to improving the discrete color maps.

I agree about the importance of supporting alpha in color maps, by that
could be dealt with by using 'sRGBA' or 'RGBA' as the color space?

json is for better or worse the lingua franca these days. For something
this size hdf5 is super overkill. It would also introduce a pretty heavy
dependencie to mpl that I would be inclined to reject (libhdf5, h5py and
implicitly cython).

I can see the advantage of being able to extract the number of points with
out parsing the data, but would want to see a case where that parsing is a
significant bottle neck to trade against the possible ambiguity of
including the same information twice.

Tom

···

On Thu, Apr 21, 2016, 12:24 Chris Barker <chris.barker at noaa.gov> wrote:

Great idea!

AS it happens, we're messing with colormaps ourselves here, and indeed to
need to have consistent ones across browser-JS and python -- so great
timing!

I'm not so sure about the encoding of the colors -- I read the
justification (nicely written), but I still think file size is pretty much
a non-issue -- these are not going to be gigabytes no matter how you slice
it. and the compressed file sizes are definitely close enough in size to
make no difference. So I would much prefer:

[ [r, g, b], [r,g,b], ....]

However, maybe that's because I still can't wrap my head easily around hex
:slight_smile:

I think compatibility with D3 per se should be a minor influence, but if
Javascript itself "likes" to use hex colors and could more easily deal with
this, then I guess that might be a driver (or do I think that because I'm a
pythonista that could easily write the conversion code in Python than JS
...)

Also for discrete colormaps, might there be a use case for transparency?
i.e. RGBA ? In which case, the encoding should be able to handle either
three or four channels per color with no ambiguity. Would that be two hex
values per color for a 32 bit color (then you have to have a clean
endianness, yes?) or commas or something separating the colors.

As for specifying the number of colors somewhere else, I agree that that's
just a recipe for inconsistency with no gain -- is anyone going to have to
write a parser where they have to alocate teh memory ahead of time???

-CHB

On Wed, Apr 20, 2016 at 4:22 PM, Nathaniel Smith <njs at pobox.com> wrote:

Hi matplotters,

We've been working on improving our tool for building custom
viridis-like colormaps (e.g. adding support for diverging colormaps),
and one of the things that we got frustrated by is how there's no
compelling way to save and distribute the resulting colormaps so that
people can actually use them. So, I wrote up a little spec to
standardize a way of storing and distributing colormaps in JSON, with
the hope that we can convince everyone to implement this and stop
writing silly little conversion scripts all the time.

The v0.1 draft is here:
    https://github.com/njsmith/json-cm/blob/master/json-cm-spec.md

Any comments? There are always a lot of fiddly details to get right in
this kind of thing -- I made a bunch of guesses about what kind of
stuff is important and how to represent it, but it can only benefit
from review from different perspectives. I would equally love to get
nitpicky critiques and high-level queries.

Thanks!
-n

--
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker at noaa.gov
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20160425/bb5eee28/attachment.html>