color problems in scatter plot

Hello,
I am using Matplotlib 1.0.0 in Python 2.6.
I am trying to plot time series data of unique IDs and color the points
based on location. Each data point has a unique ID value, a date value, and
a location value.
The unique IDs and date values are plotting fine but I am unable to control
the color and subsequently the legend.

Here is a sample of the data.
IDs = [47, 33, 47, 12, 50, 50, 27, 27, 16, 27]
locations = ['201', '207', '207', '205', '204', '201', '209', '209',
'207','207']
dates = [ 733315.83240741, 733315.83521991, 733315.83681713,
        733315.83788194, 733336.54554398, 733336.54731481,
        733337.99842593, 733337.99943287, 733338.00070602,
        733338.00252315]

This basic code works.

fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(dates,IDs,marker='d')
ax.xaxis_date()
fig.autofmt_xdate()
plt.grid(True)
plt.show()

I've been trying to figure out how to set color = locations with no success.
Any ideas out there?
Thanks,

Mike

···

--
View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32584727.html
Sent from the matplotlib - users mailing list archive at Nabble.com.

Mike, sorry to send this twice… I should have sent it to the list as well…

···

Mike,

If your locations were integers or floats rather than strings, you could just change the scatter call to the following:
ax.scatter(dates,IDs,c=

locations,marker=‘d’)
I
don’t know about a legend… I don’t know if that is possible with a scatter plot (?). Because scatter plots get their colors based off of a color map, you could generate a color bar for your data. You may need to
capture the collection object returned from the scatter plot function call, though. Here’s your code with these modifications:

Of course, you need to change your locations list to integers rather than strings.

fig = plt.figure()
ax = fig.add_subplot(111)

sc = ax.scatter(dates,IDs,c=locations,marker=‘d’)

ax.xaxis_date()

fig.autofmt_xdate()
plt.colorbar(sc)
plt.grid(True)
plt.show()

If
you really need a legend, then you could do a loop of plot commands for
each set of unique locations. Using some fancy Numpy masking makes the process easier…

import numpy as np
import matplotlib.pyplot as plt

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])
locations = np.array([‘201’, ‘207’, ‘207’, ‘205’, ‘204’, ‘201’, ‘209’, ‘209’, \

    '207','207'])

dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,

   733315.83788194,  733336.54554398,  733336.54731481,
   733337.99842593,  733337.99943287,  733338.00070602,

733338.00252315])

fig = plt.figure()
ax = fig.add_subplot(111)

cs = [‘r’, ‘b’, ‘g’, ‘k’, ‘c’]
for n, i in enumerate(np.unique(locations)):

ax.plot(dates[locations==i],IDs[locations==i],'d', c=cs[n%len(cs)], label=i)

ax.xaxis_date()
fig.autofmt_xdate()
plt.legend(numpoints=1)
plt.grid(True)
plt.show()

Not sure if this is exactly what you wanted, but I hope it helps a little.

Ryan

On Mon, Oct 3, 2011 at 2:49 PM, Michael Castleton <fatuheeva@…9…> wrote:

Hello,

I am using Matplotlib 1.0.0 in Python 2.6.

I am trying to plot time series data of unique IDs and color the points

based on location. Each data point has a unique ID value, a date value, and

a location value.

The unique IDs and date values are plotting fine but I am unable to control

the color and subsequently the legend.

Here is a sample of the data.

IDs = [47, 33, 47, 12, 50, 50, 27, 27, 16, 27]

locations = [‘201’, ‘207’, ‘207’, ‘205’, ‘204’, ‘201’, ‘209’, ‘209’,

‘207’,‘207’]

dates = [ 733315.83240741, 733315.83521991, 733315.83681713,

    733315.83788194,  733336.54554398,  733336.54731481,

    733337.99842593,  733337.99943287,  733338.00070602,

    733338.00252315]

This basic code works.

fig = plt.figure()

ax = fig.add_subplot(111)

ax.scatter(dates,IDs,marker=‘d’)

ax.xaxis_date()

fig.autofmt_xdate()

plt.grid(True)

plt.show()

I’ve been trying to figure out how to set color = locations with no success.

Any ideas out there?

Thanks,

Mike

View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32584727.html

Sent from the matplotlib - users mailing list archive at Nabble.com.


All the data continuously generated in your IT infrastructure contains a

definitive record of customers, application performance, security

threats, fraudulent activity and more. Splunk takes this data and makes

sense of it. Business sense. IT sense. Common sense.

http://p.sf.net/sfu/splunk-d2dcopy1


Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Hello,
I am using Matplotlib 1.0.0 in Python 2.6.
I am trying to plot time series data of unique IDs and color the points
based on location. Each data point has a unique ID value, a date value, and
a location value.
The unique IDs and date values are plotting fine but I am unable to control
the color and subsequently the legend.

I've been trying to figure out how to set color = locations with no success.
Any ideas out there?

Michael, if I were you, I would reorganize and group your data into
several separate scatter data sets, based on the location parameter.
Then, color each SET the color that you want. Here's a start, from the
data as you provided it:

points = [(a, b, c) for a, b, c in zip(locations, IDs, dates)]
for p in points:

        print p

('201', 47, 733315.83240741002)
('207', 33, 733315.83521991002)
('207', 47, 733315.83681712998)
('205', 12, 733315.83788193995)
('204', 50, 733336.54554397997)
('201', 50, 733336.54731480998)
('209', 27, 733337.99842593004)
('209', 27, 733337.99943286995)
('207', 16, 733338.00070602004)
('207', 27, 733338.00252314995)

def make_dict(lst):

        d = {}
        for a, b, c in lst:
      try:
                d[a][0].append(b)
                d[a][1].append(c)
            except KeyError:
                d[a] = ([b],[c])
        return d

collated = make_dict(points)
for k in collated:

        print k, collated[k]

201 ([47, 50], [733315.83240741002, 733336.54731480998])
209 ([27, 27], [733337.99842593004, 733337.99943286995])
205 ([12], [733315.83788193995])
204 ([50], [733336.54554397997])
207 ([33, 47, 16, 27], [733315.83521991002, 733315.83681712998,
733338.00070602004, 733338.00252314995])

From collated, you could then plot five scattergrams, each of a

different color, in the same axes object.

···

On Mon, 2011-10-03 at 12:49 -0700, Michael Castleton wrote:

Ryan,
I have tried setting c=locations (after converting to float) and gotten
inconsistent results. For a dataset with ~32,000 points it seems to work,
but a 2nd dataset of ~100,000 points colors everything the same even though
there are at least 10 locations.
Your second idea works nicely and I'm going to try applying it to my data.
The only real issue is that I don't know how many locations there will be
for each plot so I can't hard code the colors list. I think I can figure
that part out though.
Thanks!

Mike

Mike, sorry to send this twice... I should have sent it to the list as
well...

···

_______________________________
Mike,

If your locations were integers or floats rather than strings, you could
just change the scatter call to the following:
ax.scatter(dates,IDs,c=
locations,marker='d')
I don't know about a legend... I don't know if that is possible with a
scatter plot (?). Because scatter plots get their colors based off of a
color map, you could generate a color bar for your data. You may need to
capture the collection object returned from the scatter plot function call,
though. Here's your code with these modifications:

# Of course, you need to change your locations list to integers rather than
strings.

fig = plt.figure()
ax = fig.add_subplot(111)
sc = ax.scatter(dates,IDs,c=locations,marker='d')
ax.xaxis_date()
fig.autofmt_xdate()
plt.colorbar(sc)
plt.grid(True)
plt.show()

If you really need a legend, then you could do a loop of plot commands for
each set of unique locations. Using some fancy Numpy masking makes the
process easier...

import numpy as np
import matplotlib.pyplot as plt

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])
locations = np.array(['201', '207', '207', '205', '204', '201', '209',
'209', \
        '207','207'])
dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,

       733315.83788194, 733336.54554398, 733336.54731481,
       733337.99842593, 733337.99943287, 733338.00070602,
       733338.00252315])

fig = plt.figure()
ax = fig.add_subplot(111)
cs = ['r', 'b', 'g', 'k', 'c']
for n, i in enumerate(np.unique(locations)):
    ax.plot(dates[locations==i],IDs[locations==i],'d', c=cs[n%len(cs)],
label=i)
ax.xaxis_date()
fig.autofmt_xdate()
plt.legend(numpoints=1)
plt.grid(True)
plt.show()

Not sure if this is exactly what you wanted, but I hope it helps a little.

Ryan

--
View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32592621.html
Sent from the matplotlib - users mailing list archive at Nabble.com.

Ryan,
I should clarify my color issue. Your code is smart enough to generate
however many colors are needed but I want to make sure the colors are all
unique.
Thanks again!

Mike

Mike, sorry to send this twice... I should have sent it to the list as
well...

···

_______________________________
Mike,

If your locations were integers or floats rather than strings, you could
just change the scatter call to the following:
ax.scatter(dates,IDs,c=
locations,marker='d')
I don't know about a legend... I don't know if that is possible with a
scatter plot (?). Because scatter plots get their colors based off of a
color map, you could generate a color bar for your data. You may need to
capture the collection object returned from the scatter plot function call,
though. Here's your code with these modifications:

# Of course, you need to change your locations list to integers rather than
strings.

fig = plt.figure()
ax = fig.add_subplot(111)
sc = ax.scatter(dates,IDs,c=locations,marker='d')
ax.xaxis_date()
fig.autofmt_xdate()
plt.colorbar(sc)
plt.grid(True)
plt.show()

If you really need a legend, then you could do a loop of plot commands for
each set of unique locations. Using some fancy Numpy masking makes the
process easier...

import numpy as np
import matplotlib.pyplot as plt

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])
locations = np.array(['201', '207', '207', '205', '204', '201', '209',
'209', \
        '207','207'])
dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,

       733315.83788194, 733336.54554398, 733336.54731481,
       733337.99842593, 733337.99943287, 733338.00070602,
       733338.00252315])

fig = plt.figure()
ax = fig.add_subplot(111)
cs = ['r', 'b', 'g', 'k', 'c']
for n, i in enumerate(np.unique(locations)):
    ax.plot(dates[locations==i],IDs[locations==i],'d', c=cs[n%len(cs)],
label=i)
ax.xaxis_date()
fig.autofmt_xdate()
plt.legend(numpoints=1)
plt.grid(True)
plt.show()

Not sure if this is exactly what you wanted, but I hope it helps a little.

Ryan

--
View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32592799.html
Sent from the matplotlib - users mailing list archive at Nabble.com.

John,
I'll give this method a try also.
Thanks for the ideas!

Mike

John Ladasky-3 wrote:

···

On Mon, 2011-10-03 at 12:49 -0700, Michael Castleton wrote:

Hello,
I am using Matplotlib 1.0.0 in Python 2.6.
I am trying to plot time series data of unique IDs and color the points
based on location. Each data point has a unique ID value, a date value,
and
a location value.
The unique IDs and date values are plotting fine but I am unable to
control
the color and subsequently the legend.

I've been trying to figure out how to set color = locations with no
success.
Any ideas out there?

Michael, if I were you, I would reorganize and group your data into
several separate scatter data sets, based on the location parameter.
Then, color each SET the color that you want. Here's a start, from the
data as you provided it:

points = [(a, b, c) for a, b, c in zip(locations, IDs, dates)]
for p in points:

        print p

('201', 47, 733315.83240741002)
('207', 33, 733315.83521991002)
('207', 47, 733315.83681712998)
('205', 12, 733315.83788193995)
('204', 50, 733336.54554397997)
('201', 50, 733336.54731480998)
('209', 27, 733337.99842593004)
('209', 27, 733337.99943286995)
('207', 16, 733338.00070602004)
('207', 27, 733338.00252314995)

def make_dict(lst):

        d = {}
        for a, b, c in lst:
      try:
                d[a][0].append(b)
                d[a][1].append(c)
            except KeyError:
                d[a] = ([b],[c])
        return d

collated = make_dict(points)
for k in collated:

        print k, collated[k]

201 ([47, 50], [733315.83240741002, 733336.54731480998])
209 ([27, 27], [733337.99842593004, 733337.99943286995])
205 ([12], [733315.83788193995])
204 ([50], [733336.54554397997])
207 ([33, 47, 16, 27], [733315.83521991002, 733315.83681712998,
733338.00070602004, 733338.00252314995])

From collated, you could then plot five scattergrams, each of a

different color, in the same axes object.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

--
View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32593466.html
Sent from the matplotlib - users mailing list archive at Nabble.com.

Mike,

You may want to look into the matplotlib.cm and matplotlib.colors modules. I’ve had good success with matplotlib.colors.LinearSegmentedColormap and its ‘from_list’ method. The documentation is the best location for information on this topic. If you have a large number of locations, then the color differences will be pretty small, unless you use a colormap with lots of different colors. Below is your example using the ‘from_list’ method and the built-in colormap ‘hsv’ (you’ll just have to flip around the comments). For the matplotlib.cm colormaps, be sure to passed in normalized values (which is why the call to the colormap is slightly complex).

Maybe this is a bit more help.

Ryan

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as plc
import matplotlib.cm as mcm

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])
locations = np.array([‘201’, ‘207’, ‘207’, ‘205’, ‘204’, ‘201’, ‘209’, ‘209’,
‘207’,‘207’])
dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,
733315.83788194, 733336.54554398, 733336.54731481,
733337.99842593, 733337.99943287, 733338.00070602,
733338.00252315])

fig = plt.figure()
ax = fig.add_subplot(111)
locs_un = np.unique(locations)

The variable assignment below can be removed if you use the mcm module.

cs = plc.LinearSegmentedColormap.from_list(‘Colormap name’, [‘r’, ‘g’, ‘b’],
N=len(locs_un) )
for n, i in enumerate(locs_un):
# Reverse the comments here to use the mcm module ‘hsv’ colormap.
ax.plot(dates[locations==i],IDs[locations==i],‘d’, c=cs(n), label=i)
#ax.plot(dates[locations==i],IDs[locations==i],‘d’,
# c=mcm.hsv( float(n)/(len(locs_un)-1) ), label=i)
ax.xaxis_date()
fig.autofmt_xdate()
plt.legend(numpoints=1)
plt.grid(True)
plt.show()

···

On Tue, Oct 4, 2011 at 5:25 PM, Michael Castleton <fatuheeva@…9…> wrote:

Ryan,

I should clarify my color issue. Your code is smart enough to generate

however many colors are needed but I want to make sure the colors are all

unique.

Thanks again!

Mike

Mike, sorry to send this twice… I should have sent it to the list as

well…


Mike,

If your locations were integers or floats rather than strings, you could

just change the scatter call to the following:

ax.scatter(dates,IDs,c=

locations,marker=‘d’)

I don’t know about a legend… I don’t know if that is possible with a

scatter plot (?). Because scatter plots get their colors based off of a

color map, you could generate a color bar for your data. You may need to

capture the collection object returned from the scatter plot function call,

though. Here’s your code with these modifications:

Of course, you need to change your locations list to integers rather than

strings.

fig = plt.figure()

ax = fig.add_subplot(111)

sc = ax.scatter(dates,IDs,c=locations,marker=‘d’)

ax.xaxis_date()

fig.autofmt_xdate()

plt.colorbar(sc)

plt.grid(True)

plt.show()

If you really need a legend, then you could do a loop of plot commands for

each set of unique locations. Using some fancy Numpy masking makes the

process easier…

import numpy as np

import matplotlib.pyplot as plt

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])

locations = np.array([‘201’, ‘207’, ‘207’, ‘205’, ‘204’, ‘201’, ‘209’,

‘209’, \

    '207','207'])

dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,

   733315.83788194,  733336.54554398,  733336.54731481,

   733337.99842593,  733337.99943287,  733338.00070602,

   733338.00252315])

fig = plt.figure()

ax = fig.add_subplot(111)

cs = [‘r’, ‘b’, ‘g’, ‘k’, ‘c’]

for n, i in enumerate(np.unique(locations)):

ax.plot(dates[locations==i],IDs[locations==i],'d', c=cs[n%len(cs)],

label=i)

ax.xaxis_date()

fig.autofmt_xdate()

plt.legend(numpoints=1)

plt.grid(True)

plt.show()

Not sure if this is exactly what you wanted, but I hope it helps a little.

Ryan

View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32592799.html

Sent from the matplotlib - users mailing list archive at Nabble.com.


All the data continuously generated in your IT infrastructure contains a

definitive record of customers, application performance, security

threats, fraudulent activity and more. Splunk takes this data and makes

sense of it. Business sense. IT sense. Common sense.

http://p.sf.net/sfu/splunk-d2dcopy1


Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Hi Ryan
More very interesting information! I will give these methods a try!
Thanks once again,

Mike

rcnelson wrote:

···

Mike,

You may want to look into the matplotlib.cm and matplotlib.colors modules.
I've had good success with matplotlib.colors.LinearSegmentedColormap and
its
'from_list' method. The documentation is the best location for information
on this topic. If you have a large number of locations, then the color
differences will be pretty small, unless you use a colormap with lots of
different colors. Below is your example using the 'from_list' method and
the
built-in colormap 'hsv' (you'll just have to flip around the comments).
For
the matplotlib.cm colormaps, be sure to passed in normalized values (which
is why the call to the colormap is slightly complex).

Maybe this is a bit more help.

Ryan

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as plc
import matplotlib.cm as mcm

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])
locations = np.array(['201', '207', '207', '205', '204', '201', '209',
'209', \
        '207','207'])
dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,
       733315.83788194, 733336.54554398, 733336.54731481,
       733337.99842593, 733337.99943287, 733338.00070602,
       733338.00252315])

fig = plt.figure()
ax = fig.add_subplot(111)
locs_un = np.unique(locations)
# The variable assignment below can be removed if you use the mcm module.
cs = plc.LinearSegmentedColormap.from_list('Colormap name', ['r', 'g',
'b'],
        N=len(locs_un) )
for n, i in enumerate(locs_un):
    # Reverse the comments here to use the mcm module 'hsv' colormap.
    ax.plot(dates[locations==i],IDs[locations==i],'d', c=cs(n), label=i)
    #ax.plot(dates[locations==i],IDs[locations==i],'d',
    # c=mcm.hsv( float(n)/(len(locs_un)-1) ), label=i)
ax.xaxis_date()
fig.autofmt_xdate()
plt.legend(numpoints=1)
plt.grid(True)
plt.show()

On Tue, Oct 4, 2011 at 5:25 PM, Michael Castleton > <fatuheeva@...9...>wrote:

Ryan,
I should clarify my color issue. Your code is smart enough to generate
however many colors are needed but I want to make sure the colors are all
unique.
Thanks again!

Mike

Mike, sorry to send this twice... I should have sent it to the list as
well...
_______________________________
Mike,

If your locations were integers or floats rather than strings, you could
just change the scatter call to the following:
ax.scatter(dates,IDs,c=
locations,marker='d')
I don't know about a legend... I don't know if that is possible with a
scatter plot (?). Because scatter plots get their colors based off of a
color map, you could generate a color bar for your data. You may need to
capture the collection object returned from the scatter plot function
call,
though. Here's your code with these modifications:

# Of course, you need to change your locations list to integers rather
than
strings.

fig = plt.figure()
ax = fig.add_subplot(111)
sc = ax.scatter(dates,IDs,c=locations,marker='d')
ax.xaxis_date()
fig.autofmt_xdate()
plt.colorbar(sc)
plt.grid(True)
plt.show()

If you really need a legend, then you could do a loop of plot commands
for
each set of unique locations. Using some fancy Numpy masking makes the
process easier...

import numpy as np
import matplotlib.pyplot as plt

IDs = np.array([47, 33, 47, 12, 50, 50, 27, 27, 16, 27])
locations = np.array(['201', '207', '207', '205', '204', '201', '209',
'209', \
       '207','207'])
dates = np.array([ 733315.83240741, 733315.83521991, 733315.83681713,

      733315.83788194, 733336.54554398, 733336.54731481,
      733337.99842593, 733337.99943287, 733338.00070602,
      733338.00252315])

fig = plt.figure()
ax = fig.add_subplot(111)
cs = ['r', 'b', 'g', 'k', 'c']
for n, i in enumerate(np.unique(locations)):
   ax.plot(dates[locations==i],IDs[locations==i],'d', c=cs[n%len(cs)],
label=i)
ax.xaxis_date()
fig.autofmt_xdate()
plt.legend(numpoints=1)
plt.grid(True)
plt.show()

Not sure if this is exactly what you wanted, but I hope it helps a
little.

Ryan

--
View this message in context:
http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32592799.html
Sent from the matplotlib - users mailing list archive at Nabble.com.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

--
View this message in context: http://old.nabble.com/color-problems-in-scatter-plot-tp32584727p32603057.html
Sent from the matplotlib - users mailing list archive at Nabble.com.