I am fairly new to programing and have a question regarding matplotlib. I
wrote a python script that reads in data from the outfile of another program
then prints out the data from one column.
f = open( 'myfile.txt','r')
for line in f:
if line != ' ':
line = line.strip() # Strips end of line character
columns = line.split() # Splits into coloumn
mass = columns[8] # Column which contains mass values
print(mass)
What I now need to do is have matplotlib take the values printed in 'mass'
and plot the sum of the values over the average of the values. I have read
the documents on the matplotlib website, but they are don't really address
how to get data from a script(or I just did not see it). If anyone can point
me to some documentation that explains how I do this it would be really
appreciated.
Thanks in advance
As you show it, mass will be a string, so you'll need to convert it to
a float first, then add it to a list. You can then manipulate the
values in the list to compute your mean, or whatever, which matplotlib
can use as input to its plot() function or whichever type of plot
you're after. Alternatively, since the Python numpy module is made for
manipulating data like this, it can probably read your data in a
single function call and easily compute the things you want. However,
if you are really that new to programming, you may struggle, so I'd
suggest reading first going to scipy.org and reading up on numpy. When
you understand the basics of numpy, matplotlib's documentation should
make a lot more sense.
Gary
···
On Thu, Aug 25, 2011 at 6:48 AM, surfcast23 <surfcast23@...287...> wrote:
I am fairly new to programing and have a question regarding matplotlib. I
wrote a python script that reads in data from the outfile of another program
then prints out the data from one column.
f = open( 'myfile.txt','r')
for line in f:
if line != ' ':
line = line.strip() # Strips end of line character
columns = line.split() # Splits into coloumn
mass = columns[8] # Column which contains mass values
print(mass)
What I now need to do is have matplotlib take the values printed in 'mass'
and plot number versus mean mass. I have read the documents on the
matplotlib website, but they don't really address how to get data from a
script(or I just did not see it) If anyone can point me to some
documentation that explains how I do this it would be really appreciated.
Thanks in advance
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
Thank you Gary. I will definitely read the numpy doucs
Gary Ruben-2 wrote:
···
As you show it, mass will be a string, so you'll need to convert it to
a float first, then add it to a list. You can then manipulate the
values in the list to compute your mean, or whatever, which matplotlib
can use as input to its plot() function or whichever type of plot
you're after. Alternatively, since the Python numpy module is made for
manipulating data like this, it can probably read your data in a
single function call and easily compute the things you want. However,
if you are really that new to programming, you may struggle, so I'd
suggest reading first going to scipy.org and reading up on numpy. When
you understand the basics of numpy, matplotlib's documentation should
make a lot more sense.
Gary
On Thu, Aug 25, 2011 at 6:48 AM, surfcast23 <surfcast23@...287...> wrote:
I am fairly new to programing and have a question regarding matplotlib. I
wrote a python script that reads in data from the outfile of another
program
then prints out the data from one column.
f = open( 'myfile.txt','r')
for line in f:
if line != ' ':
line = line.strip() # Strips end of line character
columns = line.split() # Splits into coloumn
mass = columns[8] # Column which contains mass values
print(mass)
What I now need to do is have matplotlib take the values printed in
'mass'
and plot number versus mean mass. I have read the documents on the
matplotlib website, but they don't really address how to get data from a
script(or I just did not see it) If anyone can point me to some
documentation that explains how I do this it would be really appreciated.
Thanks in advance
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
I wasn't quite able to follow exactly what you wanted to do but maybe this
will help. I am going to generate some "data" that I think sounds a bit like
yours, write it to a file, clearly you already have this. Then I am going to
read it back in and plot it, e.g.
import matplotlib.pyplot as plt
import numpy as np
# Generate some data a little like yours, I think?
# print it to a file, i.e. I am making your myfile.txt
numrows = 100
numcols = 8
mass = np.random.normal(0, 1, (numrows * numcols)).reshape(numrows,
numcols)
f = open("myfile.txt", "w")
for i in xrange(numrows):
for j in xrange(numcols):
print >>f, mass[i,j],
print >> f
f.close()
# read the file back in
mass = np.loadtxt("myfile.txt")
Thank for the relpy. What I have is a script that reads the data from
a large file then prints out the values listed in a particular column. What
I now need to do is have the information in that column plotted as the
number of rows vs. the mean value of all of the rows. What I have so far is
import matplotlib.pyplot as plt
masses =
f = open( 'myfile.txt','r')
f.readline()
for line in f:
if line != ' ':
line = line.strip() # Strips end of line character
columns = line.split() # Splits into coloumn
mass = columns[8] # Column which contains mass values
mass = float(mass)
masses.append(mass)
print(mass)
plt.plot()
plt.show
I am thinking I can do something like
'y runs fron 0 to n where n == len(masses) '
x = 'mass_avg = sum(masses)/len(masses)'
Problem is I don' tknow how to have matplotlib do it with out giving me an
error about dimentions. I would also like to do this with out having to
write and read from another file. I alos need to to be able to work on files
with ddifering numbers of rows.
Thanks
mdekauwe wrote:
···
I wasn't quite able to follow exactly what you wanted to do but maybe this
will help. I am going to generate some "data" that I think sounds a bit
like yours, write it to a file, clearly you already have this. Then I am
going to read it back in and plot it, e.g.
import matplotlib.pyplot as plt
import numpy as np
# Generate some data a little like yours, I think?
# print it to a file, i.e. I am making your myfile.txt
numrows = 100
numcols = 8
mass = np.random.normal(0, 1, (numrows * numcols)).reshape(numrows,
numcols)
f = open("myfile.txt", "w")
for i in xrange(numrows):
for j in xrange(numcols):
print >>f, mass[i,j],
print >> f
f.close()
# read the file back in
mass = np.loadtxt("myfile.txt")
Well the first bit about wanting a specific column and the last bit about
not wanting to print all the data in and read it back, you get that from the
example I gave you. If you paste what I wrote for you line by line it should
become clearer for you, additionally it avoids you have to write your own
parsing code.
As far as your plotting goes, unless you actually post what you are entering
in the script (exactly as you have it), then it is impossible to say. For
example
plt.plot()
plt.show
there is no way that is all you have? if it is, then of course you will get
a fail as you are asking matplotlib to plot but are not providing it with
any data to plot!
Perhaps I am being particularly dense but "What I now need to do is have the
information in that column plotted as the number of rows vs. the mean value
of all of the rows." means nothing to me. Sorry. What do you want on the X
and Y... do you mean you want to plot your individual column (8 i think you
called it) against the mean of all the other rows? If so I would expect you
would have a dimensions issue
I apologize if my explanation was less than clear. What I have is data in
a column that runs from row 1 to row 1268. In each each row there is a
number. For example
1
3
5
6
7
8
9
so I want the y axis to run from 1 to 7 ( the number of rows) and the x
axis to be the average of the values in this case 5.57. I am having problems
with setting up the y-axis as well as the dimension problem you addressed.
Is there a way I could have every value on the x axis the same? Say for the
above example have the x and y axis be
7
6
5
4
3
2
1
5.75 5.57 5.57 5.75 5.57 5.57 5.75
Which would be the number of rows vs the average value of the data in the
rows and then plot that?
Thanks again
Khary
mdekauwe wrote:
···
Hi,
Well the first bit about wanting a specific column and the last bit about
not wanting to print all the data in and read it back, you get that from
the example I gave you. If you paste what I wrote for you line by line it
should become clearer for you, additionally it avoids you have to write
your own parsing code.
As far as your plotting goes, unless you actually post what you are
entering in the script (exactly as you have it), then it is impossible to
say. For example
plt.plot()
plt.show
there is no way that is all you have? if it is, then of course you will
get a fail as you are asking matplotlib to plot but are not providing it
with any data to plot!
Perhaps I am being particularly dense but "What I now need to do is have
the information in that column plotted as the number of rows vs. the mean
value of all of the rows." means nothing to me. Sorry. What do you want on
the X and Y... do you mean you want to plot your individual column (8 i
think you called it) against the mean of all the other rows? If so I would
expect you would have a dimensions issue
still don't quite get this, so you want for each column the average? and you
want to plot each of these averages? So a bar graph? with 8 bars?
surfcast23 wrote:
···
Hi,
I apologize if my explanation was less than clear. What I have is data
in a column that runs from row 1 to row 1268. In each each row there is a
number. For example
1
3
5
6
7
8
9
so I want the y axis to run from 1 to 7 ( the number of rows) and the x
axis to be the average of the values in this case 5.57. I am having
problems with setting up the y-axis as well as the dimension problem you
addressed.
Is there a way I could have every value on the x axis the same? Say for
the above example have the x and y axis be
7
6
5
4
3
2
1
5.75 5.57 5.57 5.75 5.57 5.57 5.75
Which would be the number of rows vs the average value of the data in the
rows and then plot that?
Thanks again
Khary
mdekauwe wrote:
Hi,
Well the first bit about wanting a specific column and the last bit about
not wanting to print all the data in and read it back, you get that from
the example I gave you. If you paste what I wrote for you line by line it
should become clearer for you, additionally it avoids you have to write
your own parsing code.
As far as your plotting goes, unless you actually post what you are
entering in the script (exactly as you have it), then it is impossible to
say. For example
plt.plot()
plt.show
there is no way that is all you have? if it is, then of course you will
get a fail as you are asking matplotlib to plot but are not providing it
with any data to plot!
Perhaps I am being particularly dense but "What I now need to do is have
the information in that column plotted as the number of rows vs. the mean
value of all of the rows." means nothing to me. Sorry. What do you want
on the X and Y... do you mean you want to plot your individual column (8
i think you called it) against the mean of all the other rows? If so I
would expect you would have a dimensions issue
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of the
values that are contained in the rows in my example it was 5.57.
mdekauwe wrote:
···
still don't quite get this, so you want for each column the average? and
you want to plot each of these averages? So a bar graph? with 8 bars?
surfcast23 wrote:
Hi,
I apologize if my explanation was less than clear. What I have is data
in a column that runs from row 1 to row 1268. In each each row there is a
number. For example
1
3
5
6
7
8
9
so I want the y axis to run from 1 to 7 ( the number of rows) and the x
axis to be the average of the values in this case 5.57. I am having
problems with setting up the y-axis as well as the dimension problem
you addressed.
Is there a way I could have every value on the x axis the same? Say for
the above example have the x and y axis be
7
6
5
4
3
2
1
5.75 5.57 5.57 5.75 5.57 5.57 5.75
Which would be the number of rows vs the average value of the data in
the rows and then plot that?
Thanks again
Khary
mdekauwe wrote:
Hi,
Well the first bit about wanting a specific column and the last bit
about not wanting to print all the data in and read it back, you get
that from the example I gave you. If you paste what I wrote for you line
by line it should become clearer for you, additionally it avoids you
have to write your own parsing code.
As far as your plotting goes, unless you actually post what you are
entering in the script (exactly as you have it), then it is impossible
to say. For example
plt.plot()
plt.show
there is no way that is all you have? if it is, then of course you will
get a fail as you are asking matplotlib to plot but are not providing it
with any data to plot!
Perhaps I am being particularly dense but "What I now need to do is have
the information in that column plotted as the number of rows vs. the
mean value of all of the rows." means nothing to me. Sorry. What do you
want on the X and Y... do you mean you want to plot your individual
column (8 i think you called it) against the mean of all the other rows?
If so I would expect you would have a dimensions issue
Perhaps someone else can help as I feel I am being particularly dense.
for i in xrange(numcols):
ax.plot([np.mean(mass[:,7]) for i in xrange(numcols)],
np.arange(numcols), label=i)
This gives you what I think you said, but really don't think this is what
you mean as it seems a strange thing to want to do.
sorry i couldn't be of more help
surfcast23 wrote:
···
Hi,
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of the
values that are contained in the rows in my example it was 5.57.
mdekauwe wrote:
still don't quite get this, so you want for each column the average? and
you want to plot each of these averages? So a bar graph? with 8 bars?
surfcast23 wrote:
Hi,
I apologize if my explanation was less than clear. What I have is
data in a column that runs from row 1 to row 1268. In each each row
there is a number. For example
1
3
5
6
7
8
9
so I want the y axis to run from 1 to 7 ( the number of rows) and the x
axis to be the average of the values in this case 5.57. I am having
problems with setting up the y-axis as well as the dimension problem
you addressed.
Is there a way I could have every value on the x axis the same? Say for
the above example have the x and y axis be
7
6
5
4
3
2
1
5.75 5.57 5.57 5.75 5.57 5.57 5.75
Which would be the number of rows vs the average value of the data in
the rows and then plot that?
Thanks again
Khary
mdekauwe wrote:
Hi,
Well the first bit about wanting a specific column and the last bit
about not wanting to print all the data in and read it back, you get
that from the example I gave you. If you paste what I wrote for you
line by line it should become clearer for you, additionally it avoids
you have to write your own parsing code.
As far as your plotting goes, unless you actually post what you are
entering in the script (exactly as you have it), then it is impossible
to say. For example
plt.plot()
plt.show
there is no way that is all you have? if it is, then of course you will
get a fail as you are asking matplotlib to plot but are not providing
it with any data to plot!
Perhaps I am being particularly dense but "What I now need to do is
have the information in that column plotted as the number of rows vs.
the mean value of all of the rows." means nothing to me. Sorry. What do
you want on the X and Y... do you mean you want to plot your individual
column (8 i think you called it) against the mean of all the other
rows? If so I would expect you would have a dimensions issue
Perhaps someone else can help as I feel I am being particularly dense.
for i in xrange(numcols):
ax.plot([np.mean(mass[:,7]) for i in xrange(numcols)],
np.arange(numcols), label=i)
This gives you what I think you said, but really don't think this is what
you mean as it seems a strange thing to want to do.
sorry i couldn't be of more help
surfcast23 wrote:
Hi,
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of
the values that are contained in the rows in my example it was 5.57.
mdekauwe wrote:
still don't quite get this, so you want for each column the average? and
you want to plot each of these averages? So a bar graph? with 8 bars?
surfcast23 wrote:
Hi,
I apologize if my explanation was less than clear. What I have is
data in a column that runs from row 1 to row 1268. In each each row
there is a number. For example
1
3
5
6
7
8
9
so I want the y axis to run from 1 to 7 ( the number of rows) and the
x axis to be the average of the values in this case 5.57. I am having
problems with setting up the y-axis as well as the dimension problem
you addressed.
Is there a way I could have every value on the x axis the same? Say
for the above example have the x and y axis be
7
6
5
4
3
2
1
5.75 5.57 5.57 5.75 5.57 5.57 5.75
Which would be the number of rows vs the average value of the data in
the rows and then plot that?
Thanks again
Khary
mdekauwe wrote:
Hi,
Well the first bit about wanting a specific column and the last bit
about not wanting to print all the data in and read it back, you get
that from the example I gave you. If you paste what I wrote for you
line by line it should become clearer for you, additionally it avoids
you have to write your own parsing code.
As far as your plotting goes, unless you actually post what you are
entering in the script (exactly as you have it), then it is impossible
to say. For example
plt.plot()
plt.show
there is no way that is all you have? if it is, then of course you
will get a fail as you are asking matplotlib to plot but are not
providing it with any data to plot!
Perhaps I am being particularly dense but "What I now need to do is
have the information in that column plotted as the number of rows vs.
the mean value of all of the rows." means nothing to me. Sorry. What
do you want on the X and Y... do you mean you want to plot your
individual column (8 i think you called it) against the mean of all
the other rows? If so I would expect you would have a dimensions issue
It seems to me that, as described, you want a plot that in which all
the bars are the same height (or width if it is a sideways bar chart),
in this case, 5.57. That makes no sense.
What information is this plot is intended to provide the viewer?
···
On Thu, Aug 25, 2011 at 10:01 PM, surfcast23 <surfcast23@...287...> wrote:
Hi,
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of the
values that are contained in the rows in my example it was 5.57.
Sorry everyone I totally missed something very important. What I need to do
is first bin the masses(which I don't know how to do).
Chelonian wrote:
···
On Thu, Aug 25, 2011 at 10:01 PM, surfcast23 <surfcast23@...287...> wrote:
Hi,
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of
the
values that are contained in the rows in my example it was 5.57.
It seems to me that, as described, you want a plot that in which all
the bars are the same height (or width if it is a sideways bar chart),
in this case, 5.57. That makes no sense.
What information is this plot is intended to provide the viewer?
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
Can you describe what you want to do? So you now want a histogram?
surfcast23 wrote:
···
Sorry everyone I totally missed something very important. What I need to
do is first bin the masses(which I don't know how to do).
Chelonian wrote:
On Thu, Aug 25, 2011 at 10:01 PM, surfcast23 <surfcast23@...287...> >> wrote:
Hi,
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of
the
values that are contained in the rows in my example it was 5.57.
It seems to me that, as described, you want a plot that in which all
the bars are the same height (or width if it is a sideways bar chart),
in this case, 5.57. That makes no sense.
What information is this plot is intended to provide the viewer?
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
Sorry I am just responding. I have been busy getting ready for the
semester. What I need to do is first sort the values contained in the column
and assign them to bins. I then have to plot the number of bins by the mean
value in each bin.
mdekauwe wrote:
···
Can you describe what you want to do? So you now want a histogram?
surfcast23 wrote:
Sorry everyone I totally missed something very important. What I need to
do is first bin the masses(which I don't know how to do).
Chelonian wrote:
On Thu, Aug 25, 2011 at 10:01 PM, surfcast23 <surfcast23@...287...> >>> wrote:
Hi,
there is only one column. so I want a plot of y and x. With y taking
values running from 0 to n or 7 in my example and x as the average of
the
values that are contained in the rows in my example it was 5.57.
It seems to me that, as described, you want a plot that in which all
the bars are the same height (or width if it is a sideways bar chart),
in this case, 5.57. That makes no sense.
What information is this plot is intended to provide the viewer?
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
I don't think he's describing a histogram, because he is not plotting
frequency of observations on the y axis, but data values (means of
each bin). I think what surfcast23 wants is just a bar graph.
So, surfcast23, I'd suggest you break it down into your two steps.
First, how will you average your values by bin? You can probably
figure that out by writing it out on paper in pseudo-code and then
just putting it in Python. Then you'll have a list of means, and you
will pass that to the bar function in matplotlib, something like:
from pylab import *
ax = subplot(111)
x = arange(4)
your_list_of_means = [4,5,7,11] #computed earlier
bar(x, your_list_of_means)
xticks( x + 0.5, ('Bin1', 'Bin2', 'Bin3', 'Bin4') )
show()
Che
···
On Sat, Sep 3, 2011 at 7:32 PM, mdekauwe <mdekauwe@...287...> wrote:
So you do want a histogram then? I assume you have all of this sorted then,
the histogram function is very good.
You are correct on what I have to do. The problem is that I have a data set
with ~1250 so I cant' do the sorting or finding the mean by hand. I guess
what I need to to is to write a script that will sort the values, bin them,
and keep track of the number of values in each bin. Then find the mean value
in each bin. Then the scrip has to take the number of values in each bin and
plot that versus the mean of each bin. I apologies for the lack of clarity
in my earlier posts. It was unclear to me what exactly had to be done until
this weekend.
Chelonian wrote:
···
On Sat, Sep 3, 2011 at 7:32 PM, mdekauwe <mdekauwe@...287...> wrote:
So you do want a histogram then? I assume you have all of this sorted
then,
the histogram function is very good.
I don't think he's describing a histogram, because he is not plotting
frequency of observations on the y axis, but data values (means of
each bin). I think what surfcast23 wants is just a bar graph.
So, surfcast23, I'd suggest you break it down into your two steps.
First, how will you average your values by bin? You can probably
figure that out by writing it out on paper in pseudo-code and then
just putting it in Python. Then you'll have a list of means, and you
will pass that to the bar function in matplotlib, something like:
from pylab import *
ax = subplot(111)
x = arange(4)
your_list_of_means = [4,5,7,11] #computed earlier
bar(x, your_list_of_means)
xticks( x + 0.5, ('Bin1', 'Bin2', 'Bin3', 'Bin4') )
show()
Che
------------------------------------------------------------------------------
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better
price-free! And you'll get a free "Love Thy Logs" t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net matplotlib-users List Signup and Options
I think you really need to read up on the NumPy documentation. There are functions that will do this for you. NumPy can load/save data, sort them, bin them, find means and standard deviations, etc… You don’t need to re-invent the wheel.
Plus, you keep on talking about having a script for each part. While it is great that you like modularity, Python does support the use of functions, and I would encourage you to use them.
Cheers,
Ben Root
···
On Tue, Sep 6, 2011 at 12:01 PM, surfcast23 <surfcast23@…1972…> wrote:
Thanks for everyone responses and help
Che,
You are correct on what I have to do. The problem is that I have a data set
with ~1250 so I cant’ do the sorting or finding the mean by hand. I guess
what I need to to is to write a script that will sort the values, bin them,
and keep track of the number of values in each bin. Then find the mean value
in each bin. Then the scrip has to take the number of values in each bin and
plot that versus the mean of each bin. I apologies for the lack of clarity
in my earlier posts. It was unclear to me what exactly had to be done until