i have tab-separated text files that i would like to parse into arrays
in numpy/scipy. i simply want to be able to read in the data into an
array, and then use indexing to get some of the columns, or some of
the rows, etc. the key thing is that these columns might be strings or
might be numbers. typically, one column is a set of strings and the
others are floats. it's necessary for me to be able to specify whether
the file has a header or not, or what the delimiter is. also, i'd like
to be able to manipulate the array that i read in and then easily
serialize it to a file as csv, again controlling the
from the documentation, it looks like 'csv2rec' (from
matplotlib.mlab) might be the best option, but i am having problems
with it. for example, i use:
data = csv2rec(se_counts, skiprows=1, delimiter='\t')
however, then i cannot access the resulting array 'data' by columns,
it seems. the usual array notation data[:, 0] to access the first
column does not work -- how can i access the columns?
also, the first line of the file in this case was a header. ideally
i'd like to be able to specify that to csv2rec, so that it uses the
tab separated headers in the first line as the field names for the
columns -- is there a way to do this?
finally, how can i then easily serialize the results to a csv file?
any help on this would be greatly appreciated. i am happy to use
options aside from 'csv2rec' -- it's just that this seemed closest to
what i wanted to do, but i might have missed something. thank you.