Cant load file as an array.

Petro · October 27, 2009, 6:37pm

Hi all.
I have a problem with loading file of following format:
first 1024 rows are tab delimited and contain from 2 to 256 elements (in different files different number of columns)
after that 5 empty lines
and at the end some 20 text lines for description.

I could manage to do it in this way, after commenting all text part:
a = mlab.load(’/home/petro/TEMP/proba.txt’,delimiter=’\t’,comments=’%’)

I was hoping to use checkrows in cvs2rec to avoid reading of the text part of file, without commenting it.

a = mlab.csv2rec(’/home/petro/TEMP/proba.txt’,
checkrows=1023,delimiter=’\t’,names=[‘a’,‘b’,‘c’,‘e’,‘f’,‘k’,‘l’,‘h’])
invalid literal for float(): % Date and Time:

But It failed to do so.

a = mlab.csv2rec(‘probe.txt’, checkrows=1023,delimiter=’\t’,comments=’%’,names=[‘a’,‘b’,‘c’,‘e’,‘f’,‘k’,‘l’,‘h’])
This gives kind of strange format for my use and needs names to be specified.
In addition changing of checkrows to 500 (or any other number) had no effect on size of record I get.

loadtxt from numpy unfortunately does not have option checkrows.

Any idea how to do it without commenting.

Thanks.
Petro.

PGM · October 27, 2009, 6:50pm

With a recent SVN version of numpy, you can use the `skip_header` and `skip_footer` arguments of np.genfromtxt to skip lines at the beginning and the end of your file. In your case, you'd use `skip_header=0` and `skip_footer=5+20`
P.

···

On Oct 27, 2009, at 2:37 PM, Piter_ wrote:

Hi all.
I have a problem with loading file of following format:
first 1024 rows are tab delimited and contain from 2 to 256 elements (in different files different number of columns)
after that 5 empty lines
and at the end some 20 text lines for description.

_Stan_West1 · October 28, 2009, 4:19pm

From: Piter_
[mailto:x.piter@…287…]
Sent: Tuesday, October 27, 2009
14:37

Hi all.
I have a problem with loading file of following format:
first 1024 rows are tab delimited and contain from 2 to 256 elements (in
different files different number of columns)
after that 5 empty lines
and at the end some 20 text lines for description.

Although the
following isn’t specific to matplotlib, I submit it for the sake of others who
may have similar questions about reading text data.

Because a
file object may be iterated, one can use the itertools module. In
particular, the islice
iterator allows you to select the start and stop lines and the step. So, you
can read the desired portion of the file into a list of rows, splitting each
row into a list of text tokens, then use numpy.array to convert
the list into a numeric array. For example,

# Begin code

import numpy as np

import itertools

from __future__ import with_statement  # no longer required in Python 2.6

with open('filename.dat') as f:

    a = np.array(

        [line.rstrip().split('\t') for line in itertools.islice(f, 1024)],

        dtype=np.float)

# End code

Just alter
the islice
arguments and dtype
as necessary to suit your file.

Documentation:

·
http://docs.python.org/library/stdtypes.html#file.next

·
http://docs.python.org/library/itertools.html#itertools.islice

·
http://docs.scipy.org/doc/numpy-1.3.x/reference/generated/numpy.array.html

·
http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html