modifying TConfig

Hi,

I definetely don't like the fact that .__repr__() and repr() are used all
over TConfig for eg storing to file.

First of all I would like to modify __repr__ for a TConfig class to give
a more synthetic view.

I propose to change the current ".__repr__()" method to ".tostring()" and
to implement a more readable ".__repr__()".

Are the different people interested in these issues OK with that? If yes,
where is the authoritative repo for TConfig? Which SVN should I start
hacking?

Cheers,

Ga�l

Hi,

I definetely don't like the fact that .__repr__() and repr() are used all
over TConfig for eg storing to file.

First of all I would like to modify __repr__ for a TConfig class to give
a more synthetic view.

I propose to change the current ".__repr__()" method to ".tostring()" and
to implement a more readable ".__repr__()".

Are the different people interested in these issues OK with that? If yes,

I'm OK with it.

where is the authoritative repo for TConfig? Which SVN should I start
hacking?

It's probably easier for now to use ipython, since both you and Darren
both have SVN write access to it. Unless Darren has a different
opinion...

Cheers,

f

···

On Dec 12, 2007 5:58 PM, Gael Varoquaux <gael.varoquaux@...427...> wrote:

I consider the authoritative TConfig to be the one in Ipython1.sandbox, but it
doesnt matter in practice because I keep the two in sync.

I do want to be able to save the verbose information to a file, since we want
to be able to generate a commented file that people can use to learn how to
make modifications. If that doesnt change, I don't have an objection to
making __repr__ terse.

Darren

···

On Wednesday 12 December 2007 7:58:10 pm Gael Varoquaux wrote:

Hi,

I definetely don't like the fact that .__repr__() and repr() are used all
over TConfig for eg storing to file.

First of all I would like to modify __repr__ for a TConfig class to give
a more synthetic view.

I propose to change the current ".__repr__()" method to ".tostring()" and
to implement a more readable ".__repr__()".

Are the different people interested in these issues OK with that? If yes,
where is the authoritative repo for TConfig? Which SVN should I start
hacking?

On second thought though: __str__ is the one meant for 'human
consumption', while __repr__ is deliberately meant to be much more
machine-like. Basically the idea is that, whenever possible, one can
do

x == eval(repr(x))

That is true for many of the builtin data types of the language.

Quoting http://docs.python.org/ref/customization.html#l2h-179 :

__repr__( self)
    Called by the repr() built-in function and by string conversions
(reverse quotes) to compute the ``official'' string representation of
an object. If at all possible, this should look like a valid Python
expression that could be used to recreate an object with the same
value (given an appropriate environment). If this is not possible, a
string of the form "<...some useful description...>" should be
returned. The return value must be a string object. If a class defines
__repr__() but not __str__(), then __repr__() is also used when an
``informal'' string representation of instances of that class is
required.

    This is typically used for debugging, so it is important that the
representation is information-rich and unambiguous.

So Gael, would you be OK with a terser str() and leaving repr to honor
this python convention?

Cheers,

f

···

On Dec 12, 2007 6:34 PM, Darren Dale <darren.dale@...143...> wrote:

On Wednesday 12 December 2007 7:58:10 pm Gael Varoquaux wrote:
> Hi,
>
> I definetely don't like the fact that .__repr__() and repr() are used all
> over TConfig for eg storing to file.
>
> First of all I would like to modify __repr__ for a TConfig class to give
> a more synthetic view.
>
> I propose to change the current ".__repr__()" method to ".tostring()" and
> to implement a more readable ".__repr__()".
>
> Are the different people interested in these issues OK with that? If yes,
> where is the authoritative repo for TConfig? Which SVN should I start
> hacking?

I consider the authoritative TConfig to be the one in Ipython1.sandbox, but it
doesnt matter in practice because I keep the two in sync.

I do want to be able to save the verbose information to a file, since we want
to be able to generate a commented file that people can use to learn how to
make modifications. If that doesnt change, I don't have an objection to
making __repr__ terse.

I totally agree. However if a user types:
pylab.rcParams
in IPython, or the Python interpreter, she gets the repr, AFAIK. I would
like this display to be readable.

Ga�l

···

On Wed, Dec 12, 2007 at 06:39:02PM -0700, Fernando Perez wrote:

On second thought though: __str__ is the one meant for 'human
consumption', while __repr__ is deliberately meant to be much more
machine-like. Basically the idea is that, whenever possible, one can
do

x == eval(repr(x))

That is true for many of the builtin data types of the language.

OK, this is what I currently have:

"""
In [1]: import simpleconf

In [2]: simpleconf.SimpleConfig()
Out[2]:
datafile = 'data.txt' # a value of type 'str' or a value of type 'unicode'
solver = 'Direct' # 'Direct' or 'Iterative'
Protocol.max_users = 1 # a value of type 'int'
Protocol.ptype = 'http' # 'http' or 'ftp' or 'ssh'
"""

I would like to make it as easy as possible for users to understand how
to modify configuration options. Comments are welcomed.

Ga�l

···

On Thu, Dec 13, 2007 at 09:30:44AM +0100, Gael Varoquaux wrote:

On Wed, Dec 12, 2007 at 06:39:02PM -0700, Fernando Perez wrote:
> On second thought though: __str__ is the one meant for 'human
> consumption', while __repr__ is deliberately meant to be much more
> machine-like. Basically the idea is that, whenever possible, one can
> do

> x == eval(repr(x))

> That is true for many of the builtin data types of the language.

I totally agree. However if a user types:
pylab.rcParams
in IPython, or the Python interpreter, she gets the repr, AFAIK. I would
like this display to be readable.

It is possible to save the current settings to a file, so only those that
deviate from the default are written to the file. By putting the comments on
the same line as the data, you encourage users to comment their config files
accordingly, but comments appearing on the same line as the data will be
deleted if the current settings are saved. Also, what happens when the
comment is many lines long, like the comment for matplotlib's timezone
setting? I have a feeling there are formatting issues with this scheme.

I think I prefer the existing behavior, where the comment appears on a
seperate line just before the data. Maybe I don't understand the point of
your modifications.

Darren

···

On Thursday 13 December 2007 04:24:21 am Gael Varoquaux wrote:

On Thu, Dec 13, 2007 at 09:30:44AM +0100, Gael Varoquaux wrote:
> On Wed, Dec 12, 2007 at 06:39:02PM -0700, Fernando Perez wrote:
> > On second thought though: __str__ is the one meant for 'human
> > consumption', while __repr__ is deliberately meant to be much more
> > machine-like. Basically the idea is that, whenever possible, one can
> > do
> >
> > x == eval(repr(x))
> >
> > That is true for many of the builtin data types of the language.
>
> I totally agree. However if a user types:
> pylab.rcParams
> in IPython, or the Python interpreter, she gets the repr, AFAIK. I would
> like this display to be readable.

OK, this is what I currently have:

"""
In [1]: import simpleconf

In [2]: simpleconf.SimpleConfig()
Out[2]:
datafile = 'data.txt' # a value of type 'str' or a value of type
'unicode' solver = 'Direct' # 'Direct' or 'Iterative'
Protocol.max_users = 1 # a value of type 'int'
Protocol.ptype = 'http' # 'http' or 'ftp' or 'ssh'
"""

I would like to make it as easy as possible for users to understand how
to modify configuration options. Comments are welcomed.

It is possible to save the current settings to a file, so only those that
deviate from the default are written to the file. By putting the comments on
the same line as the data, you encourage users to comment their config files
accordingly, but comments appearing on the same line as the data will be
deleted if the current settings are saved. Also, what happens when the
comment is many lines long, like the comment for matplotlib's timezone
setting? I have a feeling there are formatting issues with this scheme.

I think I prefer the existing behavior, where the comment appears on a
seperate line just before the data. Maybe I don't understand the point of
your modifications.

If I type "options <Rtn>", I don't understand what kind of object I have.
It looks like a string to me, and it is not obvious that it actually is
an object that I can modify by accessing its attributes. Currently if I
don't know what the object is, it is not obvious what to do with it.

I don't like the current way it is print (in the repository), because it
is too long, and looks too much like a string. I am not sure my option is
great because of your remark, and because it still looks a lot like a
string.

Second try, how to you like this:

In [1]: import simpleconf

In [2]: simpleconf.SimpleConfig()
Out[2]:
<SimpleConfig configuration object at 138452492>

-> datafile: 'data.txt' (a value of type 'str' or a value of type 'unicode')
-> solver: 'Direct' ('Direct' or 'Iterative')
-> Protocol.max_users: 1 (a value of type 'int')
-> Protocol.ptype: 'http' ('http' or 'ftp' or 'ssh')

This feels a bit more like a Python object. I am still not terribly happy
with the way the options are presented. It is not obvious they are simply
attributes.

Ga�l

···

On Thu, Dec 13, 2007 at 09:31:03AM -0500, Darren Dale wrote:

That feels more like a C++ object to me. Plus, you have a formatting issue due
to a long comment. The solver setting is easily overlooked. This example
doesn't make the sections as clear as the current implementation, where
Protocol would be unindented, and all of the protocol settings would be
indented.

Darren

···

On Thursday 13 December 2007 10:05:22 am Gael Varoquaux wrote:

On Thu, Dec 13, 2007 at 09:31:03AM -0500, Darren Dale wrote:
> It is possible to save the current settings to a file, so only those that
> deviate from the default are written to the file. By putting the comments
> on the same line as the data, you encourage users to comment their config
> files accordingly, but comments appearing on the same line as the data
> will be deleted if the current settings are saved. Also, what happens
> when the comment is many lines long, like the comment for matplotlib's
> timezone setting? I have a feeling there are formatting issues with this
> scheme.
>
> I think I prefer the existing behavior, where the comment appears on a
> seperate line just before the data. Maybe I don't understand the point of
> your modifications.

If I type "options <Rtn>", I don't understand what kind of object I have.
It looks like a string to me, and it is not obvious that it actually is
an object that I can modify by accessing its attributes. Currently if I
don't know what the object is, it is not obvious what to do with it.

I don't like the current way it is print (in the repository), because it
is too long, and looks too much like a string. I am not sure my option is
great because of your remark, and because it still looks a lot like a
string.

Second try, how to you like this:

In [1]: import simpleconf

In [2]: simpleconf.SimpleConfig()
Out[2]:
<SimpleConfig configuration object at 138452492>

>-> datafile: 'data.txt' (a value of type 'str' or a value of type
> 'unicode') -> solver: 'Direct' ('Direct' or 'Iterative')
>-> Protocol.max_users: 1 (a value of type 'int')
>-> Protocol.ptype: 'http' ('http' or 'ftp' or 'ssh')

This feels a bit more like a Python object. I am still not terribly happy
with the way the options are presented. It is not obvious they are simply
attributes.

Gael Varoquaux wrote:

x == eval(repr(x))

That is true for many of the builtin data types of the language.

And the really the whole point of having __repr__, in addition to __str__

I totally agree. However if a user types:
pylab.rcParams
in IPython, or the Python interpreter, she gets the repr, AFAIK.

That's a python interpreter change a couple versions back. One of the reasons for it is:

>>> str(a)
'0.1'
>>> str(b)
'0.1'
>>> a == b
False

huh? why false???

>>> a
0.10000000000000002
>>> b
0.10000000000000001
>>>

Ah -- now I see. It was felt that this was a case where being as precise as possible with the default output was a good idea.

If you want the pretty version, use print:

>>> print a, b
0.1 0.1

I'm not up on the details of this specific issue, but in general, the idea that:

__repr__ is precise and complete
__str__ is pretty and readable

is a good one.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

+1

For a while I've toyed with the idea of adding an option to ipython so
the output prompts could use str() instead of repr(), so users who
*deliberately* want to switch, aware of the potential conflicts, do
so.

It's easy and I'm about to get on a plane, so I might code it in if I can.

Cheers,

f

···

On Dec 13, 2007 4:14 PM, Christopher Barker <Chris.Barker@...236...> wrote:

I'm not up on the details of this specific issue, but in general, the
idea that:

__repr__ is precise and complete
__str__ is pretty and readable

is a good one

Guys, I agree with all this. It's not about the theory, but about the
user experience. The user just types along, and doesn't read books and
manuals. A least the average user. And we want to make it as easy as
possible for her.

And actually, it feels nice when I can pick up something new and be
efficient quickly. And if on top of that I discover that I can keep
learning and improving for a long time and discover the hidden power of
what I am using, that's great and I am happy.

Cheers,

Ga�l

···

On Thu, Dec 13, 2007 at 03:14:23PM -0800, Christopher Barker wrote:

I'm not up on the details of this specific issue, but in general, the
idea that:

__repr__ is precise and complete
__str__ is pretty and readable

is a good one.

Fernando Perez wrote:

For a while I've toyed with the idea of adding an option to ipython so
the output prompts could use str() instead of repr(), so users who
*deliberately* want to switch, aware of the potential conflicts, do
so.

+1

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Gael Varoquaux wrote:

Guys, I agree with all this. It's not about the theory, but about the
user experience. The user just types along, and doesn't read books and
manuals. A least the average user. And we want to make it as easy as
possible for her.

Yes, we all like that.

Which is why it was decided that __repr_ was the better default for display at the command line. See my example, too many questions along the lines of "python has a bug!" -- I'm guessing a very large fraction of those were about FP issues -- poorly understood my most newbies.

I think it is clearly the best choice for things like a single floating point number, but for far more complex objects? who knows. As an example, look at the default behavior of numpy arrays:

>>> a = numpy.ones((3,3))
>>> a
array([[ 1., 1., 1.],
        [ 1., 1., 1.],
        [ 1., 1., 1.]])

Classic __repr__.

but:

>>> a = numpy.ones((1000,1000))
>>> a
array([[ 1., 1., 1., ..., 1., 1., 1.],
        [ 1., 1., 1., ..., 1., 1., 1.],
        ...,
        [ 1., 1., 1., ..., 1., 1., 1.],
        [ 1., 1., 1., ..., 1., 1., 1.]])

no longer follows the __repr__ rules. I think that's an excellent choice -- it's really never useful to spew something that large to the screen.

Given this discussion, what are you currently proposing?

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...