Unicode to Tex symbols, Type1 names, and vice versa

John_Hunter1 · June 23, 2006, 1:21pm

The reason why I used pickle - from the Python docs: =====

I have had bad experiences in the past with pickle files created with
one version that don't load with another. I don't know if that is a
common problem or if others have experienced it, but it has made me
wary of them for mpl, where we work across platforms and python
versions. Maybe this concern is unfounded. I still do not understand
what the downside is of simply creating a dictionary in a python
module as we do with latex_to_bakoma.

JDH

_Fernando_Perez1 · June 23, 2006, 1:43pm

The most common way pickle breaks is when you pickle an instance and
later modify the class it belongs to such that some attribute
disappears or is renamed. Since pickling works by 'fully qualified
name', meaning that it only saves the name of the class and the
instance data, but it doesn't actually save the original class, in
this scenario the pickle can't be unpickled since there are
attributes that the new class doesn't have anymore.

If you are strictly pickling data in one of the builtin python types,
you are /probably/ OK, as I don't see python removing attributes from
dicts, and the builtin data types don't really have any special
instance attributes with much metadata that can change.

But it's still true that there's a window for problems with pickle
that simply isn't there with a pure auto-generated source module. And
the speed argument is, I think moot: when you import something, python
marshals the source into binary bytecode using something which I think
is quite similar to cPickle, and probably just as fast (if not faster,
since marshal is simpler than pickle). I'm not 100% sure on the
details of bytecode marshalling, so please correct me if this part is
wrong.

HTH,

f

···

On 6/23/06, John Hunter <jdhunter@...5...> wrote:

> The reason why I used pickle - from the Python docs: =====

I have had bad experiences in the past with pickle files created with
one version that don't load with another. I don't know if that is a
common problem or if others have experienced it, but it has made me
wary of them for mpl, where we work across platforms and python
versions. Maybe this concern is unfounded. I still do not understand
what the downside is of simply creating a dictionary in a python
module as we do with latex_to_bakoma.

_Edin_Salkovic · June 23, 2006, 9:28pm

Thanks John and Fernando,

You're right. I'll change the scripts to generate pure Python modules,
but I'll leave the "manual" module.

As for Unicode, I fully understand what you mean John, and I'm planing
to try to get mathtext to work with the fonts I mentioned to you a
while ago:
http://canopus.iacp.dvo.ru/~panov/cm-unicode/

although they don't have almost any pure math characters (like
integral etc.), but at least they'll be usefull for testing the
module. They have some very exotic characters. The maintainer said
that, if I (or anybody) want to, I can send him patches for the math
symbols (not for this SoC :).

Edin

···

On 6/23/06, Fernando Perez <fperez.net@...149...> wrote:

On 6/23/06, John Hunter <jdhunter@...5...> wrote:

> > The reason why I used pickle - from the Python docs: =====
>
> I have had bad experiences in the past with pickle files created with
> one version that don't load with another. I don't know if that is a
> common problem or if others have experienced it, but it has made me
> wary of them for mpl, where we work across platforms and python
> versions. Maybe this concern is unfounded. I still do not understand
> what the downside is of simply creating a dictionary in a python
> module as we do with latex_to_bakoma.

The most common way pickle breaks is when you pickle an instance and
later modify the class it belongs to such that some attribute
disappears or is renamed. Since pickling works by 'fully qualified
name', meaning that it only saves the name of the class and the
instance data, but it doesn't actually save the original class, in
this scenario the pickle can't be unpickled since there are
attributes that the new class doesn't have anymore.

If you are strictly pickling data in one of the builtin python types,
you are /probably/ OK, as I don't see python removing attributes from
dicts, and the builtin data types don't really have any special
instance attributes with much metadata that can change.

But it's still true that there's a window for problems with pickle
that simply isn't there with a pure auto-generated source module. And
the speed argument is, I think moot: when you import something, python
marshals the source into binary bytecode using something which I think
is quite similar to cPickle, and probably just as fast (if not faster,
since marshal is simpler than pickle). I'm not 100% sure on the
details of bytecode marshalling, so please correct me if this part is
wrong.

HTH,

f

_Edin_Salkovic · June 24, 2006, 12:12am

Look what happened to my beautiful code

'''A script for seemlesly copying the data from the stix-tbl.ascii*
file to a set
of python dicts. Dicts are then saved to .py coresponding files, for
later retrieval.
Currently used table file:
http://www.ams.org/STIX/bnb/stix-tbl.ascii-2005-09-24

'''

tablefilename = 'stix-tbl.ascii-2005-09-24'
dictnames = ['uni2type1', 'type12uni', 'uni2tex', 'tex2uni']
dicts = {}
# initialize the dicts
for name in dictnames:
dicts[name] = {}

for line in file(tablefilename):
    if line[:2]!=' 0': continue
    uninum = int(line[2:6].strip().lower(), 16)
    type1name = line[12:37].strip()
    texname = line[83:110].strip()
    if type1name:
        dicts['uni2type1'][uninum] = type1name
        dicts['type12uni'][type1name] = uninum
    if texname:
        dicts['uni2tex'][uninum] = texname
        dicts['tex2uni'][texname] = uninum

template = '''# Automatically generated file.
# Don't edit this file. Edit _mathtext_manual_data.py instead

%(name)s = {%(pairs)s
}
try:
    from _mathtext_manual_data import _%(name)s
    %(name)s.update(_%(name)s)
except (TypeError, SyntaxError): # Just these exceptions should be raised
    raise
except: # All other exceptions should be silent. Even ImportError
    pass
'''

# automatically generating .py module files, used by importers
for name in ('uni2type1', 'uni2tex'):
    pairs = ''
    for key, value in dicts[name].items():
        value = value.replace("'","\\'")
        value = value.replace('"','\\"')
        pair = "%(key)i : r'%(value)s',\n"%(locals())
        pairs += pair
    file(name + '.py','w').write(template%{'name':name, 'pairs':pairs})

for name in ('type12uni', 'tex2uni'):
    pairs = ''
    for key, value in dicts[name].items():
        key = key.replace("'","\\'")
        key = key.replace('"','\\"')
        pair = "r'%(key)s' : %(value)i,\n"%(locals())
        pairs += pair
    file(name + '.py','w').write(template%{'name':name, 'pairs':pairs})

# An example
from uni2tex import uni2tex
from uni2type1 import uni2type1

unichar = u'\u00d7'
uninum = ord(unichar)
print uni2tex[uninum]
print uni2type1[uninum]