log scaling fixes in backend_ps.py

Hello,

revisions 1.26 and 1.27 introduced some changes into the PS backend
which I do not quite understand. The log messages are "some log
optimizations" and "more log scaling fixes". What is the problem
they try to fix?

Most of the changes consist in replacements like

     def set_linewidth(self, linewidth):
         if linewidth != self.linewidth:
- self._pswriter.write("%s setlinewidth\n"%_num_to_str(linewidth))
+ self._pswriter.write("%1.3f setlinewidth\n"%linewidth)
             self.linewidth = linewidth

with the result that a linewidth of 2 is now emitted as 2.000 instead
of 2. Was this done deliberately? I guess that it can result in a
significant increase in PostScript file size. It also broke
set_linejoin and set_linecap, which expect integer arguments :frowning:

I will fix set_linejoin and set_linecap separately at the moment, but
I am also not so happy with the other changes as mentioned above.

All the best,
Jochen

···

--
http://seehuhn.de/

Hello,

···

On Tue, Feb 08, 2005 at 11:01:05AM +0000, Jochen Voss wrote:

I guess that it can result in a significant increase in PostScript
file size.

Actually it is only a moderate increase. The total generated output
of backend_driver.py grows by 3.3% (from 33623818 to 34725498 bytes).

Fixes for set_linecap and set_linejoin are in CVS.

I hope this helps,
Jochen
--

Jochen Voss wrote:

I will fix set_linejoin and set_linecap separately at the moment, but
I am also not so happy with the other changes as mentioned above.

OK, I updated to CVS, which fixed the PS problems, many thanks. I then reapplied my patch and fixed it a bit here and there. Here's the new version, against current CVS.

There is no significant win in backends_driver (a couple percent points at best), but no regression either. And in a few places, the code avoids O(N^2) operations, which are the kind of thing that can blow up in your face unexpectedly with a large dataset of the right kind, yet go totally unnoticed in typical usage. I ran the whole backends_driver and looked at the PS, they seem OK to me. Feel free to apply if you are OK with it.

The two routines where there might be really significant gains are

     def _draw_lines(self, gc, points):
         """
         Draw many lines. 'points' is a list of point coordinates.
         """
         # inline this for performance
         ps = ["%1.3f %1.3f m" % points[0]]
         ps.extend(["%1.3f %1.3f l" % point for point in points[1:] ])
         self._draw_ps("\n".join(ps), gc, None)

     def draw_lines(self, gc, x, y):
         """
         x and y are equal length arrays, draw lines connecting each
         point in x, y
         """
         if debugPS:
             self._pswriter.write("% lines\n")
         start = 0
         end = 1000
         points = zip(x,y)

         while 1:
             to_draw = points[start:end]
             if not to_draw:
                 break
             self._draw_lines(gc,to_draw)
             start = end
             end += 1000

Currrently, these zip a pair of arrays and then format them into text manually. This means resorting to python loops over lists of tuples, something bound to be terribly inefficient. I can imagine for plots with many thousands of lines, this being quite slow.

However, I'm afraid to rewrite this in a low-level way, because of the numeric/numarray difference. The right approach for this would be to generate a string representation of the array via numeric/numarray, which can do it in C. And then, that can be modified to add the m/l end of line markers on the first/rest lines via a (fast) python string operation.

But since I don't know whether numeric/numarray provide fully consistent array2str functions (I only have numeric to test with), I'm a bit afraid of touching this part. It's also possible that John's backend architectural changes end up modifying this, so perhaps such changes are best thought about after John finishes his reorganization.

But I think the patch is a safe, small cleanup to apply now.

Cheers,

f

backend_ps.diff (7.71 KB)

Fernando Perez wrote:

However, I'm afraid to rewrite this in a low-level way, because of the
numeric/numarray difference. The right approach for this would
be to generate
a string representation of the array via numeric/numarray, which
can do it in
C. And then, that can be modified to add the m/l end of line
markers on the
first/rest lines via a (fast) python string operation.

But since I don't know whether numeric/numarray provide fully consistent
array2str functions (I only have numeric to test with), I'm a bit
afraid of
touching this part. It's also possible that John's backend architectural
changes end up modifying this, so perhaps such changes are best
thought about
after John finishes his reorganization.

I'm only going from memory but my recollection is that the code to convert
arrays to strings is largely written in Python (we took it from Numeric).
Perhaps that has changed (for Numeric or numarray). My recollection is
that the Python code made use of many array manipulations so it should
be faster than element-by-element string conversions, but I also remember
thinking that this was an area that could potentially be speeded up by
conversion to C. In short, don't assume that the current array2str code
is in C (Todd may be able to correct me).

Perry

Quoting Perry Greenfield <perry@...31...>:

I'm only going from memory but my recollection is that the code to convert
arrays to strings is largely written in Python (we took it from Numeric).
Perhaps that has changed (for Numeric or numarray). My recollection is
that the Python code made use of many array manipulations so it should
be faster than element-by-element string conversions, but I also remember
thinking that this was an area that could potentially be speeded up by
conversion to C. In short, don't assume that the current array2str code
is in C (Todd may be able to correct me).

This is actually something which may be worth improving upon for future
numeric(3)/numarray versions. A fast, reliable, _and_ flexible string
formatter for arrays can be extremely useful in many contexts. Something where
you can control the format with precision, the line and item separators, and a
few other things, but with good speed, would be great. In current Numeric, you
can control item but not line (dimension) separators, for example (each line is
bracketed in ).

With such a formatter, this particular problem for PS formatting (and perhaps
others in mpl) could be delegated to the arrays themselves.

Best,

f

Your impression is correct; array2str was and still is mostly Python.

Todd

···

On Thu, 2005-02-10 at 10:03, Fernando.Perez@...76... wrote:

Quoting Perry Greenfield <perry@...31...>:

> I'm only going from memory but my recollection is that the code to convert
> arrays to strings is largely written in Python (we took it from Numeric).
> Perhaps that has changed (for Numeric or numarray). My recollection is
> that the Python code made use of many array manipulations so it should
> be faster than element-by-element string conversions, but I also remember
> thinking that this was an area that could potentially be speeded up by
> conversion to C. In short, don't assume that the current array2str code
> is in C (Todd may be able to correct me).