'bug' in labelling axes for data with small variation?

John_Hunter · February 25, 2005, 4:02am

x=pylab.arange(0,1e-8,1e-9)+1.0 pylab.plot(x) pylab.show()

    > All works fine when I subtract the mean of x but there seems
    > to be a problem with labelling axes for plotted data which
    > is not close to zero but shows only small variations.

I agree it's a bug. It's not immediately clear to me what the labels
should be though

  1.0000000002
  1.0000000004
  1.0000000006

and so on? That takes up a lot of room. Granted, correct but ugly
is better than incorrect but pretty, but I'm curious if there is a
better way to format these cases. Perhaps ideal would be an indicator
at the bottom or top of the y axis that read '1+' and then use 2e-9,
4e-9, etc as the actual tick labels. Do you agree this is ideal?

To achieve this, as you may know, you can pass in a custom tick
formatter. Eg

import pylab

def myformatter(x, pos):
return '%1.0e'%(x-1)

  ax = pylab.subplot(111)
  x=pylab.arange(0,1e-8,1e-9)+1.0
  ax.plot(x)
  formatter = pylab.FuncFormatter(myformatter)
  ax.yaxis.set_major_formatter(formatter)
  ax.text(-.125, 1.025, '1+', transform=ax.transAxes)
  pylab.show()

See also examples/custom_ticker1.py.

This is not to say that I don't want to fix this bug; I just wanted to
make sure you are aware of this workaround.

JDH

Hans_Fangohr · February 25, 2005, 10:19am

Hi John,

   > x=pylab.arange(0,1e-8,1e-9)+1.0 pylab.plot(x) pylab.show()

   > All works fine when I subtract the mean of x but there seems
   > to be a problem with labelling axes for plotted data which
   > is not close to zero but shows only small variations.

I agree it's a bug. It's not immediately clear to me what the labels
should be though

1.0000000002
1.0000000004
1.0000000006

and so on? That takes up a lot of room. Granted, correct but ugly
is better than incorrect but pretty, but I'm curious if there is a
better way to format these cases. Perhaps ideal would be an indicator
at the bottom or top of the y axis that read '1+' and then use 2e-9,
4e-9, etc as the actual tick labels. Do you agree this is ideal?

That would be very cool [but not soo easy to code I suppose]. I just checked what Matlab does (seeing that they have been in the business for a while), and if I run the equivalent program (i.e. plotting date from 1+0e-9, 1+1e-9, 1+2e-9,...,1e-8), the plot just shows "1" at each tick.

So: yes, I agree with your suggested ideal solution but as long as Matlab doesn't do this and there is a workaround (as you show below), this is probably not a high-priority bug

To achieve this, as you may know, you can pass in a custom tick
formatter. Eg

import pylab

def myformatter(x, pos):
return '%1.0e'%(x-1)

ax = pylab.subplot(111)
x=pylab.arange(0,1e-8,1e-9)+1.0
ax.plot(x)
formatter = pylab.FuncFormatter(myformatter)
ax.yaxis.set_major_formatter(formatter)
ax.text(-.125, 1.025, '1+', transform=ax.transAxes)
pylab.show()

See also examples/custom_ticker1.py.

This is not to say that I don't want to fix this bug; I just wanted to
make sure you are aware of this workaround.

That's very useful information -- I wasn't aware of this possibility and am sure it will be useful to others searching the mailing list for answers.

Best wishes,

Hans

_Darren_Dale1 · February 25, 2005, 1:45pm

Having some email problems, my original reply wasnt sent. Hopefully this one
will be...

    > x=pylab.arange(0,1e-8,1e-9)+1.0 pylab.plot(x) pylab.show()

    > All works fine when I subtract the mean of x but there seems
    > to be a problem with labelling axes for plotted data which
    > is not close to zero but shows only small variations.

I agree it's a bug. It's not immediately clear to me what the labels
should be though

  1.0000000002
  1.0000000004
  1.0000000006

and so on? That takes up a lot of room. Granted, correct but ugly
is better than incorrect but pretty, but I'm curious if there is a
better way to format these cases. Perhaps ideal would be an indicator
at the bottom or top of the y axis that read '1+' and then use 2e-9,
4e-9, etc as the actual tick labels. Do you agree this is ideal?

Matlab will not do an offset of this type, it goes to 4 decimal places and
gives up. I have been meaning to do some work with the scalarFormatter for a
while now. I think we can accomplish the above using the numerix mean,std,and
maybe allclose functions. I will work on it this weekend, but will need some
guidance to add a new bbox to add the result to the plot. At the same time,
I'll work on an option to move the scientific notation to the same place, so
1e10,2e10 will be come 1,2, with a x10^10 at the top of the axis.

It was also suggested to keep the significant decimal points in the labels. I
agree with that 0,0.5,1.0,1.5 looks better than 0,0.5,1,1.5. If there are any
suggestions or comments, please let me know.

···

On Thursday 24 February 2005 11:02 pm, John Hunter wrote:

--

Darren

_Matt_Newville · February 25, 2005, 4:11pm

> I agree it's a bug. It's not immediately clear to me what the labels
> should be though
>
> 1.0000000002
> 1.0000000004
> 1.0000000006
>
> and so on? That takes up a lot of room. Granted, correct but ugly
> is better than incorrect but pretty, but I'm curious if there is a
> better way to format these cases. Perhaps ideal would be an indicator
> at the bottom or top of the y axis that read '1+' and then use 2e-9,
> 4e-9, etc as the actual tick labels. Do you agree this is ideal?

More likely, the plot should be of 1-x, not x, with 1 subtracted
from the data before being sent to the plot. Would you use
seconds-since-1970 to make a plot versus Time with a range of 1
sec and data every millisecond? The data plotted should be the
"significant digits" after all.

FWIW, a custom tick formatter I've been using is below. It's a
slight variation on the default, and won't solve the space
needed to display "1 + n*1.e-9", but it will do a reasonable job
of picking the number of significant digits to show based on the
data range for the Axis. It could be expanded....

--Matt

! def myformatter(self, x=1.0, axis=None):
! """ custom tick formatter to use with FuncFormatter():
! x value to be formatted
! axis Axis instance to use for formatting
! """
! fmt = '%1.5g'
! if axis == None:
! return fmt % x
!
! # attempt to get axis span (range of values to format)
! delta = 0.2
! try:
! ticks = axis.get_major_locator()()
! delta = abs(ticks[1] - ticks[0])
! except:
! try:
! delta = 0.1 * axis.get_view_interval().span()
! except:
! pass
!
! if delta > 99999: fmt = '%1.6e'
! elif delta > 0.99: fmt = '%1.0f'
! elif delta > 0.099: fmt = '%1.1f'
! elif delta > 0.0099: fmt = '%1.2f'
! elif delta > 0.00099: fmt = '%1.3f'
! elif delta > 0.000099: fmt = '%1.4f'
! elif delta > 0.0000099: fmt = '%1.5f'
!
! s = fmt % x
! s.strip()
! s = s.replace('+', '')
! while s.find('e0')>0: s = s.replace('e0','e')
! while s.find('-0')>0: s = s.replace('-0','-')
!
! return s

_Darren_Dale1 · February 26, 2005, 8:12pm

I worked on a new formatter yesterday and today. It includes the indicator
that John described above, right now in the last ticklabel at the top of the
axis. This custom formatter also includes scientific notation in the last
ticklabel only. The ultimate goal is to have scientific notation be formatted
like in the logplots, but I havent gotten that far yet.

Using the offset makes a large ticklabel at the moment. You can pass
useOffset=False to ScalarFormatterScientific to turn this feature off (see end
of script below). Interested parties, please give this script a whirl and send
me your comments.

(John, I have now subclassed ScalarFormatter, I didnt realize I had altered
a method that other formatters were inheriting.)

Darren

from matplotlib import *
rc('font',size='smaller')
rc('tick',labelsize='smaller')
from matplotlib.ticker import ScalarFormatter, LinearLocator
import math
from matplotlib.numerix import absolute, average

from pylab import *

class ScalarFormatterScientific(ScalarFormatter):
    """
    Tick location is a plain old number. If viewInterval is set, the
    formatter will use %d, %1.#f or %1.ef as appropriate. If it is
    not set, the formatter will do str conversion
    """
    def __init__(self, useOffset=True):
        """
        useOffset allows plotting small data ranges with large offsets:
        for example: [1+1e-9,1+2e-9,1+3e-9]
        """
        self._useOffset = useOffset

    def set_locs(self, locs):
        self.locs = locs
        self._set_offset()
        self._set_orderOfMagnitude()
        self._set_format()

    def _set_offset(self):
        # offset of 20,001 is 20,000, for example
        if self._useOffset:
            ave_loc = average(self.locs)
            std_loc = std(self.locs)
            if ave_loc: # dont want to take log10(0)
                ave_oom = math.floor(math.log10(absolute(ave_loc)))
                if std_loc/math.fabs(ave_loc) < 1e-4: # four sig-figs
                    # add 1e-15 because of floating point precision, fixes conversion
                    self.offset = int(ave_loc/10**ave_oom+1e-15)*10**ave_oom
        else: self.offset = 0

    def _set_orderOfMagnitude(self):
        # if using an offset, oom applies after applying the offset
        locs = array(self.locs)-self.offset
        ave_loc_abs = average(absolute(locs))
        oom = math.floor(math.log10(ave_loc_abs))
        # need to special-case for range of 0-1e-5
        if oom <= 0 and std(locs) < 1e-4:#10**(2*oom):
            self.orderOfMagnitude = oom
        elif oom <=0 and oom >= -5:
            pass
        elif math.fabs(oom) >= 4:
            self.orderOfMagnitude = oom

    def _set_format(self):
        locs = (array(self.locs,'d')-self.offset) / \
                10**self.orderOfMagnitude+1e-15
        sigfigs = [len(str('%1.4f'% loc).split('.')[1].rstrip('0')) \
                   for loc in locs]
        sigfigs.sort()
        self.format = '%1.' + str(sigfigs[-1]) + 'f'

    def pprint_val(self, x, d):
        xp = (x-self.offset)/10**self.orderOfMagnitude
        if x==self.locs[-1] and (self.orderOfMagnitude or self.offset):
            offsetStr = ''
            sciNotStr = ''
            xp = self.format % xp
            if self.offset:
                p = '%1.e+'% self.offset
                offsetStr = self._formatSciNotation(p)
            if self.orderOfMagnitude:
                p = 'x%1.e'% 10**self.orderOfMagnitude
                sciNotStr = self._formatSciNotation(p)
            return ''.join((offsetStr,xp,sciNotStr))
        elif xp==0: return '%d'% xp
        else: return self.format % xp

    def _formatSciNotation(self,s):
        tup = s.split('e')
        mantissa = tup[0]
        sign = tup[1][0].replace('+', '')
        exponent = tup[1][1:].lstrip('0')
        return '%se%s%s' %(mantissa, sign, exponent)

figure(1,figsize=(6,6))
ax1 = axes([.2,.74,.75,.2])
ax1.plot(arange(11)*5e2)
ax1.yaxis.set_major_formatter(ScalarFormatterScientific())
ax1.xaxis.set_visible(False)
ax1.set_title('BIG NUMBERS',fontsize=14)

ax2 = axes([.2,.51,.75,.2])
ax2.plot(arange(11)*1e4)
ax2.yaxis.set_major_formatter(ScalarFormatterScientific())
ax2.text(1,6e4,'6e4')
ax2.xaxis.set_visible(False)

ax3 = axes([.2,.28,.75,.2])
ax3.plot(arange(11)*1e4+1e10)
ax3.yaxis.set_major_formatter(ScalarFormatterScientific())
ax3.text(1,6e4+1e10,'1e10+6e4')
ax3.xaxis.set_visible(False)

ax4 = axes([.2,.05,.75,.2])
ax4.plot(arange(11)*1e4+1e10)
ax4.yaxis.set_major_formatter(ScalarFormatterScientific(useOffset=False))
ax4.text(1,1e10+6e4,'same as above, no offset')

figure(2,figsize=(6,6))
ax1 = axes([.225,.74,.75,.2])
ax1.plot(arange(11)*5e-5)
ax1.yaxis.set_major_formatter(ScalarFormatterScientific())
ax1.xaxis.set_visible(False)
ax1.set_title('small numbers',fontsize=8)

ax2 = axes([.225,.51,.75,.2])
ax2.plot(arange(11)*1e-5)
ax2.yaxis.set_major_formatter(ScalarFormatterScientific())
ax2.text(1,6e-5,'6e-5')
ax2.xaxis.set_visible(False)

ax3 = axes([.225,.28,.75,.2])
ax3.plot(arange(11)*1e-10+1e-5)
ax3.yaxis.set_major_formatter(ScalarFormatterScientific())
ax3.text(1,1e-5+6e-10,'6e-10+1e-5')
ax3.xaxis.set_visible(False)

ax4 = axes([.225,.05,.75,.2])
ax4.plot(arange(11)*1e-10+1e-5)
ax4.yaxis.set_major_formatter(ScalarFormatterScientific(useOffset=False))
ax4.text(1,1e-5+6e-10,'same as above, no offset')
show()

···

On Friday 25 February 2005 08:45 am, Darren Dale wrote:

On Thursday 24 February 2005 11:02 pm, John Hunter wrote:

>
> > x=pylab.arange(0,1e-8,1e-9)+1.0 pylab.plot(x) pylab.show()
>
> > All works fine when I subtract the mean of x but there seems
> > to be a problem with labelling axes for plotted data which
> > is not close to zero but shows only small variations.
>
> I agree it's a bug. It's not immediately clear to me what the labels
> should be though
>
> 1.0000000002
> 1.0000000004
> 1.0000000006
>
> and so on? That takes up a lot of room. Granted, correct but ugly
> is better than incorrect but pretty, but I'm curious if there is a
> better way to format these cases. Perhaps ideal would be an indicator
> at the bottom or top of the y axis that read '1+' and then use 2e-9,
> 4e-9, etc as the actual tick labels. Do you agree this is ideal?

_Darren_Dale1 · February 28, 2005, 1:40am

oops, I just noticed a bug, the first script I posted wont run. This updated script
worked for me with a fresh 0.72.1 installation. Sorry about the error.

Darren

from matplotlib import *
rc('font',size='smaller')
rc('tick',labelsize='smaller')
from matplotlib.ticker import ScalarFormatter, LinearLocator
import math
from matplotlib.numerix import absolute, average

from pylab import *

class ScalarFormatterScientific(ScalarFormatter):
    """
    Tick location is a plain old number. If useOffset==True and the data range
    <1e-4* the data average, then an offset will be determined such that the
    tick labels are meaningful. Scientific notation is used for data < 1e-4 or
    data >= 1e4. Scientific notation is presented once for each axis, in the
    last ticklabel.
    """
    def __init__(self, useOffset=True):
        """
        useOffset allows plotting small data ranges with large offsets:
        for example: [1+1e-9,1+2e-9,1+3e-9]
        """
        self._useOffset = useOffset
        self.offset = 0
        self.orderOfMagnitude = 0
        self.format = None

    def set_locs(self, locs):
        self.locs = locs
        self._set_offset()
        self._set_orderOfMagnitude()
        self._set_format()

    def _set_offset(self):
        # offset of 20,001 is 20,000, for example
        if self._useOffset:
            ave_loc = average(self.locs)
            std_loc = std(self.locs)
            if ave_loc: # dont want to take log10(0)
                ave_oom = math.floor(math.log10(absolute(ave_loc)))
                if std_loc/math.fabs(ave_loc) < 1e-4: # four sig-figs
                    # add 1e-15 because of floating point precision, fixes conversion
                    self.offset = int(ave_loc/10**ave_oom+1e-15)*10**ave_oom
        else: self.offset = 0

    def _set_orderOfMagnitude(self):
        # if scientific notation is to be used, find the appropriate exponent
        # if using an offset, find the OOM after applying the offset
        locs = array(self.locs)-self.offset
        ave_loc_abs = average(absolute(locs))
        oom = math.floor(math.log10(ave_loc_abs))
        # need to special-case for range of 0-1e-5
        if oom <= 0 and std(locs) < 1e-4:#10**(2*oom):
            self.orderOfMagnitude = oom
        elif oom <=0 and oom >= -5:
            pass
        elif math.fabs(oom) >= 4:
            self.orderOfMagnitude = oom

    def _set_format(self):
        # set the format string to format all the ticklabels
        locs = (array(self.locs,'d')-self.offset) / \
                10**self.orderOfMagnitude+1e-15
        sigfigs = [len(str('%1.4f'% loc).split('.')[1].rstrip('0')) \
                   for loc in locs]
        sigfigs.sort()
        self.format = '%1.' + str(sigfigs[-1]) + 'f'

    def pprint_val(self, x, d):
        # d is no longer necessary, x is the tick location.
        xp = (x-self.offset)/10**self.orderOfMagnitude
        if x==self.locs[-1] and (self.orderOfMagnitude or self.offset):
            offsetStr = ''
            sciNotStr = ''
            xp = self.format % xp
            if self.offset:
                p = '%1.e+'% self.offset
                offsetStr = self._formatSciNotation(p)
            if self.orderOfMagnitude:
                p = 'x%1.e'% 10**self.orderOfMagnitude
                sciNotStr = self._formatSciNotation(p)
            return ''.join((offsetStr,xp,sciNotStr[2:]))
        elif xp==0: return '%d'% xp
        else: return self.format % xp

    def _formatSciNotation(self,s):
        # transform 1e+004 into 1e4, for example
        tup = s.split('e')
        mantissa = tup[0]
        sign = tup[1][0].replace('+', '')
        exponent = tup[1][1:].lstrip('0')
        return '%se%s%s' %(mantissa, sign, exponent)

figure(1,figsize=(6,6))
ax1 = axes([.2,.74,.75,.2])
ax1.plot(arange(11)*5e2)
ax1.yaxis.set_major_formatter(ScalarFormatterScientific())
ax1.xaxis.set_visible(False)
ax1.set_title('BIG NUMBERS',fontsize=14)

ax2 = axes([.2,.51,.75,.2])
ax2.plot(arange(11)*1e4)
ax2.yaxis.set_major_formatter(ScalarFormatterScientific())
ax2.text(1,6e4,'y=1e4*x')
ax2.xaxis.set_visible(False)

ax3 = axes([.2,.28,.75,.2])
ax3.plot(arange(11)*1e4+1e10)
ax3.yaxis.set_major_formatter(ScalarFormatterScientific())
ax3.text(1,6e4+1e10,'y=1e4*x+1e10')
ax3.xaxis.set_visible(False)

ax4 = axes([.2,.05,.75,.2])
ax4.plot(arange(11)*1e4+1e10)
ax4.yaxis.set_major_formatter(ScalarFormatterScientific(useOffset=False))
ax4.text(1,1e10+6e4,'y=1e4*x+1e10, no offset')

figure(2,figsize=(6,6))
ax1 = axes([.225,.74,.75,.2])
ax1.plot(arange(11)*1e-4)
ax1.yaxis.set_major_formatter(ScalarFormatterScientific())
ax1.xaxis.set_visible(False)
ax1.set_title('small numbers',fontsize=8)

ax2 = axes([.225,.51,.75,.2])
ax2.plot(arange(11)*1e-5)
ax2.yaxis.set_major_formatter(ScalarFormatterScientific())
ax2.text(1,6e-5,'y=1e-5*x')
ax2.xaxis.set_visible(False)

ax3 = axes([.225,.28,.75,.2])
ax3.plot(arange(11)*1e-10+1e-5)
ax3.yaxis.set_major_formatter(ScalarFormatterScientific())
ax3.text(1,1e-5+6e-10,'y=1e-10*x+1e-5')
ax3.xaxis.set_visible(False)

ax4 = axes([.225,.05,.75,.2])
ax4.plot(arange(11)*1e-10+1e-5)
ax4.yaxis.set_major_formatter(ScalarFormatterScientific(useOffset=False))
ax4.text(1,1e-5+6e-10,'y=1e-10*x+1e-5, no offset')
show()