Is dimensionality.string supposed to return an ascii string?

Asked by Tony S Yu

Hi quantities devs,

I'm just starting to play around with quantities, and I really like it.

I'm trying to print out some quantities, but I'm getting encoding errors. I compiled quantities without the --no-unicode flag, which seems to be causing problems with my text editor (TextMate). I tried to work around this problem by printing quantity.dimensionality.string. This fix works for superscripts, but fails for the degree symbol. For example: ``print 20 * pq.degC`` fails in my text editor (see error message below).

Was ``dimensionality.string`` intended to make the dimensionality ascii printable? Or is there something else causing my problems (I don't have much experience with character encodings)?

Thanks!
-Tony

Note that this error doesn't occur in the terminal.

System:
Mac OS X 10.5.6
python 2.5.1 (original Leopard install)
numpy 1.3.0
quantities 0.5b2

#------Error message-------
UnicodeDecodeError: ascii, °C, 0, 1, ordinal not in range(128)

function write in codecs.py at line 303
data, consumed = self.encode(object, self.errors)
Warning: It seems that you are trying to print a plain string containing unicode characters. In many contexts, setting the script encoding to UTF-8 and using plain strings with non-ASCII will work, but it is fragile. See also this ticket.

You can fix this by changing the string to a unicode string using the 'u' prefix (e.g. u"foobar").

Question information

Language:
English Edit question
Status:
Answered
For:
python-quantities Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Tony S Yu (tonysyu) said :
#1

I'm not sure if this is the correct forum to be asking questions, but the mailing list seems to be for developers only.

Looking over the code, it appears that ``dimensionality.string`` is not intended to return ascii strings. I'm a little confused about the difference between format_units and format_units_unicode in markup.py.

In any case, it'd be nice to control the use of unicode without rebuilding. It seems this wouldn't be too difficult to do.

For example, if all instances of
>>> from .config import USE_UNICODE
were replaced with
>>> from . import config
(and uses of ``USE_UNICODE`` were replaced with ``config.USE_UNICODE``), then you could do

>>> import quantities as pq
>>> pq.config.USE_UNICODE = False

I'd understand if you considered this bad form (is this considered monkeypatching?).

Patch attached below for completeness, but it's really just the renaming that I mentioned above.

Cheers,
-Tony

=== modified file 'quantities/dimensionality.py'
--- quantities/dimensionality.py 2009-02-22 22:58:09 +0000
+++ quantities/dimensionality.py 2009-05-11 23:13:57 +0000
@@ -7,7 +7,7 @@

 import numpy as np

-from .config import USE_UNICODE
+from . import config
 from .markup import format_units, format_units_unicode
 from .registry import unit_registry
 from .utilities import memoize
@@ -187,7 +187,7 @@
             % ', '.join(['%s: %s'% (u.name, e) for u, e in self.iteritems()])

     def __str__(self):
- if USE_UNICODE:
+ if config.USE_UNICODE:
             return self.unicode
         else:
             return self.string

=== modified file 'quantities/markup.py'
--- quantities/markup.py 2009-02-21 13:28:49 +0000
+++ quantities/markup.py 2009-05-11 23:13:56 +0000
@@ -6,7 +6,7 @@
 import operator
 import re

-from .config import USE_UNICODE
+from . import config

 superscripts = ['⁰', '¹', '²', '³', '⁴', '⁵', '⁶', '⁷', '⁸', '⁹']

@@ -37,7 +37,7 @@
     ]
     for key in keys:
         d = udict[key]
- if USE_UNICODE:
+ if config.USE_UNICODE:
             u = key.u_symbol
         else:
             u = key.symbol

=== modified file 'quantities/quantity.py'
--- quantities/quantity.py 2009-03-07 17:32:26 +0000
+++ quantities/quantity.py 2009-05-11 23:13:54 +0000
@@ -7,7 +7,7 @@

 import numpy as np

-from .config import USE_UNICODE
+from . import config
 from .dimensionality import Dimensionality, p_dict
 from .registry import unit_registry
 from .utilities import with_doc
@@ -266,7 +266,7 @@

     @with_doc(np.ndarray.__str__)
     def __str__(self):
- if USE_UNICODE:
+ if config.USE_UNICODE:
             dims = self.dimensionality.unicode
         else:
             dims = self.dimensionality.string

=== modified file 'quantities/uncertainquantity.py'
--- quantities/uncertainquantity.py 2009-04-01 20:05:00 +0000
+++ quantities/uncertainquantity.py 2009-05-11 23:14:14 +0000
@@ -5,7 +5,7 @@

 import numpy as np

-from .config import USE_UNICODE
+from . import config
 from .quantity import Quantity
 from .registry import unit_registry
 from .utilities import with_doc
@@ -180,7 +180,7 @@

     @with_doc(Quantity.__str__, use_header=False)
     def __str__(self):
- if USE_UNICODE:
+ if config.USE_UNICODE:
             dims = self.dimensionality.unicode
         else:
             dims = self.dimensionality.string
@@ -189,7 +189,7 @@
             dims,
             str(self.uncertainty)
         )
- if USE_UNICODE:
+ if config.USE_UNICODE:
             return s.replace('+/-', '±').replace(' sigma', 'σ')
         return s

=== modified file 'quantities/unitquantity.py'
--- quantities/unitquantity.py 2009-02-21 13:28:49 +0000
+++ quantities/unitquantity.py 2009-05-11 23:14:14 +0000
@@ -6,7 +6,7 @@

 import numpy

-from .config import USE_UNICODE
+from . import config
 from .dimensionality import Dimensionality
 from .markup import superscript
 from .quantity import Quantity, get_conversion_factor
@@ -144,7 +144,7 @@
             ref = ''
         symbol = self._symbol
         symbol = ', %s'%(repr(symbol)) if symbol else ''
- if USE_UNICODE:
+ if config.USE_UNICODE:
             u_symbol = self._u_symbol
             u_symbol = ', %s'%(repr(u_symbol)) if u_symbol else ''
         else:
@@ -156,7 +156,7 @@
     @with_doc(Quantity.__str__, use_header=False)
     def __str__(self):
         if self.u_symbol != self.name:
- if USE_UNICODE:
+ if config.USE_UNICODE:
                 s = '1 %s (%s)'%(self.u_symbol, self.name)
             else:
                 s = '1 %s (%s)'%(self.symbol, self.name)
@@ -337,7 +337,7 @@

     @property
     def name(self):
- if USE_UNICODE:
+ if config.USE_UNICODE:
             return '(%s)'%(superscript(self._name))
         else:
             return '(%s)'%self._name

Revision history for this message
Darren Dale (dsdale24) said :
#2

Hi Tony,

Thanks for the question and for the suggested changes. I think you are right, this should be configurable. I'll add support so it can be modified in a threadsafe manner, and perhaps with an rc setting as well. I'll have to get back to you within a week or two, though.

Darren

Revision history for this message
Darren Dale (dsdale24) said :
#3

This feature is available as of quantities-0.5.0

Can you help with this problem?

Provide an answer of your own, or ask Tony S Yu for more information if necessary.

To post a message you must log in.