UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

Asked by eduardobedoya

********* question
what is the difference between
App.getClipboard(); and
Env.getClipboard() ????
********** answer
no diff.
App is the recommended use, using Env is deprecated.
the feature is implemented in class App

------------------------------------------------

I have this message when importing clipboard (containing a List) that was obtained from an OCR capture (third app)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

I have used Env.getClipboard().encode('utf-8') and it allow the script to run even with those utf-8 characters,
but in Sikuli IDLE Message Tab those characters are show in strange code eg (é = \xc3\xa9)
Since I need to make a Fuzzy String Matching with those List's items. Should I delete them with regex before importing to sikuli, or Sikuli could work with those rare characters in strings?
Please could you tell me which are the most common allowed ascii characters in sikuli (python) list, variables, items. So I can delete all the rest using regex. what other characters should I avoid besides utf-8??

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
eduardobedoya
Solved:
Last query:
Last reply:
Revision history for this message
eduardobedoya (gonnabdh) said :
#1

So far I have tested clipboard with characters in range x00-x7f with no errors.
So from now on I only use this characters `~!@#$%^&*()_+,<.>/?;:'"[{]}\|

Revision history for this message
RaiMan (raimund-hocke) said :
#2

there is no problem with utf-8 characters when using sikulix.jar to run a script.

with the latest 1.1.0 running on Java 7+ and using the bundled Jython 2.7, this works:

text = "皇甫春峰"
print "text: ", text
clip = App.getClipboard()
uprint("clip utf-8:", clip) # a Sikulix Jython feature to print strings containing utf-8 characters
print "print normal: ", clip

... before running the script, I take care using ctrl/cmd-c that the 4 chinese characters are on clipboard

this is the output you get:
text: 皇甫春峰
clip utf-8: 皇甫春峰
print normal: [error] script [ Untitled ] stopped with error in line 5
[error] UnicodeEncodeError ( 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128) )

Revision history for this message
eduardobedoya (gonnabdh) said :
#3

tHANKS RainMan

Revision history for this message
eduardobedoya (gonnabdh) said :
#4

Thanks RainMan
I said so,
because I was having problems when using Lists which items has those characters in them
In this case...
a = App.getClipboard();
aAsList = eval(a)
if clipboard has some of those characters, script stop working and in Message box appear some unicode error
eg.
a =
'corazona', 'Pebleo00', 'cofriasd', 'paflio'
then script works fine
a =
'corazona«', 'Pebleo00»', 'cofriasd¾', 'paflio'
script doesn't work and Message box appear some unicode error

I guess is a python thing, I already worked around it, perhaps coudl be usefull to you
Thanks Man.