this was spammed by some arab idiot

Asked by kumatta

content lost.

changed by a spammer.

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
kumatta
Solved:
Last query:
Last reply:
Revision history for this message
RaiMan (raimund-hocke) said :
#1

I have to confirm this problem, but only on Windows (tested with version 1.1.0 which contains Jython 2.7b2 and with Java 7 and 8).

- the usage of the codecs module leads to the mentioned weird behavior.
Seems to be some Jython problem when using the same interpreter instance again (as is the case in Sikuli IDE when rerunning a script in the same IDE session).

--- running the below test script on Windows:
- the popups show the expected output
- on commandline, in the unicode situation ? are printed for each unicode character using the Java println
- the simple Python print refuses to work with unicode characters with a decoding error

this is the test script I used:

import codecs
import java.lang.System as JS
#sys.stdin = codecs.getreader('utf-8')(sys.stdin)
#sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
text = "字"
textu = unicd(text) # a wrapper for unicode(text, "utf-8")
# instead one can use: text = u"字"
JS.out.println("plainJ: " + text)
JS.out.println("unicodeJ: " + textu)
popup("plain: " + text)
popup("unicode: " + textu)
print "plain:", text
print "unicode:"
print unicode(text,"utf-8")

--- getting this output on Mac
plainJ: 字
unicodeJ: 字
plain: 字
unicode:

with the popups showing the expected output.

--- conclusion:
seems to be a Jython problem on Windows.
I do not have any idea currently how to solve this.
I take it as a bug.

Revision history for this message
kumatta (kouji) said :
#2

Thanks yuo so much,RaiMan.
Your script was tried on Sikuli.

01: #!/usr/bin/env python
02: # -*- coding: utf-8 -*-
03: import sys
04: import codecs
05: import java.lang.System as JS
06: #sys.stdin = codecs.getreader('utf-8')(sys.stdin)
07: #sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
08: text = "字"
09: textu = unicd(text) # a wrapper for unicode(text, "utf-8") # instead one can use: text = u"字"
10: JS.out.println("plainJ: " + text)
11: JS.out.println("unicodeJ: " + textu)
12: popup("plain: " + text)
13: popup("unicode: " + textu)
14: print "plain:", text
15: print "unicode:"
16: print unicode(text,"utf-8")

---------------------------------------------------------------first execution---------
plainJ: 字
unicodeJ: 字

plain: 字
unicode:
[error] script [ ttt2 ] stopped with error in line 16
[error] UnicodeEncodeError ( 'ascii' codec can't encode character u'\u5b57' in position 0: ordinal not in range(128) )

--------second execution----------
plainJ: 字
unicodeJ: 字

plain: 字
unicode:
[error] script [ ttt2 ] stopped with error in line 16
[error] UnicodeEncodeError ( 'ascii' codec can't encode character u'\u5b57' in position 0: ordinal not in range(128) )

The 1st time and the 2nd time were displayed correctly.
========================
The 7th line"sys.stdout = codecs.getwriter('utf-8')(sys.stdout)" was corrected effectively.

--------first execution---------
plainJ: 字
unicodeJ: 字

plain: 字
unicode:

--------second execution----------
plainJ: 字
unicodeJ: 字

plain: 字
unicode:
字

----------Third execution -----------
plainJ: 字
unicodeJ: 字

plain: 字
unicode:
字

It becomes amusing whenever it performs "sys.stdout = codecs.getwriter('utf-8')(sys.stdout)" .
I gave up the "print" of the double byte character.
Use of "Java's printin" is considered.

Thanks.

--kumatta--

Revision history for this message
RaiMan (raimund-hocke) said :
#3

thanks for feedback.

Hope I can fix a lot of these unicode quirks in version 1.2 later the next months.