find word in some string read from a file - encoding problem
hi,
i have a string that i copy from notepad, so i need to "find" some word into this string..i try to use .find(' ') from Pytom but its not work.
if (texto.
popup("sim")
else:
popup("não")
anyone can help me please?
Question information
- Language:
- English Edit question
- Status:
- Solved
- For:
- SikuliX Edit question
- Assignee:
- No assignee Edit question
- Last query:
- Last reply:
Revision history for this message
|
#1 |
str.find()
returns the position of the searched string in the given string between 0 and len(str).
if not found, returns -1.
so if str.find(): does not work.
If you only ant to know, that it exists:
if -1 < texto.find('gol'): # True if found
or
if (texto.
Revision history for this message
|
#2 |
hi tanks for the Answer...but now i have another problem.
I get a string from noteped, and i dont know why the .find() still dont work.
here is the code
xReader = file("C:
for line in xReader:
line.
infoText.
if (-1 == infoText[
popup("sim")
type(Key.TAB)
else:
type(Key.RIGHT)
type(Key.TAB)
##########
#text from notepad
Fernando M.D.F Gandini
Física
05439120
tanks
Revision history for this message
|
#3 |
ok, now i know the real problem....
i have to compare the string that i get from notepad "Física" with "Física" that i put as string in Sikuli...
but because of enconding sikuli compare "Física" with "FÃ-sica"
how can i solve that?
tanks
Revision history for this message
|
#4 |
if i put the string that i get from file into a popup the encode get wrong but if i print in console its appear write
why this happen?
tanks
Revision history for this message
|
#5 |
I have a similar problem with german special characters. You could try setting the encoding by adding this line as first line of your test:
# -*- coding: utf-8 -*-
But this did not help me in all cases, there seem to be some internal problems with the encoding, but I don't know how to workaround them.
Revision history for this message
|
#6 |
@ j-the-k
# -*- coding: utf-8 -*-
is no longer needed with Sikuli X (done internally automagically) in scripts when running them with Sikuli's running features.
You are right, this is an encoding problem.
I will come back soon with an answer.
Revision history for this message
|
#7 |
--- file has utf-8 encoding already
This is the best case, since you have to do nothing else in your Sikuli scripts.
If you want to have choices and more features, use Notepad++ instead (http://
Coding: UTF-8 without BOM
(BOM would add additional 3 bytes at the beginning of the file!)
--- file does not have utf-8 encoding
This is normally the case, when saving files in Windows with normal Notepad (it is some extended ASCII encoding called ANSI or Latin-1(for WesternEurope), depends on the locale)
The following works, if there are embedded utf-8 characters in a file, that is read as a byte string (normal operation):
infoText = []
xReader = file("C:
for line in xReader.
infoText.
Revision history for this message
|
#8 |
Hi RaiMan, tanks a lot for the answers.....
but now i have another strange problem...
read from UTF-8 file --------
Fernando Gandini
Física
-------
infoText = []
xReader = file("C:
for line in xReader:
line.
infoText.
popup(infoText[1])
if (infoText[1] == "Física"):
popup("certo")
else:
popup("errado")
print infoText[1]
exit()
-------
popup(infoText[1]) ------> appear with wrong encoding
if (infoText[1] == "Física"): --------> return true, so read the file with
correct encode
print infoText[1] ------------> appear correct in the console
2012/4/12 RaiMan <email address hidden>
> Your question #193332 on Sikuli changed:
> https:/
>
> Status: Open => Answered
>
> RaiMan proposed the following answer:
> --- file has utf-8 encoding already
> This is the best case, since you have to do nothing else in your Sikuli
> scripts.
> If you want to have choices and more features, use Notepad++ instead (
> http://
> Coding: UTF-8 without BOM
> (BOM would add additional 3 bytes at the beginning of the file!)
>
> --- file does not have utf-8 encoding
> This is normally the case, when saving files in Windows with normal
> Notepad (it is some extended ASCII encoding called ANSI or Latin-1(for
> WesternEurope), depends on the locale)
>
> The following works, if there are embedded utf-8 characters in a file,
> that is read as a byte string (normal operation):
>
> infoText = []
> xReader = file("C:
> for line in xReader.
> infoText.
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
> https:/
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https:/
>
> You received this question notification because you asked the question.
>
Revision history for this message
|
#9 |
popup() has a problem with utf-8 characters (known problem).
this should work:
popup(infoText[
as you might have found out already:
popup("Física")
does not work either.
Revision history for this message
|
#10 |
hey RaiMan really tanks....this works fine.
tanks again.
bye
2012/4/12 RaiMan <email address hidden>
> Your question #193332 on Sikuli changed:
> https:/
>
> Status: Open => Answered
>
> RaiMan proposed the following answer:
> popup() has a problem with utf-8 characters (known problem).
>
> this should work:
>
> popup(infoText[
>
> as you might have found out already:
>
> popup("Física")
>
> does not work either.
>
> --
> If this answers your question, please go to the following page to let us
> know that it is solved:
> https:/
>
> If you still need help, you can reply to this email or go to the
> following page to enter your feedback:
> https:/
>
> You received this question notification because you asked the question.
>
Revision history for this message
|
#12 |
@RaiMan
Yesterday I created a script with the IDE on linux, checked it into a subversion repository, checked it out on a windows machine and ran it. It had the letters "äüöß" in it, and produced a "wrong encoding"-exception if # -*- coding: utf-8 -*- is not added to the file. I use rc3.
Revision history for this message
|
#13 |
@j-the-k
this is what internally is added to the beginning of every script that is either run using the IDE or from command line using either sikuli-ide.jar (as with the .bat's) or sikuli-script.jar using <java -jar sikuli-script.jar some.sikuli>.
"# coding=utf-8",
"from __future__ import with_statement",
"from sikuli import *",
If you use any other method to run your Sikuli Jython scripts (plain Jython, Eclipse, Netbeans or whatever), you have to take care for the file encoding yourself (utf-8 is recommended).
BTW: when adding a comment, that might provoke an answer, pls. subscribe to the question.
Revision history for this message
|
#14 |
I ran the script with the sikuli-ide.sh but not directly. I use several Sikuli-modules, and the one with "üäöß" in it is one that is imported by the one I executed. Like this:
<file1.sikuli>
print "äüöß"
<file2.sikuli>
import file1
sikuli-ide.sh -r file2.sikuli
=> encoding problem if file1 does not contain # -*- coding: utf-8 -*-
So maybe the encoding is not added to every sikuli-file or only to the ones that are executed directly and not imported?
Revision history for this message
|
#15 |
@ j-the-k
good finding :-)
yes, that is the difference. the imported scripts/modules are not manipulated, only the main script that is run by Sikuli. That is also the reason, that you have to add "from sikuli import *" yourself to scripts, that you want to import.
I will add a remark to the docs.
Thanks for evaluating this situation.