Region text read not always returning correct text

Asked by Mark McGuinn

I have a auction app which reads the region on the screen containing information on the current bid position and the amount the vendor wants in order to sell the item immediately. The fields are shown on the screen as shown in the following example.

(CR) 4,000 (CR) 20000

Because the size of the area containing the amounts of the bid and the ask can vary what I do is define a region which contains all of the information and then read the text which I then manipulate to extract the numbers as integers. I do this by looking for a special character in the text (The ")" character ascii 169) and then using that to split out the number which I then convert into an integer. I have included a simple test piece of code below.

The problem is that after a variable amount of times read the text I will get a -1 returned indicating that the special character was not found even though it was on the screen. If I rerun the code on the same image it will work with nothing changed in either the code or on the screen.

My question therefore is whether the region.text() call is problematic or if there is an issue with the way I am doing things. As usual any help would be much appreciated.

The code snippet:

import re
loc_code = 0
if loc_code == 0:
    ascii_reg =(Region(460,381,366,41))
else:
    ascii_reg = (Region(561,762,258,30))
ascii_text = ascii_reg.text()
ore_count =0
ore = len(ascii_text)

while ore_count < ore:
    print("Character is ...",ascii_text[ore_count])
    print("Character is ...",ord(ascii_text[ore_count]))
    ore_count +=1

result = ascii_text.find(chr(169))
print("The result is -->",result)
result2 =ascii_text.find(chr(169),result+1)
print("The second result is -->",result2,result+1)
bid_val = ascii_text[result:result2]
# print("The ask is -->",ask_val)
bid_val_i = re.sub("[^0-9]","",bid_val)
bid_s = bid_val_i.replace(',','')
int_bid_val = int(bid_s)

ask_val = ascii_text[result2+1:ore]
ask_val_i = re.sub("[^0-9]","",ask_val)
ask_s = ask_val_i.replace(',','')
int_ask_val = int(ask_s)

print("The bid is ",bid_s)
print("The ask is ",ask_s)

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
Mark McGuinn
Solved:
Last query:
Last reply:
Revision history for this message
Mark McGuinn (mmcguinn) said :
#1

I thought it might be useful to add the output for when firstly the program runs correctly:

('Character is ...', '\xc2')
('Character is ...', 194)
('Character is ...', '\xa9')
('Character is ...', 169)
('Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '1')
('Character is ...', 49)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', ',')
('Character is ...', 44)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '\xc2')
('Character is ...', 194)
('Character is ...', '\xa9')
('Character is ...', 169)
('Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '1')
('Character is ...', 49)
('Character is ...', '6')
('Character is ...', 54)
('Character is ...', ',')
('Character is ...', 44)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
(
'Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '|')
('Character is ...', 124)
('The result is -->', 1)
('The second result is -->', 11, 2)
('The bid is ', '10000')
('The ask is ', '16000')

and secondly when it doesn't:

('Character is ...', '\xc2')
('Character is ...', 194)
('Character is ...', '\xa9')
('Character is ...', 169)
('Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '1')
('Character is ...', 49)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', ',')
('Character is ...', 44)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '@')
('Character is ...', 64)
('Character is ...', ' ')
('Character is ...', 32)
('Character is ...', '1')
('Character is ...', 49)
('Character is ...', '6')
('Character is ...', 54)
('Character is ...', ',')
('Character is ...', 44)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', '0')
('Character is ...', 48)
('Character is ...', ' ')
('Character is ...', 32)
('Charact
er is ...', '|')
('Character is ...', 124)
('The result is -->', 1)
('The second result is -->', 1, 2)
('The bid is ', '')
('The ask is ', '1000016000')

Revision history for this message
Manfred Hampl (m-hampl) said :
#2

You have to be aware that OCR always has some problems distinguishing special characters like ®, © and @.

Your text is somewhat contradictory. You talk about ascii 169 and ')', but as far as I know ascii 169(dec) is ®, and ')' is ascii code 41(dec).

The decoding log of your program shows (in the second round) @ 64(dec) instead.

If you use copy/paste on the values on screen, what do you get?
And what are the contents of ascii_text immediately after the "ascii_text = ascii_reg.text()" statement?

Revision history for this message
Mark McGuinn (mmcguinn) said :
#3

as requested the ascii string after the ascii_reg.text() when the programs works is:

('The ascii text is-->', '\xc2\xa9 40,000 \xc2\xa9 78,000 H')

and when it doesn't is:

('The ascii text is-->', '\xc2\xa9 40,000 @ 78,000 ;')

It is not possible to do a copy/paste of the screen because the area is a 'button' and if you try to right click on it to select it the button is triggered.

Revision history for this message
Manfred Hampl (m-hampl) said :
#4

What about trying to split the string at the blanks and taking the second and fourth piece? Then it does not matter whether it's \xc2\xa9 or @.

Revision history for this message
Mark McGuinn (mmcguinn) said :
#5

The suggestion to use the space character as the delimiter instead worked, thanks!