How to compare the string from two different text file?

Asked by koventhan

 I want to take one of the string from x file and other string from y file and compare both the strings.

Example :

 File x - student_list.txt

1. Student Name - Kumar
2. Student ID- 12345
3. Student percentage - 75%

 File y - student_list.txt

1. Student Name - Kumar.A
2. Student ID- 6789
3. Student percentage - 90%

From these files, wants to take Student Name's and compare. if both are same it should print PASS otherwise FAIL. could you please help me for this?

Question information

Language:
English Edit question
Status:
Solved
For:
SikuliX Edit question
Assignee:
No assignee Edit question
Solved by:
RaiMan
Solved:
Last query:
Last reply:
Revision history for this message
RaiMan (raimund-hocke) said :
#1

lines1 = open(file1).readlines()
lines2 = open(file2).readlines()

shouldFail = false
for i in range(length(lines1)):
    if not lines1[i] == lines1[i]:
        shouldFail = true
        break
if shouldFail:
    print "files do not have equal content"

Revision history for this message
koventhan (kingsaa2000) said :
#2

I am very new to this one.. I think whatever i am asking is very very basic question, sorry for it

I want to grep "Student Name -" from each file (That means our of this grep is Kumar (file1) and Kumar.A (file2)) and compare both the outputs?

like below

From file 1 - Student Name - Kumar

From file2 - Student Name - Kumar.A

and then want to compare both . The result should be fail

Revision history for this message
RaiMan (raimund-hocke) said :
#3

do both files only contain these 3 lines?
... or do both have some/many of these blocks
1. Student Name - Kumar
2. Student ID- 12345
3. Student percentage - 75%

Revision history for this message
koventhan (kingsaa2000) said :
#4

Just for example i have given these 3 lines but there are may lines may be more than 2000+ lines

In that i want to grep "Student Name - " output as "Kumar" from file1 and same like file2. And compare both the output's

Revision history for this message
koventhan (kingsaa2000) said :
#5

Student Name is not 1st line it will not be fixed line... So want to grep output of "Student Name - " from both the files

Revision history for this message
RaiMan (raimund-hocke) said :
#6

the solution heavily depends on, what you want to do after you have compared 2 names.

If you go sequentially through file1 and grab one name after the other:
Do you want to compare against all names in file2 or only against the next one (which would mean, that both have to be in some equally sorted order)
What will you do if no match?

Do you want to keep the additional 2 lines from both files on match?

What will you do, if you have a match?

Revision history for this message
koventhan (kingsaa2000) said :
#7

There is only one "Student Name -" in each files but each file has different Student Name's. So i want to grep this field and compare with other file. Whether both the student name is same or not?

file should be like below,

Student Name - AAAA
Student ID- 12345
Student percentage - 75%
.....
....
....
.....
.....
.....

.....
.....

like this 2000+ lines will be there,,, but only one student name field is present (Note- it may be 1st line or 100th file or 2000th file so this field is not fixed)...

once i get output of each files then i need to compare whether both the files has same name or different name?

like File1 has Kumar and File2 has Siva,,, Both Names are different .. So out is falls

Revision history for this message
RaiMan (raimund-hocke) said :
#8

ok, finally understood ;-)

def getName(file):
    f = open(file)
    name = ""
    for line in f.readlines():
        if line.count("Student Name") == 0: continue
        (head, name) = line.split("-")
        name = name.strip()
        break
    f.close()
    return name

file1 = "some absolute file name"
file2 = "some other absolute file name"

name1 = getName(file1)
name2 = getName(file2)
if name1 == name2:
    print "we have a match:", name1, "in:" file1, "and:", file2
else:
    print "different names:", name1, name2, "in:" file1, "and:", file2

Revision history for this message
koventhan (kingsaa2000) said :
#9

Not understanding this script. Can you please give little bit information,

my_dir = "C:\\Program Files\\Sikuli X\\"

1st File- Student1_list.txt

its has below informations,

Student Name - Kumar
Student ID- 12345
Student percentage - 75%
...
...
...
...
etc

2nd File - Student2_list.txt

its has below informations,

Student Name - Siva
Student ID- 6789
Student percentage - 90%
...
...
...
...
etc

please let me know below script is fine or not?

def getName(file):
    f = open(file)
    name = ""
    for line in f.readlines():
        if line.count("Student Name") == 0: continue
        (head, name) = line.split("-")
        name = name.strip()
        break
    f.close()
    return name

file1 = "C:\\Program Files\\Sikuli X\\Student1_list.txt"
file2 = "C:\\Program Files\\Sikuli X\\Student2_list.txt"

name1 = getName(file1)
name2 = getName(file2)
if name1 == name2:
    print "we have a match:", name1, "in:" file1, "and:", file2
else:
    print "different names:", name1, name2, "in:" file1, "and:", file2

Revision history for this message
koventhan (kingsaa2000) said :
#10

i was using below script to grep Student Name output for file1 but i am getting "TypeError::expected str or unicode but got"

my_dir = "C:\\Program Files\\Sikuli X\\"
import re
line1 = open (my_dir+"file1.txt").readlines()
m_obj = re.search(r"Student\s*Name\s*-\s*(\S*)", line1)
print m_obj.group(1)

but if i use like below without read a file, its working fine

my_dir = "C:\\Program Files\\Sikuli X\\"
import re
line1 = "Student Name - Kumar"
m_obj = re.search(r"Student\s*Name\s*-\s*(\S*)", line1)
print m_obj.group(1)

please let me know what is the issue with my code?

Revision history for this message
Best RaiMan (raimund-hocke) said :
#11

--- first
this is my optimized solution with some comments

import os

# finds and returns the name in a file
def getName(file):
    f = open(file)
    name = ""
    for line in f.readlines():
        # if line does not contain the token read next one
        if line.count("Student Name") == 0: continue
        # we split the line at the hyphen
        # the trailing part goes to name
        (head, name) = line.split("-")
        # we strip the leading/trailing whitespace
        name = name.strip()
        # found, so we can leave
        break
    f.close()
    return name

# compares the names in 2 files, prints a log
# and returns the equal name or None if not equal
def compareNames(file1, file2):
    name1 = getName(file1)
    name2 = getName(file2)
    if name1 == name2:
       print "we have a match:", name1, "in:", file1, "and:", file2
       return name1
    else:
       print "different names:", name1, name2, "in:", file1, "and:", file2
       return None

# the basedir for the files
dir = "/Users/rhocke/Desktop/Sikuli/koventan"
# creates the file names
file1 = os.path.join(dir, "file1.txt")
file2 = os.path.join(dir, "file2.txt")
file3 = os.path.join(dir, "file3.txt")

result = compareNames(file1, file2)
print "returns:", result
# one could now decide how to proceed
if not result:
    print "proceed on no match"
else:
    print "proceed on match"

result = compareNames(file1, file3)
print "returns:", result
# one could now decide how to proceed
if not result:
    print "proceed on no match"
else:
    print "proceed on match"

and produces:

different names: Kumar Siva in: /Users/rhocke/Desktop/Sikuli/koventan/file1.txt and: /Users/rhocke/Desktop/Sikuli/koventan/file2.txt
returns: None
proceed on no match
we have a match: Kumar in: /Users/rhocke/Desktop/Sikuli/koventan/file1.txt and: /Users/rhocke/Desktop/Sikuli/koventan/file3.txt
returns: Kumar
proceed on match

file3 has same content as file1

--- second on comment #10
after
line1 = open (my_dir+"file1.txt").readlines()
line1 is a list (array) of the contained lines including line breaks.

... but re.search() needs a string (means one line in this case)

You might integrate the usage of RegEx's into my solution (looks more professional ;-)

My solution has the advantage, that the key functions are packed in def()s, so you can concentrate on the workflow.

Revision history for this message
koventhan (kingsaa2000) said :
#12

Thanks For detailed information :) :),, Its very useful information for me since i am very new to this scripting. Thanks Lot RaiMan :) :)

Revision history for this message
koventhan (kingsaa2000) said :
#13

Thanks RaiMan, that solved my question.