extracting IndirectObject stream

Asked by Giancarlo

I have a pdf file, and this is the snippet which concernes me:

558 0 obj
<</Contents 583 0 R/CropBox[0 0 595.22 842]/MediaBox[0 0 595.22 842]/Parent 29 0 R/Resources
  <</ColorSpace <</CS0 563 0 R>>
    /ExtGState <</GS0 568 0 R>>
    /Font<</TT0 559 0 R/TT1 560 0 R/TT2 561 0 R/TT3 562 0 R>>
    /ProcSet[/PDF/Text/ImageC]
    /Properties<</MC0<</MYOBJECT 584 0 R>>/MC1<</SubKey 582 0 R>> >>
    /XObject<</Im0 578 0 R>>>>
  /Rotate 0/StructParents 0/Type/Page>>
endobj
...
...
...
584 0 obj
<</Length 8>>stream

1_22_4_1 --->>>> this is the string I need to extract from the stream

endstream
endobj

I need to extract the stream associated to the IndirectObject named MYOBJECT, that is the string 1_22_4_1.
I am able to find the IndirectObject by this way:

pdfFile = pyPdf.PdfFileReader(open("file.pdf"))
pageData = pdfFile.getPage(0)["/Resources"]["/Properties"] # page 0 because this stream is saved ever in the first page
for x in pageData:
  if "/MYOBJECT" in pageData[x].keys():
    indirectObject = pageData[x].get('/MYOBJECT')
print indirectObject

this code gives me this output:

IndirectObject(584, 0)

I agree that this couldn't be the best way to find it, but it was only a try.
Can you tell me if I am able to get the stream associated to the key MYOBJECT? And, if you want, can you tell me how to change my code for best results

Thanks in advance,

Question information

Language:
English Edit question
Status:
Solved
For:
pyPdf Edit question
Assignee:
No assignee Edit question
Solved by:
Giancarlo
Solved:
Last query:
Last reply:
Revision history for this message
Giancarlo (badblock-email) said :
#1

solved