No getting full content

Asked by Mr.Show

Ok i have a little problem i am not getting all of the full text content. It seems to cut it right before it ends. I know is not my sever because it does it on your site too.

sample link
http://fivefilters.org/content-only/makefulltextfeed.php?url=feeds.feedburner.com%2Fwowebook&max=3

whats missing is the most important part of the articles which is the 3 links at the bottom for downloads.

I know its is possible because a company called foobla.com use it to make a joomla component that grabs the whole content including the links. contacted them to find out what modifications the make to full rss but like always it all about that money so basically just told me to buzz off.

Point me to the right file and lines of code to look at in your script and maybe i might be able to figure out some thing.

one more thing i did not pass the HttpRequestPool does that have anything to do with it. All help is much appreciated. thanks.

Question information

Language:
English Edit question
Status:
Answered
For:
Five Filters Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Keyvan (keyvan) said :
#1

Extraction is not 100% accurate. Depending on the way the page is marked up, certain elements might be excluded from the result. Readability, the content extraction code we use, also tries to cleans up the detected content block. This can often result in elements being removed from the final result. If you want to have a look at the code, it's Readability.php.

Oh, and HttpRequestPool support will have no effect on this.

Can you help with this problem?

Provide an answer of your own, or ask Mr.Show for more information if necessary.

To post a message you must log in.