XML Parsing Error: junk after document element

Asked by vladimyr Wilson

XML Parsing Error: junk after document element
Location: http://paagamedia.com/full-text-rss/makefulltextfeed.php?url=http%3A%2F%2Fwww.npr.org%2Frss%2Frss.php%3Fid%3D1002&key=&max=4&submit=Create+Feed
Line Number 2, Column 1:<b>Warning</b>: iconv() [<a href='function.iconv'>function.iconv</a>]: Charset parameter exceeds the maximum allowed length of 64 characters in <b>/chroot/home/paaga/paagamedia.com/html/full-text-rss/makefulltextfeed.php</b> on line <b>136</b><br />
^

just after installlation i try it give me this error what can i do to correct the situation
thank you.

Question information

Language:
English Edit question
Status:
Answered
For:
Five Filters Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Keyvan (keyvan) said :
#1

Vladimyr: try replacing the convert_to_utf8 function in makefulltextfeed.php with this one:

function convert_to_utf8($html, $header=null)
{
 $accept = array(
  'type' => array('application/rss+xml', 'application/xml', 'application/rdf+xml', 'text/xml', 'text/html'),
  'charset' => array_diff(mb_list_encodings(), array('pass', 'auto', 'wchar', 'byte2be', 'byte2le', 'byte4be', 'byte4le', 'BASE64', 'UUENCODE', 'HTML-ENTITIES', 'Quoted-Printable', '7bit', '8bit'))
 );
 $encoding = null;
 if ($html || $header) {
  if (is_array($header)) $header = implode("\n", $header);
  if (!$header || !preg_match_all('/^Content-Type:\s+([^;]+)(?:;\s*charset=([^;"\'\n]*))?/im', $header, $match, PREG_SET_ORDER)) {
   // error parsing the response
  } else {
   $match = end($match); // get last matched element (in case of redirects)
   if (!in_array(strtolower($match[1]), $accept['type'])) {
    // type not accepted
    // TODO: avoid conversion
   }
   if (isset($match[2])) $encoding = trim($match[2], '"\'');
  }
  if (!$encoding) {
   if (preg_match('/^<\?xml\s+version=(?:"[^"]*"|\'[^\']*\')\s+encoding=("[^"]*"|\'[^\']*\')/s', $html, $match)) {
    $encoding = trim($match[1], '"\'');
   } elseif(preg_match('/<meta\s+http-equiv=["\']Content-Type["\'] content=["\'][^;]+;\s*charset=([^;"\'>]+)/i', $html, $match)) {
    if (isset($match[1])) $encoding = trim($match[1]);
   }
  }
  if (!$encoding) {
   $encoding = 'utf-8';
  } else {
   if (!in_array($encoding, array_map('strtolower', $accept['charset']))) {
    // encoding not accepted
    // TODO: avoid conversion
   }
   if ($encoding != 'utf-8') {
    if (strtolower($encoding) == 'iso-8859-1') {
     // replace MS Word smart qutoes
     $trans = array();
     $trans[chr(130)] = '&sbquo;'; // Single Low-9 Quotation Mark
     $trans[chr(131)] = '&fnof;'; // Latin Small Letter F With Hook
     $trans[chr(132)] = '&bdquo;'; // Double Low-9 Quotation Mark
     $trans[chr(133)] = '&hellip;'; // Horizontal Ellipsis
     $trans[chr(134)] = '&dagger;'; // Dagger
     $trans[chr(135)] = '&Dagger;'; // Double Dagger
     $trans[chr(136)] = '&circ;'; // Modifier Letter Circumflex Accent
     $trans[chr(137)] = '&permil;'; // Per Mille Sign
     $trans[chr(138)] = '&Scaron;'; // Latin Capital Letter S With Caron
     $trans[chr(139)] = '&lsaquo;'; // Single Left-Pointing Angle Quotation Mark
     $trans[chr(140)] = '&OElig;'; // Latin Capital Ligature OE
     $trans[chr(145)] = '&lsquo;'; // Left Single Quotation Mark
     $trans[chr(146)] = '&rsquo;'; // Right Single Quotation Mark
     $trans[chr(147)] = '&ldquo;'; // Left Double Quotation Mark
     $trans[chr(148)] = '&rdquo;'; // Right Double Quotation Mark
     $trans[chr(149)] = '&bull;'; // Bullet
     $trans[chr(150)] = '&ndash;'; // En Dash
     $trans[chr(151)] = '&mdash;'; // Em Dash
     $trans[chr(152)] = '&tilde;'; // Small Tilde
     $trans[chr(153)] = '&trade;'; // Trade Mark Sign
     $trans[chr(154)] = '&scaron;'; // Latin Small Letter S With Caron
     $trans[chr(155)] = '&rsaquo;'; // Single Right-Pointing Angle Quotation Mark
     $trans[chr(156)] = '&oelig;'; // Latin Small Ligature OE
     $trans[chr(159)] = '&Yuml;'; // Latin Capital Letter Y With Diaeresis
     $html = strtr($html, $trans);
    }
    if (function_exists('iconv')) {
     // iconv appears to handle certain character encodings better than mb_convert_encoding
     $html = iconv($encoding, 'utf-8', $html);
    } else {
     $html = mb_convert_encoding($html, 'utf-8', $encoding);
    }
   }
  }
 }
 return $html;
}

Can you help with this problem?

Provide an answer of your own, or ask vladimyr Wilson for more information if necessary.

To post a message you must log in.