expatreader exception during xhtml

Asked by Luc Saffre on 2011-05-06

I'm getting the following error message...

Error while evaluating the expression "xhtml(HTML)" defined in the "from" part of a statement.
File "<string>", line 1, in <module>
File "/var/snapshots/appy-current/appy/pod/renderer.py", line 240, in renderXhtml
stylesMapping, ns).run()
File "/var/snapshots/appy-current/appy/pod/xhtml2odt.py", line 501, in run
self.xhtmlParser.parse(self.xhtmlString)
File "/var/snapshots/appy-current/appy/shared/xml_parser.py", line 193, in parse
self.parser.parse(inputSource)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.6/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 211, in feed
self._err_handler.fatalError(exc)
File "/usr/lib/python2.6/xml/sax/handler.py", line 38, in fatalError
raise exception
<class 'xml.sax._exceptions.SAXParseException'>: <unknown>:1:1: not well-formed (invalid token)

It comes in the document `test_result_4.odt` produced by my test.py script at
http://code.google.com/p/lino/source/browse/tests/appy/1/test.py

Funnily the error occurs only on the customer's Debian box (Python 2.6) and not on my Windows XP notebook (Python 2.7 or 2.6). Any ideas on what might be the explanation and how to get a more useful error message?

Question information

Language:
English Edit question
Status:
Solved
For:
Appy Edit question
Assignee:
No assignee Edit question
Solved by:
Luc Saffre
Solved:
2011-05-06
Last query:
2011-05-06
Last reply:

Hi Luc,
I've already encountered this kind of behaviour, in other contexts. It seems that Windows-produced (XML) files may sometimes have some garbage leading char(s), but I don't know the details...
Gaetan

Luc Saffre (luc-saffre) said : #2

Gaetan,
thanks for your message which motivated me to investigate more. Now I found the explanation: this happens when I pass a unicode string to xhtml(). xhtml() (or rather the Python xml.sax parser) requires the XHTML source string to be either plain ASCII or UTF-8 encoded. unicode strings are not supported here!

Maybe you'll want to mention this in
http://appyframework.org/podWritingAdvancedTemplates.html