Language-specific special characters are missing

Asked by swp-bhv

Hi,

after I got the full-text to work, however not with the Tagesschau feed, but with the Spiegel.de one, I have the next problem: The "Umlauts" - äöüß - are missing.

So, the word "müssen" becomes "mssen" etc.

Is there a trick to incorporate them (e.g. format them as HTML entities or such)?

Question information

Language:
English Edit question
Status:
Answered
For:
Five Filters Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Keyvan (keyvan) said :
#1

It's probably to do with character encoding. The hosted version tries to detect the character encoding and convert to UTF-8: https://answers.launchpad.net/fivefilters/+question/94332

I'm currently raising money to clean up and release the next version of full-text rss which will incorporate some of this auto-detection.

Can you help with this problem?

Provide an answer of your own, or ask swp-bhv for more information if necessary.

To post a message you must log in.