Chinese characters unrecognizable

Asked by Lavande

Sorry for asking the same question in two different languages...
https://answers.launchpad.net/ubuntu/+source/file-roller/+question/109029
Chinese filenames in the rar files which are compressed on win platform are unrecognizable. I guess it's the problem on the conversion between GBK and UTF-8.
There was a workaround of adding some lines in the /etc/environment file. I tried it, but it didn't work.
PS, I'm using en locale.

Question information

Language:
English Edit question
Status:
Answered
For:
Ubuntu file-roller Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Simos Xenitellis  (simosx) said :
#1

I think the source of the problem is with the 'poppler' library, which is the library that does PDF support for evince.

That PDF file with the MHei, MKai, MSung chinese fonts embedded in the document.
If you try with 'xpdf' or 'acrobat', you are able to read the document.

Your first step would be to try if with Ubuntu 10.04 you are able to view the document with Evince.

If that does not work, then have a look at http://bugs.freedesktop.org/show_bug.cgi?id=22334
It is a bug report for the poppler library. What you can do is file a similar report, attaching this PDF file.
I believe this is the best way to get things forward!

Revision history for this message
Simos Xenitellis  (simosx) said :
#2

Oh, disregard the above answer. It was meant to go to another question.

For your case, the issue is with the rar file format that does not specify the encoding of the filenames.
The same problem exists with the ZIP file format, and it is possible to set in /etc/environment a special variable, so that it should consider the filename encoding in ZIP is so and so.

With the RAR format I do not know what's the status.
There is a relevant report at
https://bugs.launchpad.net/ubuntu/+source/unrar-nonfree/+bug/379119
The discussion there missed the point, which is an encoding issue of the filenames; RAR should create UTF-8 encoded filenames, or failing that, should specify the encoding.

Revision history for this message
Lavande (lavande) said :
#3

Thanks for your answer.
But now I think it is file-roller's bug, not unrar's.
I tried "unrar l myfile.rar" in the terminal, and it returned the right information. The Chinese characters can be displayed correctly.
For the same file, I opened it in file-roller, and it displayed unrecognizable characters.
I reported a bug.
https://bugs.launchpad.net/ubuntu/+source/file-roller/+bug/573574

Can you help with this problem?

Provide an answer of your own, or ask Lavande for more information if necessary.

To post a message you must log in.