UTF-8.exp testsuite is failed in newlib/newlib-nano

Asked by HermanH

I have tested UTF-8.exp testsuite in newlib (also in newlib-nano) with enabling newlib-mb option. The test result was always fail because mbtowc(...) only support 1 to 4 bytes length UTF-8 decode. UTF-8.exp expect to get "*U-00200000"stdout when UTF-8.c test program pass {0xf8, 0x88, 0x80, 0x80, 0x80} strings to mbtowc(...) but always get "2.1.5 Invalid".

Did I miss any build option?

Can this test suite be passed when I enable multiple bytes support in newlib/newlib-nano?

Thank you!

Question information

Language:
English Edit question
Status:
Solved
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Solved by:
Thomas Preud'homme
Solved:
Last query:
Last reply:
Revision history for this message
Best Thomas Preud'homme (thomas-preudhomme) said :
#1

I'm sorry to tell you that you didn't miss anything, mbtowc is just incomplete for UTF-8 at the moment. The code is in newlib/libc/stdlib/mbtowc_r.c and starts at the line:

_DEFUN (__utf8_mbtowc, (r, pwc, s, n, charset, state),

The code has the following overall structure:

1) ch == '\0' /* returns 0 */
2) ch <= 0x7f /* returns 1 */
3) 0xc0 <= ch <= 0xdf /* returns 2 */
4) 0xe0 <= ch <= 0xef /* returns 3 */
5) 0xf0 <= ch <= 0xf4 /* returns 4 */
6) anything else /* returns -1 */

Best regards.

Revision history for this message
HermanH (learhsieh) said :
#2

Thanks Thomas Preud'homme, that solved my question.