SBCL Windows 1.0.29 - SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* ?

Asked by Michael Wessel

Hi all,

I can't figure out a way to change the SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* to :utf-8
on Windows SBCL 1.0.29.

Please consider the following example, which fails on Windows, but works on Linux:

CL-USER> SB-IMPL::*DEFAULT-EXTERNAL-FORMAT*

:CP1252

CL-USER> (setf SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* :utf-8)

:UTF-8

CL-USER> (coerce (mapcar #'code-char '(40 105 110 115 116 97 110 99 101 32 12383 12429 12358 32 40 97 116 45
 109 111 115 116 32 49 32 12365 12423 12358 12384 12356 12434 25345
 12388 41 41) ) 'string)

;; swank:close-connection: encoding error on stream
                           #<SB-SYS:FD-STREAM for "a socket" {23B61D61}>
                           (:EXTERNAL-FORMAT :LATIN-1):
                             the character with code 12383 cannot be encoded.

Of course, but I wanted to use UTF-8, not :LATIN-1...

With Linux version of SBCL 1.0.29 I get:

* SB-IMPL::*DEFAULT-EXTERNAL-FORMAT*

:UTF-8

* (coerce (mapcar #'code-char '(40 105 110 115 116 97 110 99 101 32 12383 12429 12358 32 40 97 116 45
 109 111 115 116 32 49 32 12365 12423 12358 12384 12356 12434 25345
 12388 41 41) ) 'string)

"(instance たろう (at-most 1 きょうだいを持つ))"
*

I found a number of related posts, but no clear answer to the question.

Regards and thanks in advance

Michael

Question information

Language:
English Edit question
Status:
Answered
For:
SBCL Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Nikodemus Siivola (nikodemus) said :
#1

The socket in question has been created by Slime before you changed the *D-E-F*, and is hence unaffected. You can see the same issue in action in the terminal:

> LANG=C sbcl --no-userinit
This is SBCL 1.0.29.11, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* sb-impl::*default-external-format*

:LATIN-1
* (setf sb-impl::*default-external-format* :utf-8)

:UTF-8
* (stream-external-format *standard-output*)

:LATIN-1
* (stream-external-format (open "/tmp/foo" :direction :output))

:UTF-8

In other words, setting the default works fine, but it doesn't do what you would like it to do. In case of Slime the easy answer is

  (setq slime-net-coding-system 'utf-8-unix)

in .emacs. For SBCL running in terminal you can use locale environment variables as above. On Windows there is presumably a way to specify the default codepage as well.

Note also that sb-impl::*default-external-format* is not currently supported -- you can use it, but it is liable to change at some point. (In the future we will likely represent external formats with objects specifying the newline convention in addition to the encoding, which is why the current API is not "official".)

Revision history for this message
Michael Wessel (michael-wessel) said :
#2

Dear Nikodemus,

thanks for the very fast reply!

Am Mittwoch, 17. Juni 2009 15:29:55 schrieb Nikodemus Siivola:

> Nikodemus Siivola proposed the following answer:
> The socket in question has been created by Slime before you changed the
> *D-E-F*, and is hence unaffected. You can see the same issue in action
>
> in the terminal:
> > LANG=C sbcl --no-userinit
>
> This is SBCL 1.0.29.11, an implementation of ANSI Common Lisp.
> More information about SBCL is available at <http://www.sbcl.org/>.
>
> SBCL is free software, provided as is, with absolutely no warranty.
> It is mostly in the public domain; some portions are provided under
> BSD-style licenses. See the CREDITS and COPYING files in the
> distribution for more information.
> * sb-impl::*default-external-format*
>
> :LATIN-1
>
> * (setf sb-impl::*default-external-format* :utf-8)
>
> :UTF-8
>
> * (stream-external-format *standard-output*)
>
> :LATIN-1
>
> * (stream-external-format (open "/tmp/foo" :direction :output))
>
> :UTF-8
>
> In other words, setting the default works fine, but it doesn't do what
> you would like it to do. In case of Slime the easy answer is
>
> (setq slime-net-coding-system 'utf-8-unix)

Unfortunately, this still doesn't work with Slime... I now have the following
in my .emacs:

(defun slime-sbcl ()
  (interactive)

  (setq slime-net-coding-system 'utf-8-unix)
  (setq language-environment "utf-8")

  (setq inferior-lisp-program "c:/sbcl/sbcl.exe")
  (add-to-list 'load-path "c:/slime/")
  (require 'slime)
  (slime-setup)
  (slime))

Although this time it gets a little further:

(progn (load "c:/slime/swank-loader.lisp" :verbose t) (funcall (read-from-
string "swank-loader:init")) (funcall (read-from-string "swank:start-s
erver") "c:/DOKUME~1/ADMINI~1/LOKALE~1/Temp/slime.2676" :coding-system "utf-8-
unix"))

I notice "utf-8-unix" here now. OK.

But then:

CL-USER> (coerce (mapcar #'code-char '(40 105 110 115 116 97 110 99 101 32
12383 12429 12358 32 40 97 116 45 109 111 115 116 32 49 32 12365 12423 12358
12384 12356 12434 25345 12388 41 41) ) 'string)

encoding error on stream
#<SB-SYS:FD-STREAM for "standard output" {23B576C1}>
(:EXTERNAL-FORMAT :CP850):
  the character with code 12383 cannot be encoded.
   [Condition of type SB-INT:STREAM-ENCODING-ERROR]

Backtrace:
  0: (SB-INT:STREAM-ENCODING-ERROR #<SB-SYS:FD-STREAM for "standard output"
{23B57801}> 12383)
  1: (SB-IMPL::STREAM-ENCODING-ERROR-AND-HANDLE #<SB-SYS:FD-STREAM for
"standard output" {23B57801}> 12383)
  2: (SB-IMPL::OUTPUT-BYTES/CP850 ..)
  3: (SB-IMPL::FD-SOUT ..)
  4: (SB-IMPL::%WRITE-STRING ..)
  5: (SB-IMPL::%WRITE-STRING ..)
  6: (SB-PRETTY:OUTPUT-PRETTY-OBJECT "(instance たろう (at-most 1 きょうだいを持つ))"
#<SYNONYM-STREAM :SYMBOL SB-SYS:*STDOUT* {223D3CE1}>)
  7: (PRIN1 "(instance たろう (at-most 1 きょうだいを持つ))" #<SYNONYM-STREAM :SYMBOL SB-
SYS:*STDOUT* {223D3CE1}>)
 --more--

Interestingly, this time it was indeed able to produce the string

"(instance たろう (at-most 1 きょうだいを持つ))"

so character encoding worked, but it was then unable to put it to standard
output, which still uses the wrong code page CP850.

> in .emacs. For SBCL running in terminal you can use locale environment
> variables as above. On Windows there is presumably a way to specify the
> default codepage as well.

If I execute sbcl.exe from Windows console, it works if I set the codepage to
65001 (":utf-8"):

C:\sbcl>chcp 65001
Aktive Codepage: 65001.

C:\sbcl>sbcl.exe
This is SBCL 1.0.29, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.

This is experimental prerelease support for the Windows platform: use
at your own risk. "Your Kitten of Death awaits!"
* (coerce (mapcar #'code-char '(40 105 110 115 116 97 110 99 101 32
12383 12429 12358 32 40 97 116 45 109 111 115 116 32 49 32 12365 12423 12358
12384 12356 12434 25345 12388 41 41) ) 'string)

"(instance たろう (at-most 1 きょうだいを持つ))"
*

This looks better (but the console is unable to produce the correct
characters, I think).

Any ideas what else to do in order to get Slime / Emacs use a different
external format for standard out?

Regards and thanks

Michael

Revision history for this message
Nikodemus Siivola (nikodemus) said :
#3

What you are seeing now is an artifact of the incomplete Windows support: communication between Slime and SBCL is normally over a socket and not stdout, but on Windows things work a bit differently right now, plus the way SBCL uses CONSOLE-OUTPUT-CODEPAGE to pick the external format for *STDOUT*

It's an ugly kludge, but while waiting the Windows port to mature in this respect:

(defun set-default-external-format (external-format)
  (assert (sb-impl::find-external-format external-format))
  (setf sb-impl::*default-external-format* external-format)
  (with-output-to-string (*error-output*)
    (setf sb-sys:*stdin*
          (sb-sys:make-fd-stream 0 :name "standard input" :input t :buffering :line))
    (setf sb-sys:*stdout*
          (sb-sys:make-fd-stream 1 :name "standard output" :output t :buffering :line))
    (setf sb-sys:*stderr*
          (sb-sys:make-fd-stream 2 :name "standard error" :output t :buffering :line))
    (setf sb-sys:*tty* (make-two-way-stream sb-sys:*stdin* sb-sys:*stdout*))
    (princ (get-output-stream-string *error-output*) sb-sys:*stderr*))
  (values))

Which also fixes the external format for the essential internal streams.

Revision history for this message
Michael Wessel (michael-wessel) said :
#4

Am Mittwoch, 17. Juni 2009 17:19:12 schrieb Nikodemus Siivola:
> Your question #74497 on SBCL changed:
> https://answers.launchpad.net/sbcl/+question/74497
>
> Status: Open => Answered
>
> Nikodemus Siivola proposed the following answer:
> What you are seeing now is an artifact of the incomplete Windows
> support: communication between Slime and SBCL is normally over a socket
> and not stdout, but on Windows things work a bit differently right now,
> plus the way SBCL uses CONSOLE-OUTPUT-CODEPAGE to pick the external
> format for *STDOUT*
>
> It's an ugly kludge, but while waiting the Windows port to mature in
> this respect:
>
> (defun set-default-external-format (external-format)
> (assert (sb-impl::find-external-format external-format))
> (setf sb-impl::*default-external-format* external-format)
> (with-output-to-string (*error-output*)
> (setf sb-sys:*stdin*
> (sb-sys:make-fd-stream 0 :name "standard input" :input t
> :buffering :line)) (setf sb-sys:*stdout*
> (sb-sys:make-fd-stream 1 :name "standard output" :output t
> :buffering :line)) (setf sb-sys:*stderr*
> (sb-sys:make-fd-stream 2 :name "standard error" :output t
> :buffering :line)) (setf sb-sys:*tty* (make-two-way-stream sb-sys:*stdin*
> sb-sys:*stdout*)) (princ (get-output-stream-string *error-output*)
> sb-sys:*stderr*)) (values))
>
> Which also fixes the external format for the essential internal streams.

Well done, that works! Thanks a lot... until the Windows port is more mature,
may I include your fix in some free library which we offer for download? And
when do you think will this be fixed officially?

Thanks again

Michael

Revision history for this message
Nikodemus Siivola (nikodemus) said :
#5

Yes, you can consider that bit of code to be in public domain. However, if it breaks due to SBCL changes you get to keep both parts. :)

As for progress of the Windows port -- hard to say. It's inching along, but currently no-one is spending serious man-hours working on it, as far as I know. Let's say at least a month, and probably no longer then 2 years...

Can you help with this problem?

Provide an answer of your own, or ask Michael Wessel for more information if necessary.

To post a message you must log in.