[SU2014] character encodings and the Ruby console
-
Hi,
I've been playing around with character encodings and noticed something in the Ruby console I don't understand. The default encoding of strings in the console seems to be UTF-8, which isn't surprising. Even if I convert a string from UTF-8 to a different encoding, it will still be printed to the console correctly. But even if I then force the encoding of the converted string back to UTF-8 (which doesn't change a single byte in the string) it will still be printed correctly to the console although now the console must think it's a UTF-8 string while it actually isn't.
I'll attach a screen shot with an example:
Does anyone know what's going on here?
-
I haven't looked this up to verify, but I suspect there are some automatic correction going on in the Windows functions that handle the strings. Web browsers does this a lot as there are a lot of incorrectly handling of ASCII, ANSI (Windows-1252) and UTF-8 around the web.
I haven't had time to investigate this myself - but it's been on my list.Is it something that cause issues?
(Other than it can make people thing their encoding is correct, even though it's not.)
As long as you keep everything UTF-8 encoded within the SketchUp environment you should be fine. -
@tt_su said:
(Other than it can make people thing their encoding is correct, even though it's not.)
That's what I'm worried about, because I did have issues with encodings in the past. It's kind of annoying that Windows isn't all UTF-8. Between console windows, the file system, and WebDialogs I have to deal with at least three types of encodings.
-
Why do you have to deal with three different encodings?
With Ruby 2.0 you specify encoding on File operations and strings. SketchUp API will give you UTF-8 encoding. If you save your RB and HTML files with UTF-8 encoding (for local HTML files you need to add a META tag to declare it) then you have a full round of UTF-8. Exception is the ENV variable which you need to convert. -
RB and HTML file encoding was never really a problem. But I think I ran into problems when using File Open dialogs and accessing files with special characters in the name (like umlauts). I think the Windows file system sometimes expects CP1252 encoded file names. Then there was another problem with console windows, which are encoded in CP850 or something by default on my system. I'm running some C++ programs with console windows that directly communicate with my SketchUp plugin via sockets. I have WebDialogs where the user can enter text (possibly including umlauts) which is then sent to these C++ programs and displayed in the console window. I think what I did with the console windows was changing the console encoding to CP1252 (UTF-8 is not possible I think). So I guess I'm actually down to two encodings.
What I still don't understand is how to cleanly get UTF-8 text input from the WebDialogs to Ruby. I think there's some implicit conversion going on between the two.
I'm still in the process of converting my plugin to Ruby 2.0 so the above applies to the old version. -
@oricatmos said:
What I still don't understand is how to cleanly get UTF-8 text input from the WebDialogs to Ruby. I think there's some implicit conversion going on between the two.
I have not seen any problems with this.
Not have I seen any problems with Ruby 2.0 file functions - they should now be calling the "W" (Unicode) version of the Windows file functions.I understand that if you are communicating with another process that doesn't handle UTF-8 then you'd have a challenge with encoding - but within SketchUp using UTF-8 all around should be no problem.
Do you have some example snippets that reproduce the issues you describe? I think we need to have something to look at here.
-
I think I should get my plugin working in SU2014 before I try to reproduce the problem. Perhaps everything will just work then.
So please don't take this thread as a bug report. -
Let us know if you find any issues. The upgrade of Ruby was a big change so we're trying to keep an eye for potential issues.
-
@oricatmos said:
I'm still in the process of converting my plugin to Ruby 2.0 so the above applies to the old version.
No surprise there. The Ruby 1.8 trunk on PC would not not handle unicode hardly at all.
But TIG did write a small library called PCFileTools that might help on older SketchUp versions running Ruby 1.8 on PC.
-
Thanks, that might come in handy!
Advertisement