Upgrading plugins to Ruby 2.0 for SketchUp 2014
-
Didn't Bugra say the two Ruby loadpaths were encoded in the local code page ?
-
@tt_su said:
I've reproduced the crash and it happens in the Importer class wrapper that is supposed to return the result code. That's why it works when you call the method directly and not via the Importer dialog.
This isn't new for SketchUp 2014 - it's been around for awhile.
-
@jim said:
This isn't new for SketchUp 2014 - it's been around for awhile.
Yup, I saw that when I tested. Bad one - but at least it can be caught. ...one just have to know about it...
But it's fixed in-house now. -
Update - I have sent a user an .rbz, and he gets the error:
Error: #<ArgumentError: unknown encoding name - ISO-8859-1>
resulting from...
file_data.force_encoding('ISO-8859-1').encode("UTF-8") if defined?(Encoding)He is using Windows 8, SketchUp 14, on a PC. How can this be??! - and what advice can I pass on?
-
Further (on my PC, which has no encoding issue)...
Encoding.default_external = UTF-8But..
puts file_data.encoding
file_data.force_encoding('ISO-8859-1').encode("UTF-8") if defined?(Encoding)
puts file_data.encodingGenerates...
UTF-8
ISO-8859-1I would have thought that the second encoding would always be UTF-8? I am trying to understand the insidious world of Encoding (thanks to Ruby 2)! Any advice or explanation would be appreciated.
Context: A major plugin which simply wishes to reliably read a CSV file is currently useless due to the Encoding ISO-8859-1 issue.
-
@marksup said:
Update - I have sent a user an .rbz, and he gets the error:
Error: #<ArgumentError: unknown encoding name - ISO-8859-1>
resulting from...
file_data.force_encoding('ISO-8859-1').encode("UTF-8") if defined?(Encoding)He is using Windows 8, SketchUp 14, on a PC. How can this be??! - and what advice can I pass on?
Could it be that he's using the first release for SU2014? There was a bug there where when opening SketchUp via an SKP file that is located on a drive other than where SketchUp is installed parts of the Ruby interpreter wasn't initialized - that include some of the encodings.
Can you please make sure the user has kept his installation up to date? -
@marksup said:
file_data.force_encoding('ISO-8859-1').encode("UTF-8") if defined?(Encoding)
By the way, one should really try to avoid using force_encoding. There are some scenarios where one has to because Ruby itself return strings with the incorrect encoding (Like the FILE magic variable.) but I often see this pattern when people open files. If that is the case then one should ideally spesify the encoding in the IO.open (File.open) arguments.
-
Thanks ThomThom,
I noticed (during a shared Skype screen session) that the user was launching SketchUp via the .skp file - so your explanation is very promising.
Regarding the use of force_encoding and File.open, I am using the following (on the advice of another SketchUcation expert!)...
File.foreach(file_path) do |file_data|
file_data.force_encoding('ISO-8859-1').encode("UTF-8") if defined?(Encoding)Please can you provide a similar File.open example of what you would prefer/recommend.
Would it be possible for SketchUp to provide a function to reenable foolproof reading of text files? (i.e. to reinstate the automatic encoding recognition functionality that Ruby 2 is missing compared to Ruby 1.8)
-
@marksup said:
Would it be possible for SketchUp to provide a function to reenable foolproof reading of text files? (i.e. to reinstate the automatic encoding recognition functionality that Ruby 2 is missing compared to Ruby 1.8)
There never was an automatic detection - as Ruby 1.8 treated all strings as 8bit byte sequences.
To provide a proper answer to you I need to know a little bit more about what type of files you are opening.
If they are binary files:
filemode = 'rb' if RUBY_VERSION.to_f > 1.8 filemode << ';ASCII-8BIT' end File.open(file_name, filemode) {|file| # read file }
If you know the file is UTF-8:
filemode = 'rb' if RUBY_VERSION.to_f > 1.8 filemode << ';UTF-8' end File.open(file_name, filemode) {|file| # read file }
If you know the file is ISO-8859-1 but you want it as UTF-8:
filemode = 'rb' if RUBY_VERSION.to_f > 1.8 filemode << ';ISO-8859-1;UTF-8' end File.open(file_name, filemode) {|file| # read file }
I recommend reading up on the IO class and Encoding class:
http://www.ruby-doc.org/core-2.1.2/IO.html#method-c-newForcing an encoding is prone to errors - it's like brute forcing and crossing your fingers hoping it will work. By being clear by what encoding you expect you will catch incorrectly coded strings early and at the correct points.
-
Hi, thanks, but you may be missing the point...
I would wish to read a simple text (not binary) file, (typically in CSV format).
I do not know what encoding a user may use to create the text file - and most likely neither will the user!
Thus the file may be UTF-8 already, or it may be something else, I just wish to reliably and simply process it whatever the encoding, as was apparently possible with earlier releases.
I have no attachment to force_encoding - merely that it was suggested and it does appear to work, and will happily comply with any preferred alternative.
So, what extra logic is required for anyone and everyone to reliably process any text file, (which might include Β£ or m[sup:1cfz4sft]2[/sup:1cfz4sft], for example) regardless of how it might be encoded?
-
Your force encoding example will only work if the input text file (CSV) file is ISO-8859-1. ISO-8859-1 is all 8bit per character. However, if you open a file that is saved with UTF-8 encoding and it contains characters over the ASCII range (127) then there will be multi-byte characters - when you then force ISO-8859-1 encoding and convert that to UTF-8 you will mangle the characters.
If you have no idea or control over the input data then I would try to use a try/catch approach of first reading the file in what is more likely for the file to be encoded in. If you get encoding errors thrown which you can catch and try the next likely encoding. You can then fall back to just reading it as binary:
if RUBY_VERSION.to_f > 1.8 filemode << ';ASCII-8BIT' end File.open(file_name, filemode) {|file| file.seek(80, IO;;SEEK_SET) face_count = file.read(4).unpack('i')[0] }
Look at these errors and see if they might be thrown:
Encoding;;CompatibilityError Encoding;;ConverterNotFoundError Encoding;;InvalidByteSequenceError Encoding;;UndefinedConversionError EncodingError
If you are not familiar with Unicode and how it's represented in byte data I would recommend reading up on that as well. The reason it has worked for you so far has probably been that you have had ASCII compatible data. UTF-8 is byte compatible with ASCII where it uses only one byte per character - but the moment you go outside the ASCII (US-ASCII to be precise) it get multibyte characters.
For your testing purposes I would strongly recommend you test with non-english characters. For good measure make sure you go outside of European languages as well, try Japanese or Chinese for instance which might be four byte per characters.
Advertisement