WebDialog encoding bug found!
- 
 @driven said: @aerilius said: ...Then one can use get_element_valueto fetch user input of any arbitrary character range.except, get_element_valuefails pre v2013 andget_element_valueworks.I got a few old scripts I was trying to update, and was unable to figure out the issue... I think this shows it was broken to begin and got broken 'differently' with a fix... I'm more inclined, to look at using 'unicodeEscape' in the js... before retrieving by either method... john It's possible to get deeply confused trying to sort this out 
 Here's what I have observed:The UTF-8 byte sequence for élan is \xC3\xA9lan (that is, UTF-8 encodes é as the two-byte sequence \xC3\xA9). On Mac, get_element_value returns this byte sequence exactly in both SU8 and 2014. However, because Ruby 8 thinks it is a 5-character string in ASCII-8BIT, it treats \xC3 as à and \xA9 as  , the extended ASCII interpretations of these bytes.  Ruby 2.0 happily assumes it is UTF-8 and gets it correct. , the extended ASCII interpretations of these bytes.  Ruby 2.0 happily assumes it is UTF-8 and gets it correct.The action callback parameters are handled differently and inconsistently between Windows and Mac. On the Mac, in SU8 the action callback also returns the original UTF-8 5-byte sequence, but somehow when Ruby prints it as a string, it gets it right  Further, it believes this string and the one from get_element_value (that prints as à Further, it believes this string and the one from get_element_value (that prints as Ã lan) are equal.  This makes no sense to me... lan) are equal.  This makes no sense to me...But in 2014, the original 5-byte UTF-8 string is somehow transmuted into the 8-byte string \xE2\x88\x9A\xC2\xA9lan. Note that \xE2\x88\x9A is the UTF-8 for the square-root sign √ and \xC2\xA9 is UTF-8 for  .  So it appears the translation of the copyright is an attempt to handle misreading of the UTF-8 as ASCII-8BIT, but I don't know where that square root sign byte-sequence came from since as I mentioned above, \xC3 is à in ASCII-8BIT, which is \xC3\x83 in UTF-8.  It's as if the callback processing code has an incorrect implementation of the transcoding. .  So it appears the translation of the copyright is an attempt to handle misreading of the UTF-8 as ASCII-8BIT, but I don't know where that square root sign byte-sequence came from since as I mentioned above, \xC3 is à in ASCII-8BIT, which is \xC3\x83 in UTF-8.  It's as if the callback processing code has an incorrect implementation of the transcoding.And then, as juantx0 reports, SU8 on Windows 7 returns the 5 UTF-8 bytes unconverted yet again somehow manages to make sense of it in both cases  . .
- 
 @slbaumgartner said: ... Further, it believes this string and the one from get_element_value (that prints as à  lan) are equal.  This makes no sense to me... lan) are equal.  This makes no sense to me...I create this confusion... I don't check for equality it's a 'puts', it should be @dlg2.execute_script("translation3.textContent='#{param_val}' + ' == ' + '#{elm_val}' + ' is ' + '#{param_val == elm_val}'")which returns 
 √©lan == élan is falsesorry I'll get my coat... john 
- 
 Sorry, I didn't update my profile. 
 In Windows SketchUp 2013 p(params) returns
 "lid1,élan"
 so works fine. (I think in SU8 also)
 Problem is in Windows SU2014 that returns non utf string.
- 
 @juantxo said: Sorry, I didn't update my profile. I suspected that... which locale do you use Sketchup.get_localeen-US?, it may have a bearing?john 
- 
 Yes, en-Us. 
- 
 @driven said: @slbaumgartner said: ... Further, it believes this string and the one from get_element_value (that prints as à  lan) are equal.  This makes no sense to me... lan) are equal.  This makes no sense to me...I create this confusion... I don't check for equality it's a 'puts', it should be @dlg2.execute_script("translation3.textContent='#{param_val}' + ' == ' + '#{elm_val}' + ' is ' + '#{param_val == elm_val}'") >which returns 
 √©lan == élan is falsesorry I'll get my coat... john Nah, I should know enough to check your code  .  With a real test, equality fails in all cases (as it should!). .  With a real test, equality fails in all cases (as it should!).
- 
 @slbaumgartner said: Nah, I should know enough to check your code :oops: . With a real test, equality fails in all cases (as it should!). it's the tangents that count... now you can write a proper test... john 
- 
 The text that displays in your test, says UFT-8 in many places. It is UTF-8. 
 (The word format is at the end, of Unicode Transformation Format.)(nag, nag)  
- 
 @slbaumgartner said: But in 2014, the original 5-byte UTF-8 string is somehow transmuted into the 8-byte string \xE2\x88\x9A\xC2\xA9lan. ... It's as if the callback processing code has an incorrect implementation of the transcoding. YES, agree. To me it looks like it IS UTF-8, but Ruby thinks it is some other encoding, and doubly transcodes* it into UTF-8 AGAIN. - P.S. - isn't transmute a math term ?
 
- 
 @driven said: except, get_element_valuefails pre v2013 andget_element_valueworks.? It does? Are you sure? 
- 
 @dan rathbun said: To me it looks like it IS UTF-8, but Ruby thinks it is some other encoding, and doubly transcodes* it into UTF-8 AGAIN. That's a similar problem we have with FILE $LOAD_PATH, $LOADED_FEATURES and ENV. UTF-8 byte sequences isn't labeled with the correct encoding. 
- 
 @dan rathbun said: @slbaumgartner said: But in 2014, the original 5-byte UTF-8 string is somehow transmuted into the 8-byte string \xE2\x88\x9A\xC2\xA9lan. ... It's as if the callback processing code has an incorrect implementation of the transcoding. YES, agree. To me it looks like it IS UTF-8, but Ruby thinks it is some other encoding, and doubly transcodes* it into UTF-8 AGAIN. - P.S. - isn't transmute a math term ?
 I was thinking more of alchemy  
- 
 @tt_su said: ? It does? Are you sure? It certainly looks like it... I cleaned up my initial code an per the Nanny and the Professors prodding... and these are the new images...    
 and the new script# encoding; UTF-8 def show_problem @lang_hash = {'lid1'=>"élan"} # I'm using a hash because it's what I use in my plugin... @dlg2 = UI;;WebDialog.new("Problem_Main", false,"main_prob", 700, 500, 600, 0, false) html = %Q( <!DOCTYPE html> <html> <head> <title>Problem_Main</title> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body> <p>Tested using Sketchup.version #{Sketchup.version}</p> <p>Tested on #{RUBY_PLATFORM =~ /(darwin)/ ? ((%x(sw_vers).sub(/ProductName;/,'').sub(/ProductVersion;/,'').sub(/BuildVersion;/,'_'))) ; 'windows'}</p> <p>click into the box and hit return or change to use another non ASCII-8BIT string...</p> <input id="#{@lang_hash.keys[0]}" value="#{@lang_hash.values[0]}" type="text" onkeydown="if (event.keyCode == 13) trans_L8();" title="type text and then 'Enter'"> <p>below is the return using 'get_element_value'</p> <h4 id="translation1"><!-- return appears here --></h4> <p>below is the return using 'add_action_callback' </p> <h4 id="translation2"><!-- return appears here --></h4> <p>below is the return of the query 'add_action_callback' == 'get_element_value'</p> <h4 id="translation3"><!-- return appears here --></h4> <script type="text/javascript" charset="UTF-8"> function trans_L8() { window.location = 'skp;trans_L8@'+#{@lang_hash.keys[0]}.id+','+(#{@lang_hash.keys[0]}.value); } </script> </body> </html> ) @dlg2.set_html html RUBY_PLATFORM =~ /(darwin)/ ? @dlg2.show_modal() ; @dlg2.show() @dlg2.add_action_callback("trans_L8") {|dialog, params| param_id = params.split(',')[0].to_s param_val = params.split(',')[1].to_s callback = 'return using add_action_callback ' # + param_id.to_s + ' = ' + param_val.to_s # commented out to avoid the error puts callback p(params) elm_val = (@dlg2.get_element_value(param_id)).to_s element_value = 'return using get_element_value => ' + elm_val puts element_value @dlg2.execute_script("translation1.textContent='#{(elm_val)}'") @dlg2.execute_script("translation2.textContent='#{param_val}'") @dlg2.execute_script("translation3.textContent='#{elm_val == param_val}'") } end show_problem # load("/Users/johns_iMac/Library/Application Support/SketchUp 2014/SketchUp/Plugins/jcb_ViewPortResize/dev/show_encoding_issue.rb") # load("[add your path]/show_encoding_issue.rb")and a question? Is there code to get the Windows operating system details? john 
- 
 a CLUE perhaps... Using js codepoints for my input, 'get_element_value' has switched all the separators... return using add_action_callback "lid1,/uD83D/uDC7D/u20AC/u00A3/u0061/u0009" return using get_element_value => \uD83D\uDC7D\u20AC\u00A3\u0061\u0009so I modified my code @dlg2.add_action_callback("trans_L8") {|dialog, params| param_id = params.split(',')[0].to_s param_val = params.split(',')[1].to_s param_val_sub = params.split(',')[1].gsub('/','\\').to_s p param_val_sub callback = 'return using add_action_callback ' # + param_id.to_s + ' = ' + param_val.to_s # commented out to avoid the error puts callback p(params) elm_val = (@dlg2.get_element_value(param_id)).to_s element_value = 'return using get_element_value => ' + elm_val puts element_value @dlg2.execute_script("translation1.textContent='#{elm_val}'") @dlg2.execute_script("translation2.textContent='#{param_val}'") @dlg2.execute_script("translation3.textContent='#{param_val_sub}'") }and this is the result 
  
 john
- 
 So it seems there are in-fact two issues... 1: return having separators reversed 2: encoding being mis-read I added the .gsub('/','\\')to both types of return, so it works on v8 or v13/v14I added js to convert the string to UNICODE on keydown, before it retrieved by SU... and the full workaround...   # encoding; UTF-8 def show_problem @lang_hash = {'lid1'=> %Q(élan  ümlet)} # I'm using a hash because it's what I use in my plugin... @dlg2 = UI;;WebDialog.new("Problem_Main", false,"main_prob", 700, 500, 600, 0, false) html = %Q( <!DOCTYPE html> <html> <head> <title>Problem_Main</title> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body> <p>Tested using Sketchup.version #{Sketchup.version}</p> <p>Tested on #{RUBY_PLATFORM =~ /(darwin)/ ? ((%x(sw_vers).sub(/ProductName;/,'').sub(/ProductVersion;/,'').sub(/BuildVersion;/,'_'))) ; 'windows'}</p> <p>click into the box and hit return or change to use another non ASCII-8BIT string...</p> <input id="#{@lang_hash.keys[0]}" value="#{@lang_hash.values[0]}" type="text" onkeydown=" if (event.keyCode == 13) this.value=unicodeLiteral(this.value);trans_L8();" title="type text and then 'Enter'"> <p>below is the return using 'get_element_value'</p> <h4 id="translation1"><!-- return appears here --></h4> <p>below is the return using 'add_action_callback' </p> <h4 id="translation2"><!-- return appears here --></h4> <h4 id="translation3"><!-- return appears here --></h4> <script type="text/javascript" charset="UTF-8"> function trans_L8() { window.location = 'skp;trans_L8@'+#{@lang_hash.keys[0]}.id+','+(#{@lang_hash.keys[0]}.value); } /* Creates a uppercase hex number with at least length digits from a given number */ function fixedHex(number, length){ var str = number.toString(16).toUpperCase(); while(str.length < length) str = "0" + str; return str; } /* Creates a unicode literal based on the string */ function unicodeLiteral(str){ var i; var result = ""; for( i = 0; i < str.length; ++i){ /* You should probably replace this by an isASCII test */ if(str.charCodeAt(i) > 126 || str.charCodeAt(i) < 32) result += "\\\\" + "u" + fixedHex(str.charCodeAt(i),4); else result += str[i]; } return result; } </script> </body> </html> ) @dlg2.set_html html RUBY_PLATFORM =~ /(darwin)/ ? @dlg2.show_modal() ; @dlg2.show() @dlg2.add_action_callback("trans_L8") {|dialog, params| param_id = params.split(',')[0].to_s param_val = params.split(',')[1].gsub('/',"\\").to_s callback = 'return using add_action_callback ' # + param_id.to_s + ' = ' + param_val.to_s # commented out to avoid the error puts callback p(params) elm_val = (@dlg2.get_element_value(param_id)).gsub('/',"\\").to_s element_value = 'return using get_element_value => ' + elm_val puts element_value @dlg2.execute_script("translation1.textContent='#{elm_val}'") @dlg2.execute_script("translation2.textContent='#{param_val}'") } end show_problem # load("/Users/johns_iMac/Library/Application Support/SketchUp 2014/SketchUp/Plugins/jcb_ViewPortResize/dev/show_encoding_issue.rb") # load("[add your path]/show_encoding_issue.rb")BIG question is will this work on other PC's, or would I need a platform conditional? In review, 
 I think the http header has the incorrect encoding and as SU is the server it must come from there...
 The internal 'separator' variations is still a mystery...john 
- 
 @driven said: ... and a question? Is there code to get the Windows operating system details? Get the default system Encoding: 
 Encoding::find("filesystem")
 or
 Encoding::find("locale")(On my machine it returns the #<Encoding:Windows-1252>object reference.)If you want the Windows version: 
 %x[ver]On my machine it returns: "Microsoft Windows [Version 6.1.7601]"
 XP is 5.1
 Vista is 6.0
 Win7 is 6.1What else do you want to know ? 
- 
 @dan rathbun said: What else do you want to know ? does my last script run on your PC? 
 if swap these in, can you show me a screen shot?
 a more complex input...@lang_hash = {'lid1'=> %Q(élan 勢い Schwung импульс)} # I'm using a hash because it's what I use in my plugin...and the widows versioning... <p>Tested on #{RUBY_PLATFORM =~ /(darwin)/ ? ((%x(sw_vers).sub(/ProductName;/,'').sub(/ProductVersion;/,'').sub(/BuildVersion;/,'_'))) ; %x[ver]}</p>also, did you try it without the fix? did you get both returns on v2014? john 
- 
 I had not actually run the code at all, before. 
 Here goes:@driven said: ..., did you try it without the fix? did you get both returns on v2014? On SU2014, without the fix: 
  
 load "test/WebDialog_param_encoding_bug.rb" %(#004000)[true return using add_action_callback lid1 = élan return using get_element_value => élan]- After hitting return the elements are NOT replaced in the WebDialog because %(#8000BF)[element.textContent=]does not work on MSIE.
 You need a platform code branch and use%(#8000BF)[element.innerText=]on PC.
 
 On SU2014, WITH the fix, AFTER hitting return: 
  
 load "test/WebDialog_param_encoding_fix.rb" %(#004000)[true return using add_action_callback "lid1,\\u00E9lan \\uF8FF \\u00FCmlet" return using get_element_value => \u00E9lan \uF8FF \u00FCmlet]
 First setting: 
 Encoding::default_internal="UTF-8" %(#404000)[UTF-8]
 .. has not effect (no difference.)
- After hitting return the elements are NOT replaced in the WebDialog because 
- 
 With the two changes on SU2014. Before hitting ENTER: 
  .. but when I click in the text control, and hit the END key, the ruby console shows: 
 %(#004000)[return using add_action_callback "lid1,\xC3\xA9lan \xE5\x8B\xA2\xE3\x81\x84 Schwung \xD0\xB8\xD0\xBC\xD0\xBF\xD1\x83\xD0\xBB\xD1\x8C\xD1\x81" return using get_element_value => élan 勢い Schwung импульс]
 After hitting ENTER: 
  .. and then the Ruby console shows: 
 %(#004000)[return using add_action_callback "lid1,\\u00E9lan \\u52E2\\u3044 Schwung \\u0438\\u043C\\u043F\\u0443\\u043B\\u044C\\u0441" return using get_element_value => \u00E9lan \u52E2\u3044 Schwung \u0438\u043C\u043F\u0443\u043B\u044C\u0441] 
- 
 thanks dan @unknownuser said: You need a platform code branch and use element.innerText= on PC. innerText works on mac, so I'll just change that... can you run with the tweak... 
 and see if they show in the dialog...the extra // in your console return confuses me. I had to add more to escape ruby escaping javascript... john 
Advertisement



 
                             
                             
                             
                             
                             
                             
                            