sketchucation logo sketchucation
    • Login
    ℹ️ Licensed Extensions | FredoBatch, ElevationProfile, FredoSketch, LayOps, MatSim and Pic2Shape will require license from Sept 1st More Info

    Encode in 2014 again.

    Scheduled Pinned Locked Moved Developers' Forum
    12 Posts 4 Posters 678 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Q Offline
      qiucx
      last edited by

      My function reads the string from csv file with utf-8 without BOM and all rb file, html files are utf-8 too.

      Now, i using "File.open pathAttrFile, "r:utf-8"" to read the string file csv file having simple chinese encode content, my problem is that the function can not work when using "split" method directly for the read utf-8 string. "fileline.force_encoding("ISO-8859-1").split "," " should be used and it works. I do not know the reason, can anyone explain it? thanks.

      
                  fileAttr = File.open pathAttrFile, "r;utf-8"
                  arrayFileLines = fileAttr.readlines
      
                  arrayFileLines.each do |fileline|
                      attrs = fileline.force_encoding("ISO-8859-1").split ","
      .....
      
      

      And if i need to pass the iso8859-1 string to html, the string should be encoded to utf-8 using .encode("UTF-8") if defined?(Encoding).

      These convert is not very convenient, does anyone have simple method to address the issue?

      thanks

      1 Reply Last reply Reply Quote 0
      • Dan RathbunD Offline
        Dan Rathbun
        last edited by

        Try setting Encoding::default_internal= "UTF-8"
        It has not been set (it's nil on startup,) because it makes no difference on PC, which is a Ruby Core bug.
        I still think it should be set just as external is:
        Encoding::default_external %(#004000)[#<Encoding:UTF-8>]

        Also make sure you have a meta tag in the head of your HTML file that sets encoding ("charset=".)

        I'm not here much anymore.

        1 Reply Last reply Reply Quote 0
        • Q Offline
          qiucx
          last edited by

          thanks for your reply. Encoding::default_internal= "UTF-8" and Encoding::default_external= "UTF-8" are set in my entry main.rb and it works. the simple chinese character can be put correctly. However, when the split method can not work as expected. Following is the log and what is the problem with split method? thanks

          
          def getUsersInfoFromFile(fileName, path)
                  arrayAttributes = Array.new
                  pathAttrFile = Sketchup.find_support_file fileName, path
                  if (pathAttrFile)
                      fileAttr = File.open pathAttrFile,"r;utf-8"
                      arrayFileLines = fileAttr.readlines
          
                      arrayFileLines.each do|fileline|
                          arrayFileLine = fileline.split ","
                          arrayAttributes << arrayFileLine
                      end
                      fileAttr.close
                  end
                  return arrayAttributes
              end
          
          

          also, when arrayFileLines = fileAttr.readlines is changed to arrayFileLines = fileAttr.encode("UTF-8").readlines, the problem remains and split error can be fixed when fileAttr.encode("ISO-8859-1").readlinesis used. can you explain? thanks again.

          Error:
          #<ArgumentError: invalid byte sequence in UTF-8>
          C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:130:in split' C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:130:in readLines'
          C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:138:in getUsersInfoFromFile' C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:27:in checkLogin'
          C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:97:in block in login' C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:106:in call'
          C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:106:in show_modal' C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:106:in login'
          C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/Main.rb:7:in <top (required)>' C:/remove/remove_SketchUp/Tools/RubyStdLib/rubygems/core_ext/kernel_require.rb:45:in require'
          C:/remove/remove_SketchUp/Tools/RubyStdLib/rubygems/core_ext/kernel_require.rb:45:in require' C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD.rb:1:in <top (required)>'

          1 Reply Last reply Reply Quote 0
          • Q Offline
            qiucx
            last edited by

            i find the solution but can not know the reason.

            fileline.force_encoding("ISO-8859-1").encode("utf-8", replace: nil)

            can anyone explain it? And does anyone have better solution?

            thanks.

            1 Reply Last reply Reply Quote 0
            • tt_suT Offline
              tt_su
              last edited by

              Can you share your file? The error indicates you are opening a file in UTF-8 mode without the file actually being UTF-8. Are you sure it's UTF-8 encoded?

              1 Reply Last reply Reply Quote 0
              • Q Offline
                qiucx
                last edited by

                the files attached.


                userAttributes.csv


                UserManager.rb


                Login.zip

                1 Reply Last reply Reply Quote 0
                • TIGT Online
                  TIG Moderator
                  last edited by

                  It's not the RB that is wrongly encoded - that is properly encoded as UTF8-without-BOM anyway.
                  BUT related files like the CSV [and HTML?] are also best when similarly encoded - but they are not.

                  By changing the v2014 read in string back and forth it makes it acceptable.
                  Note that the encoding change will break earlier SketchUp version users so you need to check if it's defined before using it.
                  Also writing/reading a file to the main folders could have issues with permissions with earlier versions <v2014 AND break if the user has a custom folder setup 😕

                  TIG

                  1 Reply Last reply Reply Quote 0
                  • Q Offline
                    qiucx
                    last edited by

                    BUT related files like the CSV [and HTML?] are also best when similarly encoded - but they are not.

                    I have used the notepad++ to change the files encoding to UTF-8 without BOM.

                    "also best when similarly encoded - but they are not" how can i to do this similarly encoded?

                    1 Reply Last reply Reply Quote 0
                    • TIGT Online
                      TIG Moderator
                      last edited by

                      Sorry, you misunderstand me.
                      'Similarly encoded' simply means encoded as UTF8-without-BOM - which you say you have now done ?

                      TIG

                      1 Reply Last reply Reply Quote 0
                      • Q Offline
                        qiucx
                        last edited by

                        I have set all files to UTF-8 without BOM.

                        but the error remain:

                        Error: #<ArgumentError: invalid byte sequence in UTF-8>
                        C:/Users/tc/AppData/Roaming/SketchUp/SketchUp 2014/SketchUp/Plugins/CRSD/RequireFiles/UserManager/UserManager.rb:24:in `split'

                        Can anyone have it? thanks.

                        1 Reply Last reply Reply Quote 0
                        • TIGT Online
                          TIG Moderator
                          last edited by

                          Try this version...
                          I have marked changes with a final ###
                          Also try replacing the Chinese with some Western text in case of issues there ?


                          UserManager.rb

                          TIG

                          1 Reply Last reply Reply Quote 0
                          • Q Offline
                            qiucx
                            last edited by

                            Thanks for all. The problem is done.

                            The solution is to use ultraedit convert function(convert ascii to utf-8(unicode edition)). Originally, i use the notepad++ to convert utf-8 without BOM, and it seems not work as expected.

                            Now, the split function can work correctly without any force_encoding, but the input from html should be converted using force_encoding("UTF-8").

                            Thanks again for TIG.

                            1 Reply Last reply Reply Quote 0
                            • 1 / 1
                            • First post
                              Last post
                            Buy SketchPlus
                            Buy SUbD
                            Buy WrapR
                            Buy eBook
                            Buy Modelur
                            Buy Vertex Tools
                            Buy SketchCuisine
                            Buy FormFonts

                            Advertisement