sketchucation logo sketchucation
    • Login
    ℹ️ GoFundMe | Our friend Gus Robatto needs some help in a challenging time Learn More

    Ruby "split" in file line reading

    Scheduled Pinned Locked Moved Developers' Forum
    8 Posts 3 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      shirazbj
      last edited by

      hi,

      I used to use:
      b = file.gets()
      c=b.split(" ")
      to read lines from a data file in text.

      but the following lines doesn't have a " " between two data if the value is negative.

      0.208646094863E+000-0.513313617099E-001-0.915625940530E-003 0.180604789668E+000-0.240882759294E-001-0.872767876563E-003

      How do I split them now?

      Thanks in advance

      Cean

      1 Reply Last reply Reply Quote 0
      • Dan RathbunD Offline
        Dan Rathbun
        last edited by

        Try this, it does not matter whether the numbers or exponents are postive or negative:
        b = file.gets() x = b[0..18].strip.to_f y = b[19..37].strip.to_f z = b[38..-1].strip.to_f

        It is also safer to read the whole file into an array, then iterate the array:
        ` filepath = 'dir/dir2/filename.dat'
        b = IO.readlines(filepath)

        b is is now an Array of textlines

        b.each_with_index do |line,index|

        process the current line

        end`

        I'm not here much anymore.

        1 Reply Last reply Reply Quote 0
        • C Offline
          Cleverbeans
          last edited by

          For something like this I like to use regular expressions. It has the advantage of allowing you to explicitly capture the pattern you're looking for, regardless of slight variations in the files. Here is how I would parse the line into [float,int] pairs, and note that even if the values are separated by white space this will still work. The pattern given matches an optional + or -, then a digit, a decimal place, then more digits till it finds then E, then again an optional + or -, and exactly 3 digits. the string.scan method finds every match in the string and puts it into an array, and the pattern has groups delimited by the parenthesis which split the matching string natural into pair as [num,exp], all that's left is to convert to numerical quantities. Here is the code.

          
          def parse_line(line)
             #accepts a string of numbers in scientific notation
             #returns an array of [float,int] pairs
             pat = /([-|\+]?\d\.\d*)E([-|\+]?\d\d\d)/
             return line.scan(pat).map{|num,exp| [num.to_f, exp.to_i]}
          end
          
          
          1 Reply Last reply Reply Quote 0
          • Dan RathbunD Offline
            Dan Rathbun
            last edited by

            CB.. how about doing the conversion within the method, and returning an [x,y,z] array ?

            
            def parse_line(line)
               #accepts a string of numbers in scientific notation
               #returns an array of [float,int] pairs
               pat = /([-|\+]?\d\.\d*)E([-|\+]?\d\d\d)/
               val = line.scan(pat).map{|num,exp| [num.to_f, exp.to_i]}
               # Use Skecthup's Array class extended instance methods;
               # .x(), .y() and .z() to get the 3 members of val array.
               return [ val.x[0]*(10**val.x[1]),
                        val.y[0]*(10**val.y[1]),
                        val.z[0]*(10**val.z[1]) ]
            end 
            

            I'm not here much anymore.

            1 Reply Last reply Reply Quote 0
            • C Offline
              Cleverbeans
              last edited by

              @dan rathbun said:

              CB.. how about doing the conversion within the method, and returning an [x,y,z] array ?

              Certainly can be done, I just didn't presume that the three numbers were coordinate values, or that a line would only have three values, but under these assumptions. I would probably do it a little differently however by including the construction point within the map block which tightens up the code a bit more than even my original.

              
              def parse_line(line)
                 #accepts a string of numbers in scientific notation
                 #returns an array of [float,int] pairs
                 pat = /([-|\+]?\d\.\d*)E([-|\+]?\d\d\d)/
                 return line.scan(pat).map{|num,exp| num.to_f**exp.to_i}
              end 
              

              You could also convert to numerical values and then stride the array if the values are not neatly separated by line. I figured there are times when scientific notation may be the preferred output for whatever reason, so I chose that route for the example code. In any event, the regular expression is at the heart of the function and the rest is easily tailored to various needs.

              1 Reply Last reply Reply Quote 0
              • C Offline
                Cleverbeans
                last edited by

                After looking at this again, it seems ruby understands scientific notation natively, so there is no need to split the block into num and exp, it can just be grouped as a single value. Here is the modification which gives a noticeable increase in speed as well. I would expect in general the regular expression will be slower than index parsing too.

                def parse_line(line)
                   #accepts a string of numbers in scientific notation
                   #returns an array of [float,int] pairs
                   pat = /[-|\+]?\d\.\d*E[-|\+]?\d\d\d/
                   return line.scan(pat).map{|num| num.to_f}
                end
                
                1 Reply Last reply Reply Quote 0
                • S Offline
                  shirazbj
                  last edited by

                  Heard of regular expressions before, This time I tried to understand it. Thanks.

                  1 Reply Last reply Reply Quote 0
                  • C Offline
                    Cleverbeans
                    last edited by

                    @shirazbj said:

                    Heard of regular expressions before, This time I tried to understand it. Thanks.

                    Regular expression are simultaneously very useful and painful to work with. There is a great quote by Jamie Zawinski which goes "Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems." I first ran into them at Dive into Python 3 which gives a couple simple case studies and although the language is slightly different the regular expressions are basically the same.

                    Looking at the regular expression given by /[-|+]?\d.\dE[-|+]?\d\d\d/ we have the / on each end marking the start and end of the expression. There are two blocks of [-|+]? which matches an optional + or - sign. I would read this is "0 or 1 instances of + or -". Then we have \d.\dE which matches a single digit, then some arbitrary number of digits until it find an E, then we again have an option sign followed by exactly 3 digits. So this pattern will match a single numerical value in the string given. The .scan method then constructs an array out of every match. From here you can wrap blocks in parenthesis to create groups so you can break the matches up into their own arrays which is what I had done originally until I realized Ruby would parse scientific notation with the .to_f method.

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    Buy SketchPlus
                    Buy SUbD
                    Buy WrapR
                    Buy eBook
                    Buy Modelur
                    Buy Vertex Tools
                    Buy SketchCuisine
                    Buy FormFonts

                    Advertisement