• Login
sketchucation logo sketchucation
  • Login
🤑 SketchPlus 1.3 | 44 Tools for $15 until June 20th Buy Now

Ruby "split" in file line reading

Scheduled Pinned Locked Moved Developers' Forum
8 Posts 3 Posters 2.7k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    shirazbj
    last edited by 1 Apr 2011, 04:17

    hi,

    I used to use:
    b = file.gets()
    c=b.split(" ")
    to read lines from a data file in text.

    but the following lines doesn't have a " " between two data if the value is negative.

    0.208646094863E+000-0.513313617099E-001-0.915625940530E-003 0.180604789668E+000-0.240882759294E-001-0.872767876563E-003

    How do I split them now?

    Thanks in advance

    Cean

    1 Reply Last reply Reply Quote 0
    • D Offline
      Dan Rathbun
      last edited by 1 Apr 2011, 04:37

      Try this, it does not matter whether the numbers or exponents are postive or negative:
      b = file.gets() x = b[0..18].strip.to_f y = b[19..37].strip.to_f z = b[38..-1].strip.to_f

      It is also safer to read the whole file into an array, then iterate the array:
      ` filepath = 'dir/dir2/filename.dat'
      b = IO.readlines(filepath)

      b is is now an Array of textlines

      b.each_with_index do |line,index|

      process the current line

      end`

      I'm not here much anymore.

      1 Reply Last reply Reply Quote 0
      • C Offline
        Cleverbeans
        last edited by 4 Apr 2011, 16:14

        For something like this I like to use regular expressions. It has the advantage of allowing you to explicitly capture the pattern you're looking for, regardless of slight variations in the files. Here is how I would parse the line into [float,int] pairs, and note that even if the values are separated by white space this will still work. The pattern given matches an optional + or -, then a digit, a decimal place, then more digits till it finds then E, then again an optional + or -, and exactly 3 digits. the string.scan method finds every match in the string and puts it into an array, and the pattern has groups delimited by the parenthesis which split the matching string natural into pair as [num,exp], all that's left is to convert to numerical quantities. Here is the code.

        
        def parse_line(line)
           #accepts a string of numbers in scientific notation
           #returns an array of [float,int] pairs
           pat = /([-|\+]?\d\.\d*)E([-|\+]?\d\d\d)/
           return line.scan(pat).map{|num,exp| [num.to_f, exp.to_i]}
        end
        
        
        1 Reply Last reply Reply Quote 0
        • D Offline
          Dan Rathbun
          last edited by 4 Apr 2011, 18:28

          CB.. how about doing the conversion within the method, and returning an [x,y,z] array ?

          
          def parse_line(line)
             #accepts a string of numbers in scientific notation
             #returns an array of [float,int] pairs
             pat = /([-|\+]?\d\.\d*)E([-|\+]?\d\d\d)/
             val = line.scan(pat).map{|num,exp| [num.to_f, exp.to_i]}
             # Use Skecthup's Array class extended instance methods;
             # .x(), .y() and .z() to get the 3 members of val array.
             return [ val.x[0]*(10**val.x[1]),
                      val.y[0]*(10**val.y[1]),
                      val.z[0]*(10**val.z[1]) ]
          end 
          

          I'm not here much anymore.

          1 Reply Last reply Reply Quote 0
          • C Offline
            Cleverbeans
            last edited by 5 Apr 2011, 15:19

            @dan rathbun said:

            CB.. how about doing the conversion within the method, and returning an [x,y,z] array ?

            Certainly can be done, I just didn't presume that the three numbers were coordinate values, or that a line would only have three values, but under these assumptions. I would probably do it a little differently however by including the construction point within the map block which tightens up the code a bit more than even my original.

            
            def parse_line(line)
               #accepts a string of numbers in scientific notation
               #returns an array of [float,int] pairs
               pat = /([-|\+]?\d\.\d*)E([-|\+]?\d\d\d)/
               return line.scan(pat).map{|num,exp| num.to_f**exp.to_i}
            end 
            

            You could also convert to numerical values and then stride the array if the values are not neatly separated by line. I figured there are times when scientific notation may be the preferred output for whatever reason, so I chose that route for the example code. In any event, the regular expression is at the heart of the function and the rest is easily tailored to various needs.

            1 Reply Last reply Reply Quote 0
            • C Offline
              Cleverbeans
              last edited by 5 Apr 2011, 15:57

              After looking at this again, it seems ruby understands scientific notation natively, so there is no need to split the block into num and exp, it can just be grouped as a single value. Here is the modification which gives a noticeable increase in speed as well. I would expect in general the regular expression will be slower than index parsing too.

              def parse_line(line)
                 #accepts a string of numbers in scientific notation
                 #returns an array of [float,int] pairs
                 pat = /[-|\+]?\d\.\d*E[-|\+]?\d\d\d/
                 return line.scan(pat).map{|num| num.to_f}
              end
              
              1 Reply Last reply Reply Quote 0
              • S Offline
                shirazbj
                last edited by 8 Apr 2011, 02:44

                Heard of regular expressions before, This time I tried to understand it. Thanks.

                1 Reply Last reply Reply Quote 0
                • C Offline
                  Cleverbeans
                  last edited by 11 Apr 2011, 15:20

                  @shirazbj said:

                  Heard of regular expressions before, This time I tried to understand it. Thanks.

                  Regular expression are simultaneously very useful and painful to work with. There is a great quote by Jamie Zawinski which goes "Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems." I first ran into them at Dive into Python 3 which gives a couple simple case studies and although the language is slightly different the regular expressions are basically the same.

                  Looking at the regular expression given by /[-|+]?\d.\dE[-|+]?\d\d\d/ we have the / on each end marking the start and end of the expression. There are two blocks of [-|+]? which matches an optional + or - sign. I would read this is "0 or 1 instances of + or -". Then we have \d.\dE which matches a single digit, then some arbitrary number of digits until it find an E, then we again have an option sign followed by exactly 3 digits. So this pattern will match a single numerical value in the string given. The .scan method then constructs an array out of every match. From here you can wrap blocks in parenthesis to create groups so you can break the matches up into their own arrays which is what I had done originally until I realized Ruby would parse scientific notation with the .to_f method.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  1 / 1
                  • First post
                    1/8
                    Last post
                  Buy SketchPlus
                  Buy SUbD
                  Buy WrapR
                  Buy eBook
                  Buy Modelur
                  Buy Vertex Tools
                  Buy SketchCuisine
                  Buy FormFonts

                  Advertisement