Set.insert vs array << x unless array.include?(x)
-
Great info Thom! I had played around with sets and arrays before and I was not impressed with sets. But I did not experiment with amount of unique objects, which apparently has an adverse effect on arrays. Thanks,
Chris
-
-
I'll be damned!
Very interesting Jernej....looks like I need to do some more testing of my script and possibly refactor again.
So while the
Array.include?
is dead slow - the overhead of hash look-up is still faster than just adding everything into one big pile and so a single filtering afterwards... -
looking at the .uniq! source code: http://ruby-doc.org/core/classes/Array.src/M002215.html
/* * call-seq; * array.uniq! -> array or nil * * Removes duplicate elements from _self_. * Returns <code>nil</code> if no changes are made (that is, no * duplicates are found). * * a = [ "a", "a", "b", "b", "c" ] * a.uniq! #=> ["a", "b", "c"] * b = [ "a", "b", "c" ] * b.uniq! #=> nil */ static VALUE rb_ary_uniq_bang(ary) VALUE ary; { VALUE hash, v, vv; long i, j; hash = ary_make_hash(ary, 0); if (RARRAY(ary)->len == RHASH(hash)->tbl->num_entries) { return Qnil; } for (i=j=0; i<RARRAY(ary)->len; i++) { v = vv = rb_ary_elt(ary, i); if (st_delete(RHASH(hash)->tbl, (st_data_t*)&vv, 0)) { rb_ary_store(ary, j++, v); } } RARRAY(ary)->len = j; return ary; }
-
Jernej: how about larger iterations and higher number of random values?
-
How about using Array.uniq! method:
Test 1
t=Time.now;a=[];10000000.times{r=rand(10);a<<r unless a.include?(r)};puts Time.now - t
Result: 12.297
t=Time.now;a=Set.new;10000000.times{a.insert(rand(10))};puts Time.now - t
Result: 15.719
t=Time.now;a=[];10000000.times{r=rand(10);a<<r};a.uniq!; puts Time.now - t
Result: 7.753Test 2
t=Time.now;a=[];100000.times{r=rand(10000);a<<r unless a.include?(r)};puts Time.now-t
Result: 40.97
t=Time.now;a=Set.new;100000.times{a.insert(rand(10000))};puts Time.now-t
Result: 0.377
t=Time.now;a=[];100000.times{r=rand(10000);a<<r};a.uniq!;puts Time.now-t
Result: 0.087 -
@thomthom said:
Jernej: how about larger iterations and higher number of random values?
t=Time.now;a=Set.new;10000000.times{a.insert(rand(10000))};puts Time.now - t
Result: 37.911
t=Time.now;a=[];10000000.times{r=rand(10000);a<<r};a.uniq!; puts Time.now - t
Result: 8.282Still a winner?
-
It's refactoring time!
Nice find!
-
That's all great (using .uniq!) until you start dealing with Point3d objects
In that case, always use Set. -
....or make all of your Point3d's into arrays so they will sort!/uniq! etc as arrays...
-
@tig said:
....or make all of your Point3d's into arrays so they will sort!/uniq! etc as arrays...
But is the overhead of converting the Point3d's into arrays and uniq! faster than using a Set?
-
Who knows ?
Time for you to do another test... -
I probably don't know what I am doing, but I ran the following test, and obtained the
attached results
. I typically usearray.push variable
, and don't understand the situations when the other examples might be used. Btw, when I applied the other forms to my app, it failed in ways that leave me to believe that those forms are data sensitive. Can anyone explaine to a Ruby beginner what's up?t=Time.now a=[] 100000.times do r=rand(10000) a<<r end a.uniq! puts Time.now-t
0.125
t=Time.now a=[] 100000.times do r=rand(10000) a.push r end a.uniq! puts Time.now-t
0.141
t=Time.now a=[] 100000.times do r=rand(10000) a.push r end puts Time.now-t
0.094
t=Time.now a=[] 100000.times do r=rand(10000) a<<r end puts Time.now-t
0.093
Advertisement