python - List lookup faster than tuple? -
in past, when i've needed array-like indexical lookups in tight loop, use tuples, since seem extremely performant (close using n-number of variables). however, decided question assumption today , came surprising results:
in [102]: l = range(1000) in [103]: t = tuple(range(1000)) in [107]: timeit(lambda : l[500], number = 10000000) out[107]: 2.465047836303711 in [108]: timeit(lambda : t[500], number = 10000000) out[108]: 2.8896381855010986 tuple lookups appear take 17% longer list lookups! repeated experimentation gave similar results. disassembling each, found them both be:
in [101]: dis.dis(lambda : l[5]) 1 0 load_global 0 (l) 3 load_const 1 (5) 6 binary_subscr 7 return_value for reference, typical 10,000,000 global variable lookup/returns take 2.2s. also, ran without lambdas, y'know, in case (note number=100,000,000 rather 10,000,000).
in [126]: timeit('t[500]', 't=range(1000)', number=100000000) out[126]: 6.972800970077515 in [127]: timeit('t[500]', 't=tuple(range(1000))', number=100000000) out[127]: 9.411366939544678 here, tuple lookup take 35% longer. what's going on here? tight loops, seems significant discrepancy. causing this?
note decomposition variable (e.g. x,y=t), tuples faster (~6% in few tests less time) , construction fixed number of arguments, tuples crazy faster(~83% less time). don't take these results general rules; performed few minitests going meaningless projects.
in [169]: print(sys.version) 2.7.1 (r271:86882m, nov 30 2010, 09:39:13) [gcc 4.0.1 (apple inc. build 5494)]
tuples faster constructing lists, not accessing them.
tuples should faster access: require 1 less indirection. however, believe main benefit don't require second allocation when constructing list.
the reason lists faster lookups because python engine has special optimization it:
case binary_subscr: w = pop(); v = top(); if (pylist_checkexact(v) && pyint_checkexact(w)) { /* inline: list[int] */ py_ssize_t = pyint_asssize_t(w); if (i < 0) += pylist_get_size(v); if (i >= 0 && < pylist_get_size(v)) { x = pylist_get_item(v, i); py_incref(x); } with optimization commented out, tuples faster lists (by 4%).
note adding separate special-case optimization tuples here isn't necessary idea. every special case in main body of vm loop increases code size, decreases cache consistency, , means every other type of lookup requires branch.
Comments
Post a Comment