24
Letting them compete three times (a million pops/dels each time):
from timeit import timeit
for _ in range(3):
t1 = timeit('b.pop(0)', 'b = bytearray(1000000)')
t2 = timeit('del b[0]', 'b = bytearray(1000000)')
print(t1 / t2)
Time ratios (Try it online!):
274.6037053753368
219.38099365582403
252.08691226683823
Why is pop
that much slower at doing the same thing?
6
2 Answers
Reset to default
34
When you run b.pop(0)
, Python moves all the elements back by one as you might expect. This takes O(n) time.
When you del b[0]
, Python simply increases the start pointer of the object by 1.
In both cases, PyByteArray_Resize
is called to adjust the size. When the new size is smaller than half the allocated size, the allocated memory will be shrunk. In the del b[0]
case, this is the only point where the data will be copied. As a result, this case will take O(1) amortized time.
Relevant code:
bytearray_pop_impl
function: Always calls
memmove(buf + index, buf + index + 1, n - index);
The bytearray_setslice_linear
function is called for del b[0]
with lo == 0
, hi == 1
, bytes_len == 0
. It reaches this code (with growth == -1
):
if (lo == 0) {
/* Shrink the buffer by advancing its logical start */
self->ob_start -= growth;
/*
0 lo hi old_size
| |<----avail----->|<-----tail------>|
| |<-bytes_len->|<-----tail------>|
0 new_lo new_hi new_size
*/
}
else {
/*
0 lo hi old_size
| |<----avail----->|<-----tomove------>|
| |<-bytes_len->|<-----tomove------>|
0 lo new_hi new_size
*/
memmove(buf + lo + bytes_len, buf + hi,
Py_SIZE(self) - hi);
}
11
-
6
Good answer. Do you know why developers did not choose to increment the pointer also for
b.pop(0)
?– Jérôme Richardyesterday
-
17
@JérômeRichard I've seen it many times. They don't want to optimize uncommon cases. Both to avoid slowing down common cases even slightly (might be a net loss overall!), and to avoid having extra code that needs to be written, tested, reviewed, maintained, read, may introduce errors, and may complicate things. See the original discussion of the slicing optimization for a perfect example of all that.
– Kelly Bundyyesterday
-
3
Someone had use cases for the slicing optimization (FIFO buffers), and it was somewhat of a fight. Amusingly at some point they spoke of "popping" but weren't talking about pop() but about removing a slice. They never spoke about popping single bytes, and I do believe that is really uncommon for bytearrays. Even from the end. Especially when the array is as huge as mine. And my speed ratio is also much smaller when the array is much smaller.
– Kelly Bundyyesterday
-
2
So I suspect nobody ever even desired bytearray's
pop(0)
to be optimized. I didn't, either, I have no real use for it, I only stumbled upon this when looking for something else. And if I did have a use for it, I could most likely usedel
instead and would just do that.– Kelly Bundyyesterday
-
3
@PM2Ring Hmm, I don't think so. We know why slice deletion got implemented, that's in the linked original discussion. I suspect (didn't check) that
del
with single index got optimized as a side effect. And optimizingpop
was never even mentioned, at least in that discussion.– Kelly Bundyyesterday
14
I have to admit, I was very surprised by the timings myself. After convincing myself that they were in fact correct, I took a dive into the CPython source code, and I think I found the answer- cpython optimizes del bytearr[0:x]
, by just incrementing the pointer to the start of the array:
if (growth < 0) {
if (!_canresize(self))
return -1;
if (lo == 0) {
/* Shrink the buffer by advancing its logical start */
self->ob_start -= growth;
You can find the del bytearray[...]
logic here (implemented via bytearray_setslice
, with values
being NULL
), which in turn calls bytearray_setslice_linear
, which contains the above optimization.
For comparison, bytearray.pop does NOT implement this optimization- see here in the source code.
Not the answer you're looking for? Browse other questions tagged
or ask your own question.
or ask your own question.
There's at least an assignment involved in
.pop()
but the difference is quite extraordinaryyesterday
Probably has to do with the fact that
pop()
has to return the value, hence it has to do reference counts and also converting the return value. And Python also takes a lot of time in attribute access(b.pop
–> find the attribute, and then calling it is expensive but can be improved a bit by doingpop = b.pop
) and it doesn't have to worry about all that withdel
being a single bytecode instruction.yesterday
@AshwiniChaudhary No, those things don't make that much of a difference. Even
pop(0)
vsb[0]; del b[0]; b.pop
was still over 100.yesterday
this probably goes without saying but if performance is a concern you should either reverse the array order or get a deq
9 hours ago
@Tornado547 No. Reversing makes it unnatural, iconfusing, inconvenient, and takes extra time. And
collections.deque
(if that's what you mean) is far less performant at jobs for bytearrays and is lacking features like substring search.9 hours ago
|
Show 1 more comment