a,b = b,a in python vs std::swap() in C++
I know that a,b = b,a is basically assigning a tuple (a,b) the values of another tuple (b,a). This is, essentially, swapping the values form a to b and from b to a. Thus, causing a "swap".
This is the functionality of the swap() function in C++.
From research, I have seen that C++'s swap() function uses a third temporary variable to perform the swap. I haven't been able to find how is a,b = b,a implemented in python.
How is a,b = b,a implemented?
Does python also use a third temporary variable? If it doesn't, how does it work?
How do both operations compare in terms of speed? I'm guessing that if python also uses a third variable, the difference in execution time would be due to python being interpreted.
Edit: All answers are great, but the community seems to think that Sapan's is the best one. Also thanks to a_guest, whom, although didn't post an answer, gave us a great deal of information in the comments. Also: everyone seems to agree that swap() is faster just because its C++. I don't necessarily agree with that. Python can be very fast if run as a frozen binary.
For tuple assignments, Python uses the stack structure directly:
>>> import dis
>>> def abc(a, b):
... a, b = b, a
...
>>> dis.dis(abc)
2 0 LOAD_FAST 1 (b)
3 LOAD_FAST 0 (a)
6 ROT_TWO
7 STORE_FAST 0 (a)
10 STORE_FAST 1 (b)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
In python, assignments in a target list on the left-hand side are done from left to right.
How is a,b = b,a implemented?
First, b, a creates a tuple. You can verify this using e.g.
>>> tmp = 1, 2
>>> tmp
(1, 2)
Then, the assignment uses sequence unpacking, overwriting the names a, b. Hence the code is basically
>>> tmp = (a, b)
>>> b, a = tmp
How do both operations compare in terms of speed?
This would depend on your implementation of python. If you use CPython (the standard version), then C++ would likely be much faster since it is compiled and optimized.
CPython implementation details
In CPython, the swap is sometimes optimized. For small swaps (<4 elements) it uses an optimized swap
>>> def swap(a, b):
>>> a, b = b, a
>>> dis.dis(swap)
3 0 LOAD_FAST 1 (b)
3 LOAD_FAST 0 (a)
6 ROT_TWO
7 STORE_FAST 0 (a)
10 STORE_FAST 1 (b)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
>>> def swap(a, b, c):
>>> a, b, c = c, b, a
>>> dis.dis(swap)
3 0 LOAD_FAST 2 (c)
3 LOAD_FAST 1 (b)
6 LOAD_FAST 0 (a)
9 ROT_THREE
10 ROT_TWO
11 STORE_FAST 0 (a)
14 STORE_FAST 1 (b)
17 STORE_FAST 2 (c)
20 LOAD_CONST 0 (None)
23 RETURN_VALUE
For swaps of 4 or more elements it does exactly what I wrote above, without optimization.
>>> def swap(a, b, c, d):
>>> a, b, c, d = d, c, b, a
>>> dis.dis(swap)
3 0 LOAD_FAST 3 (d)
3 LOAD_FAST 2 (c)
6 LOAD_FAST 1 (b)
9 LOAD_FAST 0 (a)
12 BUILD_TUPLE 4
15 UNPACK_SEQUENCE 4
18 STORE_FAST 0 (a)
21 STORE_FAST 1 (b)
24 STORE_FAST 2 (c)
27 STORE_FAST 3 (d)
30 LOAD_CONST 0 (None)
33 RETURN_VALUE
Adding to Sapan's answer:
In C++ it might conceptionally use a third variable to swap. But you can see here that the compiler can produce the same assembly as the one shown from python:
void foo(int& a, int& b)
{
std::swap(a, b);
}
turns into
foo(int&, int&):
mov eax, DWORD PTR [rdi]
mov edx, DWORD PTR [rsi]
mov DWORD PTR [rdi], edx
mov DWORD PTR [rsi], eax
ret
https://godbolt.org/g/dRrzg6
Not exactly answer about implementation of python but important background about std::swap.
C++ std::swap will not simple swap two variables using third one. It will use knowledge of internal state to speed it up.
Some example for std::array and std::vector:
https://gcc.godbolt.org/z/MERLGZ
In both cases it will not use third variable to copy whole object but will directly swap internal representation of both types.
In case of array we have branch for case if array have correct alignment or not.
If yes then will swap whole object using two xmm registers, if not it will swap each element separately.
In case of vector we swap internal pointers between two objects.
Another important thing for std::swap is std::move, this is C++11 addition that allow to easy swap two variables that normally can't be copied like std::unique_ptr.
Right now generic version of std::swap look like:
template<typename Tp>
inline void swap(Tp& a, Tp& b)
{
Tp tmp = std::move(a);
a = std::move(b);
b = std::move(tmp);
}
If your type need memory allocation for coping and it support move then no memory allocation will be done there.
Probably most of this things that C++ do, do not apply to how python handle its objects.
You have two buckets and each one has a marble in it. You want to swap the marbles. How are you going to do that?
You take one marble out, put the other one into the empty bucket and put the first one into the now empty bucket. You have two buckets? Wrong, you have a third: your hand.
Translate this to code:
tmp = a;
a = b;
b = a;
You cannot do this, without a third, temporary bucket to hold the old value.
This is the functionality of the swap() function in C++.
From research, I have seen that C++'s swap() function uses a third temporary variable to perform the swap. I haven't been able to find how is a,b = b,a implemented in python.
How is a,b = b,a implemented?
Does python also use a third temporary variable? If it doesn't, how does it work?
How do both operations compare in terms of speed? I'm guessing that if python also uses a third variable, the difference in execution time would be due to python being interpreted.
Edit: All answers are great, but the community seems to think that Sapan's is the best one. Also thanks to a_guest, whom, although didn't post an answer, gave us a great deal of information in the comments. Also: everyone seems to agree that swap() is faster just because its C++. I don't necessarily agree with that. Python can be very fast if run as a frozen binary.
For tuple assignments, Python uses the stack structure directly:
>>> import dis
>>> def abc(a, b):
... a, b = b, a
...
>>> dis.dis(abc)
2 0 LOAD_FAST 1 (b)
3 LOAD_FAST 0 (a)
6 ROT_TWO
7 STORE_FAST 0 (a)
10 STORE_FAST 1 (b)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
In python, assignments in a target list on the left-hand side are done from left to right.
How is a,b = b,a implemented?
First, b, a creates a tuple. You can verify this using e.g.
>>> tmp = 1, 2
>>> tmp
(1, 2)
Then, the assignment uses sequence unpacking, overwriting the names a, b. Hence the code is basically
>>> tmp = (a, b)
>>> b, a = tmp
How do both operations compare in terms of speed?
This would depend on your implementation of python. If you use CPython (the standard version), then C++ would likely be much faster since it is compiled and optimized.
CPython implementation details
In CPython, the swap is sometimes optimized. For small swaps (<4 elements) it uses an optimized swap
>>> def swap(a, b):
>>> a, b = b, a
>>> dis.dis(swap)
3 0 LOAD_FAST 1 (b)
3 LOAD_FAST 0 (a)
6 ROT_TWO
7 STORE_FAST 0 (a)
10 STORE_FAST 1 (b)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
>>> def swap(a, b, c):
>>> a, b, c = c, b, a
>>> dis.dis(swap)
3 0 LOAD_FAST 2 (c)
3 LOAD_FAST 1 (b)
6 LOAD_FAST 0 (a)
9 ROT_THREE
10 ROT_TWO
11 STORE_FAST 0 (a)
14 STORE_FAST 1 (b)
17 STORE_FAST 2 (c)
20 LOAD_CONST 0 (None)
23 RETURN_VALUE
For swaps of 4 or more elements it does exactly what I wrote above, without optimization.
>>> def swap(a, b, c, d):
>>> a, b, c, d = d, c, b, a
>>> dis.dis(swap)
3 0 LOAD_FAST 3 (d)
3 LOAD_FAST 2 (c)
6 LOAD_FAST 1 (b)
9 LOAD_FAST 0 (a)
12 BUILD_TUPLE 4
15 UNPACK_SEQUENCE 4
18 STORE_FAST 0 (a)
21 STORE_FAST 1 (b)
24 STORE_FAST 2 (c)
27 STORE_FAST 3 (d)
30 LOAD_CONST 0 (None)
33 RETURN_VALUE
Adding to Sapan's answer:
In C++ it might conceptionally use a third variable to swap. But you can see here that the compiler can produce the same assembly as the one shown from python:
void foo(int& a, int& b)
{
std::swap(a, b);
}
turns into
foo(int&, int&):
mov eax, DWORD PTR [rdi]
mov edx, DWORD PTR [rsi]
mov DWORD PTR [rdi], edx
mov DWORD PTR [rsi], eax
ret
https://godbolt.org/g/dRrzg6
Not exactly answer about implementation of python but important background about std::swap.
C++ std::swap will not simple swap two variables using third one. It will use knowledge of internal state to speed it up.
Some example for std::array and std::vector:
https://gcc.godbolt.org/z/MERLGZ
In both cases it will not use third variable to copy whole object but will directly swap internal representation of both types.
In case of array we have branch for case if array have correct alignment or not.
If yes then will swap whole object using two xmm registers, if not it will swap each element separately.
In case of vector we swap internal pointers between two objects.
Another important thing for std::swap is std::move, this is C++11 addition that allow to easy swap two variables that normally can't be copied like std::unique_ptr.
Right now generic version of std::swap look like:
template<typename Tp>
inline void swap(Tp& a, Tp& b)
{
Tp tmp = std::move(a);
a = std::move(b);
b = std::move(tmp);
}
If your type need memory allocation for coping and it support move then no memory allocation will be done there.
Probably most of this things that C++ do, do not apply to how python handle its objects.
You have two buckets and each one has a marble in it. You want to swap the marbles. How are you going to do that?
You take one marble out, put the other one into the empty bucket and put the first one into the now empty bucket. You have two buckets? Wrong, you have a third: your hand.
Translate this to code:
tmp = a;
a = b;
b = a;
You cannot do this, without a third, temporary bucket to hold the old value.
Comments
Post a Comment