We were studying about Python strings and we tried to understand Python String Slicing through an example. We found a very strange [at least at first look] result when we tried to use String Slicing for creating a copy of the string.
To better understand the scenario and the result, lets start with the basics of integers and lists w.r.t to slicing.
To better understand the scenario and the result, lets start with the basics of integers and lists w.r.t to slicing.
>>> m = 10 # Integer m initialised to 10
>>> n = m # Another integer n storing the same value as in m
>>> print m, n
10 10
>>> id(m), id(n)
( 140204825415168 , 140204825415168 ) # <-- Both are of the same id
In the above scenario, when we initialised variable m, Python associated a tag name m with memory location [ 140204825415168 ] ( For this discussion, assume that id represents memory location )
When we created a new variable n and assigned it the same value as is present in m, Python intelligently, associated a new tag name n, with the same id. Python reused the same memory location to avoid [ or delay ] unnecessary memory allocations.
>>> m += 5 # Increment m by 5
>>> print m, n # m is modified, but n retains its original value
15 10
>>> id(m), id(n)
( 140204825415048, 140204825415168 ) # <-- The id or [address location] of m has changed
When we modified m, by updating its value, Python now allocated a new memory location for m, such that both m and n point to different memory locations [ Illustrated by different id values ]. This resulted in two different values being printed from m and n.
Now, let's do something similar with list variables.
>>> a = [ 1, 2, 3 ] # List a initialised to [ 1, 2, 3 ]
>>> b = a # Another list b storing the same value as in a
>>> print a, b
[1, 2, 3] [1, 2, 3]
>>> id(a), id(b)
( 4342157256 , 4342157256 ) # <-- Both are of the same id
>>> a.append(5) # Append 5 to a
>>> print a, b # Both a and b are modified !!!
[1, 2, 3, 5] [1, 2, 3, 5]
>>> id(a), id(b)
( 4342157256 , 4342157256 ) # <-- Both are of the same id !!!
We see a strange behaviour when we modify a list variable now. The changes done in list variable a are now visible from list variable b as well !!!
The reason for this behaviour is the mutability property of lists. In Python, lists are mutable, i.e., modifiable. Python here is using the variable names a and b as aliases [ or reference in C++ ] to the same data in memory. This is the standard, expected behaviour in Python for lists. Developers or Python coders need to be aware of this, and accordingly write their code.
If you want the changes in one list [ a ], not to be visible through another list [ c ] that also contains the same data, then we need to create a copy or replica of the original list a
>>> a = [ 1, 2, 3 ] # List a initialised to [ 1, 2, 3 ]
>>> b = a # Another list b storing the same value as in a
>>> c = a[:] # List c is a replica / copy of list a
>>> print a, b, c
[1, 2, 3] [1, 2, 3] [1, 2, 3]
>>> id(a), id(b), id(c)
( 4342157256 , 4342157256, 4342380304 ) # <-- Id of c is different from that of a and b
>>> a.append(5) # Append 5 to a
>>> print a, b, c # Only a and b are modified !!!
[1, 2, 3, 5] [1, 2, 3, 5] [1, 2, 3]
>>> id(a), id(b), id(c)
( 4342157256 , 4342157256, 4342380304 ) # <-- Id of c is different from that of a and b
>>> b.append(6) # Append 5 to aThis is one the major reasons, that when a function gets a list as input parameter, then the best thing to do inside the function, is to create a copy of the input list, before doing any modifications to it. This way, the caller of this function will be guaranteed that the input list will not be modified after the execution of the function.
>>> print a, b, c # Only a and b are modified !!!
[1, 2, 3, 5, 6] [1, 2, 3, 5, 6] [1, 2, 3]
>>> id(a), id(b), id(c)
( 4342157256 , 4342157256, 4342380304 ) # <-- Id of c is different from that of a and b
Slicing is possible on strings as well. Let's perform the same operations as were done for list slicing on a string now.
>>> x = "Hello" # String x initialised to "Hello"
>>> y = x # Another string y storing the same value as in x
>>> z = x[:] # String z is a replica / copy of string x
>>> print x, y, z
Hello Hello Hello
>>> id(x), id(y), id(z)
( 4342372992 , 4342372992, 4342372992 ) # <-- All three ids are the same !!!
Now, lets append to a string as we did with lists and see the behaviour.
>>> x += "World" # Append "World" to xString x has been reallocated to a new memory !!! [ or at least a new id in the above scenario ]. When, we append a string like "World" to an existing string in x, Python, internally creates a new memory location for the resultant string. It is this new memory location that is tagged with the variable name x. The earlier memory location [ containing "Hello" ] is no longer accessible through x now.
>>> print x, y, z # Only x is modified !!!
HelloWorld Hello Hello
>>> id(x), id(y), id(z)
( 4342372896 , 4342372992, 4342372992 ) # <-- Only id of x is changed !!!
We would see the same behaviour even if we used a method like replace on x or y as shown below :
>>> y = y.replace("lo", "ipad") # Replace "lo" with "ipad"Moral of the story : Python intelligently, converts string copy syntax x[:] as an alias to the existing variable x. This helps in managing the memory more efficiently.
>>> print x, y, z # Only y is modified !!!
HelloWorld Helipad Hello
>>> id(x), id(y), id(z)
( 4342372896 , 4342373136, 4342372992 ) # <-- Only id of y is changed !!!
When we say that string slice x[:] is a copy of string x, we are actually using the word "copy" with its english connotation and it does not mean that Python will create a new memory and copy the contents of x in it.
No comments:
Post a Comment