When using mutables in Python you have to be careful:
>>> a = {'value': 1}
>>> b = a
>>> a['value'] = 2
>>> b
{'value': 2}
So, you use the copy
module from the standard library:
>>> import copy
>>> a = {'value': 1}
>>> b = copy.copy(a)
>>> a['value'] = 2
>>> b
{'value': 1}
That's nice but it's limited. It doesn't deal with the nested mutables as you can see here:
>>> a = {'value': {'name': 'Something'}}
>>> b = copy.copy(a)
>>> a['value']['name'] = 'else'
>>> b
{'value': {'name': 'else'}}
That's when you need the copy.deepcopy
function:
>>> a = {'value': {'name': 'Something'}}
>>> b = copy.deepcopy(a)
>>> a['value']['name'] = 'else'
>>> b
{'value': {'name': 'Something'}}
Now, suppose we have a custom class that overrides the dict
type. That's a very common thing to do. Let's demonstrate:
>>> class ORM(dict):
... pass
...
>>> a = ORM(name='Value')
>>> b = copy.copy(a)
>>> a['name'] = 'Other'
>>> b
{'name': 'Value'}
And again, if you have a nested mutable object you need copy.deepcopy
:
>>> class ORM(dict):
... pass
...
>>> a = ORM(data={'name': 'Something'})
>>> b = copy.deepcopy(a)
>>> a['data']['name'] = 'else'
>>> b
{'data': {'name': 'Something'}}
But oftentimes you'll want to make your dict
subclass behave like a regular class so you can access data with dot notation. Like this:
>>> class ORM(dict):
... def __getattr__(self, key):
... return self[key]
...
>>> a = ORM(data={'name': 'Something'})
>>> a.data['name']
'Something'
Now here's a problem. If you do that, you loose the ability to use copy.deepcopy
since the class has now been slightly "abused".
>>> a = ORM(data={'name': 'Something'})
>>> a.data['name']
'Something'
>>> b = copy.deepcopy(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python/2.7.2/lib/python2.7/copy.py", line 172, in deepcopy
copier = getattr(x, "__deepcopy__", None)
File "<stdin>", line 3, in __getattr__
KeyError: '__deepcopy__'
Hmm... now you're in trouble and to get yourself out of it you have to define a __deepcopy__
method as well. Let's just do it:
>>> class ORM(dict):
... def __getattr__(self, key):
... return self[key]
... def __deepcopy__(self, memo):
... return ORM(copy.deepcopy(dict(self)))
...
>>> a = ORM(data={'name': 'Something'})
>>> a.data['name']
'Something'
>>> b = copy.deepcopy(a)
>>> a.data['name'] = 'else'
>>> b
{'data': {'name': 'Something'}}
Yeah!!! Now we get what we want. Messing around with the __getattr__
like this is, as far as I know, the only time you have to go in and write your own __deepcopy__
method.
I'm sure hardcore Python language experts can point out lots of intricacies about __deepcopy__
but since I only learned about this today, having it here might help someone else too.