Python Language Pickle data serialisation Customize Pickled Data


Some data cannot be pickled. Other data should not be pickled for other reasons.

What will be pickled can be defined in __getstate__ method. This method must return something that is picklable.

On the oposite side is __setstate__: it will receive what __getstate__ created and has to initialize the object.

class A(object):
    def __init__(self, important_data):
        self.important_data = important_data
        # Add data which cannot be pickled:
        self.func = lambda: 7
        # Add data which should never be pickled, because it expires quickly:
        self.is_up_to_date = False
    def __getstate__(self):
        return [self.important_data] # only this is needed
    def __setstate__(self, state):
        self.important_data = state[0]
        self.func = lambda: 7  # just some hard-coded unpicklable function
        self.is_up_to_date = False  # even if it was before pickling

Now, this can be done:

>>> a1 = A('very important')
>>> s = pickle.dumps(a1)  # calls a1.__getstate__()
>>> a2 = pickle.loads(s)  # calls a1.__setstate__(['very important'])
>>> a2
<__main__.A object at 0x0000000002742470>
>>> a2.important_data
'very important'
>>> a2.func()

The implementation here pikles a list with one value: [self.important_data]. That was just an example, __getstate__ could have returned anything that is picklable, as long as __setstate__ knows how to do the oppoisite. A good alternative is a dictionary of all values: {'important_data': self.important_data}.

Constructor is not called! Note that in the previous example instance a2 was created in pickle.loads without ever calling A.__init__, so A.__setstate__ had to initialize everything that __init__ would have initialized if it were called.