Tuesday, May 6, 2008

Overriding __getslice__, __setslice__, __delslice__

Slice accessors __getslice__, __setslice__ and __delslice__ have been deprecated since python 2.0. But you'd never know it.

This becomes a problem when you decide to override any of these three functions in a custom class.

Slice accessors still show up on lists.

>>> l = range(10, 20)
>>> '__setslice__' in dir(l)
True

And help provides no deprecation warning.

Help on method-wrapper object:

__setslice__ = class method-wrapper(object)
| Methods defined here:
|
| __call__(...)
| x.__call__(...) <==> x(...)
|
| __cmp__(...)
| x.__cmp__(y) <==> cmp(x,y)
|
| __getattribute__(...)
| x.__getattribute__('name') <==> x.name
|
| __hash__(...)
| x.__hash__() <==> hash(x)
|
| __repr__(...)
| x.__repr__() <==> repr(x)
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __objclass__
|
| __self__

So you might think that slice overrides work just like every other special method override. But no.

A naive override redefines both __setitem__ and __setslice__.


class Foo(object):

def __init__(self, *args):
self._contents = list(args)

def __setitem__(self, i, x):
self._contents[i] = x

def __setslice__(self, i, j, x):
self._contents[i : j] = x

Which appears to work.

>>> f = Foo(*range(10, 20))
>>> f._contents
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

>>> f[3] = 130
>>> f._contents
[10, 11, 12, 130, 14, 15, 16, 17, 18, 19]

>>> f[4 : 8] = [140, 150, 160, 170]
>>> f._contents
[10, 11, 12, 130, 140, 150, 160, 170, 18, 19]

But when we add print statements ...

def __setitem__(self, i, x):
print 'Now in __setitem__'
self._contents[i] = x

def __setslice__(self, i, j, x):
print 'Now in __setslice__'
self._contents[i : j] = x

... we see otherwise.

>>> f[3] = 130
Now in __setitem__.
>>> f._contents
[10, 11, 12, 130, 14, 15, 16, 17, 18, 19]

>>> f[4 : 8] = [140, 150, 160, 170]
Now in __setitem__.
>>> f._contents
[10, 11, 12, 130, 140, 150, 160, 170, 18, 19]

The interpreter never calls __setslice__. This might lead us to the sensible conclusion that -- because the method is deprecated -- the interpreter never calls __setslice__ at all. But again no.

When our class additionally redefines __getslice__ ...

class Foo(object):

def __init__(self, *args):
self._contents = list(args)

def __getslice__(self, i, j, stride = None):
print 'Now in __getslice__.'
return self._contents[i : j : stride]

def __setitem__(self, i, x):
print 'Now in __setitem__.'
self._contents[i] = x

def __setslice__(self, i, j, x):
print 'Now in __setslice__.'
self._contents[i : j] = x

... we again find otherwise.

>>> f[4 : 8] = [140, 150, 160, 170]
Now in __setslice__.
>>> f._contents
[10, 11, 12, 13, 140, 150, 160, 170, 18, 19]

The interpreter here does call __setslice__.

So what's going on? Does the interpreter call __setslice__ or not? Is __setslice__ really deprecated or not?

To capture what's actually going on here we have to talk about dependencies between different slice-getters and -setters and item-getters and -setters and say something like "interpretation of x[i : j] passes any output of x.__getslice__ to x.__setslice__ but -- on failure -- resorts to x.__setitem__ only."

This is a mess and probably justifies deprecation in itself. But as of python 2.5 we have enough rope to hang ourselves -- the interpreter will call overridden versions of the special slice handlers ... but only sometimes.

The solution is to redefine the three item accessors __getitem__, __setitem__, __delitem__ only and to ignore the corresponding slice handlers entirely.

class Foo(object):

def __init__(self, *args):
self._contents = list(args)

def __getitem__(self, i):
print 'Now in __getitem__.'
return self._contents[i]

def __setitem__(self, i, x):
print 'Now in __setitem__.'
self._contents[i] = x

def __delitem__(self, i):
print 'Now in __delitem__.'
del(self._contents[i])

Item gets, sets and deletes work as expected.

>>> f._contents
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> f[4]
Now in __getitem__.
14

>>> f[4] = 140
Now in __setitem__.
>>> f._contents
[10, 11, 12, 13, 140, 15, 16, 17, 18, 19]

>>> del(f[4])
Now in __delitem__.
>>> f._contents
[10, 11, 12, 13, 15, 16, 17, 18, 19]

Slice gets, sets and deletes work now too.

>>> f._contents
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> f[4 : 8]
Now in __getitem__.
[14, 15, 16, 17]

>>> f[4 : 8] = [140, 150, 160, 170]
Now in __setitem__.
>>> f._contents
[10, 11, 12, 13, 140, 150, 160, 170, 18, 19]

>>> del(f[4 : 8])
Now in __delitem__.
>>> f._contents
[10, 11, 12, 13, 18, 19]

That this should be so -- that slice we get slice management for free when we override item management functions -- is something of a surprise. Stuff like f[4 : 8] is certainly slice notation. So why does the interpreter call the corresponding item handler?

The answer is that f[i], f[i : j], f['bar'] all call __getitem__ and that, in the special case of f[i : j], the interpreter passes a built-in slice instance to __getitem__.

An additional print ...

def __getitem__(self, i):
print 'Now in __getitem__.'
print 'Input is %s.' % i
return self._contents[i]

... makes the point clearer.

>>> f[4 : 8]
Now in __getitem__.
Input is slice(4, 8, None).
[14, 15, 16, 17]

The conclusion to all this is to always override the three item accessors __getitem__, __setitem__, __delitem__ and to never override the three (deprecated) slice accessors __getslice__, __setslice__, __delslice__. Why is it easy to get tripped up? Because the help pages don't exactly scream about slice management deprecation. And also because it's possible to write and successfully use f[i : j] for years without realizing that this now usually interprets as a standard call to __getitem__ with a built-in slice instance passed in.

A good coresponding disucssion of just this point (minus the explanations of why the deprecation warning goes missing) is here on pages 165 - 66 of O'Reilly's Python Cookbook.

No comments: