Python uses internal caching for a range of integers to reduce unnecessary overhead from their repeated creation.
In effect, this can lead to confusing behavior when comparing integer identities:
>>> -8 is (-7 - 1)
False
>>> -3 is (-2 - 1)
True
and, using another example:
>>> (255 + 1) is (255 + 1)
True
>>> (256 + 1) is (256 + 1)
False
Wait what?
We can see that the identity operation is
yields True
for some integers (-3
, 256
) but no for others (-8
, 257
).
To be more specific, integers in the range [-5, 256]
are internally cached during interpreter startup and are only created once. As such, they are identical and comparing their identities with is
yields True
; integers outside this range are (usually) created on-the-fly and their identities compare to False
.
This is a common pitfall since this is a common range for testing, but often enough, the code fails in the later staging process (or worse - production) with no apparent reason after working perfectly in development.
The solution is to always compare values using the equality (==
) operator and not the identity (is
) operator.
Python also keeps references to commonly used strings and can result in similarly confusing behavior when comparing identities (i.e. using is
) of strings.
>>> 'python' is 'py' + 'thon'
True
The string 'python'
is commonly used, so Python has one object that all references to the string 'python'
use.
For uncommon strings, comparing identity fails even when the strings are equal.
>>> 'this is not a common string' is 'this is not' + ' a common string'
False
>>> 'this is not a common string' == 'this is not' + ' a common string'
True
So, just like the rule for Integers, always compare string values using the equality (==
) operator and not the identity (is
) operator.