Hashing

Hash Method Generation

Warning

The overarching theme is to never set the @attrs.define(unsafe_hash=X) parameter yourself. Leave it at None which means that attrs will do the right thing for you, depending on the other parameters:

  • If you want to make objects hashable by value: use @define(frozen=True).

  • If you want hashing and equality by object identity: use @define(eq=False)

Setting unsafe_hash yourself can have unexpected consequences so we recommend to tinker with it only if you know exactly what you’re doing.

Under certain circumstances, it’s necessary for objects to be hashable. For example if you want to put them into a set or if you want to use them as keys in a dict.

The hash of an object is an integer that represents the contents of an object. It can be obtained by calling hash() on an object and is implemented by writing a __hash__ method for your class.

attrs will happily write a __hash__ method for you 1, however it will not do so by default. Because according to the definition from the official Python docs, the returned hash has to fulfill certain constraints:

  1. Two objects that are equal, must have the same hash. This means that if x == y, it must follow that hash(x) == hash(y).

    By default, Python classes are compared and hashed by their id. That means that every instance of a class has a different hash, no matter what attributes it carries.

    It follows that the moment you (or attrs) change the way equality is handled by implementing __eq__ which is based on attribute values, this constraint is broken. For that reason Python 3 will make a class that has customized equality unhashable. Python 2 on the other hand will happily let you shoot your foot off. Unfortunately, attrs still mimics (otherwise unsupported) Python 2’s behavior for backward-compatibility reasons if you set unsafe_hash=False.

    The correct way to achieve hashing by id is to set @define(eq=False). Setting @define(unsafe_hash=False) (which implies eq=True) is almost certainly a bug.

    Warning

    Be careful when subclassing! Setting eq=False on a class whose base class has a non-default __hash__ method will not make attrs remove that __hash__ for you.

    It is part of attrs’s philosophy to only add to classes so you have the freedom to customize your classes as you wish. So if you want to get rid of methods, you’ll have to do it by hand.

    The easiest way to reset __hash__ on a class is adding __hash__ = object.__hash__ in the class body.

  2. If two objects are not equal, their hash should be different.

    While this isn’t a requirement from a standpoint of correctness, sets and dicts become less effective if there are a lot of identical hashes. The worst case is when all objects have the same hash which turns a set into a list.

  3. The hash of an object must not change.

    If you create a class with @define(frozen=True) this is fulfilled by definition, therefore attrs will write a __hash__ function for you automatically. You can also force it to write one with unsafe_hash=True but then it’s your responsibility to make sure that the object is not mutated.

    This point is the reason why mutable structures like lists, dictionaries, or sets aren’t hashable while immutable ones like tuples or frozensets are: point 1 and 2 require that the hash changes with the contents but point 3 forbids it.

For a more thorough explanation of this topic, please refer to this blog post: Python Hashes and Equality.

Note

Please note that the unsafe_hash argument’s original name was hash but was changed to conform with PEP 681 in 22.2.0. The old argument name is still around and will not be removed – but setting unsafe_hash takes precedence over hash. The field-level argument is still called hash and will remain so.

Hashing and Mutability

Changing any field involved in hash code computation after the first call to __hash__ (typically this would be after its insertion into a hash-based collection) can result in silent bugs. Therefore, it is strongly recommended that hashable classes be frozen. Beware, however, that this is not a complete guarantee of safety: if a field points to an object and that object is mutated, the hash code may change, but frozen will not protect you.

Hash Code Caching

Some objects have hash codes which are expensive to compute. If such objects are to be stored in hash-based collections, it can be useful to compute the hash codes only once and then store the result on the object to make future hash code requests fast. To enable caching of hash codes, pass @define(cache_hash=True). This may only be done if attrs is already generating a hash function for the object.


1

The hash is computed by hashing a tuple that consists of a unique id for the class plus all attribute values.