Why not…

If you’d like third party’s account why attrs is great, have a look at Glyph’s The One Python Library Everyone Needs. It predates type annotations and hence Data Classes, but it masterfully illustrates the appeal of class-building packages.

… Data Classes?

PEP 557 added Data Classes to Python 3.7 that resemble attrs in many ways.

They are the result of the Python community’s wish to have an easier way to write classes in the standard library that doesn’t carry the problems of namedtuples. To that end, attrs and its developers were involved in the PEP process and while we may disagree with some minor decisions that have been made, it’s a fine library and if it stops you from abusing namedtuples, they are a huge win.

Nevertheless, there are still reasons to prefer attrs over Data Classes. Whether they’re relevant to you depends on your circumstances:

  • Data Classes are intentionally less powerful than attrs. There is a long list of features that were sacrificed for the sake of simplicity and while the most obvious ones are validators, converters, equality customization, or extensibility in general, it permeates throughout all APIs.

    On the other hand, Data Classes currently do not offer any significant feature that attrs doesn’t already have.

  • attrs supports all mainstream Python versions including PyPy.

  • attrs doesn’t force type annotations on you if you don’t like them.

  • But since it also supports typing, it’s the best way to embrace type hints gradually, too.

  • While Data Classes are implementing features from attrs every now and then, their presence is dependent on the Python version, not the package version. For example, support for __slots__ has only been added in Python 3.10, but it doesn’t do cell rewriting and therefore doesn’t support bare calls to super(). This may or may not be fixed in later Python releases, but handling all these differences is especially painful for PyPI packages that support multiple Python versions. And of course, this includes possible implementation bugs.

  • attrs can and will move faster. We are not bound to any release schedules and we have a clear deprecation policy.

    One of the reasons to not vendor attrs in the standard library was to not impede attrs’s future development.

One way to think about attrs vs Data Classes is that attrs is a fully-fledged toolkit to write powerful classes while Data Classes are an easy way to get a class with some attributes. Basically what attrs was in 2015.

… pydantic?

pydantic is first and foremost a data validation library. As such, it is a capable complement to class building libraries like attrs (or Data Classes!) for parsing and validating untrusted data.

However, as convenient as it might be, using it for your business or data layer is problematic in several ways: Is it really necessary to re-validate all your objects while reading them from a trusted database? In the parlance of Form, Command, and Model Validation, pydantic is the right tool for Commands.

Separation of concerns feels tedious at times, but it’s one of those things that you get to appreciate once you’ve shot your own foot often enough.

… namedtuples?

collections.namedtuples are tuples with names, not classes.1 Since writing classes is tiresome in Python, every now and then someone discovers all the typing they could save and gets really excited. However, that convenience comes at a price.

The most obvious difference between namedtuples and attrs-based classes is that the latter are type-sensitive:

>>> import attrs
>>> C1 = attrs.make_class("C1", ["a"])
>>> C2 = attrs.make_class("C2", ["a"])
>>> i1 = C1(1)
>>> i2 = C2(1)
>>> i1.a == i2.a
True
>>> i1 == i2
False

…while a namedtuple is intentionally behaving like a tuple which means the type of a tuple is ignored:

>>> from collections import namedtuple
>>> NT1 = namedtuple("NT1", "a")
>>> NT2 = namedtuple("NT2", "b")
>>> t1 = NT1(1)
>>> t2 = NT2(1)
>>> t1 == t2 == (1,)
True

Other often surprising behaviors include:

  • Since they are a subclass of tuples, namedtuples have a length and are both iterable and indexable. That’s not what you’d expect from a class and is likely to shadow subtle typo bugs.

  • Iterability also implies that it’s easy to accidentally unpack a namedtuple which leads to hard-to-find bugs.2

  • namedtuples have their methods on your instances whether you like it or not.3

  • namedtuples are always immutable. Not only does that mean that you can’t decide for yourself whether your instances should be immutable or not, it also means that if you want to influence your class’ initialization (validation? default values?), you have to implement __new__() which is a particularly hacky and error-prone requirement for a very common problem.4

  • To attach methods to a namedtuple you have to subclass it. And if you follow the standard library documentation’s recommendation of:

    class Point(namedtuple('Point', ['x', 'y'])):
        # ...
    

    you end up with a class that has two Points in its __mro__: [<class 'point.Point'>, <class 'point.Point'>, <type 'tuple'>, <type 'object'>].

    That’s not only confusing, it also has very practical consequences: for example if you create documentation that includes class hierarchies like *Sphinx’s autodoc with show-inheritance. Again: common problem, hacky solution with confusing fallout.

All these things make namedtuples a particularly poor choice for public APIs because all your objects are irrevocably tainted. With attrs your users won’t notice a difference because it creates regular, well-behaved classes.

Summary

If you want a tuple with names, by all means: go for a namedtuple.5 But if you want a class with methods, you’re doing yourself a disservice by relying on a pile of hacks that requires you to employ even more hacks as your requirements expand.

Other than that, attrs also adds nifty features like validators, converters, and (mutable!) default values.

… tuples?

Readability

What makes more sense while debugging:

Point(x=1, y=2)

or:

(1, 2)

?

Let’s add even more ambiguity:

Customer(id=42, reseller=23, first_name="Jane", last_name="John")

or:

(42, 23, "Jane", "John")

?

Why would you want to write customer[2] instead of customer.first_name?

Don’t get me started when you add nesting. If you’ve never run into mysterious tuples you had no idea what the hell they meant while debugging, you’re much smarter than yours truly.

Using proper classes with names and types makes program code much more readable and comprehensible. Especially when trying to grok a new piece of software or returning to old code after several months.

Extendability

Imagine you have a function that takes or returns a tuple. Especially if you use tuple unpacking (eg. x, y = get_point()), adding additional data means that you have to change the invocation of that function everywhere.

Adding an attribute to a class concerns only those who actually care about that attribute.

… dicts?

Dictionaries are not for fixed fields.

If you have a dict, it maps something to something else. You should be able to add and remove values.

attrs lets you be specific about those expectations; a dictionary does not. It gives you a named entity (the class) in your code, which lets you explain in other places whether you take a parameter of that class or return a value of that class.

In other words: if your dict has a fixed and known set of keys, it is an object, not a hash. So if you never iterate over the keys of a dict, you should use a proper class.

… hand-written classes?

While we’re fans of all things artisanal, writing the same nine methods again and again doesn’t qualify. I usually manage to get some typos inside and there’s simply more code that can break and thus has to be tested.

To bring it into perspective, the equivalent of

>>> @attrs.define
... class SmartClass:
...    a = attrs.field()
...    b = attrs.field()
>>> SmartClass(1, 2)
SmartClass(a=1, b=2)

is roughly

>>> class ArtisanalClass:
...     def __init__(self, a, b):
...         self.a = a
...         self.b = b
...
...     def __repr__(self):
...         return f"ArtisanalClass(a={self.a}, b={self.b})"
...
...     def __eq__(self, other):
...         if other.__class__ is self.__class__:
...             return (self.a, self.b) == (other.a, other.b)
...         else:
...             return NotImplemented
...
...     def __ne__(self, other):
...         result = self.__eq__(other)
...         if result is NotImplemented:
...             return NotImplemented
...         else:
...             return not result
...
...     def __lt__(self, other):
...         if other.__class__ is self.__class__:
...             return (self.a, self.b) < (other.a, other.b)
...         else:
...             return NotImplemented
...
...     def __le__(self, other):
...         if other.__class__ is self.__class__:
...             return (self.a, self.b) <= (other.a, other.b)
...         else:
...             return NotImplemented
...
...     def __gt__(self, other):
...         if other.__class__ is self.__class__:
...             return (self.a, self.b) > (other.a, other.b)
...         else:
...             return NotImplemented
...
...     def __ge__(self, other):
...         if other.__class__ is self.__class__:
...             return (self.a, self.b) >= (other.a, other.b)
...         else:
...             return NotImplemented
...
...     def __hash__(self):
...         return hash((self.__class__, self.a, self.b))
>>> ArtisanalClass(a=1, b=2)
ArtisanalClass(a=1, b=2)

which is quite a mouthful and it doesn’t even use any of attrs’s more advanced features like validators or defaults values. Also: no tests whatsoever. And who will guarantee you, that you don’t accidentally flip the < in your tenth implementation of __gt__?

It also should be noted that attrs is not an all-or-nothing solution. You can freely choose which features you want and disable those that you want more control over:

>>> @attrs.define
... class SmartClass:
...    a: int
...    b: int
...
...    def __repr__(self):
...        return "<SmartClass(a=%d)>" % (self.a,)
>>> SmartClass(1, 2)
<SmartClass(a=1)>

Summary

If you don’t care and like typing, we’re not gonna stop you.

However it takes a lot of bias and determined rationalization to claim that attrs raises the mental burden on a project given how difficult it is to find the important bits in a hand-written class and how annoying it is to ensure you’ve copy-pasted your code correctly over all your classes.

In any case, if you ever get sick of the repetitiveness and drowning important code in a sea of boilerplate, attrs will be waiting for you.


1

The word is that namedtuples were added to the Python standard library as a way to make tuples in return values more readable. And indeed that is something you see throughout the standard library.

Looking at what the makers of namedtuples use it for themselves is a good guideline for deciding on your own use cases.

2

attrs.astuple() can be used to get that behavior in attrs on explicit demand.

3

attrs only adds a single attribute: __attrs_attrs__ for introspection. All helpers are functions in the attr package. Since they take the instance as first argument, you can easily attach them to your classes under a name of your own choice.

4

attrs offers optional immutability through the frozen keyword.

5

Although attrs would serve you just as well! Since both employ the same method of writing and compiling Python code for you, the performance penalty is negligible at worst and in some cases attrs is even faster if you use slots=True (which is generally a good idea anyway).