This article is part number 10 of the Readability series.


You know that passing state around different functions by using global variables is bad: it results in spaghetti code, it introduces side-effects to your functions and, well, is just bad practice. Then: don’t make the same mistake when using classes.

The form this manifests in code is by having a particular class method (not necessarily the constructor!) initializing a member field and later having other unrelated methods querying the attribute “out of the blue”.

Consider the following code in which we have the definition of a class to represent a real-world object and a caller test method:

class Table(object):

    def __init__(self, database, label):
        self._database = db
        self._label = label

        self.depth = None
        self.height = None
        self.width = None

    def query_dimensions(self):
        self.depth = self._database.get(self._label, 'depth')
        self.height = self._database.get(self._label, 'height')
        self.width = self._database.get(self._label, 'width')

    def volume(self):
        self.query_dimensions()
        return self.depth * self.height * self.width

def test():
    table = Table('foo', Database.connect())
    table.query_dimensions()
    print table.depth, table.height, table.width
    print table.volume()

Yes, I’ve seen something similar in production. No, this is not good. There are many problems with this design, but let’s focus on the one that is the point of this article.

The specific problem I want to highlight is in the query_dimensions() method: such method is abusing the local attributes as input parameters and is also storing its various return values into attributes. The fact that this method is public is even worse, as illustrated by the need to call it before being able to access public attributes.

Conceptually, a function to retrieve values from a database should be receiving the database and the object identifier as parameters, and return the results of the query. We want the function to not use global state and instead look like this:

@staticmethod
def query_dimensions(database, label):
    depth = database.get(label, 'depth')
    height = database.get(label, 'height')
    width = database.get(label, 'width')
    return depth, height, width

Note that this function has no hidden dependencies: all data read or written is accessed via input parameters or returned via return values.

Now, plugging this into our previous class allows us to “fix” the volume() method to not rely on updating the local variables in a magic manner. However, having changed the function this way exposes that the various public attributes cannot now be accessed from outside. We need to rewrite the code quite heavily:

class Table(object):

    def __init__(self, database, label):
        self._database = db
        self._label = label

        self._depth = None
        self._height = None
        self._width = None

    @staticmethod
    def _query_dimensions(database, label):
        depth = database.get(label, 'depth')
        height = database.get(label, 'height')
        width = database.get(label, 'width')
        return depth, height, width

    def depth(self):
        if self._depth is None:
            self._depth, self._height, self._width = (
                self._query_dimensions(self._database, self._label)
        return self._depth

    def height(self):
        if self._height is None:
            self._depth, self._height, self._width = (
                self._query_dimensions(self._database, self._label)
        return self._height

    def width(self):
        if self._width is None:
            self._depth, self._height, self._width = (
                self._query_dimensions(self._database, self._label)
        return self._width

    def volume(self):
        return self.depth() * self.height() * self.width()

def test():
    table = Table('foo', Database.connect())
    print table.depth(), table.height(), table.width()
    print table.volume()

This new code is certainly much longer than the original code. However, this updated version is easier to follow because the points at which attributes get updated have been made explicit. Also, this offers better encapsulation by avoiding the caller to know how to recompute internal values. And this variant highlights that, maybe, we ought to have a Dimensions type to group the depth, height and width of an object.

To summarize: don’t use attributes for data that only need to be passed around methods. Attempt to make your methods independent from the instance, or at least think about them this way so that you realize what dependencies they have.

Comments from the original Blogger-hosted post: