Variable Scoping in Python 2

21 Apr 2020

I encountered a kind of variable scoping that I did not expect while working on this pull request for the Poetry core packaging module. Poetry still supports Python 2.7 and several tests failed on Python 2.7 while I was working on it. However, all tests were passing on Python 3.5 and above.

The tests failed because of an AttributeError:

for file in include.elements:  # type: List[Path]
    # omitted for brevity

    if file.is_dir():
        if self.format in formats:
            for f in file.glob("**/*"):  # type: List[Path]
                rel_path = f.relative_to(self._path)

                if (
                    rel_path not in set([f.path for f in to_add])
>                   and not f.is_dir()
                    and not self.is_excluded(rel_path)
                ):

AttributeError: BuildIncludeFile instance has no attribute 'is_dir'

As you can see on the first line, the type of the for-loop variable is a Path. The Path class has a useful method, Path.is_dir(), to check if it is a directory.

The inner for-loop traverses the directory’s Path objects with an ill-named variable called f. In this for-loop, we check if f is already in the list to_add and if it is a directory and if f is not excluded using its relative path, rel_path. to_add is a list of BuildIncludeFile objects which themselves have the following attributes: path, parent, relative_path and resolve.

I used a list comprehension to transform to_add to a set1 and check if the traversed file is already in to_add. In this list comprehension, [f.path for f in to_add], I also used f as a variable name. This was my unfortunate mistake.

This f # type: BuildIncludeFile in the list comprehension shadows the outer for-loop f # type: Path variable which caused the second condition in the if-statement, if ... f.is_dir() to throw the AttributeError in Python 2.7. f continues to shadow the outer for-loop even once the list comprehension is iterated through and is the last BuildIncludeFile instance in to_add instead of the traversed file Path from the outer for-loop. Because I expected a Path instance for f instead of a BuildIncludeFile which does not actually have an is_dir() method, it threw this AttributeError.

A much simpler example of this phenomenon in Python 2 is the following2:

>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(x)  # Python 3 will throw a `NameError` on the other hand.
9

The fix for this variable scoping issue was easy. Renaming the variable in the outer scope from f to something more descriptive like current_file prevented any kind of unexpected scoping behaviour.

Another possible solution is to use a generator expression since generators are functions and have functional scope3 4.

Python 3 handles variable scope by only temporarily shadowing the outer scope. According to Guido, the list comprehension’s variables “leaked” onto the outer scope because it was “an intentional compromise to make list comprehensions blindingly fast”. Guido called this Python’s dirty little secret.

It’s often said in the halls of colleges and between ping pong tables in startups that “there are only two hard things in Computer Science: cache invalidation and naming things”5. I never thought that the second of these two hard things, naming, would creep up on me as a variable in a list comprehension. 🤦‍♀️


  1. I made to_add into a set because I only needed the BuildIncludeFile.path and not the entire class itself. Had I not accessed just the paths of the BuildIncludeFile objects, then checking for membership of the file in a simple set(to_add) would not have worked since they are of different types: Path and BuildIncludeFile. ↩︎

  2. Thanks to Alex Louden for proofreading and providing this example. ↩︎

  3. https://docs.python.org/2/reference/executionmodel.html#naming-and-binding ↩︎

  4. https://docs.python.org/3.8/reference/executionmodel.html#resolution-of-names ↩︎

  5. A quote by Phil Karlton. Martin Fowler has an entertaining and very short blog post about it here. ↩︎