Date: 2009-11-19 21:45:00
python 2 to 3 upgrade and exception handling

I've just (partially) upgraded Psil to Python 3. I had originally been developing it in Python 2, but there are a few particular things about Python 3 that make it a better choice. There's no particular reason it needs to run on Python 2, so I decided to not develop two parallel versions, and not try to run the same code in both Python 2 and 3, but just upgrade wholesale to Python 3 and not look back.

There were in fact only a few simple changes that were required. The most obvious is the print() function, but I also ran into:

So I certainly got a pretty good coverage of the major differences.

The problem I had with exception handling was related to my unorthodox method of handling tail recursion. When I ran a program that used a lot of tail recursion, the memory usage immediately and quickly went through the roof! Clearly there was a memory leak, but Python is generally supposed to handle that for me with its garbage collector. The clue to solving this lay in an obscure warning in the documentation for sys.exc_info:

Warning: Assigning the traceback return value to a local variable in a function that is handling an exception will cause a circular reference. This will prevent anything referenced by a local variable in the same function or by the traceback from being garbage collected.

I had read that warning, but I wasn't using the traceback of sys.exc_info at all so I thought that shouldn't be a problem. However, Python 3 now automatically includes a __traceback__ attribute of every exception (see PEP 3134). Due to the way I was calling a function referenced within the exception object itself, the presence of the traceback was creating a huge chain of unfreeable function and exception references.

Fortunately, there was a simple solution:

        except TailCall as t:
            a = t
            a.__traceback__ = None

Setting the __traceback__ attribute explicitly to None releases the reference to previous stack frames and my code no longer leaks memory.

On the recommendation of the python-dev mailing list, I filed a documentation bug to clarify the warning quoted above.

Finally, I said at first that this was a partial upgrade, because I haven't even addressed the compiler part of Psil (that compiles Psil code to native Python). The modules and interfaces that I was using previously are either gone or changed in Python 3, so a slightly different approach is needed. More on that in a later post.

I thought they introduced cycle GC for this?
There is a cycle collector, but this wasn't a cycle. It was a long list of chained activation records and tracebacks, with an active head. Besides, you don't want to trigger the cycle collector too much unnecessarily, because it's not all that fast. (I did try a gc.collect() in the inner loop while investigating this, and it still leaked memory but a lot slower.)
Greg Hewgill <>