Wednesday, November 26, 2008

Making code work in Python 2 and 3

Below are some tips for those interested in writing code that runs on Python 2 and Python 3 -- there are probably many other issues, but those were the ones I ran into while porting the code-completion code in Pydev (so, there's probably going to be a part 2 of this when I go on to port the debugger)

0. This may be one of the most important advices: breathe regularly while doing the porting... and be prepared to have uglier code waiting for you if you want to support both Python 2 and Python 3 -- if you do have a choice, don't try to support both versions of Python -- the way Python 3 is implemented, no one's supposed to do that.

1. Print can NEVER be used (use the write() method from objects or create your own print function -- with a different name and use it everywhere)

Catching exceptions putting the exception in a given var can NEVER be used (there's no compatible way of doing it in a way that's acceptable in both versions, so, just deal with it as you can using the traceback module)

socket.send needs bytearray: socket.send(bytearray(str, 'utf-8'))

4. socket.receive
gets bytes (so, they need to be decoded: socket.recv(size).decode('utf-8')

Some imports:

    import StringIO
    import io as StringIO #Python 3.0

    from urllib import quote_plus, unquote_plus
except ImportError:
    from urllib.parse import quote_plus, unquote_plus #Python 3.0

    import __builtin__
except ImportError:
    import builtins as __builtin__ #Python 3.0

There are way too many others, so, the approach for getting it right is basically running the 2to3 with that import to see the new version of it and then making it work as it used to.

6. True, False assign: in some scripts, to support older python/jython versions, the following construct was used:
__builtin__.True = 1
__builtin__.False = 0

As True and False are keywords now, this assignment will give a syntax error, so, to keep it working, one must do:
setattr(__builtin__, 'True', 1) -- as this will only be executed if True is not defined, that should be ok.

7. The long representation for values is not accepted anymore, so, 2L would not be accepted in python 3.

The solution for something as long_2 = 2L may be something as:
except NameError:
    long = int
long_2 = long(2)

Note that if you want to define a number that's already higher than the int limit, that won't actually help you (in my particular case, that was used on some arithmetic, just to make sure that the number would be coerced to a long, so, that solution is ok -- note: if you were already on python 2.5, that would not be needed as the conversion int -> long is already automatic)

8. raw_input is now input and input should be written explicitly as eval(raw_input('enter value'))
So, to keep backwards compatibility, I think the best approach would be keeping on with the raw_input (and writing the "old input" explicitly, while removing the "new input" reference from the builtins)

except NameError:
    import builtins

    original_input = builtins.input
    del builtins.input
    def raw_input(*args, **kwargs):
        return original_input(*args, **kwargs)
    builtins.raw_input = raw_input

9. The compiler module is gone. So, to parse something, the solution seems to be using ast.parse and to compile, there's the builtins.compile (NOTE: right now, the 2to3 script doesn't seem to get this correctly)

Thursday, November 13, 2008

Pydev & Python 3.0

One thing that should be added in the next release is support for Python 3.0 (aka Python 3k)

So, to cover that, I've started at the most basic part: being able to configure an interpreter with Python 3.0. And you guessed it: print does not work anymore, so, when I started checking how can support be added for Python 3.0 while still retaining compatibility with other versions (as most existent python library probably will have to do), something weird struck me: print is now something banned in the python world...

Not only won't I be able to use the print statement (because it's not available on Python 3.0), but I also won't be able to use the print function (because it's not available in older versions of Python), so, when considering the 2to3 tool, actually, what should be available is a 2toAny, where print is removed altogether in what's probably going to be the actual standard for python: instead of print 'x', the thing to have backward compatibility is writing: sys.stdout.write('x\n') -- which I think is really something sad -- probably a step backwards for python, where printing something used to be sane... unlike java things such as System.out.println()... But now not only it's very similar -- it also gets worse: you have to add the new line yourself.

So, in the 2toAny, what should be done is transforming print varA into sys.stdout.write('%s\n' % (varA,)) and print >> log, varA into log.write('%s\n', (varA,))-- much more pythonic right?!?

Well, aside from that, I wanted to say that besides that particular change, most other things seem to be consistent in taking the language forward... but that one seems just gratuitous breakage of most python code existent (it's not like print is used seldonly in the python world or anyone did actually dislike it).

I'm sorry about the tone of this post, but I must say I'm a bit disapointed with that particular change in Python at this stage... Am I missing something here? Is there another way to really have actual compatibility? -- Maybe defining my own print_ function and using it everywhere? Why did Python "fix" something that not only worked, but that everyone was happy about? (or wasn't?)

How about a python 2toAny tool, does someone know if such a thing exists? It's probably much more important than a 2to3 (as almost all libraries will need to keep supporting old versions and probably only few projects will actually migrate to 3.0 -- at least initially and until the libraries they use are able to support it).