Tuesday, April 21, 2015

Type hinting on Python

If you missed it, it seems right now there's a long thread going on related to the type-hinting PEP:

https://mail.python.org/pipermail/python-dev/2015-April/139221.html

It seems there's a lot of debate and I think the outcome will shape how Python code will look some years from now, so, I thought I'd give one more opinion to juice things up :)

The main thing going for the proposal is getting errors earlier by doing type-checking (which I find a bit odd in the Python world with duck-typing, when many times a docstring could say it's expecting a list but I could pass a list-like object and could get fine with it).

The main point raised against it is is that with the current proposal, the code becomes harder to read.

Example:

def zipmap(f: Callable[[int, int], int], xx: List[int], yy: List[int]) -> List[Tuple[int, int, int]]:

Sure, IDEs -- such as PyDev :) --  will probably have to improve their syntax highlighting so that you could distinguish things better, but I agree that this format is considerably less readable. Note that Python 3 already has the syntax in-place but currently it seems it's very seldomly used -- http://mypy-lang.org/ seems to be the exception :)

Now, for me personally, the current status quo in this regard is reasonable: Python doesn't go into java nor c++ land and keeps working with duck-typing, and Python code can still be documented so that clients knows what kinds of objects are expected as parameters.

I.e.: the code above would be:

def zipmap(f, xx, yy):
    '''
    :type f: callable(tuple(int, int))->int
    :type xx: list(int)
    :type yy: list(int)
    :rtype: list(tuple(int, int, int))
    '''

Many IDEs can already understand that and give you proper code-completion from that -- at least I know PyDev does :)

The only downside which I see in the current status quo is that the format of the docstrings isn't standardized and there's no static check for it (if you want to opt-in the type-checking world), but both are fixable: standardize it (there could be a grammar for docstrings which define types) and have a tool which parses the docstrings and does runtime checks (using the profiler hook for that shouldn't be that hard... and the overhead should be within reasonable limits -- if you're doing type checking with types from annotations, it should have a comparable overhead anyways).

Personally, I think that this approach has the same benefits without the argument against it (which is a harder to read syntax)...

If more people share this view of the Python world, I may even try to write a runtime type-checking tool based on docstrings in the future :)

2 comments:

  1. Anonymous3:50 AM

    What you want is exactly what the are doing: standardizing the format. But instead of letting the syntax inside a docstring they are putting it outside.

    Also (and related with doing it outside docstrings), they are creating a new extension to allow the type hints to exists separately from the code.

    And, everything is optional.

    Seems pretty well rounded from my point of view.

    ReplyDelete
  2. Anonymous4:04 AM

    One of the main weaknesses of Python - maybe the main weakness - is that a lot of errors are not caught at compile time, while they would be in other languages.

    Type hints will help to catch a lot more errors, in IDEs and other tools, before the code is executed. This may not seem much, but for big projects where robustness matters, this is... huge !

    Please do not underestimate the importance of type hints, they are relevant.

    ReplyDelete