Friday, May 20, 2005


Hi all... just to keep you up-to-date...

right now, I'm planning a way to better configure the pythonpath.

I'm thinking about having:
- System libs, configured together with the python interpreter;
- Source folders, for each project
- Project references for another project (and thus getting its source folders in the pythonpath)
- External libs (not part of system nor any project).

The system libs part (gui) is almost ready... in cvs it should already be working, but you cannot do much with it right now :-)

Tuesday, May 10, 2005

Pydev parser... the future??!?!?!

Yesterday I did some 'experiments' with ANTLR as opposed to the current parser that uses JavaCC.

Well, let me talk a little more about the current parser (it is borrowed from Jython)...

It uses JavaCC and asdl (that goes for Abstract Syntax Description Language).

Basically, JavaCC creates the asdl structure as it parses the code, so that after it is parsed, you get an asdl tree data-structure. This structure uses a visitor pattern, and this is what is currently used to find things about the code, like tokens, definitions, etc.

The structure provided by asdl gives us a very complete structure with information about the code, so that we have tokens that are hierarchy structured and we have their starting line and column in the code.

The drawbacks (for me) right now in this structure are:

- We do not get the end of the token, just its start;
- We do not get any indentation info, because indent and dedent tokens were not supported in the asdl structure (as it is now), and we do not get the end of the token;
- There is a huge lack of documentation for asdl;
- In my opinion, JavaCC is not as easy as antlr to work with;
- Antlr seems to have a much better 'error recovery' than JavaCC;

So, I did some experiments and discovered I could do most things I want with antlr, but I still need to find how I want to treat the code after the parsing. I guess I could use antlr to generate the asdl data-structure, but as I said, it is missing some things.

Options I have:
- make it generate some structure I want. So, I would need to have all the info available in asdl (otherwise, I won't be able to do things as refactoring in the future), plus token end and indentation data. Or I could try to extend the asdl structure a bit and keep it, as I think it is easy to deal with.

- Just extend the Jython JavaCC grammar (this would not allow a better error-recovery), only thing I would get would be the decorators.

- Use antlr to generate the same structure I have now for asdl. I would get error recovery and decorators.

Other tools would be built upon this structure, to get completions, definitions, references, etc.

Some notes:
asdlGen location:

Python.asdl location:

To generate the asdl structure from python asdl: python.asdl


Python ANTLR:


Jython (the JavaCC file for the python JavaCC grammar is here):