The first thing I profiled was the grammar used, since it is used among many places, I believe it to be the something that would have a good impact overall.
One of the things we end up doing when testing some stuff is embeeding some file -- say, an image -- into a .py file. Among those, we could get a 3-4 mb sized file, and pydev would take a long time to parse it, so, my first target was to speed that up.
Making 'general' changes thinking it will get better is usually a mistake, as you need a way to measure those impacts. I started then with a file with a couple of statements and a single 'huge' multiline string and made my target optimizing parsing that file faster.
After playing around a little in the parser I discovered that the actuall speed loss was not at the grammar itself, but at the Reader that should give the chars to the tokenizer. After looking at its code, you could see it allocateded lots of memory in the process, so, I decided to create another reader from scratch and with the help of some unit-tests, and the results were pretty impressive (for big files):
Parsing a huge .py file (3mb) it was taking about 4-5 minutes... now it only takes 2-3 seconds (yeah, the previous approach had an 'exponential' behaviour depending exclusively only on the size of some file, not to mention that it would make the garbage collector work a lot more).
This will be available for 1.0.7 -- But before I do release it, I'm still looking for other 'hotspots' to optimize ;-)