davidbau.com Your Compiler Vanishes in A Puff of Logic

May 22, 2009

Your Compiler Vanishes in A Puff of Logic

I can write a program in Java about twice as fast as as I can write the same program C or C++, and I can bang out the same program in Python maybe twice again as fast as I can in Java. In Python, you just get straight to the point and say what you mean by saying things like:

weekly = [sum(daily[j:j+7]) for j in range(0, len(daily), 7)]

In an ideal world, high-level languages like Python would replace all other programming languages.

In principle, a program written in a high-level language like Python should be able to execute faster than one written in a low-level language like C, because fewer of the implementation details are constrained by the programmer. The less the programmer says, the more flexibility the compiler has to speed things up. Of course, in reality high-level languages are much slower than low-level languages, so at Google we write most of our performance-sensitive code in C++. Even though we need to invest maybe four times as much effort.

But the PyPy guys have not given up the dream of fast high-level languages. They are making progress on an epic project to make python - and all interpreted languages - speedier than C. And they are doing it by sticking to their guns: instead of getting in the dirt and implementing a pile of low-level tricks, they are raising the level of abstraction to ever higher heights, writing their compiler, interpreters, profilers and jitters all in highly abstract python, and then applying the compiler to itself to make the final product.

Along the way, they have only needed to resort to one or two easily-explained dirty tricks.

PyPy's progress on a tracing JIT for a python meta-interpreter are documented in a paper by Bolz, Cuni, Fijalkowski, and Rigo. If you are into language technology, it is well worth a read. Their project is akin to engineers who dare to build a ladder to the moon. It seems improbable, but once they do it, it will change the world.

Someday we will all program in python.

Posted by David at May 22, 2009 07:26 PM

Comments

I don't disagree with the vision, but if it works out, it is highly unlikely to be Python for the long run. I'd rather program in virtually any other dynamic language out there.

Posted by: Sam at May 23, 2009 02:12 AM

I disagree with your basic premise:

QUOTE: In principle, a program written in a high-level language like Python should be able to execute faster than one written in a low-level language like C, because fewer of the implementation details are constrained by the programmer. END QUOTE

We have seen this argument many times before, except with "c" and "assembly" substituted for "python" and "c". For decades, people have been saying that it's only a matter of time before C compilers can consistently perform better than hand-written assembly - and yet the reality today is that critical hotspots are still hand-written in assembly. Some people still assert that it is only a matter of time until the compilers get there, but my opinion is that there must be something fundamentally flawed about the argument that it should be possible.

I believe the flaw in the argument is that the higher-level languages are actually *more* constrained, because they have made too many broad promises and are unable to tell when they can violate them without ill effect, even though it is obvious to the programmer. We might think that your example above would be a perfect case in which we could parallelize the loop... but what if sum() had side effects, such that it would produce the wrong result if not evaluated in order? What if sum() didn't have any directly detectable side effects, but called another function which called another function which did... sometimes, from code in a runtime eval? The compiler has no way of knowing this, so it has to assume the worst, and this is why it ends up being unable to make all the optimizations that one thinks it "should" be able to make. The human looks at it and says "duh, it's called 'sum'. If it has side effects, I'm going to smack somebody up the side of the head" and proceeds to optimize in the expected ways.

Python gives assurances that comprehensions such as your example will evaluate in order, as if they were a for loop. You might say that perhaps it shouldn't have. Maybe this isn't an inherent problem, and it's just that the design of python, like many actual languages, just happens to be totally wrong and make too many promises to the programmer. I think that if we did have a language which required the programmer to explicitly specify which assurances they needed on ordering and side effects, however, it would be sufficiently verbose and time-consuming that nobody would want to use it, because it would take four times as long as programming in python, just like C does.

I think what I'm suggesting is that ultimately the resulting execution speed of the program will be proportional to the total amount of information transferred from the programmer to the computer, that writing in a lower-level language is more or less equivalent to writing in higher-level language which is annotated to tell the compiler what it does and doesn't have to assure, and that only incremental gains in this are possible and not order of magnitude gains.

Posted by: at May 23, 2009 09:46 AM

Hey, it ate my name! The last comment was from Terran.

Posted by: Terran at May 23, 2009 09:46 AM

The language you infer to exists and for the problems it solves isn't very verbose. SQ' is part of that class of languages. True that you could still spend more time writing a highly optimized code in some lower level lang. But mostly we don't. At least not for on off queries.

Posted by: jsk at May 23, 2009 10:13 PM

Such a pessimist Terran!

The interesting turn of events in recent years has been the rise of dynamic optimization based on tracing, rather than reliance on pure static analysis.

I agree that the decades have shown that, in practice, you cannot expect a compiler to succeed enough at 'proving correctness' of optimizations to be able to do as well as a human in generating a single copy of the code that runs speedily and correctly in all cases.

However, the tracing approach - optimizing code for the common case rather than requiring that optimizations be always-applicable - shows real promise, and it seems clear that we are just at the beginning of it.

Someday, it seems to me, tracing compilers will be able to figure out things like how to omit the bulk of storage for records that have fields almost always zero. Tracing compilers will be able make optimizations like this that humans are unwilling to, because tracing compilers will be willing to carefully profile actual usage; and they will be willing to emit serpentine complicated adapter code to connect the common cases to the fully-correct case....

There is reason for optimism, if your timeframes are long enough.

Posted by: David at May 24, 2009 02:49 PM

I don't buy it.

If a fast compiled dynamic HLL was going to replace the static compilation crowd then we would have seen Common Lisp replace C/C++ and Java long ago...

Posted by: John Connors at June 20, 2009 01:51 PM

Code in Assembler and even C yields one possible implementation that solves the problem. The compiler is aware of one implementation, but not the original problem. A declarative programming language gives the compiler the problem, and the compiler is free to find the best possible implementation. Unroll the loops, parallelize, offload some number crunching to the GPGPU. These tricks might be beyond the ability of an assembly or C programmer. As hardware becomes more varied, with more features, expect that compilers will become better at optimization, and expect that Just In Time compilation will be the norm. But whether Python is the best balance between declarative, functional and imperative programming remains to be seen.

Posted by: Xalem at June 21, 2009 12:58 AM

I am surprised you point to PyPy, which seems like it will forever remain an academic project. I think Unladen Swallow is more likely to end up becoming the main line.

Posted by: Kevin at June 22, 2009 10:01 AM

Eww, Python. How about... oh, I don't know, ANYTHING else. Perl, even, would be better than Python.

Posted by: dude at June 22, 2009 10:33 AM

If you advocate higher-level languages, like Java over C, and Python over Java, then why stop there? What about an even higher-level language, like Lisp?

(Python is phenomenal at one-liners, as you demonstrate, but I tend to run into trouble using it for large programs. The lack of conditions, macros, symbols, (unhindered) lambdas, etc., end up hurting me more than concise one-liners helps me. It even has great compilers, so it's typically far faster than Python.)

A likely answer: "because its syntax doesn't look familiar to me". And thus, you also have the answer to why people stick with Java and C++ over Python. :-)

Posted by: John at June 22, 2009 11:13 AM

Python is great as executable pseudo-code! It lets me write stuff quickly and accurately. Like David, I find myself more productive with Python than any other language (of course the batteries included aspect helps greatly).

What we need is a very optimising translator that will convert debugged Python code into C / C++ and then can be compiled for production use if execution time is important. Unlike Lisp, et al. I can actually read other people's Python code. As for large programs, appropriate use of packages, modules and classes will tame the largest project.

Of course, if you don't like Python and/or don't agree with it's philosophy, then FINE! use whatever you favour. But it is interesting to note that universities are ditching Scheme & Java for Python as the first programming language. After all, programming still is all about algorithms and data structures.

Posted by: CyberFonic at June 23, 2009 07:56 AM

CyberFonic: I've seen plenty of unreadably bad Python code, and Lisp code that was perfectly clear; in years of using both, I don't think either holds an obvious victory in clarity. Being able to write at the appropriate abstraction level is what makes code readable, and Python does a very good job at that at one level (a higher one than Java), but still lacks fundamental *concepts* that make good Lisp code good at large projects.

Every nontrivial program I've seen (including those in Python) has reimplemented large portions of Common Lisp. "Appropriate use of packages, modules, and classes" are not a substitute for conditions or symbols or macros. Or rather, when you try to reimplement them with just Python classes, you end up only slightly better than Java's "kingdom of nouns". (For example, how you implement conditions in Python, short of Greenspunning yourself an embedded interpreter?)

I don't find the university preference interesting at all. Universities have switched to some pretty shady languages over the years, and the switching seems to correlate, more often than not, with what's trendy in the workplace, not what teaches algorithms best. (In the mid-1990's, did anybody really think that C++ was the best tool for teaching algorithms? Or: Python was better than Java for teaching data structures 5 years ago, too, so why did they wait until Python became popular in industry to switch?) I think that says a lot more about the universities than about the languages.

Posted by: John at June 23, 2009 01:40 PM

In response to the above comment: "What we need is a very optimising translator that will convert debugged Python code into C / C++..."

Shed Skin is an "Optimizing Python-to-C++ Compiler" that compiles ordinary Python code to C++ in order to speed up the code, typically by an order of magnitude or more over standard Python. Still in experimental form, but it works on a variety of real-world examples.

http://code.google.com/p/shedskin/

Posted by: James at June 30, 2009 12:15 PM

What I like so much about jython is that I can use the python language, and leave the optimization up to the JVM, which is continually improving.

I especially like using real threads on multi-core servers, but with rapid development that python is conducive to, and java not so much. native threads in python are incredibly inefficient, they are even less efficient when you have more cores?! http://www.dabeaz.com/python/GIL.pdf -- not so in jython!

I'd like to see the perl guy above tackle a unicode xml transform problems with 16 cores and write that code in less than 30 minutes -- a special bonus: give it to a junior programmer to expand on.

Posted by: dingo at July 12, 2009 12:10 PM