Tuesday, August 31, 2004

The Python paradox and PIL

In the recent remarkable talk and later clarification Paul Graham said:

Python programmers are smart. It's a lot of work to learn a new programming language. And people don't learn Python because it will get them a job; they learn it because they genuinely like to program and aren't satisfied with the languages they already know.
Today I had to perform a simple image transformation task in Python and decided to use well-known PIL library. While working with PIL I had to recall this Python Paradox. The reason?

The PIL's API is weird. Not only it contains stupid bugs but it is awkward to use. I can't judge the underlying C implementation, it may be just excellent, but the python wrapping code is far from perfect. It is simply mediocre at best. And who said the Python programmers are smart here? :-)

Just to support this rant, here are some issues I have discovered while trying to resize an image (a pretty typical use-case, isn't it?):
  • to create an image you'll need a file-like object, plain string won't work. Actually, this is very common in python libraries and can be seen quite often in standard lib as well. But this is still irritating. Why the library can't provide both methods instead of forcing the user to mess with StringIO buffer? From the implementation point of view one method could be easily done as a thin wrapper over the other. It is easy and facilitates unit testing. The only excuse could be the laziness of the library author.
  • you have to save the image just to get image data. To me, this feels awkward - why don't provide a method to get (or compute, who cares?) image data directly? The issue is emphasized by the save() method itself.
  • the save() method is just weird. Firstly, as you may have guessed, it requires a file-like object. Secondly, it requires you to specify a format in which you want to save it. It would be OK if this parameter is optional but it is mandatory! In my case I simply don't know original format nor I could care less. I just want to resize an image and get the result. As it is, I had to determine image type first, despite the fact that resizing routine accepts many different kind of image files. And the final stroke of untidiness -- format is specified as string and never explicitly checked for correctness. So if you pass JPG instead of JPEG you'd get a KeyError with very informative "JPG" explanation text.

On a positive side, the functionality is excellent - the resizing routine performed fine (after I specified more sophisticated resampling filter instead of NEAREST).


Blogger Davin said...

This comment has been removed by a blog administrator.

9:39 AM  
Anonymous Steven said...

I could not agree more!
All handling of binary data in Python is a mess.

12:45 PM  

Post a Comment

<< Home