Singleshot performance and other developments

I was dismayed by how slow parsing all that XMP was in Singleshot when all I really wanted out of it was the keywords that the PIL IPTC parsing was failing to get right. So I dug up the IPTC spec and wrote my own parser. Now not only is it much faster (since I don’t have to parse XMP at all; virtually anything that puts XMP in also puts IPTC data), but now Singleshot doesn’t depend on PIL to get IPTC headlines, keywords, etc. Singleshot was still dependent on PIL for EXIF data. In Singleshot 1.0 I used EXIF.py, but EXIF.py was so slow that for 2.0 I made Singleshot use PIL’s exif parser if it was available. I profiled EXIF.py and tweaked a couple things to get some more speed (around 25%), but it still wasn’t enough. One of the reasons EXIF.py was slow is it was gathering much more data than I actually cared about — it was even decoding the JPEG thumbnails in the EXIF data.

So, using both EXIF.py and the EXIF 2.2 spec, I wrote my own EXIF parser that pulled out the tags I wanted.

Before doing this, pulling all the metadata out of my tree of test images took around 9 seconds with EXIF.py and 4 seconds with PIL’s EXIF parsing. Now it takes 0.6 seconds.

I also finally finished the work to enable more image types, adding PNG and GIF as supported image types when Singleshot is using PIL. I need to dig up how to get the size out of PNG and GIF from the file header before I can add PNG and GIF to the ImageMagick processor — I don’t want to have to fork off a process just to get image height and width.