I have been working on converting NOAA NCEP BUFR observation files to netCDF. If anyone has worked with BUFR data, especially the NCEP variety, you can already start to feel your blood pressure rise. You lucky others should hope you never have to do so.
NCEP has an established library for en/decoding their BUFR data called BUFRLIB. This is a pretty robust library and works very well with their particular flavor of BUFR data. Unfortunately, it is written in FORTRAN. And I mean FORTRAN, not Fortran. The difference being that the former implies the 77 version and the latter implies more recent versions that try to incorporate some (pseudo) Object-Oriented Programming techniques.
After learning OOP (or at least a functional understanding of it) about a decade ago I… don’t… want… to… go… back. Also, the I/O and data manipulation in FORTRAN is very painful to deal with if you have spent any time with PERL, Python, Java, or even C++. So what to do?
I need to convert the data. There’s a supported, well working library that requires me to long for the days of rotary phones and bell bottom jeans – or there’s the intuitive easy to use (iPhone-like) language(s) that would require me to go back and reinvent the wheel (i.e., write base classes that read the bytecode). But wait, there’s a third option (and probably many more than that)…
A former colleague clued me in to F2PY which wraps Fortran (or FORTRAN) code such that it can be called by Python. Now we’re talking! Except that I didn’t know enough FORTRAN to really get it. (I took one class in the fall of 1992 and promptly core-dumped that first semester of my freshman year, though I did recall how rigid and user-unfriendly that language was and that I didn’t want to use it if I could avoid it.)
But I gave it a shot. The compiling was a real bear for many, many reasons, but I got it to work – sort of. Thanks to the prolific use of COMMON blocks and EQUIVALENCE statements, extracting the data I needed was not as straight forward as I hoped. Once I did get things to sort of work, it was sloooooooooow. Like 15 times slower. There are many reasons for that, namely the FORTRAN order of arrays and specifically multi-dimension character arrays. So I went back to pure Fortran and explored the pseudo OOP capabilities.
Truth be told, if I stuck with either the pure Fortran or the F2PY route I would probably be done by now.
However, due to my impatience/ADD, I went back and forth between the two. Though this has been slow and painful, I learned an awful lot about Fortran, Python, compiling, NumPY arrays, and so forth. I have little doubt that my next iteration will yield a simultaneous feeling of wisdom and disgust as this one has.
I could go on in great detail about this experience and I might in later posts, but for now I’ll post a few takeaways: