As most know, most of my programming work relates to data processing. One of my responsibilities with the Hurricane Research Division and the H*Wind team to be exact has been to process data and insert it in our database. So over the years (almost 12) I've written a number of programs to help along my way. A few years ago (before I discovered Python), I wrote a rather significant program for my own use. It is a generic parser. I wrote it as part of a project to re-analyze Hurricane Katrina. We had data coming in from all over the place and almost none of it in the same 'format', format in quotes since I refer to column order, field delimiters, units, etc. but almost all the data was ASCII. Anyways... I'm digressing from my point... so this parser was basically designed so that the user (basically me) would simply list the column order and such, and out would pop a file ready for the database. Well to the point of the blog post title, today (and yesterday) I was trying to figure out why my program wasn't working. The default delimiter in my program is WHITESPACE, basically any not alpha-numeric-symbol character, aka space, tab, return.... well incomes this file I processing... it was generated in Fortran, so my first assumption is that it should be space delimited, but it was not, it was TAB delimited. Okay no problem, my program can do that, but it didn't. For some reason, TAB is not being recognized as a whitespace.... argh, however when I distinctly tell my program with command-line arguments to accept TAB as a delimiter, it works.... argh..... Oh well, at least it works... thankfully. I sure the issue has something to do with being prepared on a HPUX machine, transfered to a Linux box, emailed to a Mac box and then processed.... you know somewhere in there something had to get lost in whitespace translation.... anyways... again the program works, but it is always fun when you write a program to do something and it doesn't work the way it should... thank goodness for Unix and the many workarounds that are available.
Of course this whole experience is another reason I'm really starting to like Python programming... maybe I should take the time to re-write this c program into Python... then again it works 99% of the time, so if it is not broke, why fix it?
No comments:
Post a Comment