Log in

No account? Create an account
Goings On of Late - The Mad Schemes of Dr. Tectonic — LiveJournal [entries|archive|friends|userinfo]

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Goings On of Late [Oct. 28th, 2013|11:41 pm]
I didn't post for a bit because my brain was overtaken by a muse (see previous post) and so now I have to catch up. Reverse-chronologically:

This weekend, I got both kitties to wear their harnesses for upwards of an hour! On Saturday and again on Sunday! Nico is happy to wear his (once you get it over his head), but being able to Ioliel's on and not have her freak out is new and different. Also, I got their portraits framed and hung finally.

On Saturday, Jerry's school had their charity kick-a-thon, and when I saw his Facebook status update about it I thought "oh! It's a lovely day. I should go and be supportive and get out of the house. Buuuut... it ran from 1 to 3, and I saw the status update a little after 2, and they ran a bit short, so by the time I was showered and clothed and over there, they were all done. But he appreciated the thought, and we had ourselves a little picnic in the grass near the studio, and it was quite nice.

And then on Saturday evening, Bob and Douglas came over. We had dinner at NooCo and froyo at Zinga and played some Carcassonne and Cards Against Humanity and had a lovely and delightful time.

Nothing much worth mentioning during the week last week. I think I started writing a rant about something floating around Facebook and got halfway through and then lost momentum on it. Maybe that was the week before. I saw a talk at work about formal CS methods for scientific programming (unit testing in particular), and I probably could rant about that at some length, but to abbreviate, I want to believe in it, but then every time I try to figure out what it would actually look like, I get frustrated because it all seems pointless and/or confusing and all the examples are stupid contrived nonsense about "here's how to unit test adding two numbers" which is NOT HELPFUL. There's no point in testing that because if you can't rely on addition to work properly, everything you do is doomed. If CS people want to get scientist-programmers to adopt better, more modern methods, we desperately need realistic examples to look at. /rant

Oh right, that's what the last two weeks at work have been about: with the help of my grand-boss, we got this kriging thing working, cross-validating, and running at a not totally-unreasonable speed. It'll still need to run on a supercomputer to do the whole dataset, but the project is now in motion.

Last weekend I got to play D&D with 8-year-olds. It was fun but just a wee bit disorganized. Kids that age are totally chaos-aligned. I believe we did manage to impart some important lessons, like, "scouting before a fight is useful", and "pay attention to the range of effect on your spells" and "if you insist that your character is sleeping in the river, your character will get sick and be at -2 to everything the next day." I am also pleased to report that I didn't miss on my attack to finish off the main baddie, because it probably would have resulted in TPK of all the 1st-level kiddos. Whoops!

Also last weekend: bonfire at Pyro's! It was a big one - maybe 80 people? Had a fine old time socializing with all the bearfolks. (I enjoy these kinds of events so much, and I really wish I had more to say about them than "I had a great time!" but my brain is no good at retaining the kinds of details that make for interesting anecdotes, so all I'm left with is the bare outlines and a happy but abstract memory...)

And then Tuesday before last I skipped work and went to a birthday lunch for my grandmother, who has just recently turned ninety-nine years old! I had to spell it out, because that's a lot of birthday.

[User Picture]From: dendren
2013-10-29 02:57 pm (UTC)
*imagines your kitties in their harnesses walking all pround amongst the other leatherfolk at Folsom Fair"

congrats to your Gram on ninety-nine years. that is quite a long time :)
(Reply) (Thread)
[User Picture]From: nematsakis
2013-11-05 04:39 am (UTC)
So... unit testing.

Your example "here's how to unit test adding two numbers" is a bit of a strawman. Is that actually what was explained to you? The goal of unit testing is to test functional units. What are the smallest functional units you have in the programs you write? Do you have tests for them?

The principle values I see in unit testing is that they allow you to code with fewer bugs, but more importantly allow you to refactor, repurpose, and fix bugs in long-lived code with much greater code than if you don't have tests.
(Reply) (Thread)
[User Picture]From: dr_tectonic
2013-11-05 05:23 am (UTC)
Googling for "unit testing examples", I have definitely seen adding two things as an example.

I get the basic idea of unit testing and I can see how it's useful if you're writing, say, a general-purpose library. My frustration is that it's not at all clear how it applies to scientific programming, and all the examples I can find are CS-oriented and therefore totally unhelpful.

What's the smallest functional unit in my code? I don't know! This is exactly the problem! I don't know how you decide what the smallest sensible chunk that's worth testing is, and nobody can give me any good guidelines.
(Reply) (Parent) (Thread)
[User Picture]From: dr_tectonic
2013-11-05 05:30 am (UTC)
I am vaguely starting to think that the answer is that unit testing per se is actually NOT very useful in scientific programming, that it's something that you use for what I'd call infrastructure programming, and that for scientific programming (which is usually not long-lived or refactored) you actually want to apply the same underlying principles but in a different way, doing something analogous that focuses not on the behavior of the code but instead on the structure of the data, because that's where all the longevity and mutation lie. But my ideas are all still very inchoate.
(Reply) (Parent) (Thread)
[User Picture]From: nematsakis
2013-11-05 05:40 am (UTC)
So... what makes "scientific programming" different from "programming"? I don't have any first-hand experience with this. If your code is really not long-lived, and you're only writing it to process a single data set then it seems silly to write unit-tests. Better to validate a number of key examples from your data set by hand.

I certainly don't write tests for all the code I write. However, if I have a module of code that I expect to be long-lived and called from multiple distinct applications, then I absolutely want tests for it that cover all the functionality of each application. Otherwise, it becomes really painful to modify the code to improve application A without worrying about breaking application B, and so on.
(Reply) (Parent) (Thread)
[User Picture]From: dr_tectonic
2013-11-05 06:28 am (UTC)
So, example:

Today I wrote a program in NCL (a scripting language) that interpolates data from one grid to another using a particular library method. You give it the name of the file that has your input data in it, the name of the output file to store things in, and the name of a file that has all the weights for going from this grid to that grid. (Generated using scripts I wrote last week.)

This will be used to regrid a half-dozen different variables from a half-dozen different simulations, each of which has its own grid. I could code the looping over those different input into the script, but it's the kind of thing that I may re-use in a different context, so instead I make it generic, and just pass in the filenames from the command-line. (Using a convenience shellscript I wrote a long time ago that takes care of all the tedious argument quoting and so on.) I do the looping on the command line, so I have a file named NOTES that has chunks of commands I can cut-and-paste into an xterm. If that gets messy or long and I need to redo it a lot, that will probably mutate into a proper shell-script.

The interpolation script itself is pretty straightforward. It opens up the files and reads in data, hands it off to the library function, takes the results it gets back and cleans them up a little and tacks on some extra metadata, and then writes it all out to a file. If any of those steps goes wrong, the interpreter will probably spew a useful message and die. And I don't think adding tests buys me much -- either it's a problem with the way the code is being used (invocation / inputs), or it's a problem on par with "addition isn't working property".

If you view the entire program as being the smallest usable unit (going to a view where the programming really resides in the way that I composite together all these little scripts to apply to datasets, and viewing this one script as basically like a function), the problem is that generating a synthetic test case would be vastly more work than the analysis I'm trying to perform. And in some cases, it's not clear that I can even generate an automatically-validated synthetic test case. I can easily visually inspect actual data run through the regridding and verify that it looks "the same" before and after -- but I don't know what the numeric values ought to be if I were to gin up fake data to run through it. And if my output is not a datafile, but a visualization... how do you automatically test whether a plot looks sensible? And if the tests aren't automatable, what's the point?
(Reply) (Parent) (Thread)
[User Picture]From: nematsakis
2013-11-05 03:19 pm (UTC)
So "read some data, call some library functions you don't own, write some data" seems too simple to test. But even such a simple task might have opportunities for testable modules.

For example, you might discover one day that your input files were corrupted in some way that your program passed through silently, giving you garbage data. And so maybe you want to write a validating input routine which will fail in these cases, that you will share among all modules that read the same input files.

Or maybe you own the library module and will be calling it from multiple scripts. In this case, you probably should write tests for it. In general, if you have a function or class that you expect to be long-lived and call from many different scripts, writing tests will help prevent that module from being brittle and keep the scripts running over time as you improve the module.

An example from my work which might be relevant: I write a lot of command line scripts that all take similar types of arguments. Some things I typically want to pass my scripts are dates, date ranges, sequences of primitive values (ints, floats, dates, strings), file paths, etc. I wrote a library of command-line argument validating routines. So now, in my scripts if I want to say "this argument is a float between 0 and 1" it's a one liner, but if the user supplies an invalid value the script fails immediately and gives a useful error message "argument foo should be between 0 and 1, you gave value X".

I wrote lots of test cases for these argument parsers to ensure that they handled various edge cases of valid and invalid input. I think it improved both the design and underlying code.
(Reply) (Parent) (Thread)
[User Picture]From: dr_tectonic
2013-11-06 10:48 pm (UTC)
Thank you! That is really helpful!

Yeah, with my work, it's really not the code that needs testing, it's the data. Because that's what breaks things when it changes. Unfortunately it's not practical to validate it on the fly when I read it into a file. But I am doing a LOT of QC on the data files before they get analyzed, and it seems useful to think about the evolution of that QC process as being, essentially, TDD.

Definitely I will try to keep in mind that random snippets or functions I have that get used in more than one place are good candidates for unit testing, even if it wouldn't make sense to apply the framework to the whole program.
(Reply) (Parent) (Thread)