Posts for category: java
Tests from the past
Once upon a time, I worked for a large telecommunications company. In fact, I just wrote a little bit about it over on the Pathfinder Agile Ajax blog. Which reminded me that I had been meaning to talk about how my team did testing on my primary project there. This is in the spirit of a real-world process report, or doing the best you can under less-than-perfect conditions for test-driven development.
Here’s a brief, necessarily somewhat vague, description of the project. It was a Java Swing UI for configuring an elaborate network. The input was several screens worth of information about the topology of the various devices. The output was dozens, if not hundreds, of text configurations for the devices in the network. There was also a validation step, where the configuration would be tested against rules for legal configurations that were frankly quite Byzantine.
Anyway, at the point in time we’re talking about here, the main part of the application had a 5-tier architecture (developed before I got there, and not from a test-driven process) such that creating object clusters for unit tests was prohibitively irritating. With significant investment in creating mock structures, we were able to reduce that to merely very irritating, but even so, large parts of the system, especially in the GUI, were basically opaque to unit testing without a total overhaul.
The output tier was a little better. Output was managed via Velocity templates, and the context objects behind them were developed test-first, and were separated enough from the rest of the system such that unit tests for the logic were feasible there. However, the actual Velocity output wasn’t easy to test directly—Velocity doesn’t have an easy mechanism for testing partial output in a unit test structure.
Two more points, and then we’ll get to the plan.
- You know how I say that tests are a great device for testing that the program does what the developer thinks it does, but not all that great for testing that it’s doing what it’s supposed to do? Big factor here. Since the output of the program was a hardware configuration file, there were an infinite number of ways that the program could avoid error while still producing flawed output.
- While my immediate management was never anything other than supportive, the stance of upper management was a little weird. They were happy to appropriate the word “agile” to describe their ideal process, but were very down on developers actually doing testing of any stripe. This led to some friction in our system—for example, it took a very long time for us to get a suitably powered continuous integration server. On the plus side, the actual team of developers was super-great.
So… what would you do?
Well, here’s what we did. I’m not prepared to defend this as optimal, but it did mostly seem to work.
- On the existing code, especially the GUI, we largely punted on testing. The effort involved, what with the legacy architecture and all, made setting up unit tests prohibitively expensive. We still tried in the somewhat less encumbered data layers, but a wide swath of functionally was basically trapped in static classes with a lot of interdependencies.
- The Velocity contexts, which were more isolated from the rest of the code, came to have much of the output logic. This code, being somewhat newer, actually tended to be written with tests. We also felt that, given the choice, it was more important to nail down the output.
- The actual Velocity output was complicated to test, so we built two tools to manage it. The first was a golden output test harness. We created something like two dozen sample systems that covered as much of the problem space as we could. The output was verified by our hardware experts, and stored in our source tree. The test harness would run the new code and compare the output to the golden versions. The harness (thanks to an excellent job from a fellow team member) ran Beyond Compare on the two directories, making it super-easy to see the changes, approve them, and update the golden version. The bottleneck was not, as you might expect, changing the files themselves. Rather, it was the super-slow source control in use. An update to a common header could easily change hundreds of gold output files. Changing the files would take a minute, updating the source control could take two hours. Really. (Running the tests themselves took about ten minutes—too long to run in a tight loop, but something you could run before checking back to source control.) Despite the speed issues, this turned out to be a very useful way for us to keep confident in the validity of our output, and also an easy way for us to have a context for discussion with our hardware experts.
- Which left a separate problem—ensuring that the gold output tests actually covered all the possibilities. Velocity doesn’t have an actual coverage tool, or at least didn’t at the time. So… we wrote a Jester-style mutation tool that randomly changed lines in the Velocity template and checked for broken tests. This one did take a while to run, so we only did it sporadically, but it was useful as a sanity check from time to time.
Again, clearly not optimal, but it did let us get as much of the verification and confidence benefits of testing within the constraints of the system and the environment.
And that’s it for this week’s episode of “silly testing things I’ve done in the past.” Thanks for listening.


