The video from [PyCon of the Functional Test Tools Panel](http://pycon.blip.tv/file/1947342/) has been making the rounds and one comment in particular that seems to be gaining some traction is a comment [Jason Huggins](http://twitter.com/jhuggins) made about test recorders being “evil”. I didn’t comment about it while I was on the panel because I have mixed experiences with recorders in the tools I work on and some of those experiences have left me feeling the same way.
I agree with Jason that recorders that generate code with the expectation that you will be able to **just run** that code as a test are definitely *evil*. Aside from just **not working** most of the time they create false expectations about test tools. But, I disagree that they can’t be implemented in a way that is useful and side steps a lot of those false expectations. **Note: On twitter Jason has added that he agrees with my disagreement probably for the same reasons I’ll highlight below.**
Looking at two tools I’m still an active developer on, [Windmill](http://www.getwindmill.com) and [mozmill](http://code.google.com/p/mozmill), both have a recorder but one I find highly successful and the other I’ve struggled greatly with.
The recorder in the Windmill IDE is phenomenal and almost entirely due to the work that [Adam](http://www.adamchristian.com/) has put in to it. The entire test authoring UI in Windmill is *action* based meaning there is a series of interface simulations you can do with a single variable (an element lookup or in some cases a string to eval). Each action is a cell that can be moved around and new actions added above or below. The same UI is leveraged for running and debugging tests, each cell shows green or red if a test has passed or failed.
The reason I like this UI and similarly simple UI’s like [FireUnit’s UI](http://www.mikealrogers.com/archives/327) is that they leverage the relative simplicity of the test APIs. There is no code in the Windmill IDE, just actions, so when you record a set of actions they don’t show you code just the actions **as you interact with the page**.
I have to admit, I was pretty shocked by the kinds of feature requests we got after we implemented a recorder in mozmill. Windmill has a few orders of magnitude more users than mozmill and we got far more feature requests that bordered on *”magic”* in mozmill. Not mention that during the initial usage we got a complaint or bug report almost every time a recorded test didn’t play back and pass without any additional work. It wasn’t until today that I started to understand why we saw such a big difference in expectations.
One problem is that when you take a bunch of opaque background recorder magic and then just serialize it all to code it creates a context switch for the user. They didn’t watch the actions come in as they were being recorded so they have no idea where they came from and they stop and look at the code. Instead of a series of small steps, each interaction with the page creating a line of code, the user thinks of this as two steps; do stuff, test is written. This makes a lot of users disjointed from the editing process.
Another problem with generating code is that it’s expected to be valid. If a failure occurs the only thing to do is to show the exception information. The exception information highlights the line that has failed. Generated code is always hairy and difficult to understand so you just stare at it and are at loss as to how you should fix it. But the code isn’t where to look to fix this problem, the **page** is where you look to figure out what went wrong. The fact that you’ve gone from the exception information back to the code makes it less likely for you to move your attention **again** back to the page. This is where placement is a key factor as well and, if you can, you should also try to place the product side by side with the test interface.
With Windmill there is an entirely different workflow than mozmill. You record actions and they pop up in the IDE as you create them. Since the actions are created as you interact with the page it’s more transparent as to where they came from. Then you just play back the actions and as they run the editable tabs go red or green. This keeps the focus failure on the action without moving attention from code to exceptions and and allows you to watch the page as the passes and failures occur. Once a failure occurs it’s a little more likely now that you’ll see the cell go red before some ajax request finished rendering the piece of UI you tried to click on. Then, hopefully, you will intuitively add an action above the click that waits for the element to get rendered.
The mile high view of both recorders doesn’t make them seem much different. You *can* do the same set of tasks with each, but a series of small differences reduces the chances that you’ll fall off the intended pattern of use, which is to record a set of actions and use it as an outline to create fully working test. Those small differences also create entirely different expectations from your users.
The funny thing is, Adam and I have never thought about it this way. All we ever tried to do was keep things simple and keep the test creation, editing, running and debugging workflows integrated and this is just a happy side effect