Regression testing

We’ve had quite a few release of Krita now — 1.4.1 in July 2005, 1.4.2  in October 2005, 1.5 in April 2006, 1.5.1 in May 2006, 1.5.2 in July 2006, 1.6 in October 2006, 1.6.1 in November 2006– and we’ve got 1.6.2 coming up for January and 2.0 for somewhere in 2007. The dot zero releases, 1.5 and 1.6 were accompanied by alpha, beta and rc releases. And still, with every release (except for 1.4.1) we had regressions — features that worked in the previous releases that didn’t work in the new release.

Do we suck and are too incompetent to code good software or is there another problem? Well, I’ve got hubris enough to believe that we don’t suck, that we, on the contrary, have a great team of hackers who dedicate most, and sometimes even all, of their leisure time to working Krita.

The problem is that software is complex: fixing a bug may cause a bug, improving usability for one feature may kill an unrelated feature. The code  that determines the size of a selection is a case in point: we use that in a lot of places for a lot of different things. It’s used in adjustment layers, in masks, in selections. Our architecture is not fragile, we just reuse a lot of functionality. That cuts two ways: a fix often fixes a number of unrelated bugs, and a fix may hurt a number of very nearly unrelated features. It’s the only way we can write software in a mere 70.000 lines of code that provides all the functionality Krita offers.

The problem is testing. And this problem is hard. For one thing, a developer is not a tester. A developer starts with a mental (or sometimes paper) model of how a certain feature is going to work. For me, that model is almost geographical, like a 3d landscape with features, landmarks and connections. From that model, the code is written, and then we usually exercise the code a few times to see if it conforms to the model. Given that we know the model behind the code it is next to impossible to come up with ways of exercising the code that don’t follow the model for developers. For another thing, we’re really, really pressed for time. Krita developers seldom have the time to use Krita for any tasks — I haven’t touched my real oil paints for more than a year now, let alone started a good painting in Krita.

So — what’s the solution? Is there any solution? Where should the solution come from?

To me, it’s obvious that we need real regression testing prior to a release. All functionality of Krita needs to be exercised — every feature needs to be used — and we need to keep track of what works and what fails. Alpha and beta releases don’t work for that. People installing an alpha or a beta generally do so to see whether a promised new feature is what they need. They don’t test the whole application rigorously. I think I have an idea that could work out. It combines the team spirit of our translators, the pride involved with buzz, cvs and bugzilla statistics and the accessibility of bugzilla (which, despite all claims, is not bad).

I would like a web application a little like bugzilla, where for every application testcases with test scripts can be added. The goal is to have the test cases completely cover the application’s feature set. Then, when a release is looming an application is put in test mode. At first, the application has been tested 0%. Everyone with an account can join the, say, Krita 2.0 Test Sprint, and pick test cases. The goal is to reach 100% of tests executed, but it’s okay if several tests are run more than once, by different people. Tests that fail are mailed to the relevant developers mailing list.  Successes, too, of course.

The whole thing can be jazzed up with statistics, adding a little gentle competition between applications in release mode, chat forums, irc channels and all the other things that build a community.

I have started coding on it, using Django, but my web application skills are meager, and besides, I need to flakify Krita in a hurry. So, there isn’t much more than a proposal for the data model. I might pick it up again, but I’d much rather hack on Krita — which is why I’m writing this blog. Any  volunteers? I’m not wedded to Django, and I’m prepared to install any web app environment on for a test environment. If it pans out, we can look for real hosting.