Rethinking DevCreek

I recently started drinking the DevCreek Kool-Aid.

Like most of us at Intelliware, I’ve been using the DevCreek plugin to Eclipse for about a year — ever since the first (internal) versions were published. Because DevCreek started as TestCreek, the data collection has always been biased toward collecting information about unit test runs. And, I must confess, I haven’t had the “ooo, wow” moment from the data that has been collected.

There have been moments that were interesting. When we started “ranking” projects, there was some exciting (although, admittedly, shallow) sense of competition between projects. “Ooo, look! Our project is beating out Kevin’s project. Quick, run the tests some more and leave them in the dust!”

And I have used DevCreek for evil purposes, too. When the DevCreek “proximity alert” starts up on a test that I’m changing, I’ve sometimes doubled my efforts to get ready to check-in because I know that someone else is changing the same class, and the one rule of check-in is that the developer who arrives last must make amends.

But at the end of the day, if you asked me if I thought that DevCreek was a big deal, I would have been forced to say, “no.”

I’m not sure, precisely, what’s changed recently, but I now find myself thinking about DevCreek in the centre of a large number of scenarios.

Part of it is my attraction to visualizing data. Although I haven’t yet found the Unit Test graph that I keep wanting to come back to again and again, the recent Bugzilla charts seem really interesting. This chart, from my current project, tells me that we’re generally keeping on top of bugs and enhancement requests, because the open bugs remain relatively constant, even though the total number keeps growing:

And this graph tells me that, in general, when our users find bugs (in their testing cycles), they don’t tend to be big deals.

I do find myself coming back to these charts over and over again.

David and I have talked about some additional charts that I’d love to see on the Bugzilla front, as well, but like all things, “There are requirements, and there are resources…”

I suppose I also keep thinking about it in the context of more generalized charting services, such as IBM’s Many Eyes. The idea of Many Eyes is interesting: they have a variety of chart types, and you can pour your data into it to get really interesting charts. They’ve got a good variety of “out of the box” charts, and you can upload your data to share with the world. So you can find something like the E-coli outbreaks in the U.S. between 1990 and 2004:

You get an attractive, interactive graph that provides a lot of data. You can hover over a data point and discover interesting facts, like the fact that of all reported E-coli outbreaks involving cake happened in churches. Next time your church offers you cake, you might want to look askance.

Charting seems to be an emerging field: Google recently created a charting API that you can use to embed data on, say, blog posts just by invoking a magic URL. Here’s an example:

Unlike the Many Eyes chart, above, I have no “screen shot” of this chart: it’s created, on the fly, by Google when it interprets the URL I’ve sent it.

But think about that from the point of view of what DevCreek does. These charting services offer sexy charts; you provide the data. So, now, here is what I think is the genius of DevCreek. It’s a generalized data repository into which I can store data, especially data that changes over time. Combined with neat-o charting, it can be amazingly powerful.

Very recently, I’ve been thinking about DevCreek as offering a time dimension to data that I can currently get by using tools to provide in “point-in-time” form. For example, there are a number of tools that provide information about one of my favourite topics: dependency analysis. JDepend provides me with information about package dependencies and dependency cycles. Compuware’s Package Dependency Analysis tool in OptimalAdvisor takes dependency analysis one step further by providing layerization information:

One thing that’s always been missing from this analysis, is the time dimension. How has my layerization changed over time? How has the complexity evolved? (When did things go horribly, horribly wrong?) And it’s also been annoying that I have to go through the process of generating the data (sure, sometimes the tools can be easy to set up, but it still only happens when I think of doing it). With the right kind of DevCreek client, I can get a snapshot of that data fairly frequently, and when I’m interested, walk up and see the effect of change on the project over time.

And that’s tied in, nicely, with another set of thoughts I’ve been having; these other thoughts have been rolling around in my head under the title, “Rethinking the Build”.

Here’s what that’s about. I’ve been using Maven on the last few projects I’ve been involved with. Maven is not without wrinkles, but one of the things I like is having access to a shared build server (currently running Continuum), which helps me with a number of nagging questions that keep coming up when I work on small projects such as proof-of-concepts, or smaller component projects. Questions like, “after this proof-of-concept finishes, what happens to the machines we used? Do we leave a build machine running around for all eternity? If not, can anyone restore the build machine to its happy place if we, say, ghost it and walk away?”

Now, a common build machine requires me to rethink what a build is and what it does. At Intelliware, we’ve been running automated builds as long as I’ve been with the company. Back in 2000, we bodged together build processes using shell scripts. Around 2001, we started using Ant. And more recently, Cruise Control has been a big part of the mix. But in the past, we kind of figured that anything you can do in an Ant script is a worthy candidate for inclusion in the build process.

So our builds do a lot of stuff. Compile, package, run tests, deploy to application servers, load database home states, run in-server tests (what we often confusingly call “integration” tests) using Amakihi, HttpUnit, or Selenium, run performance tests, and so on.

When you have a shared build machine, it sometimes becomes necessary to think about which parts of that process belongs on a shared machine and what parts aren’t easily shared. And sometimes that re-thinking relates to “what parts of this process are really only interesting from the point of view of an active development project?” Anyway, there’s a lot to say about that thought process, but it’s mostly tangential to what I’m saying about DevCreek.

But here’s an area where the thinking overlaps. Several years ago, I added a performance test process to a project build. It was a neat idea. We had been changing database structures a lot and we needed to have really good performance so we tracked the application performance over time as the application evolved. Because we
had a build process and we can do anything, really, with Ant, I introduced some crazy steps in our build process. I ran the performance tests, gathered the test results, aggregated them with previous test runs, and then checked the results into repository so that we never lost the data. Then I charted the results, and pushed the charts to a web server so that I could take a look on a periodic basis.

Now, that process required me to make the build do some pretty rude stuff. CVS commits in the middle of the build, and crazy, crazy stuff like that. And although I tried to make some parts of the process generic, it was all pretty specific to my project and my project’s build process. (The structure of the performance data and the process of creating charts from the performance data were bundled up and packaged as part of Amakihi, where it has languished, unloved, ever since).

If I were doing the same thing, today, rather than push this data, rudely, into the CVS or Subversion repository, I’d look to publishing it out to DevCreek. And I’d love it if there was a predefined structure for performance data, and either some canned performance data charts or an easy way to define my own performance charts (or an easy way to push data out to Many Eyes or some other charting service).

But why stop there? I mean, I’m aware of some performance issues regarding my project when it’s deployed in low-bandwidth environments. It’s something that needs to be tracked, and resolved. But, hey, why think in terms of a performance test environment? Why don’t I just routinely gather performance data of actual, running production systems and push it out to DevCreek? I don’t have to worry about having some way of receiving performance data from running installations for analysis: DevCreek is the mechanism by which I do that. Once DevCreek has the data, I just walk up and use the charts.

Suddenly, DevCreek seems like it can offer me more, more, more! And I’m enthused about it in a way that I haven’t been back when it was just a matter of being in the number one project from the point of view of running unit tests.

It's only fair to share...
Share on FacebookGoogle+Tweet about this on TwitterShare on LinkedIn

Leave a Reply