Every now and then I get emails from computer science researchers (often students), asking me to participate in their particular study. Sometimes I participate, sometimes I don’t, often I’m thinking to myself: Couldn’t you ask me what you should be researching in the first place? While that is an inappropriate response for these emails, I wanted to write down my ideas for researchers, in public, so that they can either find them on their own or I can at least link to them.
If you have ideas that would fit in this list, let me know.
How can better testing tools help write better software? This area seems to have so much untapped potential. Most open source testing tools are developed and improved in very small increments, the “this feature would be with this use case”, but it’s rare to find features that make big leaps. While I still haven’t used it, Intern is an interesting tool, because it integrates all kinds of frontend testing use cases into a single suite, and usually integration, especially if we think of it as fusion, can set free extra energy.
One example of a feature in a testing tool that is rather unlikely to happen in regular open source development scenarios is the reordering feature in QUnit. It’s not a well known feature, since its behaviour is rather subtle. It might also not be as useful as I’m thinking it is, since I haven’t been able to do any research to verify that. But I think it’s an interesting example anyway. And since I came up with it (more or less) and implemented it, I can tell you how it happened.
Some years ago, I was listening to a podcast, with Kent Beck, one of the original authors of JUnit, as a guest. He was talking about a project he had been working on for some time, called JUnit MAX. It was a plugin for the Eclipse IDE, providing a better test runner. One significant feature of this plugin was its reordering of tests after previous runs. Kent was talking about how research showed that the odds of a failing test failing again on the next run are much higher then for a passing test to start failing. So for shorter feedback loops, it makes sense to run tests, that previously failed, first on the next run. JUnit MAX would also consider the time it takes for various tests to run, and run the fastest failing tests before the slower failing tests.
After hearing that, I figured that I could implement a subset of that feature in QUnit, reordering test runs, based on previous failures (ignoring the secondary sort on runtime). While it required some rather complicated changes to make the queue flexible enough for the reorder and maintaining the original output order, the result was a faster feedback loop with no additional effort needed by the developer using QUnit. The only time this failed was when tests weren’t atomic, that is, test B relied on test A running first. Non-atomic tests are a significant code smell by themselves, and in way this feature helped uncovering them. In the end, it still required an option to turn the reordering feature off.
I’m optimistic that there are plenty ways to write better testing tools, and academic research could provide ideas and do verification of those ideas.
Elm is known for spectacularly good error messages, almost obsessively improving the compiler, even with a separate repository collecting “broken Elm programs / data to improve error messages.
Consider these two scenarios.
Scenario A: You get an email with the subject “the site doesn’t work”, and no body. You don’t even know which site this refers to.
Scenario B: You get an entry in your issue tracking system, with some descriptive fields, but most importantly, a link. You click that link, which brings you to your app, along with some debugging tool. One element of that tool is a slider that lets you go back in time. It also shows you various relevant variable. Going back in time you can see what interactions the user did before running into the issue they reported. You can quickly identify the issue based on the steps and the state of the system.
Now, which of these two would you prefer if you get paid for solving issues (not so much for spending time on solving them)? Scenario B supposedly is also something Elm provides (or at least helps with). But I don’t see why other platforms shouldn’t be able to provide at least partial support for that kind of error reporting and debugging.
If a code review finds no issues, does it mean that the reviewer didn’t bother looking too much? Or that there are no issues? How can you review the reviewer? Are there metrics or tools that could help become better at reviewing? A reviewer that points out a lot of style issues might never catch actual bugs. A reviewer who always ignores styles issues might have a hard time catching bugs due to style issues.
This seems like an area with a lot of untapped potential, but its hard to tell without some research.
Collaborating within a team can vary a lot by physical difference, from pair-programming at the same desk, to being on different sides of the planet in wide apart timezones. There are a lot of options in-between those two extremes. Martin Fowler does a great job of explaining the various options and some of their strengths and weaknesses in Remote versus Co-located Work
There isn’t a simple dichotomy of remote versus co-located work, instead there are several patterns of distribution for teams each of which has different trade-offs and effective techniques suitable for them. While it’s impossible to determine conclusive evidence, my sense is that most groups are more productive working in a co-located manner. But you can build a more productive team by using a distributed working model, because it gives you access to a wider talent pool.
In my experience, once you’ve built a team where everyone can work remotely, you get the benefits of that model even if parts of your team are (sometimes) together in one room. For example, assuming a core team of 5 people shares an office. Every other week one of those people needs to work from home, to take care of family, dealing with home repairs, or one of many other reasons. Usually these activities don’t need an entire day off (and they’re not real vacation!), so being able to work from home just as effectively as in the office makes this much less of an issue.
That said, how to actually build an effective non-co-located team has no clear answers. There are various technical solutions that I found working quite well, but I’ve never done any research to evaluate the ones I’m using or studying alternatives. Coming up with a list of options and ways to evaluate them seems like an interesting research topic to me. Only some of this is specific to software development. For example, communication tools that can properly format and highlight code snippets or error logs are more specific for developers.
All of the above have changed over time, and the bug reporting scenario B I described above has supposedly been possible with Smalltalk years (decades?) ago. I suspect that there’s a lot of history to study in software development even in just the last 10 years. From a historian’s perspective, the whole industry might be too young to study effectively, but that shouldn’t prevent studying trends like “best practices” in architecture, tools, project management, design, or any of the topics mentioned above. A good example is the Born for it article on martinfowler.com, looking at “How the image of software developers came about”.
A big thank you to Enes Fazli for reviewing drafts of this article.