The University of Toronto runs a course introducing the practice of software development for graduate students in science and engineering. This is the Software Carpentry course and most recently ran as an intensive three week session during July 2009. The course notes are available online and on review you can appreciate the pragmatic approach taken by the course. It covers the core development practices and processes that should enable the solo or small team development effort to write a working, small to medium sized, system while reducing the chance of going off the rails. Most of this aligns with the way Intelliware sees development and this content could well provide some guidance to our co-op students and possibly full time hires.
As part of the course the supervising professor, Greg Wilson, organized an afternoon conference on the theme of scientific programming and the internet: Science 2.0. The afternoon included six presentations mostly focused on Scientific programming related to the course, however the internet infused most of the talks such that there was a lot of interest to programmers in the industry too. The presenters were from the North America and Europe, all accomplished, a few of which I had known about before; outstanding for an afternoon adjunct to a university course.
- C. Titus Brown – Choosing Infrastructure and Testing Tools for Scientific Software Projects – (deVilla)
- Cameron Neylon – A Web Native Research Record: Applying the Best of the Web to the Lab Notebook – (SlideShare, deVilla)
- Michael Nielsen – Doing Science in the Open: How Online Tools are Changing Scientific Discovery
- David Rich – Using “Desktop” Languages for Big Problems
- Victoria Stodden – How Computational Science is Changing the Scientific Method – (deVilla)
- Jon Udell – Collaborative Curation of Public Events – (SlideShare)
A sample of some of the items that caught my interest during the afternoon:
Cameron Neylon, a research scientist from the UK, discussed approaches for moving the laboratory notebook into the internet era and his own work in the field. The lab notebook provides the working notes for the researcher and also evidence for publication. One of the interesting aspects here was allowing the notebook to include data submitted by the devices involved in the experiment, for example when different chemicals are mixed together then the details of those chemicals, precise date and time, volumes, the operator, and also details of the environment such as temperature. The machines become responsible for recording their own activity and as a result the constituent parts of the experiment, the machinery, and the operations, become represented in the virtual space. From this the precise steps for the all the related experiments can be found, the history of the machine, successes and failures, and so on. All the material to recreate the experiment. A lot of this ties into commercial needs for cheap data collection and sharing.
In a side comment it was mentioned how the effective announcement that water had been discovered on another planet for the first time, quite a significant event, was from the Mars Phoenix twitter feed.
Perhaps others are doing this, however he showed a creative commons style licence for slide show at the beginning of the presentation – you can see a capture at the start of deValla’s notes.
Victoria Stodden covered the difficulty in reproducing experimental data for computational science. The volume of generated data might be huge, for example from climate models, the application code might be very complicated, recreating the execution environment, providing an understanding of the code for others to be able to understand the intricacies of the behaviour and for them to develop variations, and matching the sometimes substantial computing capacity to run a huge model.
Intelliware has struggled with the related problem of customers being able to take over development of customer software that is written for them. The code is not enough. Some of the solutions that we have talked about, such as prepackaged VMWare machines suitable for development or running the service from, were covered by the presenter.
Michael Nielsen documented how the use of common internet communications systems are still being explored and turned to other uses. Following one thread, the blog of the mathematician Terry Tao presents a series of extensive blog articles across the field of mathematics, often starting with introductory material before moving onto a summary of past explorations before leaving off with conjectures from the sharp edge of the field. These blogs items are quite different in scope and coverage than you typically see. It seems that a number of leading mathematicians having similar blogs, and that these blogs are driving alternative forms of collaboration within the field. There are restructuring experts attention, a scarce resource. This gets more explicit with websites such as InnoCentive that models this as a market for expert attention. Solution seekers put out a request and a reward for a solution.