Skip to Content

Putting It Out There – Making the Transition to Open Source Software Development

Printer-friendly versionPrinter-friendly version
Issue: 
Spring 2011

Wade Sheldon (GCE)

I have spent a significant portion of my scientific career developing and customizing computer software, both to process and analyze research data, and to build systems to disseminate these data to others. Throughout this time I did what the majority of scientists do, and kept this code mostly to myself. There were many reasons for my closed development approach, from the practical ("the code isn't sufficiently documented for someone else to use") to the paranoid ("I don't have time to answer questions or help people use it") to the proprietary ("why should I give away my hard work for free"). But looking back, one of the primary drivers for my attitude was a negative experience early in my career when I found myself competing against my own software for salary money, and lost. A former research colleague found it more cost-effective to hire an undergraduate student to run my software (developed for another project and shared) than to include me on the new project as a collaborator. Although that issue was eventually overcome, it had a lasting impact on my attitude regarding giving away source code.

After joining the LTER Network in 2000, though, I started interacting with software developers at LNO, other LTER sites and partner organizations (e.g. NCEAS) who were strong open source advocates. Although I still bristled at the thought of sharing my code, the advantages of collaborative software development and sharing scientific software were becoming clear. In 2002 I made a tentative effort at code sharing by releasing a compiled version of a MATLAB software package I developed for GCE data management (GCE Data Toolbox, https://gce-svn.marsci.uga.edu/trac/GCE_Toolbox). Interest in this software was strong, resulting in over 3000 downloads by Fall 2010, but the lack of source code access quickly became an issue. I frequently received email requests for access to portions of the code base for various projects, and reviewers of our GCE-II renewal proposal simultaneously praised us for developing sophisticated QA/QC software and chastised us for not allowing other scientists to inspect the algorithms. During this time I was also encouraged by a colleague to take an online software development course geared for scientists (Software Carpentry, http://software-carpentry.org/), and began to use more community-developed tools for code management and development, including Subversion (http://subversion.tigris.org/), Python (http://www.python.org/) and Trac (http://trac.edgewall.org/). The final straw was an article in Nature (Barnes, 2010) that made an eloquent argument for publishing all scientific software code, warts and all, to allow others to engage with your research.

So in October 2010 I established a policy (with executive committee approval) of releasing all GCE software code as open source under a GPLv3 license (http://gplv3.fsf.org/). As described elsewhere in this issue (Chamblee and Sheldon, 2011), I also began openly sharing database binaries and scripts, web application code and analytical software with other LTER sites (e.g. CWT, MCR, SBC) as well as contributing source code to cross-site software development projects in LTER (ProjectDB, Personnel Database). While I still don't consider myself a staunch open source advocate, and definitely see a place for proprietary software (e.g. to ensure developer compensation, attribution and user support), my experience with "putting it out there" has been fairly painless so far. Requests for help with software code have not increased, and may actually have decreased as people have more opportunity to investigate the code themselves. Although I still worry about getting proper attribution for my work, and cringe when people delve into my code and report back about problems they've found, I know that the software is better as a result.

Although your milage may vary, if you've been holding back releasing your software because "it's not good enough", I encourage you to read Nick Barnes' article and reconsider. If you need extra encouragement (or a fun read), I also suggest looking at "The CRAPL: An academic-strength open source license" (http://matt.might.net/articles/crapl/). My favorite part of the CRAPL is "III. 3. You agree to hold the Author free from shame, embarrassment or ridicule for any hacks, kludges or leaps of faith found within the Program." So please keep that in mind if you delve into GCE software in the future.

References:

Barnes, N. 2010. Publish your computer code: it is good enough. Nature, 467, 753 (doi:10.1038/467753a, http://www.nature.com/news/2010/101013/full/467753a.html)

Chamblee, J.F. and Sheldon, W. 2011. Systems Upgrade through Technology Transfer across LTERs: Who Benefits? LTER Databits: Information Management Newsletter of the Long Term Ecological Research Network. Spring 2001 Issue. (http://databits.lternet.edu/)