Skip to Content

Review: Common Errors in Ecological Data Sharing

Printer-friendly versionPrinter-friendly version
Issue: 
Fall 2013

Hope Humphries (NWT)

Kervin, Karina E., William K. Michener, and Robert B. Cook. 2013. Common errors in ecological data sharing. Journal of eScience Librarianship 2(2):Article 1. http://dx.doi.org/10.7191/jeslib.2013.1024.

This study identifies common errors in data organization and metadata completeness that were discovered by reviewers of data papers published in the Ecological Society of America’s (ESA) Ecology Archives. ESA’s Ecology publishes the abstract of a data paper; Ecological Archives contains the data sets themselves and accompanying metadata, allowing for long-term access to data sets, which can be updated. An average of about 20 errors was identified per data paper, although many of these were simple editing errors. The authors grouped the errors according to the Data Life Cycle elements described by Michener and Jones (2012). Over 90% of papers had errors in the Collection and Organization category (i.e., collection methods, site/time descriptions, inclusion of all relevant variables) and the Description category (i.e., ascribing metadata to the data). The pervasiveness and number of errors in the data sets analyzed is perhaps surprising considering that they were specifically submitted for publishing in a data archive, and therefore one might expect that extra attention had been paid to their completeness prior to submission. However, the careful scrutiny given these data sets by reviewers no doubt was a factor in unearthing problems that might otherwise have gone unrecognized.

Information managers are all too aware of the existence of missing information and errors in data and metadata, but this study’s results could be used to raise awareness in scientists and students about the kinds of problems they should be on the lookout for. As a best practice, the authors emphasize the importance of recording all details about the study context, data collection, quality control, and analysis throughout the course of a research project rather than attempting to reconstruct them after the fact. Saving data in a non-proprietary format, such as ASCII text, is noted as important. Organizations that offer data management training and educational tools are also identified.

Reference:

Michener, William K., and Matthew B. Jones. 2012. Ecoinformatics: Supporting ecology as a data-intensive science. Trends in Ecology & Evolution 27(2):85-93. http://dx.doi.org/10.1016/j.tree.2011.11.016.