Skip to Content

Healthy Tensions, Challenges in Achieving Data Sharing

Printer-friendly versionPrinter-friendly version
Fall 2010

M. Gastil-Buhl (MCR)

Review: 'Infrastructuring Ecology: Challenges in Achieving Data Sharing' by Karen S. Baker and Florence Millerand, Chapter 6 in Parker, John N., Penders, Bart, and Niki Vermeulen (Eds.) October 2010. 'Collaboration in the New Life Sciences.' London: Ashgate. ISBN: 978-0-7546-7870-0

"Ecological data remain closely tied to traditional disciplinary knowledge .... Yet data initiatives today frequently focus on data reuse ... outside the domain ...." The authors explore "as to whether models of data sharing can be borrowed and imported with equal success in ... the environmental sciences characterized by research data and data practices that are highly heterogeneous and complex." Using the LTER as a case study, the authors demonstrate how ecological data can be more complex and diverse than physical data or data within one constrained domain of biology. The authors observe that "interpreting datasets of diverse types brought together across multiple scales can be a research project in and of itself, a tacit underappreciated part of the scientific process of knowledge building. That is, the mechanics of assembling data in a central location differs from the frequently iterative work of processing and reformatting data in order to be able to interpret and to evaluate an integrated result." They go on to deconstruct the data lifescycle in two contexts, internal use and reuse external to its original context. The authors analyze concrete examples of past efforts in data integration and data sharing, and posit lessons to inform our current practices.

Selected highlights:

"Contemporary cyberinfrastructure initiatives are throwing light on data and data practices in the sciences in two principal ways: first, in promoting larger-scale scientific collaboration and second, in making new arrangements for data sharing and more formal digital data publication."

"In formalizing the data analysis and the data curation subcycles (of the data lifecycle), metadata ... make visible and organize knowledge currently held tacitly." These subcycles are "iterative processes comprised of planned tasks and ... unanticipated irregularities". There are "healthy tensions involving local context-sensitive impulses to accommodate and remote curation-driven impulses to standardize data differences together with the mix of analysis-intensive research impulses to learn from anomalies and data-intensive synthetic efforts to learn from patterns."


This book chapter is worth an Information Manager's time to read. Although this study used ethnographic methods, for us as Information Managers it is an insightful introspection. The longitudinal perspective in time helped me, as a relative newcomer, to better understand our current challenges. The next proposal or IM plan I write I will be able to cite an up-to-date reference for the statement 'Information Managers have studied the challenges to achieving data sharing and can build upon those lessons.'