Sensor and sensor data management best practices released
Corinna Gries (NTL), Don Henshaw (AND), Renee F. Brown (SEV), Richard Cary (CWT), Jason Downing (BNZ), Christopher Jones (NCEAS), Adam Kennedy (AND), Christine Laney (JRN), Mary Martin (HBR), Jennifer Morse (NWT), John Porter (VCR), Jordan Read (USGS), Andrew Rettig (University of Cincinnati), Wade Sheldon (GCE), Scotty Strachan (University of Nevada), Branko Zdravkovic (University of Saskatchewan)
Rapid advances and decreasing costs in sensor technology, wireless communication, data processing speed, and data storage capacity have enabled widespread deployment of automated environmental sensing systems. Basic environmental processes can be monitored continuously in habitats ranging from very remote to urban providing information in unprecedented temporal and spatial resolution. Although research questions that may be answered based on these data are very diverse (Porter et al. 2009), the design process, establishment and maintenance of most environmental sensor systems, and resulting data handling have many commonalities.
Realizing that sensor networks are becoming ubiquitous in ecological research and, hence, a new set of skills, approaches, applications, and technologies are required, a workshop was organized jointly by researchers from the Northeastern Ecosystem Research Cooperative (NERC) and LTER information managers in 2011 at Hubbard Brook Experimental Forest with participants from many projects currently implementing sensor networks. An earlier publication reported on the need for an online resource guide as identified during that workshop (Henshaw et al., 2012). The publication by Campbell et al. (2013) on streaming data quality was a major product of that workshop, as was the basic outline for a best practices guide. To date, four sensor training workshops have followed, focusing on remote data acquisition as well as strategies for managing sensor networks and software tools for handling streaming data. The Remote Data Acquisition (RDA) sensor training workshops, co-sponsored by LTER and the UNM Sevilleta Field Station, focused on the field aspect of environmental sensor networks, included hands-on training in basic electronics, photovoltaics, wireless telemetry networks, and datalogger programming. Another RDA workshop will be offered in January 2015, for which more information can be found here http://sevfs.unm.edu/workshops/rda2015.html. These workshops were nicely supplemented by two training workshops that focused on managing streaming data. These were co-sponsored by LTER, NCEAS, DataONE, UCSD, and SDSC and involved trainers from projects as far ranging as LTER, the Kepler Workflow System, the GCE Data Toolbox for Matlab, CUAHSI, DataTurbine and introductions to other open source tools such a the R Statistics Package and the approach NEON is taking. Extensive training materials for these workshops, including power point presentations, example data, and manuals were developed, and some presentations recorded live, all of which can be accessed here http://wiki.esipfed.org/index.php/Workshop_Materials (Henshaw and Gries, 2013).
A working group of practitioners experienced in the entire life cycle of streaming sensor data (sensor network establishment, remote data acquisition, data storage, quality control and assurance, and data access) has met regularly in person and via online forum to initiate the 'Wiki Process' for assembling the Best Practices document. The guide builds on the collective experience of working group members as well as the earlier workshops described above. The working group has released initial documents on the Earth Science Information Partners (ESIP) Federation wiki page (http://wiki.esipfed.org/index.php/EnviroSensing_Cluster).
In its current version, this document on best practices for sensor networks and sensor data management provides information for establishing and managing a fixed environmental sensor network for on- or near-surface point measurements with the purpose of long-term or “permanent” environmental data acquisition. It does not cover remotely sensed data (satellite imagery, aerial photography, etc.), although a few marginal cases where this distinction is not entirely clear are discussed, e.g., phenology and animal behavior webcams. The best practices covered in this document may not all apply to temporary or transitory sensing efforts such as distributed “citizen science” initiatives, which do not focus on building infrastructure. Furthermore, it is assumed that the scientific goals for establishing a sensor network are thought out and discussed with all members of the team responsible for establishing and maintaining the sensor network, i.e., appropriateness of certain sensors or installations to answer specific questions are not discussed. Information is provided here for various stages of establishing and maintaining an environmental sensor network: planning a completely new system, upgrading an existing system, improving streaming data management, and archiving the quality controlled data.
The chapters contained in this guide are structured to provide a general overview of the specific subject, an introduction to methods used, and a list of best practice recommendations based on the previous discussions. Case studies provide specific examples of implementations at certain sites.
Sensor Site and Platform Selection considers environmental issues, site accessibility, system specifications, site layout, and common points of failure.
Sensor Data Acquisition outlines considerations and methods for automating real-time acquisition of environmental sensor data from remote locations.
Sensor Management Tracking and Documentation outlines the importance of communication between field and data management personnel as field events may alter the data streams and need to be documented.
Sensor Data Management Middleware discusses software features for managing streaming sensor data.
Sensor Data Quality discusses different preventative approaches to minimize data inaccuracies and quality control and data management practices to identify and properly document problematic data and data quality level.
Sensor Data Archiving introduces different approaches and repositories for archiving and publishing data sets of sensor data.
As mentioned above, this is a living document, an open source, community supported resource that implements the 'Wiki Process' and everybody is invited to contribute knowledge and experience, provide updates and corrections, or start a completely new chapter with currently missing information. Anyone can register an account with the ESIP wiki and upon approval, may edit existing content. The 'Wiki Process' for amassing knowledge in an organized fashion is well documented for the Wikipedia (see links below). Documentation and guidelines are provided there for reaching consensus through editing or through discussion. Each subject area may be discussed in the Wiki on the 'discussion' tab, or on the ESIP EnviroSensing Cluster mailing list. Overall, most of the '10 simple rules of Wiki editing' apply here as well. We would like to particularly encourage contributions that describe existing local systems (i.e., 'case studies'). Personal experiences and evaluation of products, sensors, software, etc. have not been included yet, but are considered valuable and should be voiced in this forum to the degree that they might help others. A glossary of terms would be useful as well as a list of pertinent publications. Other areas currently not well covered are implementations of OGC Sensor Web standards, aspects of citizens involvement in sensor applications, and cutting edge developments that we are not aware of. However, we hope this effort will provide a forum for lively discussion of the latest developments in sensor technology and that you will find the existing information useful enough to share your knowledge in return.
Campbell, John L., Rustad, Lindsey E., Porter, John H., Taylor, Jeffrey R., Dereszynski, Ethan W., Shanley, James B., Gries, Corinna, Henshaw, Donald L., Martin, Mary E., Sheldon, Wade. M., Boose, Emery R., 2013. Quantity is nothing without quality: Automated QA/QC for streaming sensor networks. BioScience. 63(7): 574-585.
CUAHSI - Consortium of Universities for the Advancement of Hydrologic Sciences http://cuahsi.org/
GCE Matlab Data Toolbox by Wade Sheldon https://gce-lter.marsci.uga.edu/public/im/tools/data_toolbox.htm
Henshaw, D., C. Gries, R. Brown, J. Downing, 2012. SensorNIS: Community engagement to build a resource guide for managing sensor networks and data. DataBits Fall 2012. http://databits.lternet.edu/fall-2012/sensornis-community-engagement-build-resource-guide-managing-sensor-networks-and-data
Henshaw, D., C. Gries, 2013. Sensor Networks Training Conducted at LNO. DataBits Spring 2013. http://databits.lternet.edu/spring-2013/sensor-networks-training-conducted-lno
Kepler Project https://kepler-project.org/
LTER - Long-Term Ecological Research Network http://lternet.edu/
Porter, J.H. , E. Nagy, T.K. Kratz, P. Hanson, S.L. Collins, P. Arzberger, 2009. New eyes on the world: advanced sensors for ecology. BioScience 59 (5), 385-397. doi: 10.1525/bio.2013.63.7.10
NCEAS - National Center for Ecological Analysis and Synthesis http://www.nceas.ucsb.edu/
NEON - National Ecological Observatory Network http://www.neoninc.org/
SDSC - San Diego Super Computer Center https://www.sdsc.edu/
UCSD - University of California San Diego, specifically the California Institute for Telecommunications and Information http://www.calit2.net/
UNM - University of New Mexico, specifically the Sevilleta Field Station http://sevfs.unm.edu/
Wikipedia:Ten Simple Rules for Editing Wikipedia, http://en.wikipedia.org/wiki/Wikipedia:Ten_Simple_Rules_for_Editing_Wikipedia
Wikipedia:List of policies and guidelines, http://en.wikipedia.org/wiki/Wikipedia:List_of_policies_and_guidelines