Richard Cary (CWT), John F. Chamblee (CWT)
Over the last 18 months the Coweeta LTER Information Management Team has been working to upgrade our sensor network so that we can stream the majority of our sensor data and publish to the web in near-real time. This article explains our motivations for pursuing this course, outlines our overall strategy for securing funding and planning the upgrade, and then outlines the processes and tools we have used in implementation. Our approach is characterized by the use of off-the-shelf software from sensor manufacturers (Coweeta standardized on Campbell Scientific sensors decades ago) and use of the GCE Data Toolbox for Matlab, available at https://gce-svn.marsci.uga.edu/trac/GCE_Toolbox. We believe that by focusing on the adoption and use of existing, well-tested products, we have significantly reduced the time and financial resources necessary to undertake an upgrade and expansion of our sensor network.
The Coweeta LTER (CWT) currently operates ten sensor-based field stations within the Coweeta Hydrologic Laboratory, near Otto, North Carolina. These stations measure a variety of parameters, including soil moisture, soil and air temperature, and photosynthetically active radiation. Historically, these sites have been based on Campbell Scientific CR10X dataloggers, and technicians would make monthly visits to each site to manually download the data. This method of data acquisition has a number of drawbacks, including high labor costs, diminished data quality due to undetected station down-time, and slow data publication rates.
The intensive labor costs associated with manual collection are tied to the time a technician must take to drive and then hike to each site. Not only does this effort consume time that could be spent on other projects, it also limits the number of sites that can be in operation at one time. A labor-intensive manual collection process also affects data quality. Since sites are only checked once a month, there is no way to know the status of the site until the monthly check occurs. This can lead to significant downtime due to any number of issues such as battery failure, sensor failure, vandalism, and interference from an animal or falling vegetation. These factors can result in the loss of up to one month of data.
Finally, a manual collection process implies, at least in the case of Coweeta, a manual data post-processing system for aggregating data and conducting quality checks. When accompanied by the slow rate of collection, these manual data aggregation and quality check processes reduce the frequency with which data are published – delays that can, in turn, delay research.
Pilot Project to Upgrade Equipment and Grant to Upgrade and Expand our Regional Network
To address these issues CWT information managers, technicians, and our Lead PI, Ted Gragson, submitted and received funding for two proposals to implement a near-real time data streaming system for all our field stations. Our pilot project was funded a by a $20,000 equipment supplement from the NSF and focused on the deployment of three stations, each using a different approach to data transmission and capture. Our pilot project was based on implementing workflows that were already in place at the GCE LTER. Based on our understanding of their work and the preliminary planning from the early stages of the pilot project, we also developed a plan to upgrade our entire sensor network and expand that network to encompass a greater proportion of CWT’s regional study area. We formalized this plan into an NSF Field Station and Marine Lab (FSML) grant proposal that we submitted in January of 2012.
We began streaming data at our first pilot site in June 2012. At each station, we replaced the CR10 Dataloggers with CR1000 models, equipped them with wireless communication equipment, and implemented fully automated data processing, QA/QC, and publishing data products to the web using the GCE Data Toolbox for Matlab. We received word that the FSML grant was funded in August 2012.
The pilot project was successful in substantially increasing the frequency and quality of updates to data from these two sites, while at the same time significantly reducing labor costs for processing the data. Thanks to additional funding from the FSML grant, we have, as of this writing, upgraded the eight remaining stations at the Coweeta Hydrologic Laboratory and installed 3 new stations at Mars Hill College in Mars Hill, North Carolina. By the end of the year, we will have a network of fifteen sites in operation and we anticipate operating approximately 30 fully automated stations using these methods by August 2014. The remainder of this article provides details on the opportunities provided to us by the GCE Data Toolbox and the off-the-shelf software available from Campbell Scientific.
Data Transmission using LoggerNet
The Coweeta LTER is located in the southern Appalachian Mountains, and has very dense vegetation cover, along with steep, mountainous terrain. This terrain plays a large part in determining the wireless communication method that will be employed at each site, with 900mhz radios being the ideal option within the Coweeta Hydrologic Laboratory, where line of site can be established. Cellular modems are our preferred option outside of Coweeta, where cellular service is available. If neither option is available, we plan to use the GOES (Geostationary Operational Environmental Satellites) transmission system.
We use the Campbell Scientific LoggerNet software to manage most of the remote portions of the sensor network. LoggerNet handles scheduled data retrieval from the remote stations,includes an editor for creating datalogger programs, and can also be used to perform real time station status checks and update datalogger programs remotely if needed. Other solutions are available, but we’ve found this off-the-shelf software to be intuitive and user friendly and we believe it has helped us get this project off the ground more quickly than we might have otherwise.
Each cellular transmission station upgrade begins with an upgrade of the datalogger to a Campbell Scientific CR1000 model and the installation of a cellular modem with a Yagi directional antenna to transmit the data. Data transmission and timing are handled through LoggerNet, and data are downloaded every 24 hours to an offsite workstation at the University of Georgia. Our pilot site is located at a high elevation near the top of a mountain and is also under dense canopy cover. Normally, cell coverage is virtually nonexistent within the Coweeta Hydrologic Laboratory. However, we learned that at some high elevation sites, the cellular signal from the opposite side of the mountain is strong enough to allow us to establish a connection sufficient to transmit the relatively small amount of data involved. Each radio transmission site upgrade, just like the cellular transmission site upgrade, includes the installation of a CR1000 datalogger. Radio transmitter sites use either a 900mhz radio with an omni-directional antenna or a Yagi directional antenna, depending on if the site is a radio repeater station or not. In addition, a radio base station was established inside one of the on-site CWT LTER offices at the Coweeta Hydrologic Laboratory. The base station consists of a radio connected to Yagi antenna directed to the repeater station, and an Ethernet adapter. Connections to the radio network can be made through the LoggerNet software, which is also configured for daily data retrieval.
We have conducted one test of GOES satellite data streaming using an existing sensor array operated by the US Forest Service at the Coweeta Hydrologic Laboratory. Streaming data through the GOES system is more challenging for a variety of logistical and technical reasons. GOES users must have a federal sponsor (graciously provided, in our case, by our partners at the Coweeta Hydrologic Laboratory). In addition, while it is possible to download data using the natively supported GOES LRGS (Local Readout Ground Station) software, configuring this software for use with an automated workflow proved to be difficult. Instead, we contacted a local National Weather Service (NWS) office and requested that our sensor of interest be included in the Hydrometeorological Automated Data System (HADS) system. This allowed us to use the HADS support in the GCE data toolbox, but it required a data documentation process managed through intensive communication with the local NWS office.
NWS personnel are highly professional and wonderful to work with, but there is an additional investment of set up time for using the GOES system, when compared to the cellular or radio options. It is for this reason, as well as the fact that GOES bandwidth is limited nationwide, and that the communication is only one way, that we recommend GOES transmission only when directly managed options are unavailable.
Data Processing and Publishing with GCE Data Toolbox for Matlab
Once data are made available on a local computer by LoggerNet, the focus shifts to configuring the GCE Data Toolbox for Matlab. This software was developed by Wade Sheldon at the Georgia Coastal Ecosystem LTER (https://gce-svn.marsci.uga.edu/trac/GCE_Toolbox) as a way to manage data by providing tools to perform QA/QC checks and error flagging, metadata creation and management, data transformation, and the creation and publication of data products. These tasks can be fully automated, making the GCE Toolbox an ideal solution for data streaming. While the Toolbox is available free of charge, it operates within the Matlab environment and requires a Matlab license.
Configuring the toolbox for the data streaming was fairly straightforward for two reasons:
- The toolbox comes with a set of Demonstration products (located in the “Demo” folder) that are built for data streaming. The demo products can serve as a tutorial so that users can do a “dry run” of streaming, processing, and publishing a sample data set. However, they are also built to be copied into the userdata folder of the Toolbox (where the Toolbox can access custom and site-based projects) so that they can be modified and expanded to meet local needs.
- Primary features of the Toolbox are custom import filters and metadata templates that can be written, stored, manipulated, and copied using the Toolbox’s native interface.
We could have begun by using the Toolbox's standard text import filter to import the raw delimited ASCII data files into the toolbox. However, an important part of the hardware upgrade was the switch to CR1000 data loggers, which uses a Campbell TOA5 file format as a standard. Since one of the products available in the Toolbox’s Demo folder is a TOA5 import filter, this format is easy to use with the GCE Toolbox’s standard import filter. Once LoggerNet retrieved a file for us to work with, it took only a couple of minutes to pull in the file.
Once we had a standard GCE Toolbox Matlab file, we could enter dataset and variable metadata, as well as flagging criteria. When all of these metadata are entered they can saved as a metadata template and stored for re-application as part of a data harvest workflow. GCE Toolbox-driven data harvest, post-processing, storage, and publication is handled by three main GCE Toolbox functions that are in turn supported by a wide array of additional files.
The data harvester (data_harvester.m) is a generic workflow program that will receive arguments concerning source file path, import filter, metadata template, and publication file paths for a given data set; use parameters provided in other supporting scripts to retrieve the data from the data source, apply import filters and metadata templates; and then publish the data to the required location. Users can save copies of this program and modify them to meet their own needs, adding additional workflow items that are not stored in other stored components of the data harvest system.
The harvest timer (harvest_timers.mat) is a GCE Data Toolbox data structure that stores information on the arguments used to run a data harvest script, and the frequency and timing with which data harvesters operate. The harvest timers are controlled by additional “start_harvester” and “stop_harvester” commands.
The standard data harvester file is configured to generate publication-ready web pages with links to data and customized plots that let users do preliminary data exploration. However, in order to actually publish the data, information managers must edit two sets of files, all available in the demo directory of any standard GCE Toolbox download. There are two Matlab script files and six standard xml and css stylesheets:
- The Matlab .m files are harvest_info.m and harvest_plot.m:
- The harvest info file contains parameters for the web file paths and links that should be established as part of the data harvest process.
- The harvest plot file contains information on the content and type of plots or graphs that should be generated for publication.
- The look and feel of websites generated during a data harvest are managed with the following XML and stylesheets:
By default, the harvest_info, harvest_plot, and stylesheet templates are configured to work with the demonstration data included in the Demo folder within the Toolbox. However, the harvest_plot and harvest_info files are based on case statements that can easily be copied within the file to expand the number of datasets the info and plot files are generating. These two files manage all configured harvests. Editing the stylesheets should be standard fare for any LTER information manager.
Summary and Conclusions
As we have worked to upgrade the CWT LTER sensor network, we have found that the solutions provided by both LoggerNet and the GCE Toolbox have accelerated the pace at which we can make data available. To date, we have reliable streaming data for eight sites, all of which are available for download at http://coweeta.uga.edu/streaming. As we move forward, we will not only be adding more stations, but will also be increasing the complexity and sophistication of post-processing at each site to ensure maximum data quality. We believe all of these goals can be accomplished with the framework we have outlined.
In addition, one of the Coweeta LTER PIs recently observed that, in adopting the tools we are now using, we have essentially created a “wall plug” that any CWT LTER investigator can use to their advantage when integrating sensor data into their research. Given the amount of effort that would be involved to do the same thing without these tools, we concur with this assessment and are pleased with the increases we are seeing in our capacity to scale up in order to meet growing demand.