Skip to Content

Systems Upgrade through Technology Transfer across LTERs: Who Benefits?

Printer-friendly versionPrinter-friendly version
Issue: 
Spring 2011

John F. Chamblee (CWT) and Wade Sheldon (GCE)

Introduction

In 2009, the Coweeta LTER site began planning a complete web and information system redesign. In an early preparatory step, John Chamblee (CWT IM) and Ted Gragson (CWT LPI), met with the Wade Sheldon (GCE Lead IM), to discuss potential use of GCE technology for this effort. Both CWT and GCE LTER sites are administered at the University of Georgia, and attempts to forge closer ties between CWT and GCE have been underway since GCE was first established in 2000. The need for a system upgrade presented a fresh opportunity to push the effort forward. After discussion and demos of GCE software, and with the approval of both project leaderships, we agreed to collaborate on adapting several GCE databases and web applications for Coweeta’s use, as well as the GCE Data Toolbox software for processing Coweeta field data.  Although work continues, initial products of this collaboration are now implemented on the new CWT website, unveiled in April 2011 (http://coweeta.uga.edu).

The collaboration between Coweeta and Georgia Coastal has, for the most part, been a one-directional transfer of technology. The Coweeta LTER was able to upgrade information architectures that were out of date with a pre-built system that was more suitable for the present and future needs of LTER information management. If we were to measure the question of “Who benefits?” solely by examining the direct receipt of products by one LTER from another, then the answer must be that Coweeta received the lion’s share of return on the investment in collaboration. However, if we add to our evaluation measures of overall product adaptability and long-term potential benefits beyond our two sites, then the question of benefit becomes more equal on both sides of the collaborative equation.

When the question of “Who benefits?” is considered with regard to the broader community, three answers stand out.  In addition to providing a model for future collaborations, this project has introduced some novel approaches to technology that could be adopted elsewhere in the network. We have also learned that there are some additional steps we might take to make it easier to adopt GCE technology in the future.

Direct benefits for Coweeta

GCE provided a series of applications and databases that were adapted for use in Coweeta systems. The products included, in order of adoption, the GCE Data Toolbox for MATLAB (before it was available under GPL), the GCE_Biblio publication catalog database, the GCE_Metabase data catalog, and the GCE_Submission and GCE_Access databases (which track research projects, additional project resources, and data use). In addition, we were provided an ASP code stack to make views from these databases accessible via the Web. We also developed a shared hosting agreement through which Coweeta could use their LAMP (Linux Apache MySQL Php) server environment to access GCE-originated ASPs via a reverse proxy pass to a GCE IIS server. Finally, Chamblee received many dozens of hours of technical support and mentoring in Information Management. We can break down the benefits Coweeta received from this technology transfer in several ways:

1)  Coweeta did not have to “re-invent the wheel” and invest time and money in a series of new systems, but instead were able to capitalize on existing systems and put into use applications and database designs that had been proven through years of production use.
2)  Since GCE was willing to host CWT ASP pages, Coweeta IM was able to maintain its LAMP architecture and only had to invest in MS SQL infrastructure.
3)  Coweeta was presented with an established and high standard against which to measure their systems. By using our respective data models as a basis of comparison, it was possible to conduct a fine-grained analysis of both data availability and data structure and, in so doing, to document the areas in which Coweeta would need to improve its data management practices.
4)  Coweeta was able to take advantage of a long available but underutilized opportunity for collaboration between LTERs on the same campus, which, in this case, provided the means for informal on-the-job-training. Coweeta experienced a great deal of turnover during the last funding cycle, and Chamblee has only been on the job for two years. By working closely with a more experienced Information Manager, Chamblee was able to gain a broader understanding of the issues that affect Information Management across the LTER Network. In addition, since Chamblee is from a domain science background (archaeology and historical geography), the challenge of adapting GCE’s complex information management architecture to Coweeta provided a crash course in several new technologies.

The tools developed by GCE and Sheldon consist of databases grounded on strongly typed data models and highly modular and well-document application code-stacks. This is true of the both MS SQL / ASP applications and the GCE MATLAB Toolbox. This approach to design has unexpected benefits for both incoming and experienced IMs.

While there is no “manual” for most of the GCE applications, per se, the embedded documentation, combined with GCE’s modularized approach to design, provides what a linguist might call a “creole, ” or hybrid language, documentation of English, various programming languages, and entity relationship symbology. With these tools, a domain scientist who is reasonably comfortable with programming languages and database models can trace out the operational and logical connections and teach themselves the applications. This saves training time for the people providing the technology and it provides an occasional added benefit in terms of helping to locate areas of potential improvement within the technology. By attempting to reverse-engineer programs, the neophyte may also expose an occasional inconsistency in application logic simply by stumbling across a thread that they cannot trace to its conclusion (although these “insights” can also, at times, be due to a simple lack of understanding).

Indirect benefits for Georgia Coastal Ecosystems

Although the direct benefits for Coweeta were substantial, the somewhat uneven appearance of the cost-benefit balance between GCE and Coweeta is deceiving. In exchange for providing existing technologies and services to Coweeta, GCE received valuable feedback that provided the justification for adjusting or even re-thinking of several design decisions related to GCE’s database structures. Moreover, the ease with which GCE’s interfaces have been adopted by another site located in a very different biome has also vindicated many of the decisions behind GCE’s overall design strategy.

GCE undertook two major revisions of the Information Management system that were at least partially motivated by the feedback Sheldon received from Coweeta. After Coweeta adopted the GCE MATLAB toolbox, Coweeta had some difficulty, overcome with Sheldon’s help, adapting the functions that import data logger arrays into MATLAB files.  Sheldon ultimately rewrote these scripts and provided a more modularized approach to a system that was originally designed to handle only the workflows and preprogrammed data logger arrays at the Georgia Coastal site. After this experience, Sheldon took it upon himself to look at several other routines and re-evaluate them for similar issues.

In addition, GCE had long been considering a revision of the entity-relationship model for the GCE_Metabase.  The three principal issues to be addressed were keyword management, instrumentation metadata, and the handling of multi-entity or non-tabular data sets. Before the database was revised, keyword fields were distributed across the database, and tables for individual keyword fields were tied to other fields in parent-child relationships. The new model includes a master keyword table consisting of both keywords and scopes, which define the range of situations for which the keywords apply. Instrumentation was previously handled using a strongly typed model designed to describe the specific domain of instruments that GCE employed, but the new model accomodates simple textual descriptions as well. Finally, a one-to-one relationship between tables and data sets was revised to accommodate many-to-many relationships between data sets, entities (e.g. tables) and files. Data set entities can be time-series data stored in multiple files, a single file containing a multi-table relational database (as with a Microsoft Access Application), or bundles of GIS data in vector or raster formats.

Coweeta’s contribution to these revisions is most clear in the case of the data set entity revision. Since Coweeta's IM is a GIS-intensive operation, they were able to provide several use cases against which the data set entity model could be tested. We were also able to provide feedback on the revised ASP applications that took advantage of the new model and, as was the case with the MATLAB toolbox, provide a use case for instrumentation documentation. The difference across biomes between the two sites also allows for broader testing of the new keyword system.

Long-Term benefits for the LTER Network

The LTER network has the potential to benefit from this collaboration in two ways. First, Coweeta and GCE have demonstrated that it is possible for multiple LTER sites to co-develop a multi-platform web application system and host it smoothly and securely, using reverse-proxy logic to transparently leverage software between sites. Moreover, Coweeta has developed a novel strategy for integrating LTER-specific resources (data catalogs, bibliographies, personnel lists, etc.) with other types of more static content using a low cost hybrid approach that combines proprietary web applications with an off-the-shelf content management system (Drupal).

Rather than developing an entirely new framework using HTML, PHP, or ASP, Coweeta adopted Drupal for managing static resources such as site histories, facility description, driving directions, etc. Once the Drupal pages were in place, ASP-based pages were “re-skinned” to match using include files derived from the Drupal site template. These include files can be edited whenever the site undergoes a large structural change (e.g., a main menu revision), propagating the changes to all ASP-based pages on the site. This hybrid approach, which Chamblee is calling DrASPal, leverages the PHP and JavaScript tools that are included in Drupal and has proven both flexible and reliable.

While the primary participants in this collaboration have been GCE and Coweeta, we have also worked with the Moorea Coral Reef and Santa Barbara Coastal LTERs and included them in our discussions. Together, M. Gastil Buhl and Margaret O’Brien have been working to adopt the GCE ER-models for use at their sites, but in a PostGreSQL framework driven by Perl. As they make progress, they will open up the opportunity for other sites to adopt the GCE database models without any investment in IIS or MS SQL-based technologies.

Moving forward, there are some additional changes that would be worthwhile to pursue. At present, the database instances at each site are referrenced with three-letter LTER prefixes (e.g. GCE_Metabase, CWT_Metabase, etc.). If other sites are interested in adopting these databases, investment in a more generic naming structure might be worth pursuing – especially since sites are likely to continue hosting their own local database server instances. The driving force behind such a revision would take into consideration the cost involved in altering the SQL-based views that stand between web users and the actual database tables. These views often include hard-coded variables, such as server URLs and database instance names. Given appropriate resources, the GCE databases could be restructured in a way that such variables could be stored in tables that are populated when the database instance is first established and that could be easily edited in the event a migration is needed.

In addition, since it was necessary for Coweeta to take a full-scale approach to redesigning our web site, they were able to look over not just their site design and architecture, but also our entire site’s content. This opportunity was particularly timely because of the large number of revisions that have taken place recently with regard to NSF policies concerning data citations and data management plans, as well as the proliferation of regulations at universities nationwide concerning research involving human and animal subjects. Coweeta developed their documentation by comparatively examining and citing other sites and LNO and NSF documents and policies. This new documentation provides a good test case for the network and a potential example for other sites to adopt.

Conclusions

For Coweeta, the list of benefits provided here it is relatively short in terms of descriptive text, but that is because the benefits are so overwhelmingly clear and straightforward. In terms of the time and cost saved relative to developing a new system, these benefits cannot be over-emphasized, nor can we over-emphasize the training value these systems hold for a newcomer.  For GCE, the benefits include feedback on their designs and opportunities to pursue upgrades they were already considering in a context involving much broader use cases. In addition, the success of this collaboration validated the effort and strategic thinking behind GCE’s database and software designs, which made it possible to port them across sites and work contexts.

For the LTER Network, the long-term benefit of this collaboration is a model for cross-site technology transfer and development. The GCE /CWT collaboration was successful because the technologies in question were suitable for the purposes to which they were being put and because the principals strove to achieve mutual benefit whenever possible – accepting that the benefits would sometimes be unequal at least in the short term. Over time, this spirit of collaboration is bound to produce benefits that outstrip those already achieved.