Skip to Content

The GeoNIS: Adding Geospatial Capabilities to the NIS

Printer-friendly versionPrinter-friendly version
Issue: 
Spring 2012

Aaron Stephenson (NTL)

Introduction

The LTER Network Information System (NIS, https://nis.lternet.edu/) is intended to provide a number of tools and services to promote data access and availability. These include standardized approaches to metadata management and data access, programs and workflows to create and maintain integrated derived datasets, and applications for data discovery, access, and use. These services will be enabled by the Provenance Aware Synthesis Tracking Architecture (PASTA) framework, the core component of the NIS that harvests site metadata and data into the NIS. The initial development of the NIS focuses on supporting well-documented tabular data only, leaving more complex data (such as geospatial data) to a later date. Many LTER sites have considerable geospatial data holdings; at some sites, geospatial data constitute the majority of their datasets. Rather than wait for PASTA to support geospatial data, which at this point doesn’t have an implementation date, the LTER GIS Working Group intends to build a geospatial module for the NIS so that these data can be harvested, stored, and be made accessible through the NIS. This module is called the GeoNIS.

Vision

Why a GeoNIS? Location is everything.  LTER researchers need to be able to search, access, and discover data sets and related research results across LTER and other areas of the world.  Most projects need a geographic framework that helps place the project in context.  What soils are similar or different?  What is the elevation and aspect?  Are study sites similar or do they contrast one another? The GeoNIS will help the LTER network build the geographic framework within and between our sites,  and assist with the synthesis process.

The GeoNIS is intended to provide dynamic harvesting and archiving of site-based data and metadata, support value-added products, and include the ability to generate synthetically derived data products.  The GeoNIS mirrors the design of PASTA with the additional capability to store and process geographic data.  We will test the capabilities of the PASTA data cache to store spatial data before it’s ingested into the GeoNIS.

Using automated workflows triggered by an event listener, the GeoNIS will ingest geospatial data from the PASTA data cache into a geodatabase allowing data to be immediately useable by clients. Uses might include interactive mapping, geoprocessing (transforming and analyzing data to produce new data), or just simply making data available for filtering (location or attribute) and downloading only the portion of interest rather than the entire dataset.

Archiving of datasets will be a central piece of the GeoNIS. Each time a new version of a dataset is harvested by the PASTA harvester, that version will be ingested into the GeoNIS geodatabase. The goal is to enable every version of a dataset to be accessible to clients.

By integrating GIS data across sites into one repository, the GeoNIS will provide researchers with  the ability to create new products and services, and provide the spatial framework for cross-site science. For example, locations where data collection took place can be coupled with information from external services (such as the Geographic Names Information System) to create a gazetteer that would be used to assign spatial keywords to LTER datasets. Another example is creating maps on demand by assembling LTER and non-LTER GIS resources via web mapping services.

Components

The GeoNIS will be comprised of several connected components, with future links to PASTA.

A diagram showing the components of the GeoNIS

 

Site Data:

LTER sites will contribute to the GeoNIS by preparing EML complient metadata and spatial data packages.  The metadata will be harvested similar to other LTER metadata, and links to the associated spatial data packages within the metadata will be used to harvest the spatial data into the GeoNIS. The data packages will include the digital data files as well as GIS specific metadata if available.  Access to both metadata formats will ensure that spatial data is included in LTER data catalog searches as well as within specialized GIS applications.

Ingestion Workflows:

Ingestion workflows, triggered by the PASTA event listener, consist of operations that extract, transform, and load spatial data from a variety of file formats into the GeoNIS geodatabase. These scripts will be written in Python so that both ArcGIS and operating system tools can be employed and used to automate the ingestion of site data into the GeoNIS. Through the use of the ArcGIS Data Interoperability Extension (http://www.esri.com/software/arcgis/extensions/datainteroperability/inde...), thousands of data formats can be converted and ingested by the GeoNIS.

Temporary Data Storage:

A set of local file folders on the GeoNIS server will be used for the temporary storage of spatial data files while they are being operated on by workflows.

Geodatabase:

An ArcSDE database will store geospatial data (both site data and synthesized data) for the GeoNIS.  ArcSDE supports multiuser reading and editing, data versioning, and archiving.

GIS Server:

ArcGIS Server will provide the ability to create, manage, and distribute web services, which can be accessed by desktop, mobile, and web applications. Several kinds of services can be published, including OGC, KML, many kinds specific to ArcGIS (map, globe, image, geoprocessing, etc.), and more. A list is available online at http://goo.gl/JzAIt.

Web Services:

A variety of web services will be populated with GeoNIS data, allowing any number of clients to access the data. At first this will be limited to map and image services, but eventually geoprocessing services will also be developed.

Data Portal:

The GeoNIS data portal will provide the ability to discover and access GeoNIS data through a variety of interfaces: thematically grouped links, graphical mapping interface, and a textual search interface. This could be accomplished through the open source ESRI GeoPortal Server (http://www.esri.com/software/arcgis/geoportal/index.html) or directly in the NIS Data Portal, or some combination of the two.

Links to other NIS modules:

Other NIS modules could be linked to the GeoNIS, such as ClimDB, HydroDB, or SiteDB, for more in-depth analysis or for mash-up applications.

Additionally, Best Practices are being developed by the GIS Working Group for smooth operation of GeoNIS. Topics include data packaging, attribute definitions, symbology, coordinate systems, data structures, and much more.

Next Steps

One of the highest priority tasks for the LTER GIS Working Group is to request endorsements from IMExec, NISAC, and the Executive Board.  Following that, work will begin in earnest on building the GeoNIS. Phases will include the building the geospatial software stack (database, server, web services, and applications), writing workflow programs (including connections to PASTA), building the data portal, and finally creating value-added products. The GIS Working Group invites any interested members of the community to assist with this project, especially members of the web services group, NIS Tiger Teams, and the various network database groups.