Skip to Content

Preparing Spatial Data and Associated Metadata for the GeoNIS

Printer-friendly versionPrinter-friendly version
Issue: 
Spring 2012

Theresa Valentine (AND)

A primary charge of the LTER Network is making data products widely available on-line.  Traditionally this system has focused on tabular data and left spatial datasets to be organized in separate systems, with different methods and standards for documentation. The GIS Working Group has been working on resources to better integrate spatial data into the network and help site Information Managers, researchers, and students with the creation and access to these datasets and associated metadata. 

Metadata is often defined as data about data. Wikipedia defines it as “structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use or manage an information resource“.  The standard for geospatial data has been the Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM). In 2010, the FGCC endorsed 65 non-Federally authored standards for metatdata and are moving towards an international ISO suite of standards (FGDC standards). They recommend the following: “If you have a metadata collection whose contents can be accessed as XML or metadata management software that supports ISO metadata… consider converting your FGDC metadata in XML format into the ISO XML format … and use an ISO metadata editor tool to create and update it.“ If this wasn’t complicated enough, the Standard for LTER is the Ecological Medadata Language (EML). In addition, the commercial GIS software companies have created their own metadata documenting systems, and EML and other metadata standards continue to evolve, adding new versions to keep track of. Keeping up with all the changes can be difficult and frustrating for Information Managers and GIS specialists and It is often difficult to crosswalk between the different standards without losing track of where the most current information is located and compliance with Network policies.

It is however, important to remember that creating valid EML will insure that site spatial data is discoverable in searches of the Network Information System (NIS) and at local sites. This article will provide an abridged guide to the steps needed to create valid EML documents for spatial data, while keeping complete metadata within your local GIS databases, and dealing with the legacy of CSDGM metadata in existence. The workflow will also help the user prepare data packages for inclusion into the GeoNIS database (see article on the GeoNIS).

What is Spatial Data?

Webopedia defines “spatial data” as the following: “Also known as geospatial data or geographic information it is the data or information that identifies the geographic location of features and boundaries on Earth, such as natural or constructed features, oceans, and more. Spatial data is usually stored as coordinates and topology, and is data that can be mapped. Spatial data is often accessed, manipulated or analyzed through Geographic Information Systems (GIS). “ LTER sites generate the above but also have collections of remotely sensed imagery, aerial photography, computer models, historical maps, visualizations, and study site locations. The GeoNIS will be able to incorporate all LTER data that have been referenced to real-world coordinates.

Summary of workflow:

The following is a brief summary of the workflow involved in creating metadata for spatial data. The complete workflow can be found at the project website .  The final project will be a complete package of data and metadata products ready for inclusion into the LTER GeoNIS.

  1. Create metadata: FGDC or Esri format. The place to start is with a metadata editor for documenting the spatial data. Some commercial GIS software have built in metadata editors (ArcGIS), however you can use stand alone tools. A listing of tools is maintained at the FGDC website: http://www.fgdc.gov/metadata/geospatial-metadata-tools . The tool is not as important as the amount of detail preserved and using the LTER Best Practices for EML guidelines for titles, abstracts, methods, placing of URL's,  etc…(http://im.lternet.edu/im_practices/metadata/guides) as a guide.


  2. Export to FGDC or ArcGIS metadata xml file: The metadata tool should allow the user to export to different formats. The important message is that the documentation needs to be in xml format. One of the following two file type options are needed for the next step:
    1. FGDC CSDGM xml file
    2. ArcGIS metadata xml file

  3. Customize esr102eml21.xsl stylesheet and prepare for transformation. The stylesheet is the tool you need to transform you xml metadata document into EML. You will need to edit the default stylesheet to meet the needs of your site. This will allow you to automate some of the repeating information, and machine generated identifiers associated with you site. This editing is done with an xml editor such as Oxygen. The site Information Manager should be able to help with this step. A couple of edits that should be considered:
         3.a.  The Intellectual Rights EML section. Here is where you can express the data usage policies. In the absence of such source in the original metadata, the stylesheet will populate the intellectual rights with the LTER Network Data Policies. If you think you need special policies, this is the section you need to edit.  Further guidance is found near the stylesheet corresponding section.
         3.b.  The scope for EML's package ID. Since Esri and FGDC are completely oblivious to this identifier, it needs to be hard coded in the stylesheet.  Do your site a favor and change the scope to "knb-lter-yoursiteacronym", and you'll save yourself a bit of post editing. T rest of the packageID, the revision and numeric identifier, require post-edit work.

  4. Complete the transformation. Once your stylesheet has been updated, you can run the program to transform the data into an EML xml document. There are several options for running the transformation and they are all documented on the project page. There might be some formatting errors at this stage that need to corrected.
          4.a  The creator/metadataProvider/contact details.  Esri tends to lumps the first name, middle and last name in one field and one tag, but EML has separate placeholders for first name (givenName) and last name (surName).  Since the XSLT cannot decide which one is the last name, it places all the string into the mandatory last name. Please fix it accordingly. You may have to perform these edits in several places on the resulting EML. Caution, as this will not be flagged as an error by any editor or validating tool.
          4.b  The identifier and revision part of the packageID.  You need to assign these numbers according to the Metacat LTER and site protocols.

  5. Run xml document through the EcoInformatics Parser and correct errors. The new EML xml file will need to checked for errors using the EcoInformatics Parser available at:http://knb.ecoinformatics.org/emlparser/ . The parser will check your EML document and report any formatting errors. You then go back and correct errors. It’s important to note that the parser is looking for formatting errors. It will not let you know if you have problems with your content (spelling, missing data, and incomplete entries).


  6. Prepare you data for the geospatial data package. The idea of data packaging is to prepare spatial data sets that can be harvested for ingestion into the GeoNIS geospatial database. Best practices for the contents of a geospatial data package are included in the Best Practices document. 
    http://im.lternet.edu/sites/im.lternet.edu/files/Best%20Practices%20for%20documenting%20geospatial%20data_2.docx 

  7. Prepare final documents for harvest by Metacat. The data package needs to be placed in the location specified in your EML document. The EML document will be harvested by Metacat and the GeoNIS workflows will download the data package, unpack it, and add it to the GeoNIS geospatial database.

Future plans:

Test the GeoNIS workflows to insure that data package meets requirements and validate spatial data EML for quality. This is critical to insure that the data and metadata can be ingested into a central repository, and that the spatial data is searchable through Network tools. Continue working with LTER information managers and GIS specialists to make sure that the Best Practices reflect the workflow and processes required to prepare valid EML documents for spatial data.

Create a stylesheet for Esri native metadata xml. The current transformation stylesheet should be modified to use native Esri metadata documents. This will remove the intermediate step of translating to FGCD format before moving to EML. This intermediate step can cause content to be dropped when using Esri metadata tools. The FGDC to EML translator is still valid for users who use non-Esri metadata tools and create FGDC CSDGM metadata files.

Automate update process. Most LTER sites would benefit from the development of a tool that would produce a new EML document when metadata is updated in ArcCatalog. This project would require funding for a programmer.

Look at other GIS metadata programs and metadata editors. The current effort has been primarily focused on working with Esri GIS software metadata tools, as most of the LTER sites have access to the software. There are a few sites that are using other programs, and a list of those resources would be beneficial.

Prepare for new ISO format. The FGDC move to an international format will cause some ripple effects through spatial data metadata tools. We have seen some of this through the recent changes in Esri software, as they become more ISO centric. New versions of software will require updates to the stylesheet and best practices.

Link to Project page:

http://im.lternet.edu/project/GIS_document

References:

Wikipedia Metadata standards reference: http://en.wikipedia.org/wiki/Metadata_standards
Webopedia : http://www.webopedia.com/TERM/S/spatial_data.html

FGDC Content Standard for Digital Geospatial Metadata (CSDGM): http://www.fgdc.gov/metadata/geospatial-metadata-standards#csdgm

FGDC standards: http://www.fgdc.gov/metadata/geospatial-metadata-standards#fgdcendorsedisostandards