Skip to Content

Using EML in Your Local Data Catalog

Printer-friendly versionPrinter-friendly version
Issue: 
Spring 2010

Margaret O'Brien (SBC)

All LTER sites create EML to contribute to the LTER network, but it appears that only a few use their EML in their own data catalogs (see below for links). It is possible to use the EML you contributed to the network to provide content for your local data catalog. The benefits are obvious: your local data catalog will show the same information as the Network's, and it is less work to use an existing metadata resource than to create another. All modern web-scripting languages have libraries for reading and transforming XML for web page content. The process described here is similar to the display of projectDB documents using eXist (Sheldon, 2009).

In 2003, the PISCO project installed a Metacat data catalog at UC Santa Barbara, and SBC became an "early adopter." We have continued to use the PISCO Metacat as our primary data catalog, and our EML was replicated to the KNB and LTER Metacat servers. Needless to say, SBC enjoyed the control and learning environment that this relationship provided. But recently, to reduce load on the PISCO IM systems, SBC began employing the LTER Network Office Metacat installation as its primary metadata repository. As part of that process, we created our own EML-display application instead of using a pre-installed Metacat "skin".

In a nutshell, we needed three things:

  • Access to EML documents through a URL
  • Local XSL stylesheets to transform the XML into HTML
  • A transformation routine, which could be PHP, Perl, ASP, JSP, Java, etc.

In detail:

  • XML document
    This is done for you, since, in addition to delivering HTML, Metacat also can deliver your EML as XML through a URL like this:
    http://metacat.lternet.edu/knb/metacat?action=read&qformat=xml&docid=knb-lter-sbc.5

    The URL contains parameter-value pairs to designate the document ("docid"), the requested Metacat action ("action") and the output format, in this case, XML ("qformat"). For the current purposes, the second and third parameters are fixed to "action=read" and "qformat=xml".
  • XSL transformation stylesheets
    As part of our association with PISCO, SBC was already using a set of XSL stylesheets which were more modular than those shipped with Metacat and used by the LNO. As part of this project, SBC's stylesheets were further adapted to use a "tabbed" format for the major metadata sections. The XSL uses parameters ("<xsl:param>") to control which parts of the EML document are displayed. There is one main stylesheet (e.g., "eml-2.0.1.xsl") which handles the calls to templates for other modules. These stylesheets are fairly generic and easily portable (see below).
  • Transformer
    A web scripting language is needed conduct the transform (passing on parameter values as necessary), and to send the resulting HTML to the browser. The basic steps are
    1. Read in the XML source
    2. Read in the XSL stylesheet
    3. Transform the XML with the XSL and send the results to the browser
Fig. 1. Sample code for reading XML and XSL documents, and transforming to HTML using Perl CGI. (click on figure to enlarge)

Perl was used for SBC's project, but nearly all modern web-scripting languages include an XML library. Figure 1 shows sample code to process an XML document in Perl. This script operates on a URL which resembles: http://sbc.lternet.edu/cgi-bin/showDataset.cgi?docid=knb-lter-sbc.10

It should be noted that in practice, the program variables are likely to be configured outside of the script. Since EML documents can be quite long and complex, the XSL stylesheets also use several display parameters, and these will also need to be handled by the transforming script. Only one parameter is passed in this sample ("docid"). Your XML library will have instructions for passing parameters into the transformation.

Figure 2 is a screenshot of the output for an SBC EML dataset, which shows a view of the default "Summary" tab. The stylesheets also contain <xsl:include> statements for other page components, such as headers. As a test of portability, the MCR site has already installed both the Perl script and XSL stylesheets (Fig. 3, showing the "responsible-parties" modules). The configurations for application host and other parameters are set in a settings file called "eml-settings.xsl", also called by an <xsl:include>. MCR chose to further customize the display by editing the default CSS file.

Fig. 2. Screenshot of SBC LTER's "Summary and Data Links" view of a dataset using new XSL stylesheets. (click on figure to enlarge)

Fig. 3. Screenshot of a MCR LTER dataset with the "People and Organizations" view using SBC's XSL stylesheets and Perl CGI.

Since the script calls its XML source from the central LTER repository, any EML document in that repository can be transformed through this display. Simply replace the "docid" value in the URL above to view a dataset from your site through SBC's interface. During testing, non-SBC EML was viewed regularly in an effort to ensure that the stylesheets remained broadly applicable.

The current XSL stylesheets for EML that ship with Metacat are somewhat basic, and many of us have expressed a desire to see a more modular or tabbed display. PISCO and SBC intended to extend their stylesheet project, but lacked the resources to complete them.

Although the stylesheets are currently located in SBC's code repository, they could be further developed as a network resource and kept centrally. With input from our site scientists and other users, a group could streamline and standardize these according to recommended practices and contribute them to a broader community, and possibly offer another choice in the Metacat "skins".

A Sample of LTER site data catalogs which use EML:

References

Sheldon, W. 2009. Getting started with eXist and XQuery. Databits, Spring 2009.

Acknowledgements

Chris Jones (PISCO) did more than 90% of the work on the original PISCO/SBC stylesheets in 2005, which are still available as a Metacat "skin".