Skip to Content

Using the OBOE Ontology to Describe Dataset Attributes

Printer-friendly versionPrinter-friendly version
Issue: 
Fall 2010

Margaret O'Brien (SBC)

Discussions have begun in the Network on the need to standardize dataset attributes where possible. This activity is very important, since the attribute description is what dataset users will focus on, and is probably the most important part of the dataset when it comes to enabling integration. As we develop attributes, we should also consider linking them to our two emerging registries, Units and the Controlled Vocabulary. One way to accomplish such a linkage is through an ontology, a technology which is quickly becoming the basis of the Semantic Web. As we design our attribute registry, it would be advantageous to keep some features of an ontology in mind.

The power of an ontology comes with the relationships that can be defined. There are simple parent-child relationships, as in a taxonomic tree, ( “a copepod ‘is-a’ crustacean”), the periodic table, (“a nitrogen isotope ‘is-a’ element”), or ad hoc relationships like “a lake ‘is-a’ water body”. Other more subtle relationships can also be expressed, such as “a tree branch ‘is-part-of’ a tree”. OBOE stands for “Extensible Ontology for Observations”, and by making the observation central, it can be used for annotations at the attribute level.  Following is a short introduction to one feature of the OBOE ontology which LTER could consider as we develop dataset attribute descriptions.

Dataset attributes can be defined as OBOE “Measurements” which have these four basic components: Entity, Characteristic, Standard, and Protocol.

1. Entity

 

The thing that was observed by the measurement.  In the case of LTER, the Entity will often be one of the controlled vocabulary keywords (Fig. 1). An ontology also allows synonyms; in this figure, "silicic acid" and "silicate" are equivalent.

  • Substances: ammonium, nitrate, bicarbonate, antimony, carbon
  • Habitats: aquatic ecosystem, forest, benthic, basins, arctic, clearcut
  • Organisms or their parts: bacteria, bark, beetle, arthropod, ascomycete, zooplankton, leaf
  • Concepts or processes: biodiversity, atmospheric deposition, carbon cycling, primary production

SBC OBOE entity example

Figure 1. Graph of an OBOE Entity tree for SBC substances.

2. Characteristic

 

A property of the Entity that can be measured. Characteristics might be dimensionless, like a name or a type (Fig 2). Physical characteristics (like amount, length or concentration) will have dimensions, and the characteristic will be tied to allowable units ( see Standard, below).

SBC OBOE characteristic example

Figure 2. Graph of an OBOE Characteristics tree.

3. Standard

A reference for comparing measurements, e.g., from a dictionary of units, place names or taxa. Some measurements may have more than one standard, for example, a measurement of Ammonium Concentration may allow many choices for unit. The OBOE core ontology imported the units from the LTER Unit Dictionary in 2010 and added their dimensions. Other units may be added as needed. There is also a mechanism to record conversions between units, such as for ammonium from amount (moles) to (mass) grams (Fig 3).

SBC OBOE unit example
(a)
SBC OBOE unit conversion example
(b)

Figure 3. Definition of the standard (in this case a unit) MilligramPerMeterSquaredPerDay (a), and a definition of the conversion between a substance, Ammonium, in moles to grams (b).

4. Protocol

The prescribed method for obtaining the measurement. A measurement can have only one Protocol. In ontologies, terms can have multiple parent-terms, so a Protocol may belong to multiple trees. In the example, the Protocol for Dissolved Organic Carbon Concentration can be found under a laboratory ("Carlson Lab Protocols"), under "Protocols by Constituent", and "Protocols for Elemental Analysis" (Fig 4).

multiple parent classes for the Dissolved Organic Carbon Concentration Protocol

Figure 4. A protocol may belong to multiple trees.

And as one final example, here is the full description of a measurement called "Concentration of Ammonium". In the SBC OBOE extension, it has two subclasses, for fresh water and saline water. If SBC adds another measuremnt of ammonium that required a different protocol (e.g., in anoxic porewater), we could add a third subclass. Since the OBOE measurement description includes many of the same elements as an EML attribute, it is possible to map between the two systems. Not shown here are additional rules for precision and range, which are applied at the most granular level.

SBC OBOE measurment example

Figure 5. Description of the measurement "Concentration of Ammonium".

Many communities in the life and physical sciences have designed ontologies for their specific needs, and have found semantic technology to be very powerful. As the LTER Network considers the uses and requirements of attribute standardization, it would be advantageous to consider how an ontology might be used to connect our attributes to our other registries. There are several advantages to choosing to work with an extensible ontology such as OBOE.  Since 2008, the Semtools project has been working with an LTER site’s EML datasets as a use case for an OBOE extension and for tool development. The examples here are from SBC’s OBOE extension, and its development has been mindful of the fact that many of the concepts and terms have broad applicability. A related project, the Semantic Observations Network (SONet) has a goal to develop and ratify a community-driven core ontology for representing observational data, and is considering several ontologies and observational models (including OBOE).   By working with one or both of these groups, LTER could maximize its success with this emerging technology.

For more information: