Skip to Content

Managing Controlled Vocabularies with "TemaTres"

Printer-friendly versionPrinter-friendly version
Issue: 
Spring 2011

John Porter (VCR)

The Controlled Vocabulary Working Group has been working on developing a controlled vocabulary for science keywords to be used in LTER datasets. As part of the process, the working group evaluated software for managing controlled vocabularies and higher level structures (polytaxonomys, thesauri and ontologies).  There are a surprising wide variety of software available, much of it designed for large institutional use, and correspondingly quite expensive (in the tens of thousands of dollars per year).  Needless to say, we focused primarily on low-cost or open-source software. Three major candidates were considered. The "Protege" ontology management software was the most comprehensive, supporting the widest array of lexical structures. However, the large number of features also made it harder to learn and were overkill for the relatively simple goal of developing a polytaxonomy or thesaurus.  The "MultiTes" software is commercial, with a single PC license going for roughly $250 (much more expensive enterprise versions are also available). This had the appeal that it was also used by the National Biological Information Infrastructure (NBII) Thesaurus project. The final software package we evaluated was TemaTres, an open-source, web-based thesaurus management package using PHP and MySQL that came from a library school in Argentina.  Ultimately, the relative simplicity and web accessibility of the interface (while maintaining a rich set of underlying capabilities) of the TemaTres software carried the day.

As we have come to use TemTres for a greater variety of tasks, we continue to be impressed by its flexibility and features. These include a simple, but functional, user interface for editing and browsing, sophisticated search capabilities, and the ability to export all or part of the thesaurus in a number of standardized forms. Although we have  not yet used the latter capability extensively, the availability of good export forms was a key feature, because it allows us to easily import data from TemaTres into other tools, such as Multitres or Protege, if we choose to at a future time.  TemaTres also supports a more rudimentary input capability for taxonomys in tab-indented files (but fails if their are trailing tabs), or terms in the SKOS XML schema.  The main challenges we have encountered are that some of the menus remain in Spanish, even if English is selected as the language to be used, and that the documentation of some of the more advanced features (such as the ability to link terms between two different vocabularies) is very spotty. 

Importantly, TemaTres features a rich set of web services, that provide searching and retrieval capabilities for external programs (http://www.r020.com.ar/tematres/wiki/doku.php?id=:en:tematres:web_services_terminologicos). There are several programs that work quite well with TemaTres, specifically "VisualVocabulary" which provides a graphical view of the structure of taxonomys, "TemaTresView" which provides a JAVASCRIPT-based interface for browsing and "Thesaurus Web Publisher" which provides browse and formatting capabilities coupled to configurable searches using other search engines.  

With the help of the LTER Network Office, the Controlled Vocabulary Working Group has established a "vocabulary server" for the LTER-wide vocabulary at: http://vocab.lternet.edu/vocab, including the supporting visualization software (e.g., http://vocab.lternet.edu/thesauruswebpublisher, http://vocab.lternet.edu/visualvocabulary/lter/ and http://vocab.lternet.edu/TematresView/view_thesaurus.php) Some LTER sites have also requested access to the TemaTres software for developing additional controlled vocabularies for site-specific purposes. Fortunately, with TemaTres, setting up a new controlled vocabulary is extremely simple, so that site requests can be easily accommodated (e.g., http://vocab.lternet.edu/vocab/luq).  Additionally, although we are only beginning to fully understand how to use them, there are capabilities for using the web services to link terms between different vocabularies, so that changes to LTER-wide and site-specific vocabularies can be more easily coordinated.  The relatively simple structure of the MySQL database underlying TemaTres also makes it relatively easy harvest keyword from multiple sites for the purpose of identifying new candidate terms for the LTER-wide controlled vocabulary.