Skip to Content

Fall 2002

Featured in this issue:

Featured in this issue are two articles about wireless technology, a description of a minimalist approach to creating dynamic web pages to display database content, and a discussion about the implementation of ArcIMS by a seasoned ArcIMS user. Other articles describe the innovative GCE Matlab Toolbox and the future of the All-Site Bibliography.

DataBits continues as a semi-annual electronic publication of the Long Term Ecological Research Network. It is designed to provide a timely, online resource for research information managers and to incorporate rotating co-editorship. Availability is through web browsing as well as hardcopy output. LTER mail list IMplus will receive DataBits publication notification. Others may subscribe by sending email to databits-request@lternet.edu with two lines "subscribe databits" and "end" as the message body. To communicate suggestions, articles, and/or interest in co-editing, send email to databits-ed@lternet.edu.

----- Co-editors: Kristin Vanderbilt (SEV), Tim Bergsma (KBS)

Featured Articles


GCE Data Toolbox for Matlab® -- Platform-independent tools for metadata-driven semantic data processing and analysis

- Wade Sheldon, Georgia Coastal Ecosystems LTER (GCE)

In the Spring 2001 issue of Databits I described the initial development of the GCE Data Toolbox, an integrated set of Matlab functions for dynamic analysis and documentation of tabular data sets stored in a standardized data structure format (references). Development has steadily continued on these tools since then, and a suite of graphical user interface (GUI) applications was recently added to provide convenient access to most of the capabilities of the toolbox for users unfamiliar with the Matlab programming language. These GUI applications use standard menus, graphical controls, and dialog boxes for input and are compatible with any operating system that supports the Matlab environment, including Microsoft Windows (9x, NT, 2000, XP), Linux, Solaris, and Macintosh OS/X. Complete descriptions of the functions and screenshots of the GUI applications are now available on the GCE website (http://gce-lter.marsci.uga.edu/lter/research/tools/data_toolbox.htm), so the remainder of this article will focus on new innovations in metadata-driven semantic data processing, potential applications of this technology, and plans to add support for metadata stored in Ecological Metadata Language 2 format (see http://knb.ecoinformatics.org/software/eml/).

Semantic Data Processing

The guiding design philosophy of the GCE Data Toolbox is that ecological metadata should remain inextricably linked to the data set it describes, and the metadata used to determine which operations are appropriate for any given data value based on the type of information it represents. I refer to this philosophy as metadata-driven semantic data processing.

One of the most important roles that semantic processing technology can serve is to protect the validity of data and calculations throughout all processing steps. For example, GCE Data Toolbox functions query column descriptor metadata information for operations ranging from generation of formatting instructions for numerical display and export, determining which statistical procedures are appropriate to perform when aggregating and summarizing data sets, confirming column compatibility when merging and joining multiple structures, and validating new entries made in the data editor. This semantic approach greatly minimizes the potential for data contamination compared to the standard spreadsheet and database analyses often used by ecologists, because these programs either freely coerce values between disparate formats or protect values based only on gross data format characteristics. For instance, a relational database query combining temperature data in °C with data in °F would not generate an error as long as the column data types are compatible; in contrast, these columns could not be joined or merged by GCE Data Toolbox functions unless the units were first standardized.

Semantic processing based on metadata also supports intelligent application automation. Many GCE Data Toolbox functions and GUI applications automatically identify candidate data columns without user intervention, based on data storage type, variable type, and numerical type metadata descriptors and column units. Examples include date/time and geographic coordinate inter-conversions and column unit conversions. This approach is also used in the data mapping application to identify georeference columns by column name and variable type and then automatically project between coordinate systems, when necessary, based on column units. Automatic unit conversion capability will also be added to relational join and merge functions in the near future, further simplifying the creation of synthetic data sets without loss of validity.

An important prerequisite for semantic processing, of course, is that metadata information remain synchronized with the data it describes. This is a major challenge when data is processed using disparate programs, but is accomplished automatically and transparently by GCE Data Toolbox functions. All data structure changes are logged by date to a history field, preserving the full context of all data processing. Column descriptors are also dynamically updated each time metadata is displayed or exported, and value codes are automatically generated and documented in the metadata when text fields or QA/QC flags are encoded as numerals for export in formats that don't support mixed alphanumeric characters (e.g. Matlab numerical matrices). Toolbox functions also add ancillary processing information to relevant metadata fields when appropriate, such as documentation of equations used for automatic unit conversions in the Calculations field of the Data section. These steps ensure that metadata remains relevant and useful regardless of how many updates or transformations have been performed on a given data set.

Potential Applications

The GCE Data Toolbox was primarily developed to process and package GCE-LTER data for automated analysis and distribution, but many other uses of this technology are possible. Metadata-based semantic mediation permits highly specific processing based on simple generic commands, making these tools ideal for many automated batch processing tasks. The dual user interface design (command-line and GUI) and platform-independence of the Matlab language also provide broad compatibility and flexibility.

Generic data import filters are provided for parsing delimited ASCII files and arrays and matrices stored in Matlab binary files, and data can also be imported directly from relational database tables, views and stored procedures via SQL queries (requires the optional Matlab Database Toolbox). Metadata can be imported from tokenized headers on ASCII files along with the data or manually entered into a GUI editor. The toolbox also includes support for user-editable metadata templates, in which column descriptor and general metadata are defined in advance and then applied to newly imported data by matching column names, data types and units with template entries. Multiple data and metadata export formats are supported as well, including delimited ASCII, CSV, Matlab (both arrays and matrices), and table insert/update for SQL databases. This combination of import and export capabilities, metadata template support, and automated analyses should allow these tools to be used in a very wide range of data acquisition, processing and presentation applications.

Recent enhancements to Matlab itself, such as seamless integration with JAVA classes and data types (introduced in version 6) and additions of native XML and XSLT support, network data access, and timer objects (introduced in version 6.5), also open up many exciting possibilities. For example, the new Matlab 'urlread' and 'urlwrite' functions allow data to be retrieved from any Internet-accessible data store accessible by URL using standard HTTP GET and POST requests. Together, the urlwrite and timer functions allow fully-functional, automated data harvesters to be programmed with a few lines of code once import scripts and metadata templates are defined for the data source. Just such a harvester was recently implemented for meteorological and hydrographic data from the USGS real-time monitoring station at Meridian, Georgia, and similar harvesters will soon be implemented for the NOAA NDBC buoy off Sapelo Island and USGS gauging station on the Altamaha River. The harvester imports and validates raw data, applies metadata via template, performs QA/QC flagging based on criteria specified in the metadata for each column, performs bulk English-to-metric unit conversions, automatically adds a serial date column calculated from individual date component columns, archives the raw and processed data, appends the new data to a cumulative data structure, removes any duplicate entries from overlapping harvests, and regenerates weekly and monthly plots of key parameters. All operations listed above are implemented with a few generic toolbox commands which can easily be edited and reused for other data sources.

Support for EML

Now that the LTER Information Managers have collectively agreed to support Ecological Metadata Language (EML) version 2 as a network-wide metadata exchange standard, a question that naturally follows is how EML support can be added to existing metadata-based tools such as the GCE Data Toolbox. The metadata standard used by the toolbox is nominally based on the content standard for non-geospatial metadata recommended in the 1995 Ecological Society of America Committee on the Future of Long-term Ecological Data report, as described in Michener, 2000 (references). Primary data descriptor fields, such as column name, units, description, data storage type, variable type, numerical type and precision, are stored in dedicated structure fields and managed along with the data columns themselves. The remainder of the metadata is stored in a parsable three-column array of section names, field names, and field values, which can be searched and updated by toolbox functions, manually edited in a GUI application, and formatted for display using a simple style language.

This metadata scheme is flexible and extensible, but differs from EML 2 in two important ways: 1) much flatter hierarchy of fields, designed to store child elements primarily as preformatted blocks of text; 2) lower granularity of some elements, such as person names and geographic information. Both of these differences stem from the fact that the toolbox metadata standard is optimized for final storage and formatting of metadata derived from another primary source, such as a relational database management system, with basic support for parsing and searching to support updates and meshing metadata from multiple structures. In contrast, EML 2 is designed to store a much wider range of metadata information of varying complexity in a more modular fashion.

With these differences in mind, the natural first step towards providing EML 2 support is to develop an import filter that parses EML documents, extracts the column descriptors, and formats the remainder of the metadata to support a simpler schema based on sections and fields as described above. Now that Matlab natively supports XML and XSLT, this can be accomplished by writing XSLT templates that convert EML documents to tokenized text headers already supported by the ASCII import filter. Work on this approach is already planned, and will coincide with efforts to develop EML support for GCE metadata stored in the GCE metadata database. Plans to add support for exporting EML from the GCE Data Structures are less certain, but will be explored as these other technologies are implemented. For the time being, experimental XML metadata export functions have been developed which may be extended to provide limited EML export support in the future.

Conclusion

The GCE Data Toolbox for Matlab has proven extremely useful in all phases of data acquisition, processing, analysis, presentation and distribution at the GCE LTER site during the past two years. With the addition of easy-to-use GUI applications, program documentation, and planned support for standard EML metadata, hopefully other sites will be able to benefit from these efforts as well.

References

Michener, William K.  2000.  Metadata.  Pages 92-116 in: Ecological Data - Design, Management and Processing.  Michener, William K. and James W. Brunt, eds.  Blackwell Science Ltd., Oxford, England.

Sheldon, W.M.  2001.  A Standard for Creating Dynamic, Self-documenting Tabular Data Sets Using Matlab®.  DataBits: An electronic newsletter for Information Managers. (http://intranet.lternet.edu/archives/documents/Newsletters/DataBits/01spring/)

LTER All-Site Bibliography 2002 - Update

- James W. Brunt and Troy Maddux, LTER Network Office (NET)

The first serious discussion of an LTER all-site bibliography (ASBIB) was at the Information Management Committee meeting in Toronto, Canada in 1989. At that time, the task of building a publication of the LTER site publication information was daunting to most site information managers because of the heterogeneity of site solutions to storage and management of bibliography data. Caroline Bledsoe, working then with the LTER Network Office in Seattle, proposed that IM's create the LTER All-Site Bibliography as a group project. After the waves of excitement for this new proposal calmed, Caroline Bledsoe and Harvey Chinn pursued implementation of this vision with funding from the LTER Network Office. The vision has evolved, as in punctuated equilibrium, with periods of relative inactivity after some major jumps. However, to-date, none of the implementations has been maintainable without a lot of work by site and network IMs. At present, we are standardizing on a proprietary end-user software package to take advantage of its web-posting solution. The problem is that neither this nor the previous solution has had the power of an open relational database management system behind them: a big drawback when attempting to provide any value-added components to this very useful data set. We are now at the point where storage and maintenance of an all-site bibliography is much more feasible; we can do it with minimal impact on the sites and in some cases actually provide a valuable service to the sites. But getting there has been a trial.

The Bledsoe/Chinn model:

Harvey Chinn programmed unique scripts in UNIX shell and awk programs to parse each of the (then) 18 LTER site bibliography formats. Each format was text which meant that some sites were already doing conversions from some other format to a unique text format - the first source of divergence. The scripts converted each of these unique text formats to a "standard" format that was a really a laundry list of attribute-value pairs from all the sites. This wasn't a bibliographic standard into which the entries were being placed but a loosely standardized list that included numerous ambiguities - a second source of divergence. Sites would submit their bibliographies to the central site via email or ftp and then the scripts for each site would be run to produce the "LTER" bibliographic format. All of the 18 common files were then indexed together with WAIS index - a pre-www information service product. The indexing produced a bibliography product that was searchable via gopher and later via the www. This system is documented in Bledsoe and Chinn (1997).

The Current Model:

The all-site bibliography is currently housed in an Endnote library file compiled from the all of the bibliography entries sent to the network office from all of the 24 sites. The decision was made to use Endnote as it was already being used by a majority of sites and it allowed for deployment of LTER references to the web in a relatively quick and painless manner. Sites that didn't have EndNote were provided with a copy by the LTER Network Office. Reference Web Poster (RWP) is the software package that handles the web display that can be found on the intranet page http://intranet.lternet.edu under All Site Bibliography. Someone processes Endnote files by hand and concatenates them into a master list which is pulled into RWP. The search capabilities in RWP are limited and you cannot query the library for accounting and statistical purposes. The only way to separate sites is by using one of the EndNote fields to encode the site information.

The Emerging Model:

Taking a strategic look at the needs for an all-site bibliography reveals a variety of potential and actual uses:

  • Accounting of publications (lingua franca) for the LTER Network Program
  • Facilitation of cross-site and synthetic studies
  • Generating new interest in LTER sites and LTER research

In addition, there is a need for some features useful to sites and other organizations, including,

  • Site/organization-specific searching and exporting
  • Turn-key bibliography for site information management systems (See Box 1)

Inherent in these goals is the challenge of maintaining integrity in the information and not forcing sites to do anything special with their data (e.g., encoding their site name in certain fields).

The emerging model for the all-site bibliography takes these needs and challenges into account by providing a variety of input and output channels organized around a flexible relational database infrastructure (Figure 1). A user can employ the web interfaces exclusively to manage and manipulate a site bibliography, can manage their bibliography in their own relational database management system (exposing their bibliography through Web Services), or can use Endnote - or other software producing an Endnote export file - in a variety of configurations to manage their bibliography.

Back End: The new model for the LTER All Site Bibliography is to house the data in a SQL Server database management system that reproduces Endnote format exactly; sites can still use Endnote as their desktop bibliography application.

Front End: The web interfaces to the bibliography database are being built for:

  1. Single entry of a reference
  2. Bulk upload of numerous references
  3. Updating an existing reference in the database
  4. Deleting a reference from the database

ASBIB Interfaces

The Single Entry Page. The script that generates this page will ask you to select the reference type (e.g. book, journal article) you are entering and then will dynamically generate the page with the fields associated with that reference type in EndNote. Next you will be asked to enter your LTER site from a pull down list. The script will check to see if the reference you are entering shares the same name, author, date, and site as an existing record in the database. You can change the record that is in the database (if you have permission); or you can leave the existing reference record and add another site association with that record so it will show up with your site's references; or you can discard the pending reference; or you can put the pending reference in the database as a new record.

The Bulk Upload Page. This script will allow you to select a properly formatted file from your local computer and upload it to the database. A report is returned to the screen with the number of records entered and number of duplicates. Duplicates are displayed on the screen and the questions mentioned above will have to be answered before they are processed. Eventually, this will also handle the configuration for dynamic database connections.

The Update/Delete Page. After proper authentication, this script will display information about records selected for update or deletion. If you are satisfied that you are looking at the right reference you can answer yes to proceed with the update or delete.

The Search/View Page. The script that generates this page is patterned loosely after the Reference Web Poster Interface but will make searching easier and more full-featured. You will enter a search string and you will be able to specify which fields you want to search; or you can choose your site from a pull down to get a quick list of all of your site's publications.

LTER All-Site Bibliography
Figure 1. Diagrammatic representation of the planned LTER All-Site Bibliography

Status: The new database is 95% complete - we're loading the current 16,000 entries and are testing various configurations. This is being done with a perl program that will become the central component of the on-demand bulk upload page system after the initial loading of the database. The perl program, "asbibloader", is complete but we are re-writing it to take advantage of some new perl features. Asbibloader contains the quality assurance component that checks to see if a record already exists in the database. We are tweaking this feature to make it more robust. In addition, it parses out authors into a separate table to allow for some faster search indexing and for links from LTER researchers to publications and vice-versa. We currently get authors in many different forms. One author to a line; all authors on one line separated by commas; all on one line with commas and an "and" for the last author; all on one line delimited by a semicolon; etc. And sometimes they are mixed within one set of references. Parsing these without losing data has been our most recent challenge. We are hoping that providing this feature back to the user will result in a gradual homogenization of method.

Several other features deserve mention. We've tested the cocoon connection to the CAP LTER bibliographic database and are currently mapping EML 2.0 to the database schema. We're mapping the Z39.50 bib.1 attribute set to the database and coding the web interfaces for the asbibloader and the individual entries. We would like to role this out in January if the force be with us.

References Cited

Harvey Chinn and Caroline Bledsoe. "Internet Access to Ecological Information: The U.S. LTER All-Site Bibliography Project". 1997 January . BioScience 47(1).

BOX 1: A Bibtime Story

Ingrid the information manager needs to keep track of the publications produced by her site but she wants to do the least amount of work possible to get the desired results. Her desired results are that the references need to be searchable from the web by site, author, year, or keywords. They need to be locally available in a form that Endnote can read so researchers can easily prepare publications for various publishing sources. They need to be available to the greater Ecological community.

What does Ingrid do? She puts all her references into EndNote export format and bulk uploads them to the LTER All Site Bibliography and she pulls them back out in EndNote export format. She compares this with what she started with. Any problems she encounters, she reports to the Network Office where the amazing people there make things all better. She keeps a copy locally in case those bozos at the Network Office…I mean just-in-case of emergencies…and every time she adds a new reference to the All Site Bibliography through the convenient web interface, she updates her local copy.

So Ingrid just has to keep the references reasonably current and she gets:

  1. The ability to search on any field in the database record
  2. An up to date EndNote export file whenever she or a researcher wants it
  3. The ability to search her site's references or any LTER site's publications or all LTER site publications
  4. Statistics on publications through time, by author, across sites

Design and Rationale for a Minimalist Dynamic Datasite

- Tim Bergsma, Kellog Biological Station (KBS)

In the Fall 2001 Databits, Wade Sheldon explained that not only are dynamic web pages a great way to get data out of databases, but also databases are a great way to update and control dynamic pages (Database Techniques for Creating Maintenance-free Web Pages). Here, I'll try to show that a tight relationship between dynamic pages and databases can give you a fully-functional datasite with only two pages! Then I'll suggest some reasons for adopting the minimalist approach, even if only in part.

Design

Let's start by defining a critical concept: the zero-content web page. We'll assume that your primary data is tabular, and is already in relational databases. You could write a dynamic server page (JSP, ASP, CGI, other?) for each data table, using appropriate connection parameters. If you provide about the same features for each table (nice header, nice footer, links to your homepage, whatever) you'll quickly notice that your various pages are nearly identical, except for the connection parameters. Why not have your page get even the connection parameters from a database? Suddenly, your pages are identical in structure and function, and can all be collapsed into a single "page-that-serves-tables". I call this a zero-content web page because it gets all its data, even connection data, from a database. (Actually, "near-zero": you'll need to hardcode connection information for the "master" database). You can use this page to get different tables by passing a "lookup" parameter in the URL.

A second critical concept is a design principle regarding how your datasite's workload is distributed among pages: organize pages by function rather than by dataset. We already have a page that serves tables; we also need a page that serves metadata (or at least pointers to metadata.) There: a complete datasite with only two pages! For a crude example of a page-that-serves-tables, see http://lter.kbs.msu.edu/Data/table.jsp?Product=KBS002-001&limitBy=Year&order=desc. For a page-that-serves-metadata, see http://lter.kbs.msu.edu/Data/LTER_Metadata.jsp?Dataset=all.

I hear you saying, "Hmmm...kinda like a meal of tofu and bran flakes: complete, but not very satisfying." I agree. You'll probably want to add pages to serve images, maps, personnel, citations, or some other specialized entity type. And you'll almost certainly want to add control and navigation pages, as Wade described earlier, so that your users never have to type all that url-parameter stuff by hand. For especially rich data sets, you may want to add pages that have dataset-specific functions. You could also add pages for submitting rather than retrieving data. But chances are, you will always have fewer data functions than data sets. So organizing by function helps you achieve a "minimal" datasite.

Rationale

The reasons for adopting a minimalist approach to datasite design are all variations on the theme "low maintenance". Properties of a minimalist dynamic datasite include the following.

  • Maintainability. There are fewer pages, so there are fewer places to look for an error, if one occurs (my site dropped from 60-100 dynamic pages to less than 10). The pages themselves need no editing when the database content changes. Also, there are fewer links to maintain: by authoring just two hyperlinks, you can make every data "page" point to its corresponding metadata "page", and vice versa.
  • Adoptability. A minimalist design, in whole or in part, could be easier to re-implement at another site.
  • Intelligibility. With just a few pages, you leave less of a mess for the next data manager. And, as someone has pointed out, the next manager might just be you.
  • Extensibility. If you want to add a data manipulation feature for all your data tables, you only have to add it in one place.
  • Scalability. No matter how many datasets you accrue, the size of your datasite never needs to change.
  • Consistency. Since, for instance, all your data tables are served by the same page, all your data "pages" have the same look and feel, by default.

Conclusion

The power of the minimalist site design derives from two well known principles: the principle of code reuse in computer science, and the principle of normalization in database theory. Writing a single page that serves many tables is simply an example of code reuse. Getting all content from a relational database means it can be represented exactly once, and therefore definitively (normalization/data reuse). Organizing a dynamic datasite into a few function-oriented pages can greatly decrease the maintenance burden, which in turn enables greater functionality. While few sites will actually limit themselves to just two pages, the minimalist approach could yield benefits wherever applied.

Note

I use the term "datasite" to represent an integrated subset of a website that is dedicated to providing formal data. At my site, a secretary maintains the administrative part of the web, and I mainly concern myself with the part devoted to delivery of research products.

What Every Information Manager Should Know About Wireless

- John Porter, Virginia Coast Reserve (VCR)

First of all, wireless spread-spectrum networking is not magic. Sure, it works in mysterious ways with radios leaping from frequency to frequency. Sure, it can do amazing things – like connecting a remote field site at speeds usually seen only in a LAN-wired building. Sure, wireless wizards pull off miraculous tricks. However, it is not magic…. (I’m almost sure)!

Wireless networking is accomplished using several different frequencies and technologies with different capabilities. First let’s talk capabilities. There are two basic modes of communication using spread spectrum radio – serial and IP. The first replicates the use of a serial cable to connect devices. In fact, you can think of it as one really, really long cable! It is used by hooking up a device with serial RS232 output (e.g., a data logger) to a radio in the field, then connecting a paired radio to the serial port on a PC in your lab. As far as the PC or logger is concerned, they are just hooked together by a cable – they don’t notice the radios in the system. Speeds can be high (in serial terms) with speeds of 115,000 bits per second. More than fast enough for most data loggers. There are elaborations that can be used. Freewave radios (the Cadillac of serial spread spectrum radios) have options that allow a master radio to query various “slave” radios – just as you might have the serial port on you computer hooked to a switch box connected to various serial devices. The one difference is that it can be set up to be automatic – you don’t need to be there to turn the switch.

The second mode of communication, IP, uses spread spectrum radios as if they were hubs or network cards on a wired net. The most common standard is called 802.11b or “Wi-Fi”. Radios using this standard can interoperate (although not all features from different vendors may be supported) and exchange data at high rates of speed, typically 11 megabits per second (slower if the radio connection is weak). There are also specialized network hubs and bridges that use proprietary, non-standard protocols that may provide higher speed or greater ranges. The wireless equivalent of a network hub is called an “access point.” It contains a radio capable of communicating with up to 255 wireless clients. It is typically then hooked to a wired LAN with a standard network cable. The equivalent of a wired network card is usually a PCMCIA card suitable for use in a laptop computer, or a small box connected via a USB port. Again, as far as the PC is concerned, it is simply connected to the network as if by a wire. For serial devices rather than PCs, there are “bridges” that allow you to go from serial to IP. Then, the trick is to get software on a PC to talk via IP to that device. Fortunately, many hardware manufacturers (e.g., Campbell Scientific) are realizing this and providing capabilities in their software for IP/ Serial connections.

Regardless of the mode (serial or IP) you use, radio frequencies are relevant. First, you don’t need a license to run wireless networking. The positive side of this is that the networking is therefore free – you don’t need to pay a cell-phone company. The downside of this is that the frequencies where unlicensed spread spectrum networking is allowed are “garbage frequencies” where no one wanted to purchase the rights to the spectrum (usually because of undesirable characteristics). In the US, these are in frequencies around 900 MHz, 2.4 GHz and 5 GHz (outside the US, often only 2.4 and 5 GHz are available). These frequencies are used for a variety of other applications (e.g., portable phones, microwave ovens).

A special challenge for ecological researchers is that all these frequencies require line-of-sight between radios and are easily blocked by vegetation. For example, a microwave oven works by heating water molecules using a strong 2.4 GHz radio signal. When your wireless LAN card uses a much weaker signal in that same frequency to transmit data through trees, it ends up heating up the water in the leaves and stems which eats up the minimal power in the signal. 900 MHz is a little better in this respect, but neither frequency will punch its way through either a forest or a hillside! Adding to the challenge, spread spectrum radios are limited by law to low power (less than 1 watt), but most equipment makers, driven by a desire to increase battery life in laptop computers (the main wireless market), are using much less than that – almost all are less than 0.1 watts and some are at 0.03 watts.

Deploying spread spectrum radios in real-life ecological field situations requires some cleverness and trial and error. First, site selection is critical. If you can get high enough on a tower or hilltop, you can provide that critical line-of-sight between radio antennas. Ranges can also be improved by using directional antennas that channel all the available power into a beam aimed at your receiver. Additionally, the power of transmissions from very low power wireless network cards can be boosted by external amplifiers up to the full allowable 1-watt. Depending on that critical line-of-sight, this allows data to be sent at high speeds for several miles. Combining these techniques can provide realistic ranges in the tens of kilometers.

With respect to strategies for deploying wireless, here are a couple pointers based on our experience at the VCR/LTER. First, try to define your current and future needs as clearly as possible before starting. You need to know your destination before you start a journey. Second, find out as much as you can by talking to folks that are already involved in wireless projects. We benefited immensely from the expertise that Dave Hughes and Tom Williams brought to the project. Their WWW site (http://wireless.oldcolo.com) has lots of valuable resources and even step-by-step diaries from different projects in which they were involved. Finally, experiment! During our stepwise deployment of wireless technologies at the VCR/LTER, we tried out a lot of different radios and antennas, including a set of Freewave radios borrowed from John Vande Castle at the LTER Network Office. We figure that if it turns out a radio or antenna won’t work in one situation, another situation will be found that can still use the equipment – so nothing will be wasted.

Wild about Wireless at the VCR/LTER

- Tom Williams, Old Colorado City Communications

- John Porter and Phil Smith, Virginia Coast Reserve (VCR)

Introduction

Among the technologies with the biggest potential to change the way we do business as ecologists are the new advances in wireless communications. Although ecologists have been using licensed VHF radios for data transfer for decades, the lower costs, higher speeds and easier use of spread spectrum technologies has opened the door for whole new classes of uses. With LAN-level speed, data flows can go beyond numbers to include images and sound, or arrays of sensors that have extraordinarily high data rates. We are on the cusp of advances that will allow us to deploy arrays of low-cost, lightweight wireless sensors for monitoring micro- as well as macro sites. However, our focus here is on the task of providing low-cost wireless connections to our remote field site.

Our goals for the VCR/LTER wireless project were ambitious and threefold:

  1. To allow transmission of meteorological and other digital data sources from our island study sites back to our researchers
  2. To provide access to real-time weather radar and other web-based information sources for researchers and technicians in our study areas
  3. To support video-teleconferencing for use with classes and real-time interactions between researchers, students and technicians located at both island and mainland locations

We were extremely fortunate in three respects. First and foremost, we received huge amounts of help (and even equipment) from Dave Hughes and Tom Williams of the NSF-funded Biological Sciences by Wireless Project (http://wireless.oldcolo.com/ ; note: this site has a wealth of practical information on wireless). Second, our site is well-suited to line-of-sight communications – with little topography or tall vegetation to block radio signals. Finally, we had access to lookout towers located at both ends of Hog Island (a principal VCR/LTER research site). With the help of Dave and (especially) Tom, we were able to achieve all our goals, with development of an 11MBS wireless LAN on Hog Island, linked to the mainland at E1 (2 MBS) speeds. So far, this has allowed us to deploy WWWcams to monitor research sites (http://www.VCRLTER.virginia.edu/wwwcam), receive real-time meteorological data, conduct prototype videoconferencing sessions from the island and to establish working LAN connections on our boats.

As part of their project, Tom Williams made diary entries tracking the progress of the system development. Here are some excerpts (the full diary with technical details and photos is available at: http://wireless.oldcolo.com/biology/OysterMenu.htm). Editors comments will be in []s:

Diary #38 [The first step was to prove the feasibility of a wireless link to the island using a 0.115 MBS serial Freewave radio]

During the previous week they [the VCR/LTER Staff] had purchased equipment, ladders, safety gear, tomato stakes, and everything else they could think of for work on an 80-foot tall unused fire Tower on Hog Island, where we hope to create a radio relay/hub….. The link we attempted is some 14 miles, mostly over Hog Island Bay. At that distance there is no guarantee of success, especially with the unknowns of reflection and absorption presented by saltwater, sandbars, and a small stand of trees…..The LTER staff didn't like the idea of this writer [Tom Williams] climbing a rusty old fire tower, so I stayed at the farmhouse and operated a radio with laptop and a 9 dB Yagi antenna, pointed towards the tower. Due to the steep pitch of the roof we also decided to make the first test from the lower-but-flatter porch roof/balcony (some 20 feet shy of the farmhouse's pinnacle). We kept an extension ladder handy in case we could not get a connection at that height…..While we wanted to use a Yagi -- directional antenna -- at the farmhouse, the preference for Broadwater Tower was/is to use an omnidirectional antenna, make the tower accessible from anywhere on the island as needed to extend the link. Phil Smith thus took both a 9 dB Yagi and a 6 dB omni to the top of the tower, as well as a radio, battery, and loopback plug…. Around 5 PM,during final preparation to start the test, we noticed that the farmhouse radio already had its green light on, indicating more-or-less workable connection -- and that with the Yagi lying on the porch roof/balcony floor. It was intermittent, but it was green. A more deliberate aiming of the farmhouse Yagi yielded a solid connection to the omnidirectional antenna on Broadwater Tower…. We thus determined that we could easily establish a reliable fourteen mile link between the VCR/LTER headquarters and Broadwater Tower, using one watt, 900 MHz, 115,000 bit per second serial-port-only FreeWave radios and a 9 dBi Yagi antenna at the farmhouse and either another Yagi or a 6 dBi omni on Broadwater Tower.

Diary #39 [The next step was to try some radios that support full networking (e.g., IP, LAN) as compared to a serial link]

On November 20 we tested two IP-ready radios -- the NovaRoam 900 and the WiLan Hopper Plus model 22-09…..We tested both radios over the 14 mile distance between the VCR/LTER headquarters in the 'Farmhouse', across Hog Island Bay to the Broadwater Tower, using the same antennas as we had previously used to test FreeWave radios (see Diary 38): 9 dBi Yagi antennas at both sites, plus a 6dBi omnidirectional antenna on Broadwater Tower…..The Wi-Lan radios failed to link up, even Yagi-to-Yagi. The only encouraging indication was with the unit on Broadwater Tower, on which a lone orange light showed that the tower radio was receiving data from its partner. No such result at the LTER headquarters, however. The 500 mW radios could not both span the distance….The NovaRoam radios did link up, when set for lower speed of 159kbps, the recommended setting for over 10 miles. Although the Yagi-to-omni link would not work, Yagi-to-Yagi did -- particularly when the antennas were horizontally polarized….. Even though the NovaRoam radios passed the test that the WiLan units failed, we decided to give the WiLan radios a second chance due to the desire higher bandwidth that would be needed if we were to run any live video applications. This decision was reinforced by telephone consultation with John Kinghorn of Wi-Lan's Technical Support department, who informed us that (1) the orange light at Broadwater Tower indicated that a link had been half-established, and (2) a higher-gain Yagi antenna would likely solve the problem. We were, in short, very close to a robust link…..On December 7, Tom Williams went back to Oyster, and everyone performed another test of the WiLan radios, this time using 13 dBi Yagi antennas made by Cushcraft. As we had hoped, the extra 4 dB gain at each end paid off nicely with a solid link between the farmhouse (headquarters) and Broadwater Tower.

Diary #45 [Our link to the mainland proven feasible, the next step was to install and tune it and start working on establishing 802.11b (Wi-FI) links within the island]

On April 2, 2002, we installed two radios at Broadwater Tower; first the Wi-Lan 900 MHz backhaul, and then the first of Hog Island's two Zcomax 802.11b access points….

The 900 MHz Wi-Lan backhaul was set up using 13 dBi directional (Yagi) antennas detailed in Diary 39. The connection is robust, at least in terms of signal strength. So far there have been no reports of downtime due to signal interference by weather. However wind, which is fairly constant at the top of Broadwater Tower, caused the Yagi to oscillate like an accordion reed, bouncing up and down about 6 times per second….

We not-too-coincidentally experienced about 20% dropped packets. The antenna oscillation was stopped by running thin rope from the antenna diagonally to the railing, thus dampening the vibration and solving the dropped packet problem……Having gotten both the backhaul and the initial 802.11b connection to work, we sent out a celebratory e-mail to other participants and returned to the boat. Unfortunately, the tide had gone out and we were stranded a few hundred feet off Hog Island for an additional three hours. To help redeem the time, we used a laptop computer to watch (and listen in stereo to) streaming video of a rock concert by the alternative group Indigenous, which was streamed via RealPlayer at a sustained data rate of 300 kilobits per second.

Diary #46 [Linking to the tower on the north end of Hog Island]

On April 4, 2002 we began the extension the network to Machipongo Station at the north end of Hog Island using 802.11b radios for both cloud and uplink….. Since it was going to be an unamplified link, we used a 24 dBi Radiall/Larsen 0.6 meter solid dish to make the 9 kilometer link back to Broadwater Tower….. In retrospect, we could probably have used a less expensive (and less expansive) antenna for the uplink to Broadwater Tower; tests showed that you could make the link acceptably using a 14 dBi panel antenna costing $60 (versus almost $400 for the 0.6 meter dish).

Diary #47 [Getting LTER boats on the Internet]

On August 9 we set up one of the LTER's boats for mobile high-speed Internet access. Anticipated uses include: Instantaneous tide information, Weather reports, correspondence/reporting back to the lab, Videoconferencing with the lab or with others; and Entertainment while stuck at low tide (see Diary 45)…. The next day we performed a range survey from the boat. When one does this from a car, driving around town looking for unsecured 802.11b access points and logging their locations and signal strengths, it is called "War Driving." Indeed, we used classic war driving tool for this survey: a program called Net Stumbler which, in addition to recording signal strength, noise level, access point name, and just about every other datum that is broadcast by an access point, also has the very nice feature of working with a GPS unit. As a result, once we had the GPS unit connected and the laptop appropriately aware of its location (serial port 1, 9600 bits per second, and so on) all we had to do was drive around the island…. As a result, we gathered some 50,000 data points.…. Machipongo Station's signal is visible from nearly everywhere, while the signal from Broadwater Tower is sporadic and often quite weak. The conclusion we drew is that for this application the stronger Orinoco Access Point is a significantly better bet.

Diary #48 [Linking to Meteorological Station]

We decided to connect a meteorological ("met") site near Machipongo Station at the north end of Hog Island. Since Hog Island has an 802.11b network, we sought an alternative to the Campbell NL-100, the Ethernet-to-CSIO converter which we had used for bridging between the Internet and the wireless networks in Alaska and Wisconsin…. Our aim this time was to convert a CSIO (serial) connection directly to 802.11b radio at the met station… Most "solutions" to serial-over-802.11b are kludges, consisting of a single port terminal server (which converts serial to Ethernet) connected to a separate Ethernet-to-802.11b converter, selling for the combined cost of the two modules plus an extra 50% for the "added value." Orinoco, on the other hand, offered a single unit (the EC-S that combines both functions, and at a significantly lower cost than the competition. The EC-S has a DCE serial connection, meaning it is designed to connect to a PC or laptop (or other DTE equipment) using a straight-through DB9 serial cable….. We connected the EC-S to a Campbell data logger via Campbell's SC929 cable, which is designed to provide a direct connection between a laptop or other PC and a Campbell CSIO port…. the installation was successful. From anywhere on the Internet, it was now possible to access the Machipongo meteorological station.

ArcIMS 4.0 - an "Out -of -the Box" Internet Mapping Solution?

- Brent Brock, Konza Prairie LTER (KNZ)

Internet mapping continues to gain popularity and ESRI's ArcIMS has emerged as the "software of choice" for providing this capability. The latest release of ArcIMS 4.0 provides some significant new capabilities over its predecessors but deploying these new features is no walk in the park. The following is my experience with installing, upgrading, and using ArcIMS on the Konza LTER web server beginning with version 3.0 two years ago.

Basic Internet Map Server Installation - case history

Installing ArcIMS 3.0 for the first time was a blood curdling experience. I did manage to make it work but only after three days of chasing down bug work-arounds and other tweaks. I won't frighten readers with the gory details because fortunately ESRI fixed all of the problems I had experienced in the next release. With ArcIMS 3.1 the documentation had improved significantly so by following the moderately complicated installation procedure to the letter, I was successful on my first attempt. One very large improvement with version 3.1 was support for the Tomcat servlet engine that was a big improvement over the free version of JRun that I used in the initial installation. Although Tomcat can be integrated into most popular web servers, we run Tomcat as its own service. I've seen a couple of articles in the ESRI knowledgbase that suggest ArcIMS may be less stable when running on an integrated Tomcat installation. I don't know why this would be true, but what we have works so I am sticking with it for now.

Installation of the basic components of ArcIMS 4.0 is very similar to version 3.1. Like 3.1, anyone proficient at following directions should be able to have a basic IMS server operational within a few hours. Likewise, upgrading to 4.0 from a previous version is fairly simple since your servlet engine will already be installed and ESRI provides instructions for saving and importing your old configuration. I was able to upgrade and have all of my existing map services operational within 20 minutes. You do need to pay attention to your version of Java Runtime Environment (JRE) though. The current release of JRE does not support JRE 1.4 so you may have to install a back revision of JRE to make things work. However, simply upgrading the basic components provides few gains in functionality without installing the new components.

Implementing the new tools in ArcIMS 4.0

Major enhancements in ArcIMS 4.0 are the Metadata Explorer and ArcMap Server. The metadata explorer provides a nice interface for viewing metadata and accessing data. It is integrated with ArcGIS so you can manage the XML-based metadata files in ArcCatalog and publish them on the server using drag-and-drop. ArcMap Server translates ArcMap mxd files for use in ArcIMS services. This is a big improvement over the clunky interface and limited rendering capabilities of the ArcIMS web authoring and designing tools.

Implementing each of these new components presents significant challenges. The first consideration is that these components are designed for use with ArcGIS 8.2 or later so an upgrade may be required on at least one workstation. Second, the Metadata Explorer requires metadata stored in SDE. Lastly, ArcMap Server and ArcGIS cannot be installed on the same box. This last limitation is particularly significant because it presents the same data mapping challenges familiar to anyone who has tried to move an ArcView project file to another computer. ArcMap seems to have poor support for UNC pathnames leaving the options of either copying mxd and associated data to the ArcIMS server box without changing pathnames or using SDE. Since the metadata explorer requires SDE anyway, this seemed like the obvious choice. Although SDE for Covers is supported, I opted to install SQL Server on our GIS box and run ArcSDE. Even with SDE installed I was unable to get a connection from the ArcMap Server until I discovered I needed to add an entry for the remote SDE service in the "services" file on the ArcIMS server. With that fix, the ArcMap Server has performed beautifully and allows us to publish ArcMap data frames and layouts directly to the Internet without having to attempt to duplicate them in ArcXML. Installing the metadata explorer required a certain amount of Zen to achieve success. The stumbling blocks were too many to list but I suggest anyone contemplating installing the metadata explorer first spend some time perusing the voluminous discussion threads about installing metadata explorer on the ArcIMS Discussion Forum before attempting the installation.

Topology of the KNZ Internet Mapping Service

ArcIMS Capabilities

To answer the title question: is ArcIMS an out-of-the-box Internet mapping solution, the answer is yes and no. With a little luck it is possible to install the basic ArcIMS components and serve simple maps on the Internet using the canned tools. With this option you have a choice of serving maps using HTML or Java viewers. The html viewer sends maps to the client as jpeg or gif images so it is lightweight and requires only a standard Web browser. The java viewer requires Java Runtime Environment and viewer software installed on the client but provides a richer set of features. Maps can be served as features (rather than static map images) that allow users to do things like manipulate symbol pallets of individual layers or add layers from other sources, such as local shapefiles or other Internet map services. Unfortunately, the tools for developing these basic services (ArcIMS Author and ArcIMS Designer) are rather cumbersome to use and limited in their rendering capabilities. Nevertheless, at Konza we have found these tools useful for generating simple map graphics for publications or fieldwork and for viewing thematic layers like soils or burn history data. We provide both java and html versions of simple maps at: http://www.konza.ksu.edu/data_catalog/gis/konza_prairie_interactive_maps.htm.

ArcMap Server improves map-rendering capabilities. With ArcMap Server, creation of high quality maps is greatly simplified and sophisticated renderings like shaded relief, transparent, or gradient fills is a snap. However, I wouldn't classify this as "out-of-the-box" because of the potential difficulties in implementation. Additionally, ArcMap Server is an image server: only static map images are sent to the client; users cannot manipulate the symbol pallets of the layers. Similarly, Metadata Explorer provides an attractive interface for metadata browsing if one can overcome deployment hurdles. Unfortunately, the metadata explorer is not very customizable at this point, so you are limit to the Geography Network style of metadata display, which leaves many metadata elements hidden. Konza's metadata explorer can be viewed at: http://www.konza.ksu.edu/MetadataExplorer/explorer.jsp.

Moving beyond "out-of-the-box", ArcIMS is a powerful development tool that supports sophisticated applications. Developers have their choice of using native ArcXML or one of the application connectors available (Java, ActiveX or ColdFusion). Using the ActiveX connector I developed an elegant solution to the problem of retrieving fire histories at Konza Praire (http://www.konza.ksu.edu/maps/BurnQuery.asp). This application runs a spatial query on our burn layers so the records returned are based on location rather than watershed names (which may change or move over time). Installing the connector was very simple and took only a few minutes. Writing the code was fairly straight forward but the ActiveX connector object model has a few rough edges. For example, ArcIMS returns the values of date fields as milliseconds since 1/1/1970, which is not very handy. This required a custom vbscript function to convert dates to a readable format. Also, the connector has no provision for sorting the records returned by a query filter; so I stole a chunk of javascript to sort the records at the client. These hurdles notwithstanding, development of this application was time well spent because it provides an improved query tool for users while reducing database maintenance to a single table that is automatically updated when the GIS layer is edited.

In summary, ArcIMS provides a powerful suite of Internet mapping tools. Information Managers simply wanting to post basic interactive maps on the Internet should find ArcIMS a useful "out-of-the-box" solution. Although implementation of the basic features from scratch is somewhat complicated, the documentation is good and most administrators should have things up and running with 1 to 2 days effort. IM's planning to deploy the new features in ArcIMS should expect to spend considerable time planning their deployment and chasing down undocumented glitches. IM's looking for more advanced Internet mapping capabilities will have to develop their own, but ArcIMS provides the tools needed to do it. ArcIMS ships with many good sample files to get the developer started or just adapt one of the Site Starters available for free from ESRI . Finally, the ArcIMS discussion forums are great places to look for help.

News Bits


BES Website Regarded as Curriculum

- Jonathon Walsh, Baltimore Ecosystem Study (BES)

The Baltimore Ecosystem Study website was reviewed by Science Netlinks and has been approved for use in classrooms. Science Netlinks is a website for science educators created by the American Association for the Advancement of Science [Website: http://www.aaas.org/]. It is part of a partnership called MarcoPolo [Website: http://www.marcopolo-education.org/ ] that provides Internet content for K-12 teachers. You can read their review of the BES website at http://www.sciencenetlinks.com/resources_individual.cfm?DocID=341&Grade=... Benchmark=5. The review was based on the criteria listed at http://www.sciencenetlinks.com/criteria.htm.

You can see how the BES website is featured in the Science Netlinks session called: Urban Ecology 1 at http://www.sciencenetlinks.com/lessons.cfm?BenchmarkID=4&DocID=276

ELTOSA Conference and Informatics Workshop

- Kristin Vanderbilt (SEV)

The LTER Network Office sponsored several scientists to attend the Environmental Long-Term Observatories of Southern Africa (ELTOSA) meeting held July 21-24, 2002 on Inhaca Island, Mozambique. The conference was entitled "Long Term Ecological Research for Human Development and Conservation of Biodiversity." LTER scientists attending include:

  • Dan Childers (FCE)
  • Craig Harris (KBS)
  • Laura Huenneke (JRN)
  • Stephen Macko (VCR)
  • Ernesto Mancera (FCE)
  • Sonia Ortega (NET)
  • Deb Peters (JRN, SEV)
  • Bob Waide (NET)
  • Peter McCartney (CAP)
  • Bill Michener (NET)
  • John Porter (VCR)
  • Kristin Vanderbilt (SEV)
  • Amanda Knoff (VCR)

Following the ELTOSA conference, an Informatics Training Workshop was held July 25 and 26 in Maputo, Mozambique at Eduardo Mondlane University. Bill Michener received funding for this workshop, and invited John Porter, Peter McCartney, and Kristin Vanderbilt to co-teach it with him.

Topics covered included:

  • Metadata
  • Hardware and software considerations for information management systems
  • Data sharing policies
  • Web authoring
  • Quality assurance
  • Quality control
  • SQL
  • Design and implementation of databases

Eighteen individuals, ranging from graduate students to high level program directors from six countries--Mozambique, Namibia, Kenya, Botswana, Tanzania, and South Africa-- attended the course. Each student had their own PC in a modern computer lab for the duration of the course. Students were interested and enthusiastic, and the workshop was rewarding for both students and instructors.

SCI2002: Ecoinformatics Challenges at International Conference

- Karen Baker , Palmer LTER (PAL)

The SCI2002 conference, attended by over 1,000 scientists in a wide variety of disciplines with interests in systemics, cybernetics and informatics, was kicked off with a keynote plenary talk on "The Ecoinformatics Challenge: Meeting Ecological Information Needs for the Site, Network, and Community" by John Porter, Karen Baker and Susan Stafford.

To stimulate the exchange of developments in information management and to promote cross-domain dialogue, LTER information managers participated in two Ecoinformatics Challenge sessions at the SCI2002 meeting (http://www.iiis.org/sci2002) held July 14-18 in Orlando, Florida. Twelve LTER papers were published in the conference proceedings which are available online at http://intranet.lternet.edu/committees/information_management/sci_2002/

Highlights of the conference included "Best Paper" distinctions awarded to the following LTER contributions: "The Future Of Ecoinformatics in Long Term Ecological Research" by James Brunt, Peter McCartney, Karen Baker, Susan Stafford; "Integrating Ecological Data: Tools and Techniques" by John Porter and Kenneth Ramsey; "Designing Web Database Applications for Ecological Research" by Dan Smith, Barbara Benson and David Balsiger.

Good Reads


A History of the Ecosystem Concept in Ecology

- Karen Baker, Palmer LTER (PAL )

Frank Golley, 1993. A History of the Ecosystem Concept in Ecology, Yale University Press, 254p.

In order to gain insight into a concept, context is provided by some often-nonlinear historical events. As LTER community members, we benefit from the the sweep and the depth of Frank Golley's presentation on ecosystem science. As information managers, we benefit from his recognition of the need for information management in combining, extending and passing on the data that science gathers. One historical note worth mentioning because it highlights an important distinction sometimes lost in the tacit understanding of our current research environment, is that LTER is not an acronym for Long-Term Ecosystem Research (p118) but rather for Long-Term Ecological Research. Is this an important distinction? Golley provides organizational examples, contrasting the business model for big science programs with a more academic approach. LTER, as a network, is a community organization model that explicitly adopts an integrative embrace of ecology, avoiding potential misunderstandings over the multiple levels of meaning and history associated with the term 'ecosystem research'. The LTER understaking is an ongoing re-balance of understandings generated by the multiple views afforded by the spectrum of reductionist to holistic, by the elements juxtapositioned with the whole.

Ecological Vignettes

- Karen Baker, Palmer LTER (PAL)

Eugene Odum, 1998. Ecological Vignettes: Ecological Approaches to Dealing with Human Predicaments, Harwood Academic Publishers, 269p.

Eugene Odum's 'Ecological Vignettes' brings to mind Rachel Carson's 'Silent Spring' in that it presents the particular along with some general ramifications. The book is divided into two parts with vignettes followed by more detailed essays that reference the scientific literature. The vignettes synthesize insights from a broad ecological career that has focused over the years on local as well as global systems with some sensitivity to political, economic, and social ramifications. The scaling of knowledge grounded by experience with watershed studies provides a much needed articulation about issues at the ecosphere level. Odum builds from a starting point, his determining factor, that the human population has reached the maximum carrying capacity of the earth as a whole. With a nontraditional, multi-tier presentation format, Odum invites and then supports participation from a broad audience by providing tools in the form of access to information in an approachable format. From the dark side of technology to the tyranny of small decisions, the book provides an often elusive bigger picture relevant to individual reflection as well as national action.

Calendar


Calendar

October 16-17, 2002 LTER Network Office Site Review

October 20 - November 2, 2002 Ecoinformatics Training Course for OBFS Personnel, Sevilleta LTER

November 21-22, 2002 MetaDiversity III: Global Access for Biodiversity Through Integrated Systems, Philadelphia, URL http://www.nfais.org/EventDetails.asp?EventID=12)

February 4-5, 2003 IMEXEC Meeting, NCEAS