Skip to Content

Spring 2000

Featured in this issue:

Data management tools & resources, 2000 All-Scientists Meeting.

DataBits continues as a semi-annual electronic publication of the Long Term Ecological Research Network. It is designed to provide a timely, online resource for research information managers and to incorporate rotating co-editorship. Availability is through web browsing as well as hardcopy output. LTER mail list IMplus will receive DataBits publication notification. Others may subscribe by sending email to databits-request@lternet.edu with two lines "subscribe databits" and "end" as the message body. To communicate suggestions, articles, and/or interest in co-editing, send email to databits-ed@lternet.edu.

----- Co-editors: Denise Steigerwald (MCM) and Ned Gardiner (CWT)

Featured Articles


Plans for the 2000 All Scientists Meeting Workshops on Information Management

- Barbara Benson (NTL)

LTER information managers have been busy planning workshops for this summer's All Scientists Meeting. The abstracts below were taken from this web site: http://www.lternet.edu/allsci2000/. It is possible to register to bring a poster and/or attend the workshops via this site. Things are shaping up for an interesting meeting.

DATA MANAGEMENT & MOVEMENT

DM-1 Title: The partnership between long-term ecological research and information management: successes and challenges.

Organizers: Barbara Benson (NTL), Dick Olson (DACC-ORNL), John Magnuson (NTL)

Since the establishment of the LTER network and partly because of its existence, ecology has shifted from the traditional study of a site or an event by an individual to a much broader approach that includes networks of sites and communities of investigators carrying out modeling, synthesis, assessments, and long-term ecological research. A key factor in this fundamental change is the dramatic increase in the application of computer science to ecology. This workshop is designed to generate a productive dialogue among scientific researchers and information managers. Invited speakers will set the stage by presenting an overview of successful partnerships between ecological research and information management. We will distill the crucial components of successful information management both at individual sites and for intersite research. As information technology continues to evolve and the research agenda broadens, new challenges will need to be met by the partnership. We will attempt to articulate what these areas of growth will be. Projected products from this workshop include a web page with a summary of discussions and possibly a summary article in a journal such as Bioscience.

DM-2 Title: Advanced Communications and Networking: Opportunities and Challenges for LTER/ILTER

Organizers: Bill Chang (NSF), Tony Fountain (SDSC), and John Vande Castle (NET)

This Panel Discussion focuses on the opportunities and challenges that advanced communications and networking bring to LTER/ILTER community. Advanced Internet and information technologies (IT) have redefined our lives. They influence everything we do and will change how we, as scientists, conduct our research, communicate our findings, educate our students, and serve our communities. Short (10-15 minutes) presentations followed by panel discussion will cover the following topics:

  1. Wireless Data Acquisition and Communication - Dave Hughes (Old Colorado City Communications)
  2. Laboratory and Field Station Networking Advances - (To be invited)
  3. Regional and National Networking vBNS/I2 and Beyond - John Jamison (STAR-TAP/Juniper)
  4. Integrated Data Management and Analysis - Reagon Moore (SDSC)
  5. Potential LTER/ILTER Applications of these Advances - Tony Fountain (SDSC/LTER)
  6. Challenges and Bottlenecks - John Vande Castle (NET)

Please join us!! Together we will redefine how these new technologies can help us accomplish our work!!

DM-3 Title: Ecological informatics: Innovative tools and technologies.

Organizers: Hap Garritt (PIE) and John Porter (VCR)

This workshop will provide introductions to new technologies that can aid in the input, management and analysis of ecological information resources. A panel of experts will provide information on innovative applications of new technologies and answer questions from workshop participants. Where feasible, demonstrations will be available for hands-on testing by workshop participants. Topics to be addressed include: display and query of LTER Network data resources, wireless networking, linking bar code and GPS technologies, data modeling tools, micro data loggers, network collaboration technologies, voice recognition and implementing webcams.

DM-4 Title: The LTER Network Information System and beyond.

Organizer: Peter McCartney (CAP)

Biological informatics is evolving in an environment of rapid technological change. Some of the challenges facing us are scaling our data management infrastructure to accommodate the sheer quantity of data that will be generated in the decades to come, improving access to heterogeneous data sources, and developing more intelligent applications that make data more usable for the diverse array of end-user communities. This workshop examines some of the current efforts by the LTER team to build a Network Information System for integrating data across the LTER sites. It then looks at several new projects both within and without the LTER network that complement and expand the NIS goals through new technologies such as machine-parsable metadata, knowledge-based software design, advanced networking tools, and visualization methods. Several issues will be addressed in the workshop, including

  1. the role of extended partnerships to tackle large development projects with diverse technical needs
  2. Mechanisms for identifying end-user needs in designing informatics applications
  3. Security, property rights, and management issues associated with building integrated data access systems
  4. Achieving new goals while maintaining backward compatibility with legacy data resources and software systems.

DM-5 Title: GIS on the internet and LTER: a frontier for research, applications, and solutions.

Organizers: Ned Gardiner (CWT)

This workshop will formally present internet Geographical Information Science (GIS) technologies to the LTER community. Principal investigators, students, and computer specialists comprise the speakers. Talks and demonstrations will expose participants to planning, implementing, and expanding GIS applications on the World Wide Web. The focus will be on research, including planning new and ongoing projects, data visualization, sharing data, and communicating across the large distances that typically separate scientists who collaborate in the LTER network and beyond. Investigators will observe and consider the role of internet-based GIS in bringing LTER science to bear on regional ecosystem analyses. Information managers will bring their own expertise and will leave with a broader vision of internet GIS applications and solutions for their own work. We will provide timely examples of how live, web-based GIS will continue to enhance any long term ecological research program.

In addition, look for the workshop titled "Learning from LTER Data in K-12 Classrooms". It is being organized by Marianne Krasny, and will include discussions of using the Internet and computer technologies to enhance learning as well as partnering with LTER information managers in developing activities.

Providing controllable WWW animations using JAVASCRIPT

- John H. Porter, Virginia Coast Reserve LTER

Animations of data or imagery of your site can be a handy way to convey a mass of information to potential users. However, the options have been somewhat limited. You could create an MPEG or .mov file, but then the users needed to download an appropriate viewer. You could create an animated GIF file, but then there was no way to control the speed of display. Finally, you could create a JAVA applet (and have your users go away while they wait for it to load). However, I recently came across a JAVASCRIPT application that is widely used at weather sites and can be easily adapted to a variety of applications. The jsImagePlayer© from BASTaRT was written by Martin Holecko in the Czech Republic. It is a short script that displays a sequence of images and provides controls for direction, speed and looping.

I've adapted the script and related HTML (mainly by moving some buttons around and relabeling them) into some interesting applications. The first was to provide easy on-line animation of a simulation model of the movement of Hog Island produced by Guofan Shao. Provided are two resolutions for those with large or small screens. That page also features versions of the model animated with JAVA, GIFs and MPEGs for comparison. I think you'll agree that the jsImagePlayer provides a maximum of functionality with a minimum of fuss (see below for how much fuss was involved).

A second application was the development of a "virtual webcam" animating a sequence of images of the future site of our new field laboratory taken in 1997. It allows users to pan back and forth and alter the speed of display. For the record, our lab will be in the copse of trees in frames 24 and 25. The original images were created by dumping one frame per second from our digital video camera and converting them to .gif files. I've also used the jsImagePlayer© to display sequences of nesting locations of colonial waterbirds - the dots on the map seem to jump around as each year's locations are displayed, but that is an ongoing project so I can't give you a URL for that WWW page yet.

Now, the big question, how long did it take to set this up? Actually, not long at all. You can download the jsImagePlayer© just by viewing and saving the source of any of the WWW pages listed above. If you want the pristine version from BASTaRT, you will want to download his example. Feel free to use my mildly tweaked version (hitting play at the end automatically rewinds; start and end keys don't have frame numbers on them) from either of my example pages.

Once you have the jsImagePlayer© code, create a numbered sequence of GIF or JPG files that will serve as the source images. File names should have a prefix, a number and then .gif or .jpg. For example im2.jpg might be the start of a sequence (im is the prefix, 2 is the number) and im99.jpg might be the end of a sequence (again, im is the prefix, 99 is the number). In a clearly labeled section at the beginning of the JavaScript code, you set the image name ("im" in the example above), the number of the first image (2), the number of the last image (99), the image_type (jpg or gif) and the height and width of the animation images. You then just need to go to the HTML section at the end of the code that contains the "img src=..." for displaying the first image and set the name of the first image and the proper dimensions. It may be possible to do this last step by automatically by running the "animation" function in the code, but I found it easier just to modify the HTML code myself.

In summary, jsImagePlayer© is a freely available tool (as long as you acknowledge its authorship) that can have you up and running interesting animations on your WWW site in about a half-hour! Enjoy!

Review of Dragon NaturallySpeaking Preferred continuous-speech dictation system with digital handheld recorder for use as a data entry tool

- John P. Anderson, Jornada LTER

This is a general overview of speech transcription options with system requirement specifics, feature identification, and personal observations in my initial exploration of using this product to convert digitally recorded field data to text files. It is not a comprehensive review or comparison of speech transcription options. More complete information can be found at the web site for Dragon Systems, Inc., http://nuance.com/dragon/index.htm.

I. General comments:

  1. I have not tested this product yet under actual field conditions.
  2. The digital recorder is very light, small, and comfortable to hold. It has a PAUSE button, which allows for continuation of recording within an individual file without having to formally insert new comments.
  3. Use of a headset microphone with rigid boom mic is necessary for consistent, accurate transcription of recorded data. This maintains a consistent position of the mic from the mouth which is necessary for consistent, accurate transcription and corresponds to positions used during creation of user profile speech files (done once for each user). While the recorder can be used without the headset mic (has a built-in mic), I don't think it would be possible under normal field use to maintain a consistent position during use.
  4. I have sucessfully tested the Dragon NaturallySpeaking Preferred software for use in data entry directly into a Word document (saved as a text file). It also has been used successfully in form-based data entry (though not by me yet. See http://nuance.com/dragon/index.htm. for testimonials of different applications of software.)
  5. I have been less sucessful with accurate data transcription when using the digital recorder. However, I feel pretty confident that this can be corrected by working with Dragon Systems tech help.
  6. The recorder creates a digital audio file (Wave file) that can be used to listen to the recording as it is transcribed by Dragon NaturallySpeaking (DN) software or reviewed at a later time. Transcription errors can be caught and corrected (manually or by voice) through visual checks in this fashion. Of course, programmatic QA/QC can and should also be done. Programmatic QA/QC must be done using methods other than with Dragon Systems products. It may require an additional parsing program to get the data into a form that could be used by existing QA/QC programs.
  7. The Voice It Link software included with Dragon NaturallyMobile Recorder (the digital recorder I'm using) is used to transfer and transcribe the recorded information directly into Dragon NaturallySpeaking which uses the user speech files to do this. See 5. below.
  8. Voice commands allow: movement within the information being transcribed or edited; voice editing, moving, formatting, and revising of text
  9. You must initially train Dragon NaturallySpeaking to understand your recorded speech. To do this, you create user speech files for recorded dictation. This may take an hour or more for the basic training but only has to be done once for each user. It may require a longer period when there is specialized vocabulary to be added (for example, species acronyms).
  10. It may or may not be cost effective for short-term seasonal workers because of training time necessary for Dragon NaturallySpeaking to understand each user.
  11. A microphone is included with the product.

[Some information below is excerpted from Dragon NaturallySpeaking Preferred Getting Started manual.]

II. Initial use of Dragon NaturallySpeaking (DN):

  1. A user speech file must be created for each user. This need only be done once. It includes information about the user's voice, pronunciation, and use of words. DN uses this to correctly transcribe the user's speech. Additional users can be added at any time. Individual speech files are maintained separately.
  2. Vocabulary:
    • Different vocabulary lists are used depending on RAM and processor speed. Better vocabularies provide better speech-recognition accuracy but require more computer memory and a faster processor; i.e., the more RAM and the faster the processor the better the speech-recognition accuracy. I don't know at what level of system resources improvements level out.
    • You can customize your vocabulary to reflect words, acronyms, etc. you use as well as your writing style by using Vocabulary Builder. You can do this by importing documents in any of the following formats: text files (.TXT), Rich text format files (.RTF), Word files (.DOC), and HTML files (.HTM and .HTML). Any new words found in documents selected for including in vocabulary are displayed and they can be individually selected or deselected for inclusion in vocabulary. These can be ordered alphabetically or by frequency within the document.

NOTE: Each time you run Vocabulary Builder it undoes the effect of the last time you ran it; it is not cumulative. I believe Dragon NaturallySpeaking Professional (I'm only reviewing DN Preferred here) will allow you to build separate, specialized vocabularies that are used in conjunction with the standard DN vocabulary, which corrects this deficiency in the DN Preferred edition.

III. Requirements for Dragon NaturallySpeaking Preferred:

  1. at least 133 MHz Pentium (faster processors improve speed and accuracy)
  2. at least 64MB to take full advantage of BestMatch technology features which increases speed and accuracy). Otherwise, minimum 32MB on Windows 95 or 98; minimum 48 MB on Windows NT)
  3. Windows 95, 98, or NT 4.0+ with Service Pack 3.
  4. Hard disk requirements:
    • about 75 MB to install Dragon NaturallySpeaking (includes Help files, which are recommended since they provide the most complete documentation).
    • additional 95 MB hard disk space to run Dragon NaturallySpeaking and train your user speech files (one set of user files).
    • additions to above basic requirements:
      1. +10 MB for each additional user
      2. +2MB for Dragon Natural Word for Microsoft Word
      3. +4 MB for Dragon NaturalWord for Corel WordPerfect
      4. +16 MB for Text-to-Speech utility
      5. +22 MB for BestMatch technology support
    • other system requirements:
      1. 16-bit sound card or built-in audio system with input quality equal to or greater than the Creative Labs Sound Blaster 16. (See http://nuance.com/dragon/index.htm for an up-to-date list of supported hardware.)
      2. Speakers, for multimedia Help system, Quick Tour, Text-to-Speech utility, and dictation playback. They are not required for speech recognition.
      3. CD-ROM drive for installation CD.

IV. Voice It Link software.

  1. This is the bridge between the digital voice recorder (Dragon NaturallyMobile Recorder) and the Dragon NaturallySpeaking voice-to-text software on the pc. It allows creation of user named files and folders on the voice recorder and the ability to sort and organize those files.
  2. Requirements for Voice It Link software:
    • Windows 95. I've used it successfully on Windows 98. Windows NT unknown. (There may be upgrades that have occurred since I purchased software that explicitly state that it's compatible with Windows 98 and NT.)
    • Best with 166 MHz Pentium MMX and 32 MB of RAM for voice-to-text functions
    • 10 MB on hard disk for basic functions
    • 100 MB for voice-to-text functions.
    • Serial port with DB-9 connection
    • Industry standard sound adapter, such as SoundBlaster, that permits playback of audio files.
    • 3.5 inch floppy for software installation

V. Dragon NaturallyMobile Recorder (Voice-to-text recorder (VTR))

  1. Comes with 4 MB (40 minutes) internal memory. Uses industry-standard 3.3V SSFDC type in 2 MB (20 minutes), 4 MB (40 mjinutes), and 8 MB (81 minutes) SmartMedia cards.
  2. Provides visual display of relative amount of memory used. Provides flashing warning when about a minute of memory is left. Beeps when memory is full.
  3. Tracks info on each file recorded including its length, date and time created, and date and time last modified.
  4. Hold up to 99 folders on its internal memory or card memory. Each folder can hold up to 99 files (or recordings).
  5. Battery level displayed
  6. Provides display of remaining minutes of recording time.
  7. Can vary speed of playback (slowdown or speedup).
  8. Add, insert, or delete comments at any point. Can mark the beginning and end of specific sections of comments to be deleted.
  9. Bookmark comments to come back to later. These are indexed for quickly locating again.

Tools for Data Management - an XML-based Metadata Editor in Java

- By Rudolf Nottrott, NCEAS and University of California Natural Reserve System

Synthetic research in ecology depends heavily on the integration of large numbers of heterogeneous data sets. Underpinning the integration are complete metadata for each data set. At NCEAS, we have created a prototype system for creating data documentation and automated data processing based on XML-structured metadata. The system demonstrates new functionality in support of automated data processing, such as quality assurance, and data presentation. Additional functionality envisioned for the system includes content discovery, as well as data conversion and integration for cross-site, synthetic research.

We developed an XML-based metadata editor (MDE) in Java as the central part of the system (for more system details see www.nceas.ucsb.edu/ecoinformatics). The editor uses configurable XML Document Type Declarations (DTDs) for an Ecological Metadata Language (EML) developed at NCEAS and based on a standard proposed by the Ecological Society of America. EML describes data set schema, naming conventions, attribute range and type information. Investigators use the editor to produce metadata that comply with EML. Future functions envisioned for the editor include metadata conversion between standards and access to data associated with metadata. Because the editor is configurable with any valid DTD, investigators are free to use metadata standards other than EML. In particular, the Federal Geographic Data Committee's (FGDC) Biological Data Profile of the Content Standards for Digital Geospatial Metadata is likely to supersede EML (see http://www.fgdc.gov, FGDC-STD-001.1-1999), and can be used to configure the editor when its final form is available.

While the editor has been a valuable first step in our efforts to overcome the barriers to researchers attempting to synthesize ecological data across space and time, much work remains to be done and is presently undertaken in the framework of an NSF grant entitled: "A Knowledge Network for Biocomplexity; Building and Evaluating a Metadata-based Framework for Integrating Heterogeneous Scientific Data", Knowledge and Distributed Intelligence (KDI). The grant was awarded as a result of a collaborative proposal by NCEAS, LTER, SDSC and Texas Tech University (Reichman, Brunt, Helly, Jones, Willig). Components of the present metadata editor will likely be incorporated into this work as part of the metadata management client as part of the system proposed under KDI.

News Bits


Virtual Tour Using Quicktime Authoring

- Karen Baker (PAL)

Quicktime Authoring (QTVR) software supports production of a virtual tour in the form of scenes via an online window with 360-degree panorama views and hotspots linked to other panoramas or web pages. The method requires a digital camera with special tripod mount and the QTVR software application package (cost ~$300) for the Macintosh computer. The QuickTime Player is available free for viewers (www.apple.com/quicktime).

Examples of specific equipment and software ordered include:

  1. The camera with increased memory for field photo storage and with cable for download (www.cameraworld.com; Nikon Coolpix 950 digital camera ~$800; compact flash 64meg fast RAM card ~$200; AA 4 charger w/4 aa recharg/gold bat ~$40; Cool Pix AC ADAT.EH-30 F/CP-900/900S ~$50)
  2. The tripod accessories and software (www.kaidan.com; Kaidan-Quick Tilt leveler-auto leveling QPXL-1 $120; Kaidan-Panoramic VR tripod head $250 Kiwi+ KW2-disk type 18; Kaidan Apple QuickTime VR Authoring Studio ~$275)
  3. The tripod (www.simacorp.com; Sima tripod ST23c $45.) With this equipment 18 pictures are taken giving a 360-degree view and the software provides the technique for stitching the jpg files together

References about the method include the book "The Quicktime VR book" by Susan Kitchens and the web sites http://www.quicktvr.com) and www.apple.com/quicktime/qtvr.

BioQuest Education Materials

- Karen Baker (PAL)

The LTER Education Committee held a workshop at the Kellogg Biological Station in November of 1999. Representatives of BioQuest/Beloit College gave an introduction to their digital libraries and database developments, ie BIRDD and Biology Workbench. The BioQuest Curriculum Consortium is a community of bioscience educators and researchers established in 1986 to focus on undergraduate biology education and science curricula reform. The Consortium goal is to provide opportunities for students to learn about science by participating in science by having the opportunity to consider complex problems using scientific data and methods. The philosophical framework includes problem-posing, problem-solving, and persuasion of peers. The group gathers databases, creates software interfaces and develops curriculum in order to make data available on CD for use in the classroom. Information is on the web for ordering existing educational materials and CDs (http://bioquest.org). The group solicits new module contributions and looks for partners with databases that BioQuest could augment with curriculum design and could include in their distributions via Academic Press. Contacts: bioquest@beloit.edu; 608-363-2743.

Central Arizona - Phoenix Biological Databases & Informatics (CAP BDI) Project

- Peter McCartney (CAP)

A proposal titled "Networking our Research Legacy", was awarded by the NSF Biological Databases and Informatics program to P. McCartney, C. Gries, T. Craig, N. Grimm and C. Redman (CAP). The project will run 3 years with a budget of $720K. The goal is to develop an information management infrastructure centered at Arizona State University that integrates diverse environmental databases and provides a suite of access tools that allow users with differeng interests, backgrounds and computational skill to more seamlessly locate and use data that has been produced or aquired by ecological researcher in the CAP study area. The use of structured metadata in machine-readable XML format represents a common thread between these tools which range from metadata-creation tools to web-based data query and download solutions. Two full-time programmers (one for database programming, the other for interface design) will be hired along with two RA students each year. The project will work collaboratively on ecologcial metadata standards with the KDI project awarded to NET, NCEAS and SDSC.

LTER Site Information System Elements Survey

- Karen Baker (PAL)

At their August 1999 summer meeting, the LTER information managers gathered information about their existing individual site information system designs in preparation for KDI and metadata activities. Categories of inquiry include data, metadata, personnel and bibliography. Survey results have been transcribed from paper to digital form and posted online at http://frazil.icess.ucsb.edu/im. The survey has been implemented as a relational database with web interface permitting

  1. Sites to update their site information management system descriptions annually
  2. Data to be viewed by site or by subject.

The production survey implemented on a PC/NT with IIS4.0 a web server includes the database management system Access using web interface software CGI/Perl.

Bioinformatics Whitepaper

- Peter McCartney (CAP)

The IM executive committee reviewed the status of a whitepaper on bioinformatics that was prepared a year ago as a long-term vision statement about informatics in ecology and the role to be played by LTER information management. It was recognized that many of the funding initiatives anticipated during the preparation of that document are already becoming a reality and that the core concept of the paper (the need for cross-institutional partnerships in informatics research) is already being validated by the collaborations forged within and between the current KDI (both NCEAS and KU) and BDI informatics projects. Therefore, it was decided that the paper would be edited and revised from its current state with the intent to submit to Bioscience, or similar high-profile ecology forum, so that our message about the changing role of information management in ecological science would receive the exposure it needs to influence decision-making about the design of major new research intiatives. The current version will be edited by McCartney and several members of the IM executive committee to augment a manuscript describing the LTER Network Information System (Baker et al). The white paper will also describe new information management initiatives and projects. We will solicit comments on the paper from all LTER information managers.

LTER Site Description Directory Update

- Karen Baker (PAL)

The availability of basic information about site characteristics for the LTER network of sites is addressed by the LTER Site Description Directory module (SiteDB). General site descriptive information includes specifics such as history, classification type, latitude, longitude, area, elevation, and site contacts. The database, originally developed as a single form relational database (MiniSQL) with web access (Lite) on unix, is being redesigned and ported by the Palmer site in coordination with the LTER Network Office. The new schema involves a multiple table relational database (access) with plans for use of web portable interface software (cgi/perl) on a PC/NT. Development includes consideration of interfaces with other site directories. The LTER Site Description Directory is organized into general, site-specific and themed views. Entries are to be maintained by the individual LTER sites through web forms and input is moderated to ensure security. The directory, planned as part of the LTER network information system module, will support single or multiple site searches for information retrieval and comparison.

Planning Workshops for the National Ecological Observatory Network

- Barbara Benson (NTL)

The National Science Foundation is holding a series of planning workshops for a proposed program, the National Ecological Observatory Network (NEON). The first planning workshop was held at Archbold Biological Station, Florida from 9-12 January 2000. As proposed, NEON would establish 10 observatories around the country that will serve as national research platforms for integrated studies in field biology. Current plans anticipate a call for proposals for the first three NEON sites this fall.

The second NEON infrastructure planning workshop was held March 9-12, 2000 at the San Diego Supercomputer Center. Several LTER information managers were present (James Brunt, NET; Barbara Benson, NTL; John Porter, VCR). Participants reviewed "how the NEON observatories will provide a state-of-the-art infrastructure to support interdisciplinary, integrated research and allow scientists to conduct comprehensive, continental-scale experiments on ecological systems. Specific components of NEON observatories, as well as broader network needs were discussed. These components include site-based experimental infrastructure, natural history archive facilities, analytical instrumentation, communication networks, and computational facilities."

The second day of the meeting was the Information Technology Frontiers Symposium. The presentations included many cutting-edge technologies: web-based distributed systems for accessing data, web-based integrated database and analysis environments, wireless technology, data handling and information discovery systems for large scientific data collections, usability challenges to accessing ecological data, integrated modeling across scales, visualizing data in a dynamic temporal context, shared virtual realities. Powerpoint presentations for these talks can be found at http://www.sdsc.edu/NEON/presentations.html. Video clips can be viewed at http://www.sdsc.edu/NEON/video.html.

Good Reads


How To Manage Data Badly (Parts 1 & 2)

- Darrell Blodgett (BNZ)

Good Reads:

  1. Hale, S.S. 1999. How To Manage Data Badly (Part 1). Bulletin of the Ecological Society of America, 80 (4): pp. 265-268.
    • (Online PDF file at The Ecological Society of America: pages 20-23).
    • How To Manage Data Badly (Part 1) is a sarcastic set of ten rules for the database manager or administrator to follow in order to "manage data badly". The rules will probably be recognized by data managers who have seen, or experienced fully or partially compliant systems in the past. Rules such as "Rule 1. One world, one database" , and "Rule 2. Users are losers" give you a good idea of the humorous content of the article.
  2. Hale, S.S. 2000. How To Manage Data Badly (Part 2). Bulletin of the Ecological Society of America, 81 (1): pp. 101-103.
    • (Online PDF file at The Ecological Society of America: pages 1-3).
    • How To Manage Data Badly (Part 2) describes rules 11 through 16 which are rules for scientists to insure that their data is managed badly. Again these rules are familiar to data managers who deal the small percentage of scientists who adhere to one or more of these rules. The article concludes by encouraging database managers and scientists to work together following the 16 principles to achieve widespread recognition. The epilogue goes on to describe the importance of good data management and lists several things that are needed to do a better job at managing data.

FAQ


Where can I find information on how to copyright data, web material, or other documents?

- Karen Baker (PAL)

With the increasing production of web sites and CD products, the question of copyright arises for the information manager. The US law provides copyright as a form of protection to the authors of "original works of authorship". Copyright exists from the time a work is created in fixed form. With works made during employment, the employer (not the employee) is considered the author. The owner of the copyright can authorize use, reproduction, modification and distribution. A copyright often provides a statement regarding 'as is" basis with respect to warranties.

Online help includes a US Copyright Office web site: http://lcweb.loc.gov/copyright and a readable synthesis and FAQ by Terry Carroll: http://www.aimnet.com/~carroll/copyright/faq.home.html. A university often provides copyright information via the offices of publication, public counsel and/or technology transfer. These offices may have a copyright disclosure form for material that needs to be marketed or licensed. Copyright registration is not required, but registration with the university provides the benefit of having an advocate to pursue copyright infringements.

A recently generated Palmer CD carried the following copyright:

"Copyright © January 2000, The Regents of the University of California Permission to use, copy, modify and distribute the contents of this CD for educational, research and non-profit purposes, without fee, and without a written agreement is hereby granted provided the copyright notice and acknowledgement of the Palmer Long-Term Ecological Research archive appear on all copies. Any other use of this material without permission is prohibited. The CD contents are provided on an "as is" basis without any warranty."

I have a large number of CDs to label. What are my options?

- Karen Baker (PAL)

Since technology has developed to the point that a CD can be created at a reasonable cost either through duplication or replication, it is important to consider the CD packaging.

Making an optimum master disk (i.e. with a track-at-once session rather than multisession recording) is just the beginning of the CD creation process. A traditional plastic CD case provides for a front insert and/or back plus spine inserts which can be generated with software associated with the CD burner. Because of the cost of generating and handling these inserts, an alternative design would be to use a slim jewel case with no inserts putting all the cover information on the label or the CD itself.

There are a variety of label types to consider, from least expensive to more costly:

  • Paper
  • Inkjet
  • Silkscreen
  • Offset printing

The first two are typical options when disks are duplicated while the latter two are used when disks are replicated.

The paper label can be generated by any PC given purchase of special paper and software but questions of longevity of the paper solution exist.

The inkjet provides full color solution using a midrange cost printer (~$800-$2000) which can be found in small production shops today (although perhaps destined to appear in copy centers within the year much like the historical development of large format plotter availability). This solution uses a blank CD purchased with a white ink background on which colors are printed. Production costs are approximately $3 per disk including the jewel case, copy and label with an order of more than 100.

The silkscreen option begins in the same cost range but cost increases with the number of colors and requires production of more than 500 CDs.

Offset printing is most expensive but allows for full color (CMYK) printing.

Note: a CD is an original product so copyright exists with its production. University report series already have begun to issue technical report numbers to CD publications although a paper copy of a brief overview of the contents is often requested.

What guidance is there for new LTER information managers?

- Karen Baker (PAL)

The addition of three new sites to the network in the year 2000 prompted the creation of an informal online guide for new LTER Information Managers: http://www.icess.ucsb.edu/lter/dm/projects/imguide

Topics covered include:

  • Communications
  • Administration contributions requiring attention
  • Research contributions requiring attention
  • References

The references give a good overview of the history of individual and network efforts with respect to data management. This synthesis of information will be updated on an ongoing basis. Plans are developing to add the FAQ and GOOD-READ entries from the LTER IM Newsletter, "Databits", to this guide to provide additional background for information managers new to the LTER network.

Calendar


LTER Metadata Standards Committee Meeting 18-20 Feb 00

The LTER Metadata Standards Committee (NIS group) met immediately after the information managers executive committee meeting to focus on metadata. The goals included:

  1. Review existing standards for metadata content and format
  2. Develop recommendations for a standard / set of standards for LTER site metadata
  3. Develop implementation plans for bringing LTER sites into compliance with these recommendations within a reasonable time frame

LTER Information Manager Executive Committee Meeting 17 Feb 00

The information managers executive committee (imexec@lternet.edu), formerly known as datatask@lternet.edu, met at the network office for a productive one and a half day meeting in conjunction with a following two-day meeting of the network information system (NIS) working group. Topics covered included the following:

  1. The group asked itself the perennial question faced by all LTER scientists, "what do we have on our plate vs. what should we have on our plate?"
  2. The IM workshops planned for the upcoming All Scientists Meeting were discussed (see separate article in this issue).
  3. Several NSF initiatives the ITR (information technology research, formerly IT2), NEON and KDI were discussed as well as the research partnership with SDSC and NACSE.
  4. Peter McCartney described his recent successful DBI proposal and how it fits in and is complementary with the work that Matt Jones and Rudolf Nottrott are doing on metadata via NCEAS.
  5. We had a follow-up discussion on the White Paper: The Future of Informatics in LTER: Our Vision. We discussed plans for an appropriate publishing outlet for a stream-lined article. Peter McCartney will take the lead and all within the IM executive committee are invited to provide meaningful input.
  6. Barbara Benson brought Dick Olson's proposal for ESA's long-term study section to us for our input. Dick has received funding from ESE to develop a database of long-term research sites compiled from existing inventories and would like the LTER data management community to comment on how to build it to maximize its utility and overall usefulness.
  7. Karen Baker demonstrated the development and current status of SITEDB. This prompted a thorough discussion of many of the issues associated with the development of individual modules, ie. prototypes, of the NIS. Several of these issues were later discussed within the NIS working group meeting.
  8. Don Henshaw discussed the status and needed next steps of the CLIMDB (climate database).
  9. James discussed ASBIB (all-site bibliography). Again, many similar issues underlie all three NIS prototype development efforts, and it is anticipated that we, as a group, will continue the discussions on the balance between site leadership and network office support in prototype development activities, such as these, for network infrastructure. The issue of scaleability and maintenance must be addressed in a proactive way to insure the future development of important modules - otherwise they will languish.

(Submitted by Susan Stafford on behalf of the IM executive committee (Barbara Benson, Mike Hartman, Don Henshaw, Darrell Blodgett, Ned Gardiner, Peter McCartney and Karen Baker). We were joined by Denise Steigerwald and John Campbell from the NIS working group (metadata standards committee)).

LTER Coordinating Committee Meeting October 1-2, 1999

The LTER Coordinating Committee (CC) met October 1-2, 1999 at the Hubbard Brook LTER site. The meeting combined planning for the 2000 LTER All-Scientists Meeting with a science topic: "Patterns and Control of Net Primary Productivity Across Biomes." Discussion of the All-Scientist Meeting focused on developing a network funding model that would maximize participation by LTER scientists across the entire network. The decision was to go with an individualized system that coupled funds provided by NET with the anticipated travel costs from each site. There was also substantial time spent discussing NEON (National Ecological Observatory Network) and the opportunities it might present LTER sites. Gus Shaver was elected to replace outgoing EXEC committee member Ray Smith, whose 3-year term had expired and there were discussions about ways to provide a memorial for Tom Callahan.

The Net Primary Productivity (NPP) workshop featured presentations on primary production in a number of different biomes, including grasslands, forests, lakes and marine systems. Of special interest to information managers, at the spring 1999 CC meeting, there was extensive discussion of the need to link science topics at future CC meetings to the development of network-wide standards for measurements. The NPP workshop included a breakout discussion group for NPP standards for the LTER Network.

Of particular interest to information managers, the LTER Network Office is working with OBFS (the Organization of Biological Field Stations) on developing a position at NET that would focus on OBFS connectivity and ecoinformatics issues. There was also a discussion of the KDI grants held by U. Kansas and NCEAS and the opportunities afforded by developing a Partnership for Biodiversity Informatics in conjuction with NCEAS.

The tour of Hubbard Brook was also not without its interest for information management. The tour included the extensive archive of physical samples maintained at HBR. Check http://www.vcrlter.virginia.edu/images/lter_network/HBR for details.

Calendar

  • May 00: LTER Metadata Standards Committee Meeting
  • 00 Aug 02-04 All Sci Meeting; Snowbird, Utah
  • 00 Aug 01-04 LTER IM Meeting; Snowbird, Utah
  • 00 Aug 06-11 Ecological Society of America Meeting; Snowbird, Utah