Note on Category Formation

Fall 2010

Karen Baker (CCE, PAL)

New categories that arise within a community are noteworthy because they represent ways of organizing information and reflect current community understanding. Categories may initially be understood only vaguely but may become useful, partly because their definitions emerge somewhat naturally over time. The Ocean Informatics team that works with PAL and CCE LTER has found it useful to work with two categories that exist within the LTER community – ‘signature’ and ‘core’ – and thereby work on distinguishing the two during this time of category formation in the broader network.  Of course, their definitions will likely change or perhaps sharpen over time.

On site websites and within data catalogs, LTER datasets are being tagged as ‘signature’ and ‘core’ datasets. In terms of CCE and PAL use, the ‘signature’ datasets are those that represent a time-series that is long enough to be significant for that type of data; the ‘core’ datasets are those that focus on LTER themes. These two categories are recognized explicitly to be non-exclusive; that is, a site dataset may be in one, both, or neither category.

Currently, two uses of the ‘signature’ and ‘core’ categories are illustrated on the CCE site data page ( First, representative examples of signature and core datasets are presented. These serve as both an introduction to site data and as an access point to the data system. Under each category, a graph is presented for the generalist interested in browsing and a dataset link into the data catalog aids the more engaged participant ready to explore. This data presentation design evolved from a collaborative effort by a site scientist, graduate students, and information managers to review the project website.

The second use of these categories highlights the system functionalities built around keywords. As our information system architecture is designed to accommodate multiple sets of keywords, a set of nine keywords has been created for LTER-specific categorization. This set contains the original 5 LTER core science themes (disturbance patterns, movement of inorganic matter, movement of organic matter, population studies,primary production), three additional site-designated core keywords (education/outreach, information management, and social science) as well as the recent addition of ‘signature’.  For the data presentation discussed above, it has been important to convey that representative examples are given rather than all datasets in each category. In the data catalog, however, all the datasets identified by keywords are available to be retrieved.

The definition of a particular category or set of categories may change over time. With different development cycle times at different levels, having loosely-defined categories or categories ‘in formation’ at the community-level avoids ‘premature standardization’ and does not preclude their definition and immediate use at the site-level.  Site-level changes can be made quickly, while it takes time for development of community interest and consensus. Though a site must stay in synch with definitions at the community-level, the PAL and CCE use of ‘signature’ and ‘core’ categories gives us experience that may inform the development of these or related categories in broader semantic arenas. Our local work may be seen as contributing to network-wide category formation.