Skip to Content

Design and Rationale for a Minimalist Dynamic Datasite

Printer-friendly versionPrinter-friendly version
Issue: 
Fall 2002

- Tim Bergsma, Kellog Biological Station (KBS)

In the Fall 2001 Databits, Wade Sheldon explained that not only are dynamic web pages a great way to get data out of databases, but also databases are a great way to update and control dynamic pages (Database Techniques for Creating Maintenance-free Web Pages). Here, I'll try to show that a tight relationship between dynamic pages and databases can give you a fully-functional datasite with only two pages! Then I'll suggest some reasons for adopting the minimalist approach, even if only in part.

Design

Let's start by defining a critical concept: the zero-content web page. We'll assume that your primary data is tabular, and is already in relational databases. You could write a dynamic server page (JSP, ASP, CGI, other?) for each data table, using appropriate connection parameters. If you provide about the same features for each table (nice header, nice footer, links to your homepage, whatever) you'll quickly notice that your various pages are nearly identical, except for the connection parameters. Why not have your page get even the connection parameters from a database? Suddenly, your pages are identical in structure and function, and can all be collapsed into a single "page-that-serves-tables". I call this a zero-content web page because it gets all its data, even connection data, from a database. (Actually, "near-zero": you'll need to hardcode connection information for the "master" database). You can use this page to get different tables by passing a "lookup" parameter in the URL.

A second critical concept is a design principle regarding how your datasite's workload is distributed among pages: organize pages by function rather than by dataset. We already have a page that serves tables; we also need a page that serves metadata (or at least pointers to metadata.) There: a complete datasite with only two pages! For a crude example of a page-that-serves-tables, see http://lter.kbs.msu.edu/Data/table.jsp?Product=KBS002-001&limitBy=Year&order=desc. For a page-that-serves-metadata, see http://lter.kbs.msu.edu/Data/LTER_Metadata.jsp?Dataset=all.

I hear you saying, "Hmmm...kinda like a meal of tofu and bran flakes: complete, but not very satisfying." I agree. You'll probably want to add pages to serve images, maps, personnel, citations, or some other specialized entity type. And you'll almost certainly want to add control and navigation pages, as Wade described earlier, so that your users never have to type all that url-parameter stuff by hand. For especially rich data sets, you may want to add pages that have dataset-specific functions. You could also add pages for submitting rather than retrieving data. But chances are, you will always have fewer data functions than data sets. So organizing by function helps you achieve a "minimal" datasite.

Rationale

The reasons for adopting a minimalist approach to datasite design are all variations on the theme "low maintenance". Properties of a minimalist dynamic datasite include the following.

  • Maintainability. There are fewer pages, so there are fewer places to look for an error, if one occurs (my site dropped from 60-100 dynamic pages to less than 10). The pages themselves need no editing when the database content changes. Also, there are fewer links to maintain: by authoring just two hyperlinks, you can make every data "page" point to its corresponding metadata "page", and vice versa.
  • Adoptability. A minimalist design, in whole or in part, could be easier to re-implement at another site.
  • Intelligibility. With just a few pages, you leave less of a mess for the next data manager. And, as someone has pointed out, the next manager might just be you.
  • Extensibility. If you want to add a data manipulation feature for all your data tables, you only have to add it in one place.
  • Scalability. No matter how many datasets you accrue, the size of your datasite never needs to change.
  • Consistency. Since, for instance, all your data tables are served by the same page, all your data "pages" have the same look and feel, by default.

Conclusion

The power of the minimalist site design derives from two well known principles: the principle of code reuse in computer science, and the principle of normalization in database theory. Writing a single page that serves many tables is simply an example of code reuse. Getting all content from a relational database means it can be represented exactly once, and therefore definitively (normalization/data reuse). Organizing a dynamic datasite into a few function-oriented pages can greatly decrease the maintenance burden, which in turn enables greater functionality. While few sites will actually limit themselves to just two pages, the minimalist approach could yield benefits wherever applied.

Note

I use the term "datasite" to represent an integrated subset of a website that is dedicated to providing formal data. At my site, a secretary maintains the administrative part of the web, and I mainly concern myself with the part devoted to delivery of research products.