Mason Kortz (PAL, CCE)
Collaboration in a distributed environment is a cornerstone of the LTER - and of the LTER IMC. This article reviews a collaborative design meeting held in early 2011 to design and develop an updated LTER personnel database.
The LTER Network Office has recently started sponsoring funding for product-oriented working groups – working groups that are focused on the creation of a specific scientific, technical, or organization product. The working groups may be created for the express purpose of creating a product, or they may be formed as a subset of an existing, long-term working group that pursues a general theme across many projects. The LNO funding available to the product-oriented working groups can be used for travel and meeting expenses, providing these groups with an opportunity for face-to-face interaction.
The Web Services Working Group (WSWG) was recently awarded funding for a product-oriented subgroup focused on redesigning the LTER personnel database, and we chose to use our funded meeting time for application design and development. The personnel database represents a continuation of a collaborative model that started with the ProjectDB and Unit Registry projects: applications that are designed and developed by the community, with LNO input, to fulfill a network-level role. The first part of this article describes the meeting process and its outcomes, including both design decisions and applications. The final section reviews the benefits of targeted design and development meetings and considers the role such meetings can play in the LTER network.
The PersonnelDB design and development meeting was held from February 21st to 25th 2011, at the LTER Network Office in Albuquerque, NM. The meeting was attended by six information managers from different sites – Sven Bohm, Gastil Buhl, Corinna Gries, Mason Kortz, Wade Sheldon, and Jonathan Walsh – as well as several members of the Network Office – James Brunt, Mark Servilla, Marshall White, and Yang Xia. The meeting was broken into two segments. The first two days were dedicated to making design decisions; days three through five focused on beginning the development process. All of the participants listed above attended the design portion of the meeting. The development portion was attended by Sven Bohm, Mason Kortz, and Wade Sheldon.
Days 1-2: Design and Discussion
During the first two days of the meeting, participants were tasked with creating an application specification that could later be used as a reference by the development team. The group began by scoping of the new PersonnelDB database, web service, and interfaces. We discussed potential use cases from the site, network, and public perspectives and determined which of these use cases did or did not fall in the purview of the PersonnelDB. Scoping also included the discussion of which personnel would be included in the database, and who would have access to maintain these records.
As decisions were made and the scope narrowed, the group shifted towards a more technical design perspective. Using the use cases we had previously discussed, the group designed a data model that could represent all of the necessary data and relationships in the personnel database. From this data model we designed two implementations: a relational database schema to store personnel data, and an XML schema to exchange data between the web service and its clients.
Having established design specifications for the data storage and exchange mechanisms, the group worked on more specific technical questions. We created a basic REST syntax for the web service, along with general guidelines for extending the syntax if necessary. We also discussed features for the user interfaces to the personnel service, and mocked up a search interface with a workflow diagram. These were reviewed and compiled with the use cases and data model specifications from the earlier discussions to create a design specification as reference for the development phase.
Throughout the meeting we discussed organizational design as well as technical design. Specifically, we considered the roles and responsibilities of the information managers, network office managers, and network personnel in maintaining the personnel database. The group also discussed the maintenance of the servers on which the development of the PersonnelDB, and future WSWG projects, would take place.
Days 3-5: Development
The last three days of the workshop focused on developing the PersonnelDB application, using the specifications produced from the first two days. The first task was to set up a development environment in the LNO computational infrastructure; this was done by creating a virtual server and giving the PersonnelDB development team access to manage data and applications on the server. After establishing the server environment, development was done on a MySQL database, PHP entity model and web service implementation, and the XML schema, including XPath and XSLT code for searching and displaying PersonnelDB records.
During the development portion of the meeting, the design specifications continued to evolve as we implemented them. Because of the detail we put into the specifications in the first part of the meeting, most of these changes were relatively small and dealt with technical implementation issues. Although several of meeting participants were not present during the development phase, we were able to continue the design discussion over email and through VTC meetings.
The design and development meeting at LNO was extremely productive, but the PersonnelDB project required work beyond even a very productive week! Part of the wrap-up during the last day was to outline ongoing work and assign tasks to meeting participants. These tasks fell into two general categories: development and documentation. Ongoing development work includes finalizing and testing the PersonnelDB web service and creation of search and management interfaces to the service. Documentation tasks include user manuals, schema and code documentation, and review and analysis of the design process (such as this article). All of this work is supported, in part, by information management buyout time funded by the LNO.
The week-long meeting at LNO resulted in several important design decisions, as well as progress on the PersonnelDB application. Here, I describe the most notable decisions made and the reasoning behind them, as well as the application components and documentation that have been, or are being, produced.
Centralized Web Service Enabled Database: The personnel database redesign was focused on creating an application that was useful in both the network and site contexts. To do this, we decided on a centralized database that would be hosted at LNO, providing a single authoritative source for personnel and roles throughout the network. In order to support the distributed environment of the LTER sites, we also decided on web service access to the data. This allows any site to use the contents of the centralized database as part of their local data system without having to maintain a duplicate copy of the database. For those sites that choose to maintain their authoritative personnel information locally, the web service interface also enables bi-directional synchronization, so sites can easily pull data from or push data to the centralized database.
Personnel/Profile Database Split: In scoping the PersonnelDB project, the group divided information about LTER network members into two categories: personnel information and profile information. Personnel information is organizational in nature, and includes a person’s roles, site affiliations, contact information, and status as active or inactive within the network. Profile information includes areas of research or technical expertise, working group membership, and collaborations with other LTER members. The PersonnelDB application will handle personnel information, while profile information will be handled by a proposed future application. The profile application would refer back to the PersonnelDB for personnel information.
Because of the organic growth of the existing personnel database, personnel and profile information were mixed together. The decision to split them was not one of the expected goals of the WSWG during this meeting, rather, the decision was made after analysis and discussion of our use cases. Personnel information requires an authoritative source for generating NSF reports, managing contacts for network emails, and integration with other applications. Thus the canonical list of persons in roles at sites needs more control, notification of changes, and archiving of history. Profile information is voluntary, and is more dynamic. Information such as research interests and collaborations will evolve as LTER members self-identify, and so a less rigidly controlled application seems appropriate.
Multiple Roles per Person: One of the most universally recognized needs in the updated personnel database was support for multiple roles per person, as the number of individuals associated with multiple sites has increased over time. Discussion led us to identify two types of roles: NSF roles (drawn from a standard list used in NSF reports) and local roles (roles that are unique to the LTER network or even a particular site). The former represents a rigid controlled vocabulary; the latter is more flexible, with managers being able to assign roles freely as well as create new roles. Each person in the database must have one or more NSF roles, and optionally may have any number of local roles as well. The division of roles into NSF and local influenced the management system (see below).
Management Access: Part of the organizational design work was discussing the permissions and responsibilities in managing personnel data. The group agreed that role and contact information should be shared between sites to avoid repetition of data when a person is involved with more than one site, but this opened the door for many questions about primary and secondary sites and the management access associated with each one. Ultimately we decided on the following three-tier management scheme:
- LNO managers can create personnel and edit identity (name and primary email) information, assign NSF roles, assign contact information, and designate contact information as primary.
- Site managers can create personnel and edit identity information for anyone with a role at their site. They can also assign and edit local roles and contact information, provided the role or contact information is associated with their site. This means that a site user can add information to any person in the database, but cannot remove or edit information created by another site.
- Personnel will be able to edit their own professional information in the proposed profile application.
‘No Delete’ Policy: In the interest of preserving a record of previous LTER personnel, and previous positions held by current LTER personnel, the group decided on a ‘no delete’ policy for the PersonnelDB database. Instead, whenever a person, role, or contact information element would be deleted, it is flagged as inactive. This decision was mainly supported by the need for other resources, such as bibliographies, to reference personnel who are no longer active in the LTER network.
Custom XML Exchange: When the WSWG members initially discussed XML as an exchange mechanism, we proposed using the Party element from the EML specification. As we compared EML to our use cases, we realized that some of the information we needed to encode, including roles, active and inactive data, and multiple name aliases, could not be represented in EML without extending the schema. Furthermore, the structure of the "Party" element would require repetition of name and identifier information with each role or contact information element. After weighing the complexity of extending EML against the time to draft a new schema, we decided to create a custom schema that more closely mirrored the PersonnelDB data model, but used elements borrowed from EML, allowing for easy conversion back to an EML Party element.
Web Service: The primary product developed by the PersonnelDB team is a web service providing access to personnel information. The web service application consists of several sub-components: a MySQL database, a PHP entity model that abstracts access to the database, a PHP service layer that accepts and processes HTTP requests, and an authentication/authorization mechanism that using the LNO LDAP server. The PersonnelDB web service supports both read and write interactions, provides the point of contact between user interfaces and the actual personnel data.
XML Tools: A library of XML tools is a third product type. This library can be used with the outputs of the PersonnelDB web service is being created to assist developers in creating their own personnel-enabled sites and applications. These XML tools include XPath queries for common sorting and subsetting actions and XLS stylesheets for transforming PersonnelDB outputs into other useful formats, including EML and HTML.
Manuals: User manuals are in development for both the search interface and the management interface. Developer documentation is also being produced. The PersonnelDB XML schema is already fully annotated, with documentation available via the LNO subversion server. A full description of the REST web service interface will also be released for developers who wish to integrate the PersonnelDB service with their own applications. Finally, the data model, MySQL implementation, and PHP code base will be documented to facilitate maintenance and upgrades of the application in the future.
Reviews: In addition to technical documentation, the WSWG will be producing documentation of the organizational elements of the PersonnelDB project. The first such documentation was a report made to NISAC in March 2011. The second piece of documentation is this report, providing a summary and retrospective of the design and development meeting. An additional report will be made on release of the service and interfaces. Finally, a living document of bug reports and suggestions will be maintained on the LTER IMC web site.
Benefits of the Co-located Design and Development Meetings
Beginning a project in a distributed environment like the LTER network can be difficult – without a tangible starting point, participants are often at a loss on where to begin. To counter this, many distributed projects begin with one member assuming the role of project leader and creating an initial prototype product. This provides the group with a starting point, but it also reduces the benefit of having multiple designers. Beginning a design project with a co-located design meeting is one approach to addressing this issue. Personal interaction encourages participants to contribute proactively, rather than just responding to the project leader’s work. This leads to a greater investment in the project, which carries over beyond the initial meeting and into the distributed work that follows. In this way, a design meeting can be very important for overcoming the inertia with which distributed projects must cope.
In a distributed environment, focusing the attention of all participants on a single task or issue at the same time can be a daunting task. Because of this, it is often unfeasible to have all members contribute to every decision. This defeats one of the primary purposes of group design – incorporating a broad and diverse set of perspectives. In a co-located meeting, each participant can contribute, and do so in an environment that encourages discussion and dialogue. This leads to more informed decision making, and thus to a more robust design for the project.
Another benefit of a co-located design meeting is the speed with which information is exchanged, challenges are recognized and discussed, and decisions are made. In a distributed environment, participants’ schedules often do not align, so collecting group feedback or reaching consensus may take days or weeks. The real-time interaction provided by a co-located design meeting, as well as the lack of distractions, helps focus the group on the task at hand. A design group can accomplish in a few days work that could take months in a distributed environment.
Co-located development provides its own benefits. Initially, it may seem that a day of distributed development work is just as useful as a day of co-located development work. However, development work is rarely the rote implementation of an existing design. As development work proceeds, issues that were not recognized during the design phase may be uncovered or technical limitations may require redesign of the new product. In this way, development benefits from co-location just as design does. Additionally, collaborative development allows for an extremely useful exchange of development methodology in the form of quick, informal code reviews – something that is very difficult to coordinate in a distributed environment.
Overall, a focused design and development meeting is an excellent way to begin a project. The work that can be accomplished in a week, or even in a few days, provides momentum to keep a project moving forward in a distributed environment. Furthermore, the ability to have many contributers working together on the design and development of a project leads to informed decisions and high quality products that can serve as the foundation for future, distributed work