Becoming an Information Professional: A Student Experience with UIUC MLIS Program’s Data Curation Specialization
Chung-Yi Hou, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
As the volume, format types, and sources for data increase rapidly with the invention and improvement of new scientific instruments, the ability to manage and curate data is becoming more important as well. The skills and knowledge required to provide stewardship for digital data are especially crucial. This is because the rate of digital data generation and usage significantly outpaces the number of trained informational professionals who are available to support the varying data requirements from all the different research domains. In fact, many major studies have found and reported the need to provide a sustainable framework and policies that will allow on-going programs to be implemented to develop and train skilled information professionals. For example, the Research Information Network (RIN) and the British Library proposed that policy-makers need to help with “producing appropriate, effective and sustainable models for training and careers in managing data” (Research Information Network and the British Library, 2009, p. 52). Likewise, the European Commission published a report in 2010 emphasizing the need for data scientists and their expertise by placing the following call to action:
"We urge that the European Commission promote, and the member-states adopt, new policies to foster the development of advanced-degree programmes at our major universities for this emerging field of data science. We also urge the member-states to include data management and governance considerations in the curricula of their secondary schools, as part of the IT familiarisation programmes that are becoming common in European education." (European Commission, 2010, p. 32)
Additional studies and efforts by Library and Information Science schools, as highlighted by the “Preparing the workforce for digital curation: The iSchool perspective” panelists at the 9th International Digital Curation Conference, also continue to help discovering, improving, and expanding the education and development opportunities for information professionals.
Among the many different programs that are implemented to address the need to prepare information professionals, the Graduate School of Library and Information Science (GSLIS) at the University of Illinois at Urbana-Champaign (UIUC) takes a “leading role in both data curation education and research” (GSLIS The iSchool at Illinois, n.d., p. 2) by offering a degree specialization in Data Curation. Specifically, the Specialization in Data Curation program was developed in 2006 through a grant from the Institute of Museum and Library Services (IMLS), and can be earned “either as part of an ALA-accredited Master of Science (MS) degree or, for students who have already completed their master’s degree, as part of a Certificate of Advanced Study (CAS) degree” (GSLIS The iSchool at Illinois, n.d., para 3). The program defines Data Curation as the “active and ongoing management of data throughout its entire lifecycle of interest and usefulness to scholarship, science, and education” (Cragin et al., 2007). By focusing on the collection, representation, management, and preservation of data, the program “brings together library and information science and archival theory as well as digital technologies for information discovery, access, and re-use” (GSLIS The iSchool at Illinois, n.d., p. 2). In addition, since UIUC’s Center for Informatics Research in Science and Scholarship (CIRSS) oversees the Specialization in Data Curation program, the Center injects further data curation education and research opportunities into the program. As a result, UIUC GSLIS expects that graduates of the Specialization in Data Curation program will be trained and develop the necessary expertise to be employed “across a range of institutions, including museums, data centers, libraries and institutional repositories, archives, and throughout private industry” (GSLIS The iSchool at Illinois, n.d., para 3) to help curate and manage the data and the associated requirements.
While the knowledge and skills gained from the Specialization in Data Curation program can be applied to all data relating to sciences, humanities, and social sciences, students can select elective courses beyond the three required core courses to personalize the program experience to their personal interests and professional goals. In addition, students can participate in projects hosted by CIRSS to gain hands-on curation experiences. An example project is the Data Curation Education in Research Centers (DCERC) program. The project is led by UIUC in collaboration with the University of Tennessee and the National Center for Atmospheric Research (NCAR), a premier national research center with state-of-the-art data operations and services. The goal of the DCERC program is to develop a sustainable and transferable model of data curation education for masters and doctoral students in Library and Information Science. To achieve this goal, a key part of the program has been summer internships for masters students at NCAR. During the internships, the students are expected to complete a data curation project relating to scientific research at NCAR. In addition, the students are paired with both science and data mentors based on their areas of interest and technical skill level in order to further understand NCAR’s research and data environment. Furthermore, the internship offers the students other opportunities to participate in activities, such as conferences, talks, and workshops, which allow the students to explore additional topics relating to the practices and policies of data curation and management.
For the author, her experience at NCAR through the DCERC program allowed her to work with the mentors at the NCAR Computational and Information Systems Laboratory’s (CISL) Research Data Archive (RDA) and NCAR Research Applications Lab (RAL) in order to make a unique climate reanalysis dataset publicly available, accessible, and usable. The dataset contains three-dimensional hourly analyses in netCDF format for the global atmospheric state from 1985 to 2005 (a total of 184,080 files) on a 40km horizontal grid (0.4°grid increment) with 28 vertical levels. As a result, the dataset provides detailed representation of local forcing and the diurnal variation of processes in the planetary boundary layer to allow and promote studies of new climate characteristics. During the project, the author focused on three specific areas of the data curation process: data quality verification, metadata descriptions harvesting, and provenance information documentation. When the curation project started it had been five years since the data files were generated. Also, although the Principal Investigator (PI) had generated a user document, the document had not been maintained. Furthermore, the PI had moved to a new institution, and the remaining team members were reassigned to other projects. These factors made data curation in the author’s focus areas especially challenging. As a result, the project provided the author a realistic environment to understand and practice the methodologies for resolving data curation issues in a scientific research setting. Overall, the author was able to make the dataset available, accessible, and usable through the data’s landing page at RDA at the end of the eight-week internship. The project illustrated that it was essential for the proper and dedicated resources to be invested in the curation process in order to give datasets the best chance to fulfill their potential to support scientific discovery. Equally important, the project team also reflected the following key experiences with the data curation process:
- Data curator’s skill and knowledge helped make decisions, such as file format and structure and workflow documentation, that had significant, positive impact on the ease of the dataset’s management and long-term preservation.
- Use of data curation tools, such as the Data Curation Profiles Toolkit’s guidelines, revealed important information for promoting the data’s usability and enhancing preservation planning.
- Involving data curators during each stage of the data curation life cycle instead of at the end could improve the curation process’ efficiency.
As the data and their associated management and curation requirements grow, the expertise of the trained information professionals will also become more important to meet the challenges. It will be important to continue to raise awareness and emphasize the need to implement the framework and policies to support the training and professional development of information professionals in the area of data curation. Meanwhile, through her dedicated academic coursework in data curation at UIUC and practical experiences with the Research Data Archive at NCAR, the author provides an example of how the current effort is already making positive impact on the next generation of information professionals who are willing and welcome the opportunities to take on the responsibilities.
Cragin, M.H., Heidorn, P.B., Palmer, C.L., & Smith, L.C. (2007). An Educational Program on Data Curation. Poster, Science and Technology Section of the annual American Library Association conference. Washington, D.C., June 25, 2007. Available: http://hdl.handle.net/2142/3493.
European Commission. (2010, October 6). Riding the wave – How Europe can gain from the rising tide of scientific data – Final report of the High Level Expert Group on Scientific Data. Retrieved from http://ec.europa.eu/information_society/newsroom/cf/dae/document.cfm?action=display&doc_id=707
Graduate School of Library and Information Science The iSchool at Illinois. (N.D). Specialization in data curation. Retrieved from: http://www.lis.illinois.edu/academics/degrees/specializations/data_curation
Graduate School of Library and Information Science The iSchool at Illinois. (N.D.). Specialization in data curation program overview. Retrieved from http://webdocs.lis.illinois.edu/comm/recruitment_pdfs/Specialization-Data-Curation.pdf
Hedstrom, M., Larsen, R., Palmer, C., DeRoure, D., & Lyon, L. (2014, February 25). Preparing the workforce for digital curation: The iSchool perspective. Panel discussion presented at the 9th International Digital Curation Conference, San Francisco, CA.
Research Information Network and the British Library. (2009, November 2). Patterns of information use and exchange: case studies of researchers in the life sciences. Retrieved from http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/patterns-information-use-and-exchange-case-studie