Digitization Workflows

This video (07:20) on Digitization Workflows identifies five clusters (or stages) in the process of digitizing natural history collection objects using digital images, and these stages can be easily adapted to other biodiversity data sources. If you are unable to watch the embedded video, you can download it locally. (MP4 - 26.8 MB)
As the video highlights, digitization protocols vary from institution to institution, but it is essential that the chosen protocol is agreed, documented and respected.

We do not teach digitization, per se, during the workshop, as it can easily stand as a week-long course on its own, instead we focus on basic introduction to biodiversity data capture. However, we want to provide you with resources on digitization as we know many are interested in this.

There are many ways to organize digitization efforts and so digitization can seem daunting to begin with. It is important to remember that in most cases someone else has already tried to digitize the same types of specimens and objects that you are planning to. In this exercise we introduce you to some practical digitization workflow resources to help get you started. These will also form the basis for work we will do in the workshop on selecting, modifying and assessing workflows.

Some steps in the process may include:

  • Pre-digitization curation and staging: This includes the preparation of the data source for the digitization process, including the assignment of unique identifiers that will help to refer to the source without error and to keep all derived information together.

  • Image capture: This includes a fair amount of planning, not only on the image capture itself (e.g. definition of the work sequence, selection of adequate hardware), but also on how and where the images will be stored and handled.

  • Image processing: This includes quality control, file conversion, etc.

  • Electronic data capture: The core of the digitization process, includes capturing key information in a database. The video highlights that the most common method of entering the information is through a keyboard, but more and more institutions are turning to advanced data entry technologies.

  • Georeferencing: Geographical information is very important for biodiversity analysis, so digitization projects should seek to extract the most accurate geographical information possible.

Integrated Digitized Biocollections (iDigBio) is the coordination centre for the United States National Resource for Advancing Digitization of Biodiversity Collections (ADBC). They lead a nation-wide effort to make data and images for millions of biological specimens available in a standard electronic format for the research community, government agencies, students, educators, and the general public. They have produced several videos that discuss the digitization process.

There are other videos in the iDigBio series that you may be interested in, if you wish to learn more about specific workflows for different specimen types: