Building: Main Building 1st Floor
Room: Salone degli Oceani
Last modified: 2013-10-02
Abstract
In the Naturalis Biodiversity Center, high volumes of media files are being produced. A project to digitize collection objects produces on average 0.6 terabytes (TB) of data (20,000 images and pdf files) each day. The original files are retained as backup in off-line storage at the Netherlands Institute for Sound and Vision. Derived media files in lower resolutions for daily usage are created and stored in a Cloud. To be able to process up to 1.2 TB of media files each day, the process of harvesting, storage and processing is fully concurrent and can handle multiple media producers at the same time. A Media Server has been created to index and serve the media files by URL, where each media file gets a unique ID. Videos are served as streams. The Media Library has processed 1.6 million media files since it went into production three months ago. Naturalis aims to move all their media assets to this central Media Library facility.
The next phase of the Media Library focuses on development of a Search Facility. This includes a Drupal content management system to add metadata and to search and browse the media assets. The metadata is stored in TDWG Audubon Core format, a new media metadata standard soon to be released. Each media file can have only one ‘owner system’, the Naturalis collection or species registration system in which metadata for the object and associated media files is maintained. This is combined with additional metadata generated by the Media Library and presented as one integrated Audubon Core record to the user. For indexing and integration of the media metadata, a Search Index is created using the open source Apache Solr search platform.