Font Size:
An iDigBio perspective on Darwin Core Archives
Building: Grand Hotel Mediterraneo
Room: America del Nord (Theatre I)
Date: 2013-10-30 04:35 PM – 04:45 PM
Last modified: 2013-10-05
Abstract
In this paper iDigBio technical team will present our experiences as first time implementers of darwin core archive based data processes, covering both the strength of the ecosystem and format, as well as some of its weaknesses for our use cases. We will start with a broad overview of the darwin core archive ecosystem and the resources available for first time implementers looking for examples of functional processes and guidance. We will also discuss our early immersion in standards and practices as we became familiar with the domain and other projects within it. We feel that the darwin core archive excels in this phase, and that GBIF’s strong leadership is one of the format’s greatest assets. Next, we will describe some of our early difficulties with the existing state of the tools and practices by providing discussions of features that our community partners requested and the challenges presented in examining the implementation of those features. Many of these difficulties arose at points where the iDigBio project seeks to expand upon the existing work of GBIF, and we view this mostly as natural growing pains of a format seeking to encompass more uses cases. The single biggest barrier to our workflow fitting into the darwin core archive smoothly was the lack of tool support for media objects as a core type in IPT. We will then discuss the more general issues that presented themselves to us during the initial implementation phase of the project and the compromises we made in solving these issues. The issues discussed will be less related to the format itself but still have important implications in how files are constructed. In many cases these issues are projections of broader community issues into the domain of the format, such as the presence of strong identifiers or semantic issues. Finally, we will present potential solutions or improvements to the tools, process and ecosystem that would make the darwin core archive format more accessible to first time implementers. One key area we feel could be improved, and can contribute to, is the expansion of support for darwin core archives into languages other than Java. We will also provide recommendations, based on our use cases, for how the format could be modified or extended. One route for this could be, for example, to have some of the pieces that currently exist as extensions promoted to first class entities. For example if resource relationships could be expected to exist in every file, we could eliminate the concept of a core file entirely.