Missouri Botanical Garden Open Conference Systems, TDWG 2014 ANNUAL CONFERENCE

Font Size: 
Biodiversity Data Quality tailored for Scientific Workflows
Christian Gendreau, Allan Koch Veiga, David P. Shorthouse, Antonio Mauro Saraiva

Building: Elmia Congress Centre, Jönköping
Room: Rum 10
Date: 2014-10-28 05:07 PM – 05:20 PM
Last modified: 2014-10-03


In order to address some of the data quality challenges faced with biodiversity data,a new conceptual model and software implementations are currently under development. Both can be used to design powerful and reproducible scientific workflows that improve, measure and validate Biodiversity Data Quality (BDQ).

The conceptual model can be used to identify user needs and to describe them in a homogeneous way. Based on the conceptual model, reusable software can be designed or customized to deal with specific BDQ issues. The resulting implementations/customizations from possibly multiple organisations could be perfectly suitable to compose scientific workflows to tackle BDQ in a myriad of contexts.

Their usage may vary from standards and best practices based on the conceptual model, to the call of services of the implemented tools that would be exposed, like the validation of an entire DarwinCore Archive (Dwc-A). By using configuration and customization, we want to achieve a better reusability over different software platforms and build a repository of available validations/assertions built by the community. The framework should also offer an extension system to support functionalities that would require more than validation and assertion, like a suggestion feature that could, for example, suggest an ISO date representation for all the different dates received.

We will present the conceptual model and a software implementation that can be consumed by scientific workflow software to tackle particular BDQ issues. We will also present our ideas about how the scientific workflows community can use the conceptual model to deal with BDQ.