Missouri Botanical Garden Open Conference Systems, TDWG 2014 ANNUAL CONFERENCE

Font Size: 
DaRWIN (Data Research Warehouse Information Network)
Franck Theeten, Marielle Adam, Paul-André Duchesne, Yann Chambert, Cathy Emery, Philippe Vignaux, Son Du, didier Van den Spiegel

Last modified: 2014-09-25

Abstract


DaRWIN (Data Research Warehouse Information Network) is an on-line system for the management of Natural History Collection which has been developed and maintained by the Royal Belgian Institute of Natural Sciences (RBINS) in Brussels since 2010. Its open source code has been made available on a public Git repository : https://github.com/naturalsciences/Darwin/ (see also http://naturalsciences.github.io/Darwin/)

DaRWIN provides the complete suite of services needed to manage big collections. It is technically based on the Symfony/Doctrine frameworks and on a relational model implemented in PostgreSQL/PostGIS. This model is partly based on the ABCD (Access to Biological Collection Data) standard, which is also used to import the data by batches.

DaRWIN is the main database system of the RBINS since 2010 and has been adopted by the Royal Museum for central Africa in Tervuren, Belgium, in 2013. The good interaction with the ABCD standard and the organisation of the Symfony framework allow extending DaRWIN by several modules gathering in logical group functionalities which may interest scientists and curators looking for a powerful and complete solution.

Three of these modules are:

-an Excel template converting large files into ABCD documents. This tool can also perform unicity and syntax checks on the data.

-a client connecting the CoL (Catalog of Life), WoRMS (World Registry of Marine Species) and GBIF (Global  Biodiversity Information Facility)  taxonomic web services. It controls the validity of taxa, checks their upper hierarchy and links them to bibliographical resources, if any. It also tags synonyms and link them to their accepted name, detects homonyms and use the ‘fuzzy matching’ functionality offered by GBIF to suggest corrections of misspelled or unrecognized names. Reporting is made via Excel files in the ABCD standard, that produces global statistics  and uses a colour scheme to identify each case.

-a label-printing module, entirely adjustable. It associates the two XML representations of the data and label structure to a JQuery API. Labels are created by dynamically-generated CSS and XQuery instructions.

These modules can be perceived both as constitutive parts of the DaRWIN systems and as reusable external tools with their own API. DaRWIN is as well a complete solution, with a flexible ecosystem conveniently allowing the development of auxiliary tools reusable in several contexts.