Last modified: 2011-10-03
Abstract
At the 2008 TDWG conference, Tim Robertson presented some of the challenges encountered when attempting to consume large collections of occurrence data provided over the widely used protocols of TAPIR [1], DiGIR [2] and BioCASe [3]. A solution to these challenges was proposed in the form of a simpler, less verbose data exchange format. This proposal led to GBIF [4] joining the Darwin Core (DwC) [5] drafting team in 2009, and the eventual design of the Darwin Core Archive (DwC-A) format. In 2010, GBIF expanded the DwC-A to support checklists and further enhanced it with the inclusion of dataset descriptive data according to the GBIF metadata profile [6]. With these key features in place, GBIF engaged in targeted outreach to its Participants and promoted the adoption of the DwC-A, while simultaneously providing a number of guides and tools to support creation and publishing.
From its introduction in mid-2010, DwC-A has seen steady adoption and, most significantly in the GBIF network, by publishers of large datasets. While the number of DwC-A publishers is still small relative to that for users of established protocols like DiGIR, their data is huge: in 2010, DwC-A accounted for 93 million occurrence records published in the GBIF network, representing 35% of the GBIF index, and almost as much as those published via DiGIR. That number continues to grow in 2011. A significant portion of 2011 DwC-A publishing can be attributed to the version 2.0 release of GBIF’s own Integrated Publishing Toolkit (IPT) [7] in early 2011 which simplifies publishing and exclusively uses the DwC-A as its publishing format.
While the DwC-A format has made a major improvement in the GBIF indexing process by simplifying and reducing the time taken for exchange, there are future challenges that must also be addressed by the community. These include questions on the possible need for supporting other core types (e.g. a spatial core) and the social challenges surrounding extension and vocabulary governance.
[1] http://www.tdwg.org/dav/subgroups/tapir/1.0/docs/tdwg_tapir_specification_2010-05-05.htm
[3] http://www.biocase.org/products/protocols/
[5] http://www.tdwg.org/standards/450/