Missouri Botanical Garden Open Conference Systems, TDWG 2014 ANNUAL CONFERENCE

Font Size: 
StanDAP-Herb develops a standard process for extracting metadata from digitised herbarium specimens
Agnes Kirchhoff, Walter G. Berendsohn, Ulrich Bügel, Fernando Chaves, Cailin Guan, Markus Lindhorst, Dominik Röpert, Eduard Santamaria, Karl-Heinz Steinke, Hangyan Zheng

Building: Elmia Congress Centre, Jönköping
Room: Rydbergsalen
Date: 2014-10-27 11:30 AM – 11:50 AM
Last modified: 2014-10-03

Abstract


On herbarium sheets, data like plant name, collection site, collector, barcode, accession number, etc. are found mostly on labels glued on the sheet. The data are thus visible on images taken of the specimen. Currently, they are mostly entered manually into collection databases. The StanDAP-Herb Project (Standard Data Acquisition Process) funded by the DFG (German Research Foundation) develops a standard process for (semi-) automatic detection of meta-data on Herbarium specimens to replace the time consuming manual data input as much as possible. Image processing software detects objects such as labels or barcodes on the digitized record and classifies them. Text objects are transformed into structured information using text mining algorithms. For handwriting, author identification is attempted. The project evaluates and enhances existing software to comply with standard interfaces and integrate them into an open software architecture based on established IT standards. The software modules thus become available for work flow processing, in order to verify data quality, facilitate data discovery and enhance the application of collection data in research.

The project addresses a large proportion of scientific collections: approximately 22 million herbarium specimens exist as botanical reference objects in Germany, about 500 million worldwide.