Missouri Botanical Garden Open Conference Systems, TDWG 2013 ANNUAL CONFERENCE

Font Size: 
Expanding the Biodiversity Heritage Library
William Ulate

Building: Grand Hotel Mediterraneo
Room: America del Nord (Theatre I)
Date: 2013-10-29 11:56 AM – 12:09 PM
Last modified: 2013-10-25


BHL has been continuously expanding in terms of quantity and types of content, geographical coverage and services provided, answering requests from diverse communities, including scientists, and particularly taxonomists.  BHL currently includes more than 41 million pages, 118,000 volumes, almost 63,000 titles and since March of this year, almost 95,000 articles from BioStor and the number of taxonomic names occurrences within the text has increased substantially with the new services from the Global Names Architecture, totalling now more than 150 million appearances of names.

The incorporation of segments has brought the challenge of deduplicating new article titles contributed by our providers.  Solving this task by clustering segments together has allowed us to categorize these relations opening the door to new functionality for our everyday end users.  Technically, there are paths that could be followed if enough resources were available, like assigning unique identifiers to legacy articles; tagging and extracting entities from the text.

The wish list goes on, including citation services, segments linking out to other repositories, crowdsourcing OCR improvements, legacy articles DOIs, tagging and extracting identities, among other cool things.