Missouri Botanical Garden Open Conference Systems, TDWG 2016 ANNUAL CONFERENCE

Font Size: 
Understanding mass flowering of dipterocarps through semantic occurrence information extraction
Roselyn Santos Gabud, Riza Theresa Batista-Navarro, Vladimir Mariano, Eduardo Mendoza, Sandra Yap

Building: CTEC
Room: Auditorium
Date: 2016-12-06 09:45 AM – 10:00 AM
Last modified: 2016-10-15


Forest restoration and rehabilitation is a challenge in biodiversity conservation that requires the understanding of data collected over long-term periods from large-scale geographic areas, given the complex and long reproductive cycles of forest trees. In the Philippines, the lowland tropical forests primarily comprised of dipterocarp species are one of the most threatened ecosystems in the world. Dipterocarps, belonging to the family Dipterocarpaceae, are economically and ecologically important due to their timber value as well as contribution to wildlife habitat, climatic balance and stronghold on water releases. They exhibit supra-annual mass flowering events that occur in irregular intervals of two to ten years possibly synchronously across Asia. In order to understand the mass flowering of dipterocarps within the context of their effective natural regeneration and reforestation, we propose to exploit enormous amounts of text form biodiversity records in taxonomic literature, scholarly articles, books and agency reports. We aim to develop and employ information extraction methods to augment structured observation data with occurrence information captured from the literature. To this end, we have developed a schema for the semantic annotation of taxon names, geographic locations, dates, habitat descriptions, authorities, and names of herbaria (in the case of collected specimens) to aid in determining the distribution of dipterocarps. Our proposed schema, furthermore, captures the species’ reproductive state to enable the derivation of phenological patterns and the identification of factors that trigger mass flowering. In this way, we enable the generation of more comprehensive time series occurrence data that includes information on reproductive maturity and habitat conditions of dipterocarps. This will facilitate further knowledge discovery tasks focused on restoration of dipterocarp forests.