Missouri Botanical Garden Open Conference Systems, TDWG 2013 ANNUAL CONFERENCE

Font Size: 
Coupling phenotype descriptions and phylogenetic trees: from SDD to ontologies via graph databases
Anaïs Grand, Régine Vignes Lebbe, Eduardo Miranda, André Santanchè

Building: Grand Hotel Mediterraneo
Room: America del Nord (Theatre I)
Date: 2013-11-01 09:00 AM – 09:15 AM
Last modified: 2013-10-08

Abstract


Characters are at the heart of the taxonomist’s tasks: discovering, describing, naming, comparing, characterizing new taxa, classifying them according to their phylogenetic relationships and studying their history, diversity and distribution. Taxonomic works result in the production of a huge amount of data (e.g., phenotype descriptions, morphological data matrices, etc.) stated in free-text format and digitally represented in many semi-structured standards, not often able to be interconnected. However, a semantic framework is needed for the integration of characters across studies, wherein ontologies are one of the promising choices to address this challenge. We face two challenges in this context: (i) how to relate several unconnected ontologies to be used in ontology-based descriptions; and (ii) how to map/reuse the huge amount of existing resources developed pre-ontologies. To address (i), we present a semantic representation of characters, with a unifying meta-model that can be superimposed over existing bio-ontologies, disciplining their relations and favoring their integration. Given the fact that converting taxonomic data in ontologies is not a straightforward task, to address (ii) we are implementing an intermediate step between semi-structured phenotypic descriptions and ontologies, based on graph databases. In the Semantic Web context, an ontology in RDF (Resource Description Framework) /OWL (Ontology Web Language) is essentially a graph where the nodes and relations are objects and properties following some class model. Texts and labels in natural language will appear as complementary documentation for human consumption. We mapped the SDD (Structured Descriptive Data) format to the graph model, remodeling semi-structured descriptions to a graph abstraction, in which the data are linked, enabling coupling phylogenetic trees and phenotype descriptions. Graph databases are less schema dependent and, since an ontology is also a graph, the mapping from the original graph towards an ontology becomes a sequence of graph transformations. This graph model was designed to be published on the Web in a Linked Data approach. Practical experiments are illustrated with the study of fossil ferns, using the programs Xper2 (for descriptions), which is compatible with the SDD standards, and LisBeth (for phylogenetics).