Building: Windsor Hotel
Room: Acacia Tent
Date: 2015-09-29 02:00 PM – 02:15 PM
Last modified: 2015-08-29
Abstract
Biodiversity scientists and conservationist are interested in the diversity and distribution of not only species, but also of their phenotypes and the environments in which they occur. Characteristics of organisms or species (e.g., phenotypes, traits, habitats, etc.) are generally described using free text, often quite subjectively (e.g., “petals small” or “mesic environment”). Even when quantitatively measured, traits can be difficult to interpret, due to variation in methods. Researchers wanting to reuse trait data may need to assess each dataset individually to determine its fitness for purpose, with considerable investment of time. As the number and size of biodiversity datasets grow larger, so does the need to link trait data to a solid semantic framework, in order to make it discoverable and re-usable in a more automated fashion than is currently possible. This talk will focus on the semantics of morphological or physiological traits and the data that describe them. Other talks within this symposium will address the semantics of environmental characteristics and taxon names, as well as the technical and infrastructure aspects of collection, observation, and trait data.
There have been multiple efforts to describe the semantics of traits, in terms of both the traits themselves and how to represent observations of traits. The most common way of representing traits in ontologies is as a combination of some entity and some quality, also known as an EQ statement. This basic pattern can be expended to cover more complex traits, but there are cases (such as for rates or fluxes) that require alternate representations. The semantics of trait data is even more complex, requiring information about the entity that bears the trait, the trait, the process of measuring the trait, and the data that result from the measurement process, plus provenance of all the pieces involved. The Extensible Observation Ontology (OBO-E), several ontologies based on the Observation and Measurements standard (O&M), and the Biological Collections Ontology (BCO) have all attempted to describe the complexity of observing trait data using in ontologies, each with different strengths. A recent effort to align these ontologies revealed fundamental similarities among all of them, as well as areas that were difficult to align. Challenging areas suggest places where there may still work to be done on the semantics, or where driving use cases or applications have led to different solutions. Working out shared solutions - both technical and semantic - to the challenges of trait data is a key step in building biodiversity standards that will work for the next era of biology.