Building: Grand Hotel Mediterraneo
Room: America del Nord (Theatre I)
Date: 2013-10-29 04:00 PM – 04:20 PM
Last modified: 2013-10-06
Abstract
Encyclopedia of Life (EOL, http://eol.org) is undergoing a major transformation in order to better serve scientific discovery. We present how TraitBank™ uses Darwin Core and other standards to manage structured attribute (trait) data across the tree of life. Though we take a machine-readable semantic approach, we also emphasize ease of use for humans, even if they are not informaticists. In addition to aggregating trait data in existing literature or databases, we anticipate a rapid rise in annotations about attributes on specimens and citizen science observations.
To manage trait data EOL both uses and extends Darwin Core. At the heart of each record is an Occurrence, where the identity of the taxon and context in which the trait was observed or measured may be recorded (e.g. geospatial information, dates, life stages, IndividualCount). The Darwin Core field MeasurementOrFact holds the basics of the trait measured and some other metadata. In particular, MeasurementType describes what was measured (a URI drawn from an ontology or a local URI if not yet part of an ontology) and MeasurementValue holds a number or a term from a controlled vocabulary or ontology. Measurement metadata might include, for example, Unit (from the Units of Measurement Ontology, UO), Accuracy, and MeasurementMethod. If the MeasurementType involves a statistical operation, e.g. mean or logarithmic transformation, this is indicated with a field referencing an ontology such as the Semanticscience Integrated Ontology (SIO). As with other content on EOL, rich attribution metadata uses fields from Dublin Core and Darwin Core. In some cases, Occurrences are part of multi-occurrence Events with their own metadata. Interactions among species, for example predator-prey relationships, are handled using a new extension named Associations. This extension is similar to MeasurementOrFact but with AssociationType instead of MeasurementType to indicate the type of relationship among taxa, and values are references back to other rows in the Occurrence extension.
The structured data interface on EOL is designed both for scientists who need access to all the rich metadata as well as for non-scientists who want overviews of "quick facts" and definitions. Direct addition of data, search, download, and API functions are in active development. For now, most data are provided via import from other databases or by transforming datasets found in the literature into Darwin Core Archive-like spreadsheets.
Our approach is compatible with existing Darwin Core-based processes. Thus, as collections annotate specimens with attributes, this information can easily be included as MeasurementOrFacts or Associations in their Darwin Core Archives. Similarly, citizen science observation projects can (and some iNaturalist Projects already do) ask observers to provide more than just the basic time-and-place information about the organisms they are seeing. Each project may choose to control the MeasurementTypes used for annotation; existing usage on EOL will allow projects to identify and choose commonly used MeasurementTypes.
More information about how TraitBank uses semantics to foster interoperability will be covered in the third Semantics for Biodiversity Symposium.