TraitBank: Integrating Trait Data Across The Tree of Life
Katja Sabine Schulz, Jennifer Hammock, Cynthia Parr

Date: 2014-10-30 11:45 AM – 12:00 PM
TraitBank (http://eol.org/traitbank) is an open digital repository for organism traits. Supported by the Encyclopedia of Life (EOL) technical infrastructure, TraitBank currently provides access to over 8 million measurements and facts about the distribution, ecology, life history, physiology and morphology of more than 1.5 million taxa. Data sources include major biodiversity information systems (e.g., IUCN - International Union for Conservation of Nature, OBIS - Ocean Biogeographic Information System, PBDB - Paleobiology Database), literature supplements (e.g., from the Dryad Digital Repository, Ecological Archives, Pangaea), label data from natural history collections, and legacy/unpublished data sets from individual scientists and projects. Data types range from individual specimen measurements and results of a particular study to summary data from large surveys and comprehensive reviews. Some of the data are derived from text mining projects. Access to trait data is provided on EOL taxon pages and through a data search. Each record is accompanied by available metadata on provenance, measurements methods, sampling parameters, etc. TraitBank data can also be downloaded via csv files or a JSON-LD ( JavaScript Object Notation for Linked Data) service. Reuse and redistribution of data is encouraged with attribution to the original sources.

TraitBank is semantically enhanced through links to domain-specific ontologies and controlled vocabularies, such as the Plant Trait Ontology (TO), Vertebrate Trait Ontology (VT), Environment Ontology (ENVO), Phenotypic Quality Ontologyand (PATO), and Darwin Core. This approach provides an explicit context for each record and imposes a common structure on data derived from heterogeneous sources. Since not all biodiversity data captured by TraitBank can as yet be mapped to ontologies, the EOL team also collaborates with experts to bridge gaps in current knowledge representation systems. As more taxon or subject-specific trait databases emerge, TraitBank will complement these efforts by serving both as a potential data source and a data consumer that integrates trait data across biological subdisciplines. The emerging semantic framework will facilitate data discovery, support queries across data sets, and advance data integration and exchange among projects, thus making more biodiversity data available for use in scientific and policy-oriented applications.