Building: Elmia Congress Centre, Jönköping
Room: Rum 10
Date: 2014-10-30 11:30 AM – 11:45 AM
Last modified: 2014-10-14
Abstract
In response to a US National Science Foundation challenge to improve the speed at which we “Assemble, Visualize, and Analyze the Tree of Life,” the Next Generation Phenomics project seeks to develop and adapt tools to assemble large phenomic datasets in a rapid and automated way. This project consists of computer vision, natural language processing, and crowdsourcing components. The Computer Vision (CV) team is developing methods that automate the extraction and annotation of phenomic characters from digital images using computer learning approaches. The new CV algorithms can discern the presence/absence of features and assess their spatial relationships and appearance. The Natural Language Processing (NLP) group is developing software to transforms digitized taxonomical descriptions into taxon/character matrices for phylogenetic analyses. Also, because microbial descriptions often differ radically from those of other organisms, the NLP group is developing supervised learning strategies to extract phenomic characters from microbial descriptions. Finally, the crowdsourcing team has developed software, The Evolution Project, that works with MorphoBank to present images of character states to crowds for scoring. Several experiments are underway to estimate the quality of the coding done by non-experts. Here we explore the impact and challenges of using these methods to obtain phenomic data on a large scale across the Tree of Life.