Missouri Botanical Garden Open Conference Systems, TDWG 2011 Annual Conference

Font Size: 
Reconciling Non-Overlapping Coverage Among Downloadable Plant Names
William Halliday Piel

Last modified: 2011-09-20

Abstract


Cyberinformatics in the life sciences has made huge strides in recent years with the convergence of numerous advances in technologies, such as the advent of inexpensive portable computers and GPS devices, data exchange standards, digitization of biological collections, and the dramatic growth of molecular data. Synthesis and inference from these data depend on reliable, unambiguous identifiers – prime among them being species names, which serve as a critical nexus among most data in the life sciences. For this semantic glue to work, biologists need access to complete, comprehensive, taxonomic dictionaries of names so as to detect misspellings and reconcile synonyms. To this end, we have built an on-line service, the Taxonomic Name Resolution Service (TNRS), to correct spelling errors and reconcile semantic heterogeneity using the GNI parser and TAXAMATCH technologies, and populated with plant names from Tropicos. We have tested this service against a large list of plant names compiled from three of the most popular downloadable taxonomies -- NCBI, ITIS, and USDA Plants -- to assess the ability of the TNRS to reconcile among names exclusive to each of these taxonomies. A remarkably large number of plant names are not shared by all three sources, but in many cases the TRNS succeeds in providing the missing intelligence to create a crosswalk among them, thereby creating a comprehensive dictionary service that is greater than the sum of the parts.