Missouri Botanical Garden Open Conference Systems, TDWG 2016 ANNUAL CONFERENCE

Font Size: 
How to standardize a dataset to Darwin Core with OpenRefine
Dimitri Brosens, Peter Desmet

Building: CTEC
Room: TecnoAula 2
Date: 2016-12-06 02:00 PM – 03:30 PM
Last modified: 2016-10-16

Abstract


Whether you are a biodiversity data publisher or user, you have probably encountered messy data: variations of the same value, inconsistent date formats, incomplete geospatial information, etc. As a nontechnical person, how do you explore, let alone clean and standardize such data?

In this workshop, we will teach you how. With the free, open source tool OpenRefine (formerly Google Refine) you will learn how to 1) import a dataset, 2) explore it with facets, 3) clean and standardize it to Darwin Core by clustering and splitting, and 4) exporting it back as simple Darwin Core... all in easy repeatable steps. We will also show you how to link your data to the GBIF taxonomic backbone or the Encyclopedia Of Life by using external services and crosslinking. And we will try to find decimal coordinates using the Google or Mapquest web services. Intrigued? Join us and we are sure that you will become an OpenRefine adept! Note: this workshop contains a theoretical and hands-on session, so bring your own computer and data.