Missouri Botanical Garden Open Conference Systems, TDWG 2016 ANNUAL CONFERENCE

Font Size: 
Biodiversity informatics and the agricultural data management landscape
Cyndy Parr

Building: Computer Science
Room: Computer Science 3
Date: 2016-12-06 09:00 AM – 09:15 AM
Last modified: 2016-10-16


Historically, ecological and biodiversity researchers have focussed on the basic patterns and processes of populations, communities, and ecosystems, with minimal attention paid to the role of humans. Human impacts have instead been addressed in the more applied sciences of conservation, medicine, and agriculture. In recent years, however, boundaries between applied and basic sciences have blurred. There is general recognition that our future is best served by science that seeks to understand systems in their true, full contexts. Societies cannot live sustainably without an understanding of the biosphere and how humans and their behavior and management practices might impact it. Data infrastructure (e.g., data management systems, metadata standards, ontologies) must therefore accommodate use cases that span managed and "pristine" systems. In this talk we describe the challenges faced by agricultural research communities that share some domain-specific data needs with basic biodiversity and ecology research communities, but that also share needs with the social science and biomedical communities. Big data in agriculture involves both real-time environmental and high throughput genomics and phenomics. Long-term data includes social science surveys, repeated crop rotation experiments, and basic monitoring of soil and water and weather conditions. Battling emerging pests or adapting cropping or ranching activities to climate change requires an understanding of wild relatives and microbial ecology. We sketch out a landscape of loosely coupled data and analysis infrastructures and policies that are being developed to address these challenges, with special focus on the United States. Some parts of this landscape are centered at the US National Agricultural Library (NAL), e.g., the Ag Data Commons, i5K workspace, Life Cycle Assessment Commons. Other parts are found elsewhere in the US Department of Agriculture (e.g., Long Term Agroecosystem Research initiative, National Institute of Food and Agriculture's data science program). Other government agencies, universities, and private organizations all play critical roles. Some parts of the landscape are already familiar to the biodiversity informatics community but agricultural use cases can help all of us work together on best practices and interoperable systems. Collectively, we can identify and address gaps in standards and services for machine-readable data dictionaries, thesauri, and ontologies. We can strengthen the use of Globally Unique Identifiers and ride public access mandates and advances in high performance computing to promote text and data mining and modeling. We can build a living knowledge landscape that serves and promotes both basic and applied research.