Last modified: 2016-09-19
Abstract
There is lack of unstructured biodiversity data, e.g., legacy knowledge of the researchers, the bushmen, fishermen, local guides, etc., and therefore there is untapped potential to add value to data currently available. These data are found in specialist’s mental models or field notes and tend to be lost in the process of information transfer. An example of this is a requirement elicitation about malaria incidence. It was observed that the presence of swamp rice grass (Leersia hexandra) in the river, indicated the likely occurrence of malaria in the region, since this vegetation provides suitable conditions for the proliferation of Anopheles darlingi, the malaria vector. This is not expressed in traditional databases, since this knowledge was acquired through years of the expert’s experience, but it is important and also relevant information that can help to guide future actions. Before the collection event of mosquitoes, it is recommended to search in an entomological database that manages mosquitoes’ records in order to find out if there are occurrences of mosquitoes in a specific region, improving field collection efficiency. Another recommendation is to find information in the database about other regions with similar properties to the habitat described. This tacit knowledge could be better used to infer structured data organized in structuring instruments of knowledge, such as ontologies.
OntoBio, a formal ontology applied to biodiversity data, was developed in a research initiative involving IComp/UFAM (Instituto de Computação/Universidade Federal do Amazonas) and Instituto Nacional de Pesquisas Amazônia's (INPA) Biological Collection Program. It provided important results with already validated technologies for the adoption of formal ontologies to knowledge acquisition and integration in the biodiversity domain.
During the development of OntoBio, much of an expert’s knowledge (which was not presented in the structured databases that support the ontology) was not represented, and thus lost. Empirical evidence indicated that this knowledge could become essential to incorporate semantic expressiveness in ontologies. A conceptual framework was developed to aggregate scientific tacit knowledge into ontologies. The new version of OntoBio incorporates more semantics into the model and features that allow its use in more complex applications.
The benefits of most significant scientific data integration will permit bioscience to advance in areas that have not been investigated yet. For this reason, it is necessary to integrate the specialist mental model into existing biodiversity databases and into those already integrated in OntoBio.