Last modified: 2011-09-20
Abstract
This demo is geared mostly at current and possible future users of TapirDotNET (TapirDotNET, http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirDotNET), an application developed by Kevin Richards in 2007, which implements the TDWG Access Protocol for Information Retrieval (TAPIR). TAPIR allows mapping from different data sources and data structures to predefined conceptual schemas like DarwinCore, thus providing harmonized access to diverse data resources. TapirDotNET includes a web-based interface tool for configuration and a client application. The configuration interface is a service which provides all the necessary settings for the mapping process, assuming relational or tabular data. This interface is used to define the source database, tables, and columns where the mapped resources are stored. It also allows the definition of the mapping, i.e., the relationship between the employed conceptual schema (e.g., DarwinCore) and the own data structure. A successful configuration and mapping process results in an instance of the TAPIR protocol which allows queries to underlying resources. TapirDotNET uses two types of mapping. The FixedValueMapping type indicates a fixed value without a database connection and the SingleColumnMapping type allows access to a column in a database table. Therefore, with TapirDotNET it is possible to map relational data structures onto the biodiversity data standards.
Lately, however, XML gains an ever increasing role as storage format of primary biodiversity data. Unfortunately, TapirDotNET does not support mapping XML data resources. Up to now, it has thus not been possible to map XML-based data resources to biodiversity data standards using this tool.
In our work, we have extended the tool with a new functionality. This extension, which is currently being extensively tested, will be made available as open source software via SourceForge. It enables the support of standard conform sharing of XML data and is called TapirDotNETXml. TapirDotNETXml accesses an arbitrary XML schema and displays all text nodes defined there in the mapping section of the configuration tool. The user can then select individual nodes and map them to the target schema. For this, TapirDotNETXml includes two further types of mapping compared to the original TapirDotNET. The XmlNodeMapping type allows access to the individual nodes in an XML file. The XmlNodesDependsValuesMapping type provides the access through a node of an XML file which is linked to a node in another XML table column. With the XmlNodesDependsValuesMapping type it is possible to describe relations between the metadata and primary data of the XML data structure for the mapping. The configuration interface of the TapirDotNETXml application also provides the functionality to load and to import various standard schemas, i.e., DarwinCore.
In the demo, we will show usage of TapirDotNETXml to map data from our Biodiversity Exploratories Information System (BExIS) (https://exploratories.bgc-jena.mpg.de) to DarwinCore.
BExIS is an integrated research platform developed for a large collaborative research project in Germany, namely the Biodiversity Exploratories. In BExIS, we use the document-oriented XML approach for meta and primary data handling. BExIS has its own project specific metadata schema. This schema supports flexible data structures and data syntax for primary data ensured by user-driven configuration at the individual data set level. We will access this schema with TapirDotNETXml and show how the mapping to DarwinCore can be configured. We will show the resulting mapping as well as example accesses to data via this mapping.
Technical requirements: an internet connection, a projector (if done in a formal setting, which we would prefer)