Missouri Botanical Garden Open Conference Systems, TDWG 2011 Annual Conference

Font Size: 
Pl@ntNote v2 - a new generation of software for storage and exchange of heterogeneous botanical data
Benjamin LIENS, Samuel Dufour, Philippe BIRNBAUM, Pierre Bonnet, Daniel Barthelemy, Jean-François Molino

Last modified: 2011-10-11

Abstract


The development of computer networks provides enthusiastic possibilities to exchange botanical data between separate teams, although a major brake to this idea is that storage of botanical data has been mainly realized using relational databases in the past decades. The limitation of this approach is due to the structured nature of data in these systems, making botanists dependant of the database tables and fields chosen by the software development team, causing difficulties to fit to different team’s needs. Even though two or more separate teams agree on a common structure, the data exchange between different databases then generates numerous problems: i)efficient implementation of globally unique identifiers to identify new data and common data in between two databases, ii) efficient data synchronization between databases, iii) conflicts resolution when data is edited separately in both databases.

 

In order to answer the first issue on data modularity, we have chosen to develop a new data management and exchange software relying on the innovating NoSQL (“Not Only SQL”) paradigm. This concept provides systems where it becomes possible to dynamically add fields to a particular data. The NoSQL CouchDB database was chosen by our development team, because it answers the issue on data modularity, and as it was originally conceived for networking, it offers data synchronization facilities. In this database, data is stored as documents identified by unique identifier on a worldwide network, and data synchronization and conflict detection is automatically managed.

 

We have developed the Pl@ntnote v2 application as a web based interface on top of the CouchDB database. The data management process offered to the end-user involves two distinct phases. First an administrator has the ability to describe its data structure, choosing the entities he wants to observe and the hierarchical relationships between entities. Then basical users of the software have the possibility to record their data under this format, in classical client/server software. If the team wants to send or retrieve data from another Pl@ntNote database, the software offers a data exchange system, with possibilities of choosing the granularity of data exchanged, from the data structure only to the whole database, or a subset of the data contained, with an interface allowing to edit data in case of conflict detection.

Finally, we also provide the possibility of working disconnected of the network. The web based software will still be usable on a user’s computer, and in that case data will be stored on the local computer, waiting for the moment the network is reachable to use the data exchange possibility of the Pl@ntNote software with a distant database.

 

We expect this software to provide new possibilities in constitution of original datasets with a strong increase in scientific value.  Moreover, we believe sharing data structures will allow teams to collaborate in order to easily find the structure adapted to their needs, avoiding time consuming analysis of data description needs from a team to another, and facilitating collaboration between those teams.