The EDIT Platform for Cybertaxonomy - an integrated software environment for biodiversity research data management
The Platform for Cybertaxonomy , developed as part of the EU Network of Excellence EDIT (European Distributed Institute of Taxonomy), is an open-source software framework covering the full breadth of the taxonomic workflow, from fieldwork to publication . It provides a number of tools for full, customized access to taxonomic data, editing and management, and collaborative team work. At the core of the platform is the Common Data Model , offering a comprehensive information model covering all relevant data domains: names and classifications, descriptive data (morphological and molecular), media, geographic information, literature, specimens, persons, and external resources . The model adheres to community standards developed by the Biodiversity Information Standards organization TDWG . Apart from its role as a software suite supporting the taxonomic workflow, the platform is a powerful information broker for a broad range of taxonomic data providing solid and open interfaces including a Java programmer’s library and a CDM Rest Service Layer. In the context of the DFG-funded "Additivity" project ("Achieving additivity of structured taxonomic character data by persistently linking them to preserved individual specimens", DFG project number 310530378), we are developing components for capturing and processing formal descriptions of specimens as well as algorithms for aggregating data from individual specimens in order to compute species-level descriptions . Well-defined and agreed descriptive vocabularies referring to structures, characters and character states are instrumental in ensuring the consistency and comparability of measurements. This will be addressed with a new EDIT Platform module for specifying vocabularies based on existing ontologies for descriptive data. To ensure that these vocabularies can be re-used in different contexts, we are planning an interface to the Terminology Service developed by the German Federation for Biological Data (GFBio) . The Terminology Service provides a semantic standards aware and harmonised access point for distributed or locally stored ontologies required for biodiversity research data management, archiving and publication processes . The interface will work with a new OWL export function of the CDM library, which provides EDIT Platform vocabularies in a format that can be read by the import module of the Terminology Service. In addition, the EDIT Platform will be equipped with the ability to import semantic concepts from the Terminology Service using its API and keeping a persistent link to the original concept. With an active pipeline between the EDIT Platform and the GFBio Terminology Service, terminologies originating from the taxonomic research process can be re-used in different research contexts as well as for the semantic annotation and integration of existing research data processed by the GFBio archiving and data publication infrastructure. KEYWORDS: taxonomic computing, descriptive data, terminology, inference REFERENCES: 1. EDIT Platform for Cybertaxonomy. http://www.cybertaxonomy.org (accessed 17 May 2018). 2. Ciardelli, P., Kelbert, P., Kohlbecker, A., Hoffmann, N., Güntsch, A. & Berendsohn, W. G., 2009. The EDIT Platform for Cybertaxonomy and the Taxonomic Workflow: Selected Components, in: Fischer, S., Maehle, E., Reischuk, R. (Eds.): INFORMATIK 2009 – Im Focus das Leben. GI-Edition: Lecture Notes in Informatics (LNI) – Proceedings 154. Köllen Verlag, Bonn, pp. 28;625-638. 3. Müller, A., Berendsohn, W. G., Kohlbecker, A., Güntsch, A., Plitzner, P. & Luther, K., 2017. A Comprehensive and Standards-Aware Common Data Model (CDM) for Taxonomic Research. Proceedings of TDWG 1: e20367. https://doi.org/10.3897/tdwgproceedings.1.20367. 4. EDIT Common Data Model. https://dev.e-taxonomy.eu/redmine/projects/edit/wiki/CommonDataModel (accessed 17 May 2018). 5. Biodiversity Information Standards TDWG. http://www.tdwg.org/ (accessed 17 May 2018). 6. Henning T., Plitzner P., Güntsch A., Berendsohn W. G., Müller A. & Kilian N., 2018. Building compatible and dynamic character matrices – Current and future use of specimen-based character data. Bot. Lett. https://doi.org/10.1080/23818107.2018.1452791. 7. Diepenbroek, M., Glöckner, F., Grobe, P., Güntsch, A., Huber, R., König-Ries, B., Kostadinov, I., Nieschulze, J., Seeger, B.; Tolksdorf, R. & Triebel, D., 2014. Towards an Integrated Biodiversity and Ecological Research Data Management and Archiving Platform: The German Federation for the Curation of Biological Data (GFBio), in: Plödereder, E., Grunske, L., Schneider, E., Ull, D. (Eds.): Informatik 2014 – Big Data Komplexität meistern. GI-Edition: Lecture Notes in Informatics (LNI) – Proceedings 232. Köllen Verlag, Bonn, pp. 1711-1724. 8. Karam, N., Müller-Birn, C., Gleisberg, M., Fichtmüller, D., Tolksdorf, R., & Güntsch, A., 2016. A Terminology Service Supporting Semantic Annotation, Integration, Discovery and Analysis of Interdisciplinary Research Data. Datenbank-Spektrum, 16(3), 195–205. https://doi.org/10.1007/s13222-016-0231-8.