Managing and publishing fungal community barcoding data by use of the process-oriented schema MOD-CO and a GFBio data publication pipeline
The need to fulfil FAIR guiding principles for data management and publication  directly affects researchers, i.e., data producers as well as data managers. Data management has to be set up well already at an early stage of the data life cycle. This is demonstrated by a best practice work- and dataflow 'Fungal community barcoding data', which has been established as side product in the context of the project 'GBOL 2 Mycology‘, German Barcode of Life initiative (https://www.bolgermany.de/). The work- and dataflow was set up by applying the newly published MOD-CO schema, Version 1.0 which has been implemented as an instance of the database application DiversityDescriptions for data management, and for making data compliant to GFBio infrastructure for data archiving and publication. The comprehensive conceptual schema MOD-CO for 'Meta-Omics Data of Collection Objects' Version 1.0 was published as Linked Open Data representation in spring 2018 . The process-oriented schema describes operations and object properties along the work- and dataflow from gathering environmental samples, to the various transformation, transaction, and measurement steps in the laboratory up to sample and data publication and archiving. By supporting various kinds of relationships, the MOD-CO schema allows for the concatenation of individual records of the operational steps along a workflow. The MOD-CO descriptor structure in version 1.0 comprises 653 descriptors (concepts) and 1,810 predefined descriptor states, organised in 37 concept collections. The published version 1.0 is available as various schema representations of identical content (https://www.mod-co.net/wiki/Schema_Representations). This schema has been implemented as data structure in the relational database DiversityDescriptions (DWB-DD) (https://diversityworkbench.net/Portal/DiversityDescriptions), a generic component of the Diversity Workbench environment (https://diversityworkbench.net). DWB-DD is considered being appropriate to be applied as a LIMS (Laboratory Information Management System) and ELN (Electronic Laboratory Notebook) for organising Fungal community barcoding data' and similar data collections in molecular laboratories. Its data export interface provides guidance to generate data and metadata in the formats CSV and XML, the latter following the SDD metadata schema with involvement of extensions by metadata elements from EML and ABCD standards; for community standards see: https://gfbio.biowikifarm.net/wiki/Data_exchange_standards,_protocols_and_formats_relevant_for_the_collection_data_domain_within_the_GFBio_network. The research data themselves are organised according to the MOD-CO data schema. The data package of the work- and dataflow 'Fungal community barcoding data' is going to be submitted to GFBio after having been checked for GFBio compliance and to be published under a creative common license. Suggestions for standardized citation will be provided, a DOI assigned, and long-term data archiving ensured. KEYWORDS: DiversityDescriptions, German Barcode of Life (GBOL), German Federation for Biological Data (GFBio), MOD-CO conceptual schema, use case for community barcoding data REFERENCES: 1. Wilkinson, M.D. et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. – Sci. Data 3: 160018. DOI: 10.1038/sdata.2016.18. 2. Rambold, G., Yilmaz, P., Harjes, J., Link, A., Glöckner, F.O., Triebel, D. 2018. MOD-CO schema – a conceptual schema for processing sample data in meta’omics research (version 1.0). http://mod-co.net/wiki/MOD-CO_Schema_Reference.