Ten outcomes to improve informatics interoperability in cyber/e-Infrastructures for biodiversity and ecological sciences : (through the use case of Essential Biodiversity Variables)
Data products for Essential Biodiversity Variables (EBV) must be (re-)producible and comparable for any geographic area, small or large, fine-grained or coarse; at temporal scale determined by need and/or the frequency of available observations; at a point in time in the past, present day or in the future; as appropriate, for any species, assemblage, ecosystem, biome, etc.; using data for that area/topic that may be held by any and across multiple research/data infrastructures; using harmonized, widely accepted protocols (workflows); capable of being executed in any infrastructure; by any (appropriate) person anywhere. To date, the GLOBIS-B project (www.globis-b.eu) established there are technical needs for: i) common dimensional structure, packaging and metadata descriptions of EBV data products; ii) consistent quality checking and assertion across data from different sources that contribute to EBVs; iii) EBV workflows with common representation that is independent of underlying computational infrastructure; and iv) use of standard mechanisms for recording provenance of EBV data products. However, too little is presently known about how the technical production of EBV data products will work in practice. Experimental implementation work is necessary, both to show what is technically feasible and useful, and to reveal what is really needed. We must, for example agree details of both the compact data/file structure for EBV data products, and programmatic interfaces to those data products. Experimental work must lead eventually to formal standardisation. Scientists, infrastructure providers, informaticians and GEO BON Working Groups must jointly address the specific problems of moving from limited, experimental, proof-of-concept type studies (such as the Atlas of Living Australia / Global Biodiversity Information Facility (GBIF) invasive species case study) to first trials producing and using real data products with real users. Beyond first trials, they must jointly move to more robust solutions that scale out and up, as well as providing the basis for the long-term support to GEO BON across a wide range of EBVs classes. Satisfying the EBV use case acts for generally improving informatics interoperability among diverse cyber / e-Infrastructures supporting biodiversity science and ecology. It is desirable to guide participating providers without restricting their autonomy to achieve what is needed in ways appropriate to their own business. We show ten specific outcomes we want to see achieved, with the mission being the ability to deploy and execute standard workflows for preparing, publishing and preserving fit-for-use EBV data products that are comparable with one another. Achieving such outcomes significantly improves the ability of infrastructure providers to support the EBV production process.