Supervised learning methods to predict species interactions based
Species interaction networks - pollination networks, host-phage networks, food webs and the like - are key tools to study community ecosystems. Biologists enjoy working with networks as they provide a sound mathematical description of a system and come equipped with a large toolkit to analyze various properties such as stability, diversity or dynamics. Species interaction networks can be obtained experimentally or by field observations. Modern techniques such as DNA barcoding and camera traps, coupled with large databases, contribute further to the popularity of networks in ecology. In practice however, a collected network rarely contains all in sito interactions, as this would require an unfeasible large sampling effort. Species distributions are also subject to changes, for example due to climate change, which leads to new potential interactions. It is of great importance to be able to predict such interactions, for example to anticipate the effect of exotic species in an ecosystem. In our work, we study how to use supervised machine learning tools to be able to predict new species interactions. Based on an observed network, we learn a function that takes as inputs the description of two species (e.g. traits, phylogenetic similarity or a morphological description) and predicts whether these two species are likely to interact or not. This framework for pairwise learning is based on kernels and similar methods have been highly successful for predicting molecular networks and for recommender systems, as used by companies such as Netflix and Amazon. We have shown that these methods can detect missing interactions in many different types of species interaction networks. A large focus of our work is on how the accuracy of these models can be estimated realistically. Our methods are available in an R package called xnet, making them easy to use for ecology researchers.