Biophotonics aims to grasp and investigate the characteristics of biological samples based on their interaction with incident light. Over the past decades, numerous biophotonic technologies have been developed delivering various sorts of biological and chemical information from the studied samples. Such information is usually contained in high dimensional data that need to be translated into high-level information like disease biomarkers. This data translation is not straightforward, but it can be achieved using the advances in computer and data science. The scientific contributions presented in this thesis were established to cover two main aspects of data science in biophotonics: the design of experiments and the data-driven modeling and validation. For the design of experiment, the scientific contributions focus on estimating the sample size required for group differentiation and on evaluating the influence of experimental factors on unbalanced multifactorial designs. Both methods were designed for multivariate data and were checked on Raman spectral datasets. Thereafter, the automatic detection and identification of three diagnostic tasks were checked based on combining several image processing techniques with machine learning (ML) algorithms. In the first task, an improved ML pipeline to predict the antibiotic susceptibilities of E. coli bacteria was presented and evaluated based on bright-field microscopic images. Then, transfer learning-based classification of bladder cancer was demonstrated using blue light cystoscopic images. Finally, different ML techniques and validation strategies were combined to perform the automatic detection of breast cancer based on a small-sized dataset of nonlinear multimodal images. The obtained results exhibited the benefits of data science tools in improving the experimantal planning and the translation of biophotonic-associated data into high-level information for various biophotonic technologies.