Data quality assessment for system identification in the age of big data and Industry 4.0

As the amount of data stored from industrial processes increases with the demands of Industry 4.0, there is an increasing interest in finding uses for the stored data. However, before the data can be used its quality must be determined and appropriate regions extracted. Initially, such testing was done manually using graphs or basic rules, such as the value of a variable. With large data sets, such an approach will not work, since the amount of data to tested and the number of potential rules is too large. Therefore, there is a need for automated segmentation of the data set into different components. Such an approach has recently been proposed and tested using various types of industrial data. Although the industrial results are promising, there still remain many unanswered questions including how to handle a priori knowledge, over- or undersegmentation of the data set, and setting the appropriate thresholds for a given application. Solving these problems will provide a robust and reliable method for determining the data quality of a given data set.


Citation style:
Could not load citation form.


Use and reproduction: