Laure Berti-Équille (Abstract et Short Bio)
ML-Based Data Cleaning : Current Solutions and Challenges
With the success of machine learning (ML) techniques in database research, ML has already proved a tremendous potential to dramatically impact the foundations, algorithms, and models of several data management tasks, such as error detection, data cleaning, data integration, and query inference. Part of the data preparation, standardization, and cleaning processes, such as data matching and deduplication for instance, could be automated by making a ML model “learn” and predict the matches routinely. This talk will survey the recent trends of applying machine learning solutions to improve data cleaning as one of the most crucial and difficult task in data management and it will present the next research challenges in the convergence of machine learning and data management.