Deep Explorations: a explorative study for deep learning applications in the water sector

This project concerns an explorative study to assess the value of deep learning (DL) for KWR and the water sector in general. The different types of deep learning, their strengths and weaknesses, will be investigated and considered. The technique will be applied to two case studies:(1) datamining in customer complaints received by drinking water utilities, and (2) datamining in infrared spectroscopy for microplastics analysis and polymer classification, which can later be extended to other areas, such as chromatography, UV adsorption and pattern recognition in data sets.

What is deep learning?

Deep Learning (DL) is a family of machine learning (ML) algorithms based on artificial neural networks, able to derive more complex relations between input and output than other ML techniques. As a result, DL turns out to be highly effective in applications like image and speech processing. In these applications, DL techniques beat more traditional artificial neural networks (ANN) relying on a limited number of hidden layers or other algorithms such as Random Forest or Support Vector Machine (SVM).

What is the value of deep learning for the water sector?

In this project we want to investigate the potential and value of DL for the water sector. This fits very well within the trend at KWR and the water sector as a whole, where more and more often ML approaches are used to answer (research) questions. To investigate this technology, we will draft an overview of the different type of DL and their strengths and weaknesses. This will allow the selection of topics or problems in which DL has high potential to contribute. Furthermore, two cases are selected based on the properties of DL, namely that it excels when complex relationships exist between input and output and/or when large amounts of data are available.

These cases are:

  • The analysis of text data of customer communications with the water utilities, and to automatically extract the topic of the message. These topics could be cross reference with data from pipe failures and network maintenance, if available. This allows for early detection of several problems and decreased service in the water distribution network. At the same time, it would help to improve customised customer contact.
  • The analysis of infrared spectroscopy data in the field of microplastics identification. The output of this use case will consist in various deep and ensemble learning models for accurate classification of polymer of microplastics found in the environment as well as the knowledge and skills to derive these models. This will allow to implement advanced data analysis and classification algorithms to complex chemical data. The impact of this research will be substantial as the experience acquired will be relevant for numerous other research areas within KWR (e.g., other chemical analysis, forecasting of toxicity of chemicals as well as the removal efficiency based on quantitative structure activity relationship (QSAR) models).

A new tool in the researcher’s toolbox?

With the experienced gained with these use cases we will get insight into the possible applications of DL within the research of KWR and the water sector.  It will be recognized when it is a proper tool to address different problems across the water sector. These possible applications are by nature of the technique, across disciplines. As an outcome we will make this technique more accessible within the organisation. As such, it can become a valuable tool for KWR researchers – a new tool in the researcher’s toolbox –  to better serve our clients – the water sector.