Big Data Epidemiology

The scientific literature reports on – sometimes contradictory – research results on the relationship between the quality of drinking water and public health. For example, the relationship between water hardness and the incidence of cardio-vascular disease, between manganese and intellectual development, and between arsenic and lung cancer. Although the risk assessment of separate chemical substances points to the wide safety margins between actual exposure and the (provisional) health-based exposure limits, a great deal of uncertainty exists with regard to chronic exposure and to complex mixtures of chemicals in low concentrations.

An epidemiological approach can be useful in gaining a better understanding of the relationship between drinking water quality and health. The increasing availability of national databases, containing spatially detailed health data, opens up opportunities to efficiently research the associations between drinking water and health characteristics.


In this project we applied the Big Data Epidemiology approach to drinking water, alongside other approaches like risk assessment, based on toxicological data or bioassays. To clarify the possible relationship between drinking water quality and human health using an epidemiological study, we applied advanced statistics. Corrections were made for distorting factors, but also for sex, age and socio-economic status.


In this project we connected detailed national data on health and mortality to information about the drinking water of millions of Dutch men and women. Using this approach we worked with a maximal sample size and with the most contrasting exposure levels that are available in the Netherlands. The research focused on structuring and combining different datasets, with the aim of obtaining information about the possible relationship between chronic exposure to chemical substances through drinking water and health.


An understanding was gained of the added value of a Big Data Epidemiological approach in determining the relationship between exposure to chemical substances via drinking water and human health, in comparison to the risk assessment per individual chemical substance. The results were shared with the water utilities.