Next Article in Journal
Modelling of Sediment Exchange between Suspended-Load and Bed Material in the Middle and Lower Yellow River, China
Next Article in Special Issue
Geospatial Information System-Based Modeling Approach for Leakage Management in Urban Water Distribution Networks
Previous Article in Journal
Water-Use Efficiency of Crops in the Arid Area of the Middle Reaches of the Heihe River: Taking Zhangye City as an Example
Previous Article in Special Issue
An Improved Genetic Algorithm for Optimal Layout of Flow Meters and Valves in Water Network Partitioning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Cost–Benefit Prediction of Asset Management Actions on Water Distribution Networks

1
Unité Mixte de Recherche Gestion Territoriale de l’Eau et de l’Environnement (GESTE) IRSTEA-ENGEES, 1 quai Koch, 67070 Strasbourg, France
2
CSIP-ICube, Université de Strasbourg, 3-5, rue de l’Université, 67084 Strasbourg, France
*
Author to whom correspondence should be addressed.
Water 2019, 11(8), 1542; https://doi.org/10.3390/w11081542
Submission received: 19 June 2019 / Revised: 12 July 2019 / Accepted: 15 July 2019 / Published: 25 July 2019
(This article belongs to the Special Issue Advances in Modeling and Management of Urban Water Networks)

Abstract

:
The potential costs and benefits of a combination of asset management actions on the water distribution network are predicted. Two types of actions are considered: maintenance actions and renewal actions. Leak detection and reparation of failures on connections and pipes define the set of potential maintenance actions to be carried out. Renewal actions concern connections, pipes, and meters. All these actions represent the model’s decision variables in order to determine a trade-off between two objectives: (i) the maximization of the water efficiency rate and (ii) the minimization of the total cost of actions to be carried out on the water system. The assessment of objective functions is ensured by an artificial neural network (ANN) trained on a French mandatory database «SISPEA». A non-dominated sorting genetic algorithm (NSGA-II) is coupled to the ANN to reach the set of compromised solutions representing potential actions to achieve. Applied to a real water distribution system in the southeast of France, the proposed decision model indicates that the improvement of water efficiency rate (WER) in the short term requires increasing operation expenditures (OPEX), which represent 99% of the total cost. Results show the existence of a threshold effect that implies to use the budget in a certain way to improve performance. A potential solution can be chosen by the decision maker among the generated Pareto front with regard to the constraint on the budget and the targeted WER.

1. Introduction

Water utility performance monitoring is widely addressed in the literature. IWA initiative carried out by Ref. [1] to build key performance indicators (KPIs) led to the emergence of national mandatory databases in several countries in order to improve the management of water utility and ensure transparency against stakeholders and users. However, KPIs are generally measured on an ex-post basis in order to assess the ability of conducted policy to achieve planned goals; otherwise, corrective actions can be planned in the case of a mismatch. This way of management could be expensive in terms of time and money.
One possible improvement to avoid this mismatch is the use of a decision-aiding model to predict KPIs based on potential decisions and a set of explanatory data. A possible shortfall concerns the absence of data collection at the scale of the water utility, which renders it difficult to train and fit a prediction model. The existence of an information system (IS) seems to be a prerequisite for the assessment and the prediction of KPIs. This shortfall tends to be solved. In fact, in the last 2 decades, we observe the development of sensors technologies and information and communications technology (ICT) that encourage water utility to install smart devices in order to monitor water systems in real time and collect information about their operation. The relevance of adopting smart water systems and the potential benefits in terms of leak management, water quality monitoring, and energy savings are discussed in Ref. [2]. Smart systems generate an important quantity of data which are not always exploited in the decision-making process. Data gathering improves the water utility information system (IS) and constitutes a prerequisite for prospective analysis. The current research addresses the assessment of KPIs in an ex-ante way based on the exploitation of data due to the emergence of mandatory databases and the deployment of smart devices in the water systems. The current paper aims at answering the following question: How can the existing data collections or IS be exploited for prospecting asset management actions and assessing their costs and benefits in an ex-ante way?
For any planning of asset management actions, the assessment of expected costs and benefits is recommended because it allows decisions mitigation. The importance of cost–benefit quantification in the determination of optimal maintenance time is underlined in Ref. [3]. Models for asset management of water pipes seem to be driven by the estimation of the optimal date of renewal based on the deterioration of the asset, the assessment of whole life costing [4], the achievement of a critical threshold for the number of breaks [5] or the rendered service (pressure, flow, quality) under economic or technical constraints [6,7]. Pipes renewal planning considering multiple objectives can be achieved by genetic algorithms [8]. The problem of water pipe renewal planning based on a cost–benefit approach is addressed in Ref. [9]. Authors define five items of benefit. Items calculate the benefit of reduction of the repair cost, the benefit from avoiding potential damages of water suspension for domestics and non-domestics, and the benefit from avoiding the social cost in case of roads unavailability. The optimal time for pipe renewal is reached when expected benefits are greater than costs.
The use of genetic programming for pipe breaks prediction is discussed in Ref. [10]. Authors develop an economic-based model for pipes replacement. They assume that there exist two categories of models for pipe breaks prediction: The physically-based models that aim at identifying physical causes of breaks and statistical models that analyze historical data to identify explanatory variables.
The use of machine learning seems relevant to tackle prediction problems. Between 2006 and 2016, the use of Artificial Neural Networks (ANNs) has increased in the drinking water sector, particularly for modeling the infrastructure and water quality [11]. ANNs address water quality problems by modeling chlorine concentration [12]. To improve leakage management, hydraulic and water quality data collected from sensors are used to fit ANNs for detecting and locating leakage in Yorkshire Water’s Keighley distribution system [13]. A principal component analysis (PCA) and ANN was carried out to predict the leakage ratio in the drinking water system using six effective parameters: pipe deterioration ratio, the volume of water supplied, pipe length, mean pipe diameter, the number of leaks, and an energy ratio [14]. Authors show the advantage of coupling ANN with PCA. To estimate the magnitude and the location of leaks, ANNs were trained on different sets of input data (pressure and flow rate) collected from sensors installed in the piping network [15].
It appears from the literature review that despite the output variable to predict, the training of ANNs in the drinking water sector is done at the local scale by using a series of monitoring data collected by sensors disseminated in the network. What can be done in case of the absence of monitoring data? A partial answer is given by Ref. [16], who investigated the training of ANNs not on monitoring data but on aggregated data or KPIs, representing high-level data gathered in mandatory databases. Authors establish cause–effect relationships between KPIs. They compare the use of ANN or multiple regression analysis (MRA) for calibrating a decision model that is able to predict the water efficiency ratio from a set of nine mandatory indicators considered as input variables.
In the context of absence or paucity of low-level monitoring data, the current work improves the model developed in Ref. [16] by prospecting asset management actions based on high-level data represented by ex-post KPIs measured at the scale of the water utility.
We assume that the proposed model can be adapted in the context of smart water systems where monitoring data are available at a low-level scale. The main added value of the proposed model is its ability to prospect asset management actions by measuring KPIs in an ex-ante way using an adaptation of ANNs and a multi-objective genetic algorithm. The prediction model can be fitted with a multiset of data from several water utilities or a national database of mandatory KPIs as SISPEA (French context) and the IS of the water utility. This can be very helpful in case of absence of enough monitoring data at the scale of the water utility.
The paper is organized into five sections. The current section proposes a literature review of asset management of water pipes and the use of ANN for KPI’s prediction and genetic algorithm for problem optimization. Section 2 defines the objective functions and the mathematical formulation of the considered problem. The characteristics of the ANN and NSGA II are also detailed. Section 3 illustrates the use of the developed model on a real case study and shows how it is carried out. Section 4 discusses the results and the main added value of the model. Finally, the last section concludes the paper.

2. Materials and Methods

This paper focuses on the prediction of two KPIs considered as objective functions: (1) the water efficiency rate considered as a benefit and (2) the total cost obtained by the sum up of OPEX and capital expenditures (CAPEX). Considered costs are the result of the implementation of asset management actions: renewal of pipes, connections and meters on one hand; and leak detection, connections and pipes reparation on the other hand. The prediction of KPIs is ensured by an adaptation of ANNs coupled with a multi-objective genetic algorithm NSGA II [17].

2.1. The Water Efficiency Rate (WER)

In the French context, the WER is a mandatory KPI calculated for each water utility according to the decree of May 2007 [18]. It measures the ratio between the billed and distributed water. The prediction model uses the theoretical model developed in Ref. [16] to establish relationships between WER (output) and nine other mandatory KPIs (Input) considered as explanatory variables. Table 1 lists the explanatory variables with their corresponding code (taken from SISPEA) and their link with asset management actions.
The assessment of WER requires the analysis of the yearly hydraulic balance of the whole network. Table 2 lists the required variables.
To be able to calculate WER, the listed explanatory variables in Table 1 should be calculated or estimated. WER can be indirectly estimated from the linear leakage, which encompasses four types of losses: losses due to metering errors W m , losses due to leaks on main pipes W p , losses due to leaks on connections W c , and losses due to invisible leaks W i . We assume that losses due to metering errors W m ( t ) can be calculated by Equation (1):
W m ( t ) =   W b ( t ) ( t ) × ε m   ( t )
with:
ε m   ( t ) = ε m   ( t 1 )   × A g e ¯ m ( t ) A g e ¯ m ( t 1 )
By considering the meter renewal rate, ε m   ( t ) is calculated by Equation (3):
ε m   ( t ) = ε m   ( t 1 ) × ( 1 r m )
where rm is the rate of annual meter renewal in percentage per year as listed in Table 3.
Losses due to leaks on pipes are computed by taking into account the estimated number of leaks on pipes from which the effect of the pipe renewal is subtracted:
W p ( t ) = M T T R v l ( t ) × d p ( t ) × [ n p ( t ) r p ( t ) × L n e t ( t ) × r b ( t ) ]  
Analogously, losses due to leaks on connections at a given year W c ( t ) are computed by taking into account the estimated number of leaks on connection minus the effect of connections renewal:
W c ( t ) = M T T R v l ( t ) × d c ( t ) × [ n l c ( t ) r c ( t ) × n c t   ( t ) × r c b ( t ) ]
The model also involves water losses W i ( t ) caused by invisible leaks. Equation (6) indicates how they are calculated:
W i ( t ) = M T T R i n v ( t ) ×   d ( t ) × [ n i n v ( t ) × ( 1 α × r p ( t ) ( 1 α ) × r c ( t ) ) r d ( t ) × L n e t ( t ) ]
Asset management actions in terms of renewal (pipe, connections) and leak detection have an impact on leaks. Actions decrease the number of invisible leaks and the mean time to repair; this assumption is introduced by Equation (6). The total water loss for year t, Wl(t), is obtained by the sum up of all types of water losses as shown in Equation (7):
W l ( t ) = W m ( t ) + W p ( t ) + W c ( t ) + W i ( t )
Based on previous equations, it is possible to compute the linear leakage index according to Equation (7).
L L I ( t ) =   W l ( t ) L n e t ( t )
The average renewal rate of water mains over the 5 last years r p ¯ ( t ) (code: P107.2) measures the mean value of the annual renewal rate of water pipes (without connections) over the last 5 years. This includes renewed, reinforced and rehabilitated pipes but does not take into account maintenance actions as pipes reparation. The average renewal rate of water mains over the last 5 years is calculated by Equation (9):
r p ¯ ( t ) = i = 0 3 r p ( t i 1 ) + r p ( t ) 5
with rp(ti − 1) for i   [ 0 , 3 ] being the annual renewal rate of pipes from the previous 4 years (known); and rp(t) is the annual renewal rate envisaged.
The remaining explanatory variables: number of users (VP.056), linear density of users (VP.228), billed metered domestic consumption (VP.063), volume of unmetered consumption (VP.221), billed metered consumption (VP.232), volume produced + volume imported (VP.234) are estimated based on water utility manager opinion, historical data and Monte Carlo analysis using a uniform distribution function as explained in Ref. [16].
In the context of a lack of low level data, we advise to use Equation (7) to estimate the mean and standard deviation of the following parameters: leakage flow rate, the number of hidden leaks and repair time for both pipes and connections over an observation period of at least 5 years. Obtained values represent a set of feasible solutions that satisfy the yearly hydraulic balance on the observation period.
The number of visible breaks and leaks on pipes and connections are supposed to be available as local data from the water utility. To involve the uncertainty of estimation, a Monte Carlo analysis is implemented using Equation (7), where a set of parameters and variables of the equation are randomly generated as shown in Figure 1. In the absence of data concerning the characteristics of leaks, normal distribution functions are used to randomly generate the flow rate, the number of leaks and time to repair. The achievement of this analysis provides a potential range of values for parameters of Equation (7) that make the estimation of water losses possible for prediction purposes. Figure 1 illustrates the required steps to estimate annual water losses.

2.2. The Total Annual Cost

The total annual cost (CTot) of decisions or a policy defined by asset management actions is calculated by Equation (10). Required variables for cost calculations are resumed in Table 3.
CTot = CAPEX + OPEX
OPEX are derived from curative maintenance actions of repairing pipes and connections, on the one hand, and preventive maintenance actions of leak detection, on the other hand; Equation (11) summarizes the annual maintenance costs as follows:
OPEX = Cpipe_reparation + Cconnection_reparation + Cleak_detection
Each component of the maintenance cost is displayed in Equation (12) as follows:
OPEX = Crep × (np + nd) + Crep × nlc + Cdet × ldet
CAPEX measure the cost of asset management actions in terms of pipes, connections and meters renewal as indicated in Equation (13):
CAPEX = Cpipe_renewal + Cconnection_renewal + Cmeter_renewal
Equation (13) becomes as follows when each component of investment cost is displayed:
CAPEX = Cp × rp × lnet + Ccon × rc × nlc + Cmeter × rm × nm

2.3. The Artificial Neural Network (ANN)

A neural network is composed of multiple perceptron and is called a deep neural network when the number of hidden layers is greater than or equal to 2 [19]. We use a multiple layers neural network in order to predict the WER based on nine KPIs considered as input [16]. Figure 2 illustrates a perceptron representing a layer in an ANN.
The value assigned to neuron i in Figure 2 can be calculated by Equation (15) as follows:
n e u r o n i k = r e l u ( w i , 1 × n e u r o n 1 k 1 + w i , 2 × n e u r o n 2 k 1 + w i , 3 × n e u r o n 3 k 1 + b i k 1 )
The rectified linear unit function r e l u is given by Equation (16):
r e l u ( x ) = { 0   f o r   x < 0   ; x   f o r   x 0 }
The vector n e u r o n k that groups all the values assigned to the neurons in layer k is calculated as follows:
n e u r o n k = r e l u [ w 0 , 0 w 0 , n w i , 0 w i , n ] [ n e u r o n 0 k 1 n e u r o n n k 1 ] + [ b 0 k 1 b n k 1 ]
Equation (17) becomes:
n e u r o n k = r e l u ( M w k 1 × n e u r o n k 1 + b k 1 )
where:
  • Mw is the matrix of weights;
  • n e u r o n is the vector of neuron values;
  • k is the index of the layer;
  • i is the number of neurons in the k th layer;
  • n is the number of neurons in the ( k 1 ) th layer;
  • b is the bias vector.
The output value of the ANN can be computed by Equation (18). In our case, it is a single neuron which produced the water efficiency rate WER. The value of this neuron depends on the values of the previous neuron layers and the associated weights and biases.
Values of the previous layers also depend on weights and biases as well as input variables. The input variables are known, the objective is to determine the optimal values of weights and biases to give a good prediction.
To do this, during the learning phase, the prediction is compared to the real value. Weights and bias are adjusted until a satisfactory error is obtained. Error is commonly calculated with a Loss function noted L . For regression problems, the function L corresponds to the mean square error which computes the square difference between the observed and predicted value:
L ( y i , y i ^ ) =   1 n i = 1 n ( y i y i ^ ) 2
where n is the number of input values, y i is the value of input i, and y i ^ is the corresponding predicted value.
To minimize the loss function L , we use an optimization function A d a g r a d which modifies weights and bias in order to minimize the error. A d a g r a d was introduced by Ref. [20] and it is called so for adaptive gradient algorithm. During the learning process, the weights are updated considering Equation (20):
Δ w i ( t ) =     η G i ( t ) + ε × L w i ( t )
with:
{ G i ( t ) = G i ( t 1 ) + ( L w i ( t ) ) 2 G i ( 0 ) = 0
The term η G i ( t ) + ε is the effective learning rate, with η being the initial learning rate. The term L w i ( t ) is the gradient (partial derivative of loss function with respect to weights). By this definition, G i is a monotone increasing function. So, the effective learning rate is monotonously decreasing. Note that G i and the effective learning rate are different for each weight.
Figure 3 illustrates the ANN built for our prediction model. It is designed for the nine input explanatory variables representing KPIs (with French mandatory codes), two hidden layers with the same number of neurons as the input layer. The output layer considered as the output of the model is only composed of the neuron corresponding to the water efficiency rate, WER (code: P104.3).
Since the number of input samples is more than 10,000, the model is trained with a batch size of 200. The batch size corresponds to the number of samples that will be propagated through the neural network. After propagation, weights and biases are updated in order to decrease the error. Once all training samples are passed once through the network, this counts as 1 epoch. The network training is done by performing multiple epochs.

2.4. NSGA II and the Problem Formulation

The problem to solve concerns the optimization of asset management actions in an ex-ante way in order to maximize the WER and minimize the annual total costs. The decision variables measure the level of actions in terms of pipes, connections, and meters maintenance and renewal. We consider that the following variables ldet, rc, rp, and rm are the most relevant for the decision maker in terms of asset management. The problem can be formulated as the following:
Maximize f1(ldet,rc,rp,rm) = WER(t)
M a x i m i z e   f 2 ( l d e t , r c , r p , r m ) = 1 C T o t ( t )
constrained by:
l d e t _ m i n l d e t l d e t _ m a x
r p _ m i n r p r p _ m a x
r c _ m i n r c r c _ m a x
r m _ m i n r m r m _ m a x
The value of upper and lower limits of decision variables are defined according to the water utility manager expectations. By considering the two fitness functions f1 and f2, NSGA II will attempt to find the best 4-tuple (ldet, rc, rp, and rm) from a population of potential solutions. The population size is set in advance and the values of the 4-tuple elements are generated randomly between the upper and lower boundaries to initialize the population as shown in Figure 4.

2.4.1. The Concept of Non-Dominance

The NSGA II implements the concept of dominance to reach potential solutions. The concept of dominance is well defined in Ref. [21]. Two definitions can be considered. The first one considers two solutions, that solution X1 dominates solution X2 if both conditions are true: (i) solution X1 is not worse than X2 for all the objectives, and (ii) solution X1 is strictly better than X2 for at least one objective. Conditions are resumed in Equations (28) and (29).
i ∈ {1,2}: fi (X1) ≥ fi (X2)
j ∈ {1,2}: fj (X1) > fj (X2)
The second definition considers as non-dominated solutions those that are not dominated by any member of the considered population.

2.4.2. The Crowding Distance

To sort solutions, NSGA II uses a crowding distance [17,22,23]. It is used to estimate the density of solutions surrounding an individual in the population by considering the difference of the objective values of the nearest neighbor as shown in Figure 5. It is an estimate of the size of the largest cuboid enclosing point k, without including any other point in the population. In the following sections, the term individual designates a potential solution.
Let’s consider F the size of the front, for individuals, the crowding distance is calculated by the difference between the objective values of the two nearest neighbors:
d i = m = 1 M f i + 1 m f i 1 m f m a x m f m i n m
The edge, the first individual and the last individual in the rank, are assigned with a large distance to ensure that boundary points will always be selected as shown by Equation (31).
d 0 = d F 1 =
where M is the number of objectives, f i m is ith fitness values in the mth objective, and f m a x m and f m i n m are the maximum and minimum objective values of the mth objective (in the non-dominated set).
This formulation maintains diversity in the population by eliminating redundant individuals but suffers from a loss of both vertical and horizontal diversity as explained by Ref. [22]. To improve the diversity in the final front, an improvement of the crowding distance has been proposed by Ref. [23] by defining a dynamic crowding distance:
D d i = d i log ( 1 V i )
with:
V i = m = 1 M ( | f i + 1 m f i 1 m | d i ) 2
The dynamic crowding distance is computed for each individual in the non-dominated set. The individual which has the lowest dynamic crowding distance is removed. The dynamic crowding distance is updated after each removal. These operations are repeated until the size of the non-dominated set is equal to the population size.

2.4.3. The Selection Method

Once individuals have been assessed and sorted, k Elements of the population are taken as candidates for the mating pool, where k designates the tournament size [17]. Random selection is a particular case of tournament selection when k = 1 . For k > 1 , the selection method is called tournament selection. The k individuals are compared to each other based on their rank and crowding distance. The best individual is added to the mating pool. The operation is repeated a second time to obtain two individuals in the mating pool as shown in Figure 6. Selected solutions are subject to crossover and mutations to create offspring. Tournament selection is repeated until the number of created offsprings is sufficient.
There exists other selection operators where individuals are chosen based on their proportional fitness value, as the roulette wheel selection (RWS). The individual is selected according to a probability of selection calculated by the ratio between its fitness value and the sum up of fitness values of individuals in the mating pool [24].

2.4.4. The Crossover

Realizing a crossover is a way of using the information of two parents in the population to obtain one child [25]. There are different possible recombinations and several authors have compared them to each other in different problems [26,27]. There is no consensus in the literature concerning the effectiveness of single point crossover or multi-point crossover. This depends on the particularities of the problem. The danger of algorithms comparison on a small sample according to their performance is underlined in Ref. [28]. The authors advise to integrate problem-specific knowledge into the functioning of the algorithm; this integration can also concern crossover operators. In our case, there are two objective functions and the only constraints in this problem are upper and lower bounds of the 4-tuple variable. Hence, we choose to use the flat crossover which is a widely used crossover method [29]. Considering two parents in the current population:
P a r e n t 1 = ( l d e t 1 , r p 1 , r c 1 ,     r m 1 )
P a r e n t 2 = ( l d e t 2 , r p 2 , r c 2 , r m 2 )
and a random vector:
r = ( r 1 , r 2 , r 3 , r 4 )
with random values r i [ 0 , 1 ] . The ith child ( l d e t i ,   r p i ,   r c i ,   r m i ) is a linear combination of the two parents:
l d e t i =   r 1 × l d e t 1 + ( 1 r 1 ) × l d e t 2
r p i = r 3 × r p 1 + ( 1 r 3 ) × r p 2
r c i = r 2 × r c 1 + ( 1 r 2 ) × r c 2
r m i = r 4 × r m 1 + ( 1 r 4 ) × r m 2
Table 4 shows an example of the offspring that two parents can give by applying this crossover.

2.4.5. The Mutation

The mutation is an operator that modifies an individual to explore the entire search space [25] and to escape from local optima thanks to small changes in the values of the 4-tuple variables. It is used to maintain diversity in the population of potential solutions. We use the polynomial mutation introduced by Ref. [30]. For each variable, there is a mutation probability. The mutation probability is set at 1/4 since each solution is represented by a 4-tuple. There is one mutation per offspring on average.

2.4.6. The Selection of Offspring

Once the crossovers and mutations have been achieved, we end up with a population of P individuals and a population of P offspring. The total size of the selection is 2P and this must be reduced to P individuals. This selection is made by keeping the best individuals as requested by NSGA II [17]. In this way, the next generation will be better than the previous generation or equivalent if no individual from the descendants is better than the current population. This is called elitism selection. To select the best individual, we defined an operator n basis on individual domination rank r a n k p and dynamic crowding distance Ddp. The partial order n is defined as:
p   n q   if   ( r a n k p <   r a n k q )   or   ( ( r a n k p =   r a n k q )   and   D d p >   D d q )
The individual with the lower rank, according to the non-dominated sorting algorithm, is preferred. If two individuals have the same rank, the one which is located in the lower density of solutions is preferred.
The selection of 2P individual is first sorted in the ascending order with respect to their rank obtained by the non-dominated sorting algorithm. Then, individuals are sorted with respect to the dynamic crowding distance in descending order. The next generation is thus generated until there are P individuals in the new population.

2.4.7. Performance Metrics

The effectiveness of the model depends on its ability to ensure diversity, a good distribution and spread of solutions. To evaluate the distribution, we use the Spacing index (SP) introduced by Ref. [31]. To be able to assess the spread in a population of P individuals (potential solutions), we need to calculate d i which is the minimum of the sum of the absolute difference in objective fitness values between the ith solution and any other solution as shown in Equation (42), and d ¯ , the mean value of d i calculated by Equation (43):
d i = min i ,   i k { m = 1 M | f i m f k m | }
d ¯ = i = 1 | P | d i | P |
Therefore, SP is obtained by Equation (44):
S P = 1 | P | 1 i = 1 | P | ( d i d ¯ ) 2
SP is used to evaluate the spacing between the different solutions. If the distance between each solution is the same, then the SP value will be zero. Thus, a value of zero or near zero indicates a good distribution of solutions on the Pareto front. The spread index was proposed by Ref. [17]:
Δ = d f + d l + i = 1 | P | ( d i   d ¯ ) d f + d l + ( | P | 1 ) × d ¯
d f and d l are the Euclidean distances between the extreme solutions and the obtained Pareto solutions. A Δ value close to 0 means that the solutions are well dispersed along the Pareto front.

3. Case Study

The model is implemented on a real water distribution network in the south of France. According to the data of the year 2016, the water system delivers 700,000 m3 of drinking water for about 6300 users with a network length of 82 km. We consider the actual asset management actions implemented by the water utility as a baseline solution. It can be resumed by a leak detection of the entire network once (ldet = 82 km) that allows detecting 14 leaks on average. Annual renewal rates are: rp = 0.71%, rc = 2.5% and rm = 10%. Thanks to these actions, WER = 76.90% with a total cost equals to 551,493 €, shared between 70% in CAPEX and 30% in OPEX. We aim at improving WER by conducting alternative asset management actions at lower costs than commonly used strategies. Before searching compromise solutions, ANN is fitted thanks to the SISPEA database. SISPEA is a mandatory French database that gathers 26 KPIs from more than 12,000 water utilities between 2006 and 2016. Data were split into two samples, 70% of the data is split to fit the Ann model, and the remaining 30% is used for validation.

3.1. Artificial Neural Network Fitting

The calibration of the ANN requires the definition of a set of parameters that improve its accuracy. As discussed in Ref. [16], many simulations are carried out in order to determine the most appropriate values for the number of hidden layers, the number of neurons per layer, and the type of activation number. The selected ANN is built by three hidden layers with 144, 36, and 9 neurons at each layer, respectively. The chosen activation function is the function r e l u for all neurons. The estimation of required variables for water losses estimation for the year (N + 1) at the local scale (see Equation (7)) is generated based on expert opinion and Monte Carlo analysis. Table 5 compares the observed and predicted values of WER for the period between 2010 and 2016.
According to Table 5, ANN seems to predict the WER with a high accuracy; the estimation error oscillates between −1.27% and +0.13%.

3.2. Parameters of the NSGA II

The implementation of NSGA II requires the definition of the type of tournament to consider and to set the population size of potential solutions. According to performance indicators resumed in Table 5, we compare between the tournament selection method with tournament size k = 2, random selection, and roulette wheel selection. The mean values and standard deviation are calculated on 10 tests for each selection method; results are resumed in Table 6. The population size is fixed at 200, and the number of generations is set at 25.
Tournament selection and roulette wheel selection seem more efficient than random selection in terms of both distribution and spread.
The method with the best results is roulette wheel selection. Both the average and the standard deviation of the two performance metrics are the lowest. Note that the standard deviation for tournament selection is greater than for random selection.

3.3. Population Evolution

The objective of the model is to get closer to the true and unknown Pareto front. Over the generations, the population should move closer to the Pareto front. The starting population is generated randomly between the limits set for the 4-tuple values of the decision variables as presented in Table 7.
Figure 7 shows the evolution of potential solutions composing the population (size P = 200) using the roulette wheel selection after 25 generations. The optimal front is quickly reached; the population improves significantly in the first generations, and then very slowly over the last five generations, the front stabilizes for the last generations. The obtained front confirms the relevance of using the roulette wheel selection.

3.4. Problem Resolution

Performed tests guide the choice of the type of selection, crossover and mutation operators. To solve the considered problem (see Equations (22) and (23)), the NSGA II is implemented with an initial population generated randomly with 1000 individuals. The crossover probability is set to 90% to generate the offspring using the flat crossover. The polynomial mutation is used with an index polynomial mutation of 1 and a mutation probability of 25%. The new population is selected from roulette wheel selection. The number of performed generations is 25.
Figure 8 illustrates the Pareto front obtained for the considered objectives. The blue dots forming the front in the middle correspond to the average value of the water efficiency rate predicted by the ANN model. The red dots forming the upper and lower fronts define the limits of the 95% confidence interval. Each point of the front represents the 4-tuple of the constrained decision variables of the problem: the rate of pipes renewal, the rate of connections renewal, meters renewal, and length proven by leak detection. Table 8 details some of the solutions composing the Pareto front.
The comparison of the baseline solution (actual practice) to the proposed solutions indicates that actual practice does not offer a compromise between cost and performance. Its costs more than all solutions listed in Table 8 with a value of WER = 76.9%. Another interesting analysis concerns the repartition of expenditures between CAPEX and OPEX. Water utility privileges investment by increasing CAPEX (70 %) where our model advises to increase OPEX to 99% of total expenditures (according to Table 8).
Results show a significant influence of the leak detection and reparation actions on the WER. This is an intuitive result but the main advantage of the proposed model is its capacity to predict the effects of actions on the WER. The length of the water system under consideration is about 86 km. The values of length proven by leak detection per year contained in Table 8 correspond to approximately one, two and three times the total length of the network. The total cost is shared into two parts, CAPEX and OPEX. OPEX are largely due to the leak detection and asset reparations while CAPEX value is low due to low investments. In the short term period, the model shows that the main way to improve WER when the value of the efficiency is already high is the investigation for leaks. Indeed, leak detection allows improving more efficiently the water efficiency rate for an acceptable cost, compared to renewal actions. Renewal actions start to have an impact when performance values and asset condition are low.

4. Discussion

Predictions of WER (outputs) obtained for management actions (inputs) seem to be coherent with practice. In fact, Table 8 shows a positive correlation between WER and total cost, which confirms that it is required to spend more money to enhance performance. There seems to exist a threshold effect between expenditures and performance, even if we double the budget (from 2.23 k€ to 4.13 k€) performance increases only by 7%. This result is important because it indicates that even if the budget is available, it has to be spent in a certain manner and shared adequately between investment and maintenance actions. Another point of interest concerns the share of OPEX in the total expenditures. OPEX represents 99% of the total costs, which implies that if we aim at improving WER in the short term, it is recommended to spend more money for maintenance actions than investment. The advantage of the proposed approach is to drive the decision by indicating the type of maintenance actions to implement. For the studied case, Table 8 indicates that the leak detection and reparation of leaks seem to be efficient. The values of upper limits for decision variables were defined as the following: The limit of leak detection rate corresponds to a total inspection of the entire network each month (12 per year), renewal rate of pipes and connections is limited to 5% (5 times the actual rate) per year considering an average lifespan for asset of 50 years (ambitious), and the renewal rate of meters is limited to 10% (two times the actual rate) which corresponds to a lifespan for meters of 10 years on average.
The definition of NSGA II parameters requires expertise and should be driven by tests. Results show the relevance of comparing different operators of selection and crossover. For the current study, roulette wheel seems more pertinent than other methods. The size of the population and the number of generations are also an important parameter to fit. The followed procedure aims at driving the implementation of the approach by: (i) defining the first suitable operators for a fixed-size population (P = 200); (ii) test the range of values for the number of generations (5 to 25); and (iii) increase the size of population for a given number of generations from 200 to 1000. Even if we cannot generalize the obtained results, it seems that this procedure leads to improve the shape of the Pareto front and to make it less discontinuous and more uniform. It can be interpreted as an improvement of the consistency of the front as shown in Figure 8.
The variety of solutions offered by the Pareto front constitutes a set of potential actions to implement depending on the context, constraints, and objectives to reach. This constitutes a valuable mitigation tool for decision makers and stakeholders.
Another advantage provided by the prospecting model is its capacity to be coupled with NSGA II in order to guide the search for the most relevant solutions. Even if results are really encouraging, some aspects have to be investigated. The dynamics of the model are not actually addressed: how is it possible to improve the planning of actions from year to year by updating input data? Another important aspect concerns the effect of asset management actions in the long term; it appears that maintenance actions significantly improve the value of WER with a low total cost. This can be considered as relevant in the short term, but it is not supposed to encourage the non-investment actions. A risk can be faced by the water utility due to an under-investment, which is the deterioration of the asset and the delivered service. One possible improvement is to constrain the rate of asset renewal when the solutions are searched for in order to avoid an important asset aging due to disinvestment.

5. Conclusions

The actual research is considered an encouraging improvement of our model based on ANN for predicting KPI’s. The proposed improvement confirms that it is possible to predict and optimize KPIs for water utilities by coupling ANN and NSGA II in the context of the lack of local data. Many aspects should be checked in relation to the characteristics and parameters of ANN like the number of hidden layers and the number of neurons and activation functions. For NSGA II, the set of population and type of selection, crossover, and mutation have to be fixed before implementing the prediction model. All these aspects can render the model difficult to implement by the water utility because it requires specific skills. The actual model should be improved to gain simplicity for easy implementation by water utilities.
The absence of local data is encountered by the use of a national mandatory database and Monte Carlo simulations; this can be useful in the short term. We demonstrate to the water utility managers the usefulness of using data for prediction; this should encourage the water utility manager to improve their IS and converge to a smart water system in order to catch real-time data for supporting the decision making. The interpretation of results should take into account the context of the water utility. The preference of implementing maintenance actions versus renewal actions can be relevant when the value of WER is high. That means that the condition of the asset is good and does not require renewal. This can be acceptable in certain conditions but not adequate when the asset has deteriorated. For example, if assets are in a good condition and the water system is young, it is not necessary to check the network by leak detection. The context and condition of the network have to be considered when the boundaries of decision variables are set. Their range of variation may consider as low boundaries thresholds different from 0 to avoid the aging of assets in the long term.
The variation of WER and cost shown by the Pareto front seem realistic and offers a variety of potential solutions to the decision maker which is valuable.
Further research will explore the reproducibility of the developed approach for other KPIs by defining the set of input variables and how the ANN model and NSGA II can help to predict them. For example, the SISPEA database contains 25 additional KPIs that merit to be predicted in the same way as WER. We intend to explore the possibility to adapt the current model and make a general methodology for predicting water utility KPIs.

Author Contributions

A.N. conceived the methodology, analyzed and interpreted results, contributed to the paper writing, reviewing and editing. J.B. implemented the methodology, performed data processing, models fitting and participated to the paper writing.

Funding

The work presented is part of the French project “SPHEREAU”, grant number AAP FUI n° 22. It was funded by “bpifrance”, the basin water agency “Agence de l’Eau Rhin Meuse”, French regional authorities “Région Centre Val de Loire” and “la region Grand Est”.

Acknowledgments

We are grateful to the manager of the utility who helped us in our research and provided us with data, information and advice.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Alegre, H.; Hirnir, W.; Baptisia, J.M.; Parena, R. Performance Indicators for Water Supply Services. Manual of Best Practice Series; IWA Publishing: London, UK, 2000; p. 146. [Google Scholar]
  2. Mutchek, M.; Williams, E. Moving Towards Sustainable and Resilient Smart Water Grids. Challenges 2014, 5, 123–137. [Google Scholar] [CrossRef] [Green Version]
  3. Walski, T.M.; Pelliccia, A. Economic analysis of water main breaks. J. Am. Water Work. Assoc. 1982, 74, 3140–3147. [Google Scholar] [CrossRef]
  4. Skipworth, P.J. Whole Life Costing for Water Distribution Network Management; Thomas Telford: London, UK, 2002. [Google Scholar]
  5. Park, S.W.; Loganathan, G.V. Methodology for economically optimal replacement of pipes in water distribution systems: 1. Theory. KSCE J. Civ. Eng. 2002, 6, 539–543. [Google Scholar] [CrossRef]
  6. Shamir, U.; Howard, C.D.D. Analytical approach to scheduling pipe replacement. J. AWWA 1979, 71, 248–258. [Google Scholar] [CrossRef]
  7. Kleiner, Y.; Adams, B.J.; Rogers, J.S. Selection and scheduling of rehabilitation alternatives for water distribution systems. Water Resour. Res. 1998, 34, 2053–2061. [Google Scholar] [CrossRef]
  8. Halhal, D.; Walters, G.A.; Ouazar, D.; Savic, D.A. Water Network Rehabilitation with Structured Messy Genetic Algorithm. J. Water Resour. Plan. Manag. 1997, 123, 137–146. [Google Scholar] [CrossRef] [Green Version]
  9. Kim, K.; Seo, J.; Hyung, J.; Kim, T.; Kim, J.; Koo, J. Economic-based approach for predicting optimal water pipe renewal period based on risk and failure rate. Environ. Eng. Res. 2018, 24, 63–73. [Google Scholar] [CrossRef] [Green Version]
  10. Xu, Q.; Chen, Q.; Ma, J.; Blanckaert, K. Optimal pipe replacement strategy based on break rate prediction through genetic programming for water distribution network. J. Hydro Environ. Res. 2013, 7, 134–140. [Google Scholar] [CrossRef]
  11. O’Reilly, G.; Bezuidenhout, C.; Bezuidenhout, J.J. Artificial neural networks: Applications in the drinking water sector. Water Sci. Technol. Water Supply 2018, 18, 1869–1887. [Google Scholar] [CrossRef]
  12. Cuesta Cordoba, G.A.; Tuhovcak, L.; Taus, M. Using Artificial Neural Network Models to Assess Water Quality in Water Distribution Networks. Procedia Eng. 2014, 70, 399–408. [Google Scholar] [CrossRef] [Green Version]
  13. Mounce, S.R.; Machell, J. Burst detection using hydraulic data from water distribution systems with artificial neural networks. Urban Water J. 2006, 3, 21–31. [Google Scholar] [CrossRef]
  14. Dongwoo, J.H.P.; Gyewoon, C. Estimation of Leakage Ratio Using Principal Component Analysis and Artificial Neural Network in Water Distribution Systems. Sustainability 2018, 10, 750. [Google Scholar] [CrossRef]
  15. Caputo, A.C.; Pelagagge, P.M. An inverse approach for piping networks monitoring. J. Loss Prev. Process. Ind. 2002, 15, 497–505. [Google Scholar] [CrossRef]
  16. Nafi, A.; Brans, J. Prediction of Water Utility Performance: The Case of the Water Efficiency Rate. Water 2018, 10, 1443. [Google Scholar] [CrossRef]
  17. Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evolut. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef] [Green Version]
  18. Legifrance. Available online: https://www.legifrance.gouv.fr/affichTexte.do?cidTexte= ORFTEXT000000274838URL (accessed on 12 June 2019).
  19. Kamínski, K.; Kami´nski, W.; Mizerski, T. Application of Artificial Neural Networks to the Technical Condition Assessment of Water Supply Systems. Ecol. Chem. Eng. S 2017, 24, 31–40. [Google Scholar] [CrossRef] [Green Version]
  20. Duchi, J.; Hazan, E.; Singer, Y. Adaptive Sub-gradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 2011, 12, 2121–2159. [Google Scholar]
  21. Ding, L.; Zeng, S.; Kang, L. A fast algorithm on finding the non-dominated set in multi-objective optimization. In Proceedings of the Congress on Evolutionary Computation (CEC’03), Canberra, ACT, Australia, 8–12 December 2003. [Google Scholar]
  22. Yang, L.; Guan, Y.; Sheng, W. A novel dynamic crowding distance based diversity maintenance strategy for MOEAs. In Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC), Ningbo, China, 9–12 July 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 1, pp. 211–216. [Google Scholar]
  23. Luo, B.; Zheng, J.; Xie, J.; Wu, J. Dynamic crowding distance—A new diversity maintenance strategy for MOEAs. In Proceedings of the Fourth International Conference on Natural Computation, Jinan, China, 18–20 October 2008; IEEE: Piscataway, NJ, USA, 2008; Volume 1, pp. 580–585. [Google Scholar]
  24. Deb, K. Multi-Objective Optimization Using Evolutionary Algorithms; John Wiley & Sons: Hoboken, NJ, USA, 2001. [Google Scholar]
  25. Lim, S.M.; Sultan, A.B.; Sulaiman, N.; Mustapha, A.; Leong, K.Y. Crossover and Mutation Operators of Genetic Algorithms. Int. J. Mach. Learn. Comput. 2017, 7, 9–12. [Google Scholar] [CrossRef] [Green Version]
  26. Mendes, J.A. Comparative study of crossover operators for genetic algorithms to solve the job shop scheduling problem. WSEAS Trans. Comput. 2013, 12, 164–173. [Google Scholar]
  27. Liagkouras, K.; Metaxiotis, K. An Elitist Polynomial Mutation Operator for Improved Performance of MOEAs in Computer Networks. In Proceedings of the 22nd International Conference on Computer Communication and Networks (ICCCN), Nassau, Bahamas, 30 July–2 August 2013; pp. 1–5. [Google Scholar]
  28. Wolpert, D.H.; Macready, W.G. No Free Lunch Theorems for Optimization. IEEE Trans. Evolut. Comput. 1997, 1, 67–82. [Google Scholar] [CrossRef]
  29. Sharapov, R.R. Genetic Algorithms: Basic Ideas, Variants and Analysis. In Vision Systems: Segmentation and Pattern Recognition; Goro, O., Ashish, D., Eds.; Springer: Berlin, Germany, 2009; pp. 407–422. [Google Scholar]
  30. Deb, K.; Deb, D. Analysing mutation schemes for real-parameter genetic algorithms. Int. J. Artif. Intell. Soft Comput. 2014, 4, 1–28. [Google Scholar] [CrossRef]
  31. Schott, R.J. Fault Tolerant Design Using Single and Multicriteria Genetic Algorithm Optimization. Master’s Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1995. [Google Scholar]
Figure 1. Steps for annual water losses estimation, adapted from Ref. [16].
Figure 1. Steps for annual water losses estimation, adapted from Ref. [16].
Water 11 01542 g001
Figure 2. Example of a single perceptron.
Figure 2. Example of a single perceptron.
Water 11 01542 g002
Figure 3. Neural network configuration with two hidden layers.
Figure 3. Neural network configuration with two hidden layers.
Water 11 01542 g003
Figure 4. Flowchart of the fast, elitist, non-dominated sorting genetic algorithm (NSGA-II).
Figure 4. Flowchart of the fast, elitist, non-dominated sorting genetic algorithm (NSGA-II).
Water 11 01542 g004
Figure 5. The crowding distance of individual in the front. Amended from Ref. [17].
Figure 5. The crowding distance of individual in the front. Amended from Ref. [17].
Water 11 01542 g005
Figure 6. The tournament selection.
Figure 6. The tournament selection.
Water 11 01542 g006
Figure 7. Evolution of Pareto front after 25 generations.
Figure 7. Evolution of Pareto front after 25 generations.
Water 11 01542 g007
Figure 8. The Pareto front obtained from 1000 individuals and 25 generations.
Figure 8. The Pareto front obtained from 1000 individuals and 25 generations.
Water 11 01542 g008
Table 1. Explanatory variables for efficiency rate.
Table 1. Explanatory variables for efficiency rate.
Asset Management ActionsSISPEA CodeExplanatory Variables—Indicators
Metering and metering errorVP.056Number of users
VP.228Linear density of users
VP.063Billed metered domestic consumption
VP.221Volume of unmetered consumption
VP.232Billed metered consumption
VP.234Volume produced + Volume imported
Leakage and water lossesVP.225Average network efficiency rate over last 3 years
P106.3Linear leakage index on distribution mains (LLI)
Pipes renewalP107.2Average renewal rate of water mains over the last 5 years
Table 2. List of variables required for hydraulic balance.
Table 2. List of variables required for hydraulic balance.
SymbolDefinition of the VariableUnit
W b Annual volume of water-billed metered consumptionm3
W m Annual volume of water loss due to metering errorm3
W p Annual volume of water loss due to leaks on main pipesm3
W c Annual volume of water loss due to leaks on connectionsm3
W i Annual volume of water loss due to invisible leaksm3
ε m Metering error in percentage%
A g e ¯ m Average age of meters#
M T T R v l Mean time to repair visible leaks
M T T R i n v Mean time to repair hidden leaks
d p Average flow rate for a leak on pipeL/s
d c Average flow rate for a leak on connectionL/s
d Average flow rate for a hidden leakL/s
ninvNumber of invisible breaks/leaks#
n p Number of breaks/leaks on pipes per year#
n l c Number of breaks/leaks on connections per year#
n c Number of connections#
r b Pipe breakage rate#/km
r c b Connection breakage rate#/km
r d leak detection efficiency rate%
α Invisible leakage rate on main pipes/connections%
L n e t Network lengthkm
Table 3. List of required variables for cost calculations.
Table 3. List of required variables for cost calculations.
SymbolsDefinition of the Variable
CrepCost of a leak reparation in € per unit
CdetCost of leak detection in € per km
CmeterCost of a meter in € per unit
CconCost of a connection renewal in € per unit
CpCost of pipe renewal in € per km
n l c Number of leaks on connections per year
npNumber of leaks on pipes per year
ndNumber of leaks detected by leak detection
nmNumber of installed meters
lnetLength of the network
ldetLength of the network investigated by leak detection
rcRate of annual connections renewal in percentage per year
rpRate of annual pipe renewal in percentage of the length renewed per year
rmRate of annual meter renewal in percentage per year
Table 4. Crossover and offspring generation.
Table 4. Crossover and offspring generation.
Individualldetrprcrm
Parent174.180.00420.00830.028
Parent282.070.00310.00680.035
Random vector, r0.610.770.750.55
Offspring177.270.00400.00790.031
Offspring278.980.00330.00720.032
Table 5. Comparison between observed and predicted WER between 2010 and 2016.
Table 5. Comparison between observed and predicted WER between 2010 and 2016.
YearObservedPredicted-ANN (%)Estimation Error (%)
201065.965.7−0.30%
201163.262.4−1.27%
201261.460.7−1.14%
20137575.10.13%
201475.175.10.00%
201576.976.8−0.13%
201676.676.60.00%
Table 6. Performance comparison of three selection methods in terms of distribution and spread.
Table 6. Performance comparison of three selection methods in terms of distribution and spread.
Selection MethodPerformance MetricsSpace IndexSpread Index
RandomMean0.1270.313
Std0.0250.091
Roulette WheelMean0.1130.259
Std0.0230.074
TournamentMean0.1150.286
Std0.0350.130
Table 7. Definition of decision variable constraints.
Table 7. Definition of decision variable constraints.
Decision VariablesLower BoundUpper Bound
ldet012 × Lnet
rp05%
rc05%
rm010%
Table 8. Trade-off solutions with regard to considered constraints and objective.
Table 8. Trade-off solutions with regard to considered constraints and objective.
rprcrmLnet (km)Total Cost (€)CAPEX (€)OPEX (€)WER (%)
4 × 10−58.1 × 10−54.1 × 10−386.5223,3022837220,46574.5
4.6 × 10−58.2 × 10−47.3 × 10−3162.6309,8383935305,90379.0
5.4 × 10−59.5 × 10−48.4 × 10−3260.2413,8684536409,33281.6

Share and Cite

MDPI and ACS Style

Nafi, A.; Brans, J. Cost–Benefit Prediction of Asset Management Actions on Water Distribution Networks. Water 2019, 11, 1542. https://doi.org/10.3390/w11081542

AMA Style

Nafi A, Brans J. Cost–Benefit Prediction of Asset Management Actions on Water Distribution Networks. Water. 2019; 11(8):1542. https://doi.org/10.3390/w11081542

Chicago/Turabian Style

Nafi, Amir, and Jonathan Brans. 2019. "Cost–Benefit Prediction of Asset Management Actions on Water Distribution Networks" Water 11, no. 8: 1542. https://doi.org/10.3390/w11081542

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop