A Fast Elitist Non-Dominated Sorting Genetic Algorithm for Multi-Objective Optimization: NSGA-II

Page 1

A Fast Elitist Non-Dominated Sorting Genetic Algorithm for Multi-Objective Optimization: NSGA-II
Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and T Meyarivan
Kanpur Genetic Algorithms Laboratory (KanGAL) Indian Institute of Technology Kanpur Kanpur, PIN 208 016, India
deb,samira,apratap,mary @iitk.ac.in http://www.iitk.ac.in/kangal
KanGAL Report No. 200001 Abstract. Multi-objective evolutionary algorithms which use non-dominated sort- ing and sharing have been mainly criticized for their (i) З´СЖїµ computational complexity (where С is the number of objectives and Ж is the population size), (ii) non-elitism approach, and (iii) the need for specifying a sharing parameter. In this paper, we suggest a non-dominated sorting based multi-objective evolution- ary algorithm (we called it the Non-dominated Sorting GA-II or NSGA-II) which alleviates all the above three difficulties. Specifically, a fast non-dominated sort- ing approach with З´СЖѕµ computational complexity is presented. Second, a selection operator is presented which creates a mating pool by combining the parent and child populations and selecting the best (with respect to fitness and spread) Ж solutions. Simulation results on five difficult test problems show that the proposed NSGA-II is able to find much better spread of solutions in all prob- lems compared to PAES—another elitist multi-objective EA which pays special attention towards creating a diverse Pareto-optimal front. Because of NSGA-II’s low computational requirements, elitist approach, and parameter-less sharing ap- proach, NSGA-II should find increasing applications in the years to come.
1 Introduction
Over the past decade, a number of multi-objective evolutionary algorithms (MOEAs) have been suggested [9,3,5,13]. The primary reason for this is their ability to find multiple Pareto-optimal solutions in one single run. Since the principal reason why a problem has a multi-objective formulation is because it is not possible to have a single solution which simultaneously optimizes all objectives, an algorithm that gives a large number of alternative solutions lying on or near the Pareto-optimal front is of great practical value. The Non-dominated Sorting Genetic Algorithm (NSGA) proposed in Srinivas and Deb [9] was one of the first such evolutionary algorithms. Over the years, the main criticism of the NSGA approach have been as follows: High computational complexity of non-dominated sorting: The non-dominated sort- ing algorithm in use uptil now is З´СЖ¿µ which in case of large population sizes

Page 2

2 Deb, Agrawal, Pratap, and Meyarivan
is very expensive, especially since the population needs to be sorted in every gen- eration. Lack of elitism: Recent results [12, 8] show clearly that elitism can speed up the per- formance of the GA significantly, also it helps to prevent the loss of good solutions once they have been found. Need for specifying the sharing parameter × Ц: Traditional mechanisms of insur- ing diversity in a population so as to get a wide variety of equivalent solutions have relied heavily on the concept of sharing. The main problem with sharing is that it requires the specification of a sharing parameter (× Ц). Though there has been some work on dynamic sizing of the sharing parameter [4], a parameterless diver- sity preservation mechanism is desirable. In this paper, we address all of these issues and propose a much improved version of NSGA which we call NSGA-II. From the simulation results on a number of difficult test problems, we find that NSGA-II has a better spread in its optimized solutions than PAES [6]—another elitist multi-objective evolutionary algorithm. These results encourage the application of NSGA-II to more complex and real-world multi-objective optimization problems.
2 Elitist Multi-Objective Evolutionary Algorithms
In the study of Zitzler, Deb, and Theile [12], it was clearly shown that elitism helps in achieving better convergence in MOEAs. Among the existing elitist MOEAs, Zitzler and Thiele’s [13] strength Pareto EA (SPEA), Knowles and Corne’s Pareto-archived evolution strategy (PAES) [6], and Rudolph’s [8] elitist GA are well known. Zitzler and Thiele [13] suggested an elitist multi-criterion EA with the concept of non-domination in their strength Pareto EA (SPEA). They suggested maintaining an external population at every generation storing all non-dominated solutions discovered so far beginning from the initial population. This external population participates in genetic operations. At each generation, a combined population with the external and the current population is first constructed. All non-dominated solutions in the com- bined population are assigned a fitness based on the number of solutions they dominate and dominated solutions are assigned fitness worse than the worst fitness of any non- dominated solution. This assignment of fitness makes sure that the search is directed towards the non-dominated solutions. A deterministic clustering technique is used to ensure diversity among non-dominated solutions. Although the implementation sug- gested in [13] is З´СЖ¿µ, with proper book-keeping the complexity of SPEA can be reduced to З´СЖ¾µ. An important aspect of this study and subsequent studies [12, 11] is that they clearly show the importance of introducing elitism in evolutionary multi- criterion optimization. Knowles and Corne [6] suggested a simple MOEA using an evolution strategy (ES). In their Pareto-archived ES (PAES) with one parent and one child, the child is compared with respect to the parent. If the child dominates the parent, the child is accepted as the next parent and the iteration continues. On the other hand, if the parent dominates the child, the child is discarded and a new mutated solution (a new child) is found. However, if the child and the parent do not dominate each other, the choice between the child and

Page 3

Fast Elitist NSGA 3
the parent considers the second objective of keeping diversity among obtained solutions. To maintain diversity, an archive of non-dominated solutions is maintained. The child is compared with the archive to check if it dominates any member of the archive. If yes, the child is accepted as the new parent and the dominated solution is eliminated from the archive. If the child does not dominate any member of the archive, both parent and child are checked for their nearness with the solutions of the archive. If the child resides in a least crowded region in the parameter space among the members of the archive, it is accepted as a parent and a copy of added to the archive. Later, they suggested a multi- parent PAES with similar principles as above. Authors have calculated the worst case complexity of PAES for Ж evaluations as З´ СЖµ, where is the archive length. Since the archive size is usually chosen proportional to the population size Ж, the overall complexity of the algorithm is З´СЖ¾µ. Rudolph [8] suggested, but did not simulate, a simple elitist multi-objective EA based on a systematic comparison of individuals from parent and offspring popula- tions. The non-dominated solutions of the offspring population are compared with that of parent solutions to form an overall non-dominated set of solutions, which becomes the parent population of the next iteration. If the size of this set is not greater than the desired population size, other individuals from the offspring population are included. With this strategy, he has been able to prove the convergence of this algorithm to the Pareto-optimal front. Although this is an important achievement in its own right, the al- gorithm lacks motivation for the second task of maintaining diversity of Pareto-optimal solutions. An explicit diversity preserving mechanism must be added to make it more usable in practice. Since the determinism of the first non-dominated front is З´СЖ¾
µ,
the overall complexity of Rudolph’s algorithm is also З´СЖ¾µ.
3 Elitist Non-dominated Sorting Genetic Algorithm (NSGA-II)
The non-dominated sorting GA (NSGA) proposed by Srinivas and Deb in 1994 has been applied to various problems [10, 7]. However as mentioned earlier there have been a number of criticisms of the NSGA. In this section, we modify the NSGA approach in order to alleviate all the above difficulties. We begin by presenting a number of different modules that form part of NSGA-II. 3.1 A fast non-dominated sorting approach In order to sort a population of size Ж according to the level of non-domination, each solution must be compared with every other solution in the population to find if it is dominated. This requires З´СЖµ comparisons for each solution, where С is the num- ber of objectives. When this process is continued to find the members of the first non- dominated class for all population members, the total complexity is З´СЖ¾µ. At this stage, all individuals in the first non-dominated front are found. In order to find the individuals in the next front, the solutions of the first front are temporarily discounted and the above procedure is repeated. In the worst case, the task of finding of the second front also requires З´СЖ¾µ computations. The procedure is repeated to find the sub- sequent fronts. As can be seen the worst case (when there exists only one solution in

Page 4

4 Deb, Agrawal, Pratap, and Meyarivan
each front) complexity of this algorithm is З´СЖ¿µ. In the following we describe a fast non-dominated sorting approach which will require at most З´СЖ¾µ computations. First, for each solution we calculate two entities: (i) Т , the number of solutions which dominate the solution , and (ii) Л , a set of solutions which the solution domi- nates. The calculation of these two entities requires З´СЖ¾µ comparisons. We identify all those points which have Т
¼ and put them in a list ½. We call ½ the current
front. Now, for each solution in the current front we visit each member () in its set Л and reduce its Т count by one. In doing so, if for any member the count becomes zero, we put it in a separate list А. When all members of the current front have been checked, we declare the members in the list ½ as members of the first front. We then continue this process using the newly identified front А as our current front. Each such iteration requires З´Жµ computations. This process continues till all fronts are identified. Since at most there can be Ж fronts, the worst case complexity of this loop is З´Ж¾µ. The overall complexity of the algorithm now is З´СЖ¾µ ·З´Ж¾µ or З´СЖ¾µ. It is worth mentioning here that although the computational burden has reduced from З´СЖ¿µ to З´СЖ¾µ by performing systematic book-keeping, the storage has increased from З´Жµ to З´Ж¾µ in the worst case. The fast non-dominated sorting procedure which when applied on a population И returns a list of the non-dominated fronts .
fast-nondominated-sort(И)
for each Ф ¾ И for each Х ¾ И if ´Ф Хµ then if Ф dominates Х then ЛФ ЛФ Х include Х in ЛФ else if ´Х Фµ then if Ф is dominated by Х then ТФ ТФ · ½ increment ТФ if ТФ
¼ then
if no solution dominates Ф then
½ ½
Ф Ф is a member of the first front
½
while
А
for each Ф ¾ for each member Ф in for each Х ¾ ЛФ modify each member from the set ЛФ ТХ ТХ ½ decrement ТХ by one if ТХ
¼ then А А Х
if ТХ is zero, Х is a member of a list А
· ½
А
current front is formed with all members of А 3.2 Density Estimation To get an estimate of the density of solutions surrounding a particular point in the pop- ulation we take the average distance of the two points on either side of this point along each of the objectives. This quantity
ЧШ Т serves as an estimate of the size of the

Page 5

Fast Elitist NSGA 5
largest cuboid enclosing the point without including any other point in the population (we call this the crowding distance). In Figure 1, the crowding distance of the -th so- lution in its front (marked with solid circles) is the average side-length of the cuboid (shown with a dashed box). The following algorithm is used to calculate the crowding
Cuboid
f f
1 2
i i-1 i+1 0 l Fig. 1. The crowding distance calculation is shown.
distance of each point in the set Б :
crowding-distance-assignment(Б)
Р
Б
number of solutions in Б for each , set Б ℄ ЧШ Т
¼
initialize distance for each objective С
Б = sort(Б Сµ
sort using each objective value
Б Ѕ℄ ЧШ Т = Б Р℄ ЧШ Т = ½
so that boundary points are always selected for
¾ to ´Р Ѕµ
for all other points
Б ℄ ЧШ Т = Б ℄ ЧШ Т + ´Б · Ѕ℄ С Б Ѕ℄ Сµ
Here Б ℄ С refers to the С-th objective function value of the -th individual in the set Б. The complexity of this procedure is governed by the sorting algorithm. In the worst case (when all solutions are in one front), the sorting requires З´СЖ РУ Жµ computations. 3.3 Crowded Comparison Operator The crowded comparison operator ( Т) guides the selection process at the various stages of the algorithm towards a uniformly spread out Pareto-optimal front. Let us assume that every individual in the population has two attributes. 1. Non-domination rank (Ц Т) 2. Local crowding distance ( ЧШ Т ) We now define a partial order Т as :
Т
if (Ц Т
Ц Т) or ((Ц Т = Ц Т) and ( ЧШ Т ЧШ Т ) )

Page 6

6 Deb, Agrawal, Pratap, and Meyarivan
That is, between two solutions with differing non-domination ranks we prefer the point with the lower rank. Otherwise, if both the points belong to the same front then we prefer the point which is located in a region with lesser number of points (the size of the cuboid inclosing it is larger). 3.4 The Main Loop Initially, a random parent population И¼ is created. The population is sorted based on the non-domination. Each solution is assigned a fitness equal to its non-domination level (1 is the best level). Thus, minimization of fitness is assumed. Binary tournament selection, recombination, and mutation operators are used to create a child population Й¼ of size Ж. From the first generation onward, the procedure is different. The elitism procedure for Ш
½ and for a particular generation is shown in the following:
КШ ИШ ЙШ combine parent and children population
fast-nondominated-sort(КШ)
´ ½
¾
µ, all non-dominated
fronts of КШ until ИШ·Ѕ Ж till the parent population is filled
crowding-distance-assignment( ) calculate crowding distance in
ИШ·Ѕ ИШ·Ѕ include -th non-dominated front in the parent pop Sort(ИШ·Ѕ
Т)
sort in descending order using Т ИШ·Ѕ ИШ·Ѕ ¼ Ж℄ choose the first N elements of ИШ·Ѕ ЙШ·Ѕ = make-new-pop(ИШ·Ѕ) use selection,crossover and mutation to create Ш Ш · ½ a new population ЙШ·Ѕ First, a combined population КШ ИШ ЙШ is formed. The population КШ will be of size ¾Ж. Then, the population КШ is sorted according to non-domination. The new parent population ИШ·Ѕ is formed by adding solutions from the first front till the size exceeds Ж. Thereafter, the solutions of the last accepted front are sorted according to
Т and the first Ж points are picked. This is how we construct the population ИШ·Ѕ of
size Ж. This population of size Ж is now used for selection, crossover and mutation to create a new population ЙШ·Ѕ of size Ж. It is important to note that we use a binary tournament selection operator but the selection criterion is now based on the niched comparison operator Т. Let us now look at the complexity of one iteration of the entire algorithm. The basic operations being performed and the worst case complexities associated with are as follows: 1. Non-dominated sort is З´СЖ¾µ, 2. Crowding distance assignment is З´СЖ РУ Жµ, and 3. Sort on Т is ЗґѕЖ РУ ґѕЖµµ. As can be seen, the overall complexity of the above algorithm is З´СЖ¾µ. The diversity among non-dominated solutions is introduced by using the crowding comparison procedure which is used in the tournament selection and during the popula- tion reduction phase. Since solutions compete with their crowding distance (a measure

Page 7

Fast Elitist NSGA 7
of density of solutions in the neighborhood), no extra niching parameter (such as × Ц needed in the NSGA) is required here. Although the crowding distance is calculated in the objective function space, it can also be implemented in the parameter space, if so desired [1]. It is interesting to note here the connection of this algorithm with the algorithm proposed by Rudolph [8]. Since the non-dominated front finding algorithm used in Rudolph’s algorithm is З´СЖ¾µ for each front, Rudolph control’s the complexity of his algorithm by working with just the first few fronts in the parent and the child pop- ulations and treating the rest of the individuals in the child population at par. With the availability of a fast non-domination sorting algorithm we can now afford to combine the parent and child populations and do a complete sort to identify all the fronts and allocate fitness accordingly.
4 Results
We compare NSGA-II with PAES on five test problems (minimization of both objec- tives): MOP2:
½ґЬµ ½ ЬФ ИТ ½ Ь ½ФТ ¾
Ь½ Ь¾ Ь¿
¾ґЬµ ½ ЬФ ИТ ½ Ь · ½ФТ ¾
(1) MOP3:
½ґЬµ
¢
Ѕ·ґ ½ ½µ
¾
· ´ ¾ ¾µ
¾£ ¾ґЬµ
¢
´Ь · їµ¾ · ´Э · Ѕµ¾£
(2) where
½
¼ × Т ½ ¾ УЧ ½ · × Т ¾ ½ УЧ ¾
¾
½ × Т ½ УЧ ½ · ¾ × Т ¾ ¼ УЧ ¾
½
¼ × Т Ь ¾ УЧЬ · × Т Э ½ УЧЭ
¾
½ × Т Ь УЧ Ь · ¾ × Т Э ¼ УЧЭ
MOP4:
´
½ґЬµ
ИТ ½
½
Ѕј ЬФ ¼ ¾ Х
Ь¾ · Ь¾
·Ѕ
Ь½ Ь¾ Ь¿
¾ґЬµ
ИТ
½

Ь ¼ · × ТґЬ µ¿¡ (3) EC4:
´
½ґЬµ Ь½
¼
Ь½
½
¾ґЬµ
½
Х
ЬЅ
Ь¾
ЬЅј (4) where
ґЬµ Ѕ·
Ѕј ¾

Ь¾ Ѕј УЧґ Ь µ
¡
EC6:
½ґЬµ ½ ЬФґ Ь½µ × Т ´ Ь½µ ¼
Ь
½ ½ Ѕј
¾ґЬµ

½ ´ ½ µ¾¡
(5) where
ґЬµ Ѕ·
Ѕј ¾
Ь
јѕ
Since the diversity among optimized solutions is an important matter in multi- objective optimization, we devise a measure based on the consecutive distances among

Page 8

8 Deb, Agrawal, Pratap, and Meyarivan
the solutions of the best non-dominated front in the final population. The obtained set of the first non-dominated solutions are compared with a uniform distribution and the deviation is computed as follows: ¡
½ ½

½
(6) In order to ensure that this calculation takes into account the spread of solutions in the entire region of the true front, we include the boundary solutions in the non-dominated front ½. For discrete Pareto-optimal fronts, we calculate a weighted average of the above metric for each of the discrete regions. In the above equation, is the Euclidean distance between two consecutive solutions in the first non-dominated front of the final population in the objective function space. The parameter is the average of these distances. The deviation measure ¡ of these consecutive distances is then calculated for each run. An average of these deviations over 10 runs is calculated as the measure (¡) for comparing different algorithms. Thus, it is clear that an algorithm having a smaller ¡ is better, in terms of its ability to widely spread solutions in the obtained front. For all test problems and with NSGA-II, we use a population of size 100, a crossover probability of 0.8, a mutation probability of ½ Т (where Т is the number of variables). We run NSGA-II for 250 generations. The variables are treated as real numbers and the simulated binary crossover (SBX) [2] and the real-parameter mutation operator are used. For the (1+1)-PAES, we have used an archive size of 100 and depth of 4 [6]. A mutation probability of ¼ јЅ is used. In order to make the comparisons fair, we have used 25,000 iterations in PAES, so that total number of function evaluations in NSGA-II and in PAES are the same. Table 1 shows the deviation from an ideal (uniform) spread (¡) and its variance in 10 independent runs obtained using NSGA-II and PAES. We show two columns for each test problem. The first column presents the ¡ value of 10 runs and the second column shows its variance. It is clear from the table that in all five test problems NSGA- II has found much smaller ¡, meaning that NSGA-II is able to find a distribution of solutions closer to a uniform distribution along the non-dominated front. The variance columns suggest that the obtained ¡ values are consistent in all 10 runs.
Table 1. Comparison of mean and variance of deviation measure ¡obtained using NSGA-II and PAES Algorithm MOP2 MOP3 MOP4 EC4 EC6 NSGA-II 0.361 0.00068 0.445 0.00043 0.387 0.00164 0.383 0.00099 0.365 0.01613 PAES 1.609 0.00671 1.341 0.00495 1.087 0.00687 1.563 0.05723 1.195 0.05151
In order to have a better understanding of how these algorithms are able to spread so- lutions over the non-dominated front, we present the entire non-dominated front found

Page 9

Fast Elitist NSGA 9
by NSGA-II and PAES in two of the above five test problems. Figures 2 and 3 show that NSGA-II is able to find a much better distribution than PAES on MOP4. In EC4, converging to the global Pareto-optimal front is a difficult task. As reported elsewhere [11], SPEA converged to a front with
¼ in at least one out of five
different runs. With NSGA-II, we find a front with
¿ in one out of five different
-12 -10 -8 -6 -4 -2 0 -20 -19 -18 -17 -16 -15 -14 f_2 f_1 NSGA-II
Fig. 2. Non-dominated solutions obtained us- ing NSGA-II on MOP4.
-12 -10 -8 -6 -4 -2 0 -20 -19 -18 -17 -16 -15 -14 f_2 f_1 PAES
Fig. 3. Non-dominated solutions obtained us- ing PAES on MOP4.
runs. Figure 4 shows the non-dominated solutions obtained using NSGA-II and PAES for EC6. Once again, it is clear that the NSGA-II is able to better distribute its population along the obtained front than PAES. It is worth mentioning here that with similar num- ber of function evaluations, SPEA, as reported in [11], had found only five different solutions in the non-dominated front.
5 Conclusions
In this paper, we have proposed a computationally fast elitist multi-objective evolution- ary algorithm based on non-dominated sorting approach. On five difficult test problems borrowed from the literature, it has been found that the proposed NSGA-II outperforms PAES—another multi-objective EA with the explicit goal of preserving spread on the non-dominated front. With the properties of a fast non-dominated sorting procedure, an elitist strategy, and a parameterless approach, NSGA-II should find increasing attention and applications in the near future.
Acknowledgements
Authors acknowledge the support provided by All India Council for Technical Educa- tion, India during the course of this study.

Page 10

10 Deb, Agrawal, Pratap, and Meyarivan
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 f_2 f_1 Pareto-Optimal Front NSGA-II PAES
Fig. 4. Obtained non-dominated solutions with NSGA-II and PAES on EC6.
References
1. Deb, K. (1999) Multi-objective genetic algorithms: Problem difficulties and construction of test Functions. Evolutionary Computation, 7(3), 205–230. 2. Deb, K. and Agrawal, R. B. (1995) Simulated binary crossover for continuous search space. Complex Systems, 9 115–148. 3. Fonseca, C. M. and Fleming, P. J. (1993) Genetic algorithms for multi-objective optimiza- tion: Formulation, discussion and generalization. In Forrest, S., editor, Proceedings of the Fifth International Conference on Genetic Algorithms, pages 416–423, Morgan Kauffman, San Mateo, California. 4. Fonseca, C. M. and Fleming, P. J. (1998) Multiobjective optimization and multiple constraint handling with evolutionary algorithms–Part II: Application example. IEEE Transactions on Systems, Man, and Cybernetics: Part A: Systems and Humans. 38–47. 5. Horn, J. and Nafploitis, N., and Goldberg, D. E. (1994) A niched Pareto genetic algorithm for multi-objective optimization. In Michalewicz, Z., editor, Proceedings of the First IEEE Conference on Evolutionary Computation, pages 82–87, IEEE Service Center, Piscataway, New Jersey. 6. Knowles, J. and Corne, D. (1999) The Pareto archived evolution strategy: A new baseline al- gorithm for multiobjective optimisation. Proceedings of the 1999 Congress on Evolutionary Computation, Piscatway: New Jersey: IEEE Service Center, 98–105. 7. Mitra, K., Deb, K., and Gupta, S. K. (1998). Multiobjective dynamic optimization of an industrial Nylon 6 semibatch reactor using genetic algorithms. Journal of Applied Polymer Science, 69(1), 69–87. 8. Rudolph, G. (1999) Evolutionary search under partially ordered sets. Technical Report No. CI-67/99, Dortmund: Department of Computer Science/LS11, University of Dortmund, Ger- many. 9. Srinivas, N. and Deb, K. (1995) Multi-Objective function optimization using non-dominated sorting genetic algorithms, Evolutionary Computation, 2(3):221–248. 10. Weile, D. S., Michielssen, E., and Goldberg, D. E. (1996). Genetic algorithm design of Pareto-optimal broad band microwave absorbers. IEEE Transactions on Electromagnetic Compatibility, 38(4).

Page 11

Fast Elitist NSGA 11 11. Zitzler, E. (1999). Evolutionary algorithms for multiobjective optimization: Methods and applications. Doctoral thesis ETH NO. 13398, Zurich: Swiss Federal Institute of Technology (ETH), Aachen, Germany: Shaker Verlag. 12. Zitzler, E., Deb, K., and Thiele, L. (in press) Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary Computation, 8. 13. Zitzler, E. and Thiele, L. (1998) Multiobjective optimization using evolutionary algorithms—A comparative case study. In Eiben, A. E., Bдck, T., Schoenauer, M., and Schwefel, H.-P., editors, Parallel Problem Solving from Nature, V, pages 292–301, Springer, Berlin, Germany.

A Fast Elitist Non-Dominated Sorting Genetic Algorithm for Multi-Objective Optimization: NSGA-II

Recent Documents:

Recent Search: