CS.Lectures: Ant Colony Optimization

Ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems.
It is a population-based metaheuristic (A metaheuristic is a high-level problem-independent algorithmic framework that provides a set of guidelines or strategies to develop heuristic optimization algorithms ) that can be used to find approximate solutions to difficult optimization problems.
Ant colony optimization is a class of optimization algorithms modeled on the actions of an ant colony.

“Ant Colony Optimization (ACO) studies artificial systems that take inspiration from the behavior of real ant colonies and which are used to solve discrete optimization problems.”

THE WHOLE CONCEPT OF ANT COLONY OPTIMIZATION IS TO MINIMIZE THE PATH AND POWER CONSUMPTION.

This algorithm is a member of the ant colony algorithms family, in swarm intelligence methods, and it constitutes some metaheuristic optimizations. Initially proposed by Marco Dorigo in 1992 in his PhD thesis, the first algorithm was aiming to search for an optimal path in a graph, based on the behavior of ants seeking a path between their colony and a source of food. ACO performs a model-based search and shares some similarities with estimation of distribution algorithms.

Ant Foraging Behavior

Ants navigate from nest to food source. Ants are almost blind!
Incapable of achieving complex tasks alone.
Rely on the phenomena of swarm intelligence for survival.
Shortest path is discovered via pheromone trails.
Each ant moves at random & Pheromone is deposited on path.
Use stigmergic (Two individuals interact indirectly when one of them modifies the environment and the other responds to the new environment at a later time. This is stigmergy. ) communication via pheromone trails.
More pheromone on path increases probability of path being followed
What emerges is a form of autocatalytic behavior: the more ants follow a trail, the more attractive that trail becomes for being followed. The process is thus characterized by a positive feedback loop, where the probability of a discrete path choice increases with the number of times the same path was chosen before.

Autocatalysis is a positive feedback loop that drives the ants to explore promising aspects of the search space over less promising areas.

Show some examples:

This is why ACO algorithms are called autocatalytic positive feedback algorithms!

A pheromone is a secreted or excreted chemical factor that triggers a social response in members of the same species. Pheromones are chemicals capable of acting like hormones outside the body of the secreting individual, to impact the behavior of the receiving individuals.

Swarm Intelligence

Collective system capable of accomplishing difficult tasks in dynamic and varied environments without any external guidance or control and with no central coordination
Achieving a collective performance which could not normally be achieved by an individual acting alone
Constituting a natural model particularly suited to distributed problem solving

Working

In ACO, a set of software agents called artificial ants search for good solutions to a given optimization problem.
To apply ACO, the optimization problem is transformed into the problem of finding the best path on a weighted graph. The artificial ants (hereafter ants) incrementally build solutions by moving on the graph.
The solution construction process is stochastic and is biased by a pheromone model, that is, a set of parameters associated with graph components (either nodes or edges) whose values are modified at runtime by the ants.

Explaining ACO through an example

The easiest way to understand how ant colony optimization works is by means of an example. We consider its application to the traveling salesman problem (TSP).

The first ACO algorithm was called the Ant system and it was aimed to solve the travelling salesman problem, in which the goal is to find the shortest round- trip to link a series of cities.

In the TSP a set of locations (e.g. cities) and the distances between them are given. The problem consists of finding a closed tour of minimal length that visits each city once and only once.

Problem is NP-hard, Classical combinatorial optimization problem to test.

To apply ACO to the TSP,

We consider the graph defined by associating the set of cities with the set of vertices of the graph. This graph is called construction graph.
Since in the TSP it is possible to move from any given city to any other city, the construction graph is fully connected and the number of vertices is equal to the number of cities.
We set the lengths of the edges between the vertices to be proportional to the distances between the cities represented by these vertices and we associate pheromone values and heuristic values with the edges of the graph.
Pheromone values are modified at runtime and represent the cumulated experience of the ant colony, while heuristic values are problem dependent values that, in the case of the TSP, are set to be the inverse of the lengths of the edges.

The ants construct the solutions as follows.

Each ant starts from a randomly selected city (vertex of the construction graph). Then, at each construction step it moves along the edges of the graph.
Each ant keeps a memory of its path, and in subsequent steps it chooses among the edges that do not lead to vertices that it has already visited.
An ant has constructed a solution once it has visited all the vertices of the graph.
At each construction step, an ant probabilistically chooses the edge to follow among those that lead to yet unvisited vertices.
The probabilistic rule is biased by pheromone values and heuristic information: the higher the pheromone and the heuristic value associated to an edge, the higher the probability an ant will choose that particular edge.
Once all the ants have completed their tour, the pheromone on the edges is updated.
Each of the pheromone values is initially decreased by a certain percentage. Each edge then receives an amount of additional pheromone proportional to the quality of the solutions to which it belongs (there is one solution per ant).

This procedure is repeatedly applied until a termination criterion is satisfied.

Formal definition of a combinatorial optimization problem

The first step for the application of ACO to a combinatorial optimization problem (COP) consists in defining a model of the COP as a triplet. A model P = (S, Ω, f ) of a combinatorial optimization problem consists of:

S S is a search space defined over a finite set of discrete decision variables;
Ω Ω is a set of constraints among the variables; and
f:S→R+0 f is an objective function to be minimized (as maximizing over f is the same as minimizing over −f , every COP can be described as a minimization problem).

The search space S is defined as follows, A set of discrete variables Xi , i=1,…,n , with values vji∈Di={v1i,…,v|Di|i} , is given. Elements of S are full assignments, that is, assignments in which each variable Xi has a value vji assigned from its domain Di . The set of feasible solutions SΩ is given by the elements of S that satisfy all the constraints in the set Ω .. A solution s∗∈SΩ is called a global optimum if and only if, The set of all globally optimal solutions is denoted by S∗Ω⊆SΩ . Solving a COP requires finding at least one.s∗∈S∗Ω .

Algorithm

In the ant colony optimization algorithms, an artificial ant is a simple computational agent that searches for good solutions to a given optimization problem. To apply an ant colony algorithm, the optimization problem needs to be converted into the problem of finding the shortest path on a weighted graph.
In the first step of each iteration, each ant stochastically constructs a solution, i.e. the order in which the edges in the graph should be followed.
In the second step, the paths found by the different ants are compared.
The last step consists of updating the pheromone levels on each edge.
Pseudo Code
begin
   Initialization;
   while termination condition not met do
     ConstructAntSolutions;
     DaemonActions;         /* optional */
     UpdatePheromones;
   end
end
Initialization Set parameters, reset pheromone variables
ConstructAntSolutions Let ants construct solutions
DaemonActions Optionally improve candidate solutions
UpdatePheromones Make solution components belonging to good solutions more desirable for ants in following iterations.

Convergence

For some versions of the algorithm, it is possible to prove that it is convergent (i.e., it is able to find the global optimum in finite time). The first evidence of convergence for an ant colony algorithm was made in 2000, the graph-based ant system algorithm, and later on for the ACS and MMAS algorithms. Like most metaheuristics, it is very difficult to estimate the theoretical speed of convergence. A performance analysis of a continuous ant colony algorithm with respect to its various parameters (edge selection strategy, distance measure metric, and pheromone evaporation rate) showed that its performance and rate of convergence are sensitive to the chosen parameter values, and especially to the value of the pheromone evaporation rate.^[31] In 2004, Zlochin and his colleagues^[32] showed that COAC-type algorithms could be assimilated methods of stochastic gradient descent, on the cross-entropy and estimation of distribution algorithm. They proposed these metaheuristics as a "research-based model".

Describe how ants are able to find the shortest path to a food source

The ants rely on trails of pheromones to find their way. The researchers suggest that the chemical trail might initially be random, but converges on the optimum route over time. This process illustrates self-organization and evolution, in which all possible routes are whittled down to the fastest one.

When two points (say, two nests, or a nest and a food source) need to be connected, ants may start out tracing several winding pheromone paths among them. ... Short trails enable ants to make more trips, less time elapses between passes, so these trails end up marked more strongly. The shortest trail emerges.

The role of pheromone evaporation in DOPs is to improve the adaptation capabilities of the algorithm. ... Therefore, pheromone evaporation helps to eliminate pheromone trails that may misguide ants without destroying any knowledge gained from previous environments.

Ants foraging for food leave pheromone trails on their paths. When going back to the nest ants also leave pheromones, but their concentrations may depend on food quality. Additionally, pheromones evaporate over time.

https://www.slideshare.net/parthapsad/aco-presentation-31381828

Model-based Search

In model-based search algorithms, candidate solutions are generated using a parameterized probabilistic model that is updated using the previously seen solutions in such a way that the search will concentrate in the regions containing high quality solutions.
The term “model” is used here to denote an adaptive stochastic mechanism for generating candidate solutions, and not an approximate description of the environment, as done, for example, in reinforcement learning (Sutton and Barto, 1998).
The general approach is described schematically in Figure 1.
Some of the early works exploiting the model-based approach, such as ant colony optimization (Dorigo, 1992; Dorigo et al., 1996; Dorigo and Di Caro, 1999) and population-based incremental learning (Baluja and Caruana, 1995), do not provide an explicit description of the model-based idea.

Common extensions

3.1 Ant system (AS)

3.2 Ant colony system (ACS)

3.3 Elitist ant system (EAS)

3.4 Max-min ant system (MMAS)

3.5 Rank-based ant system (ASrank)

3.6 Continuous orthogonal ant colony (COAC)

3.7 Recursive ant colony optimization

Ant system

Ant system (AS) was the first ACO algorithm to be proposed in the literature (Dorigo et al. 1991, Dorigo 1992, Dorigo et al. 1996).
Its main characteristic is that the pheromone values are updated by all the ants that have completed the tour.
Solution components cijare the edges of the graph, and the pheromone update for τij ,that is, for the pheromone associated to the edge joining cities iand j ,is performed as follows:

τ i j \leftarrow (1 - ρ) \cdot τ i j + \sum k = 1 m Δ τ k i j,

where ρ∈(0,1]is the evaporation rate, mis the number of ants, and Δτkijis the quantity of pheromone laid on edge (i,j)by the k-th ant:

Δτkij=⎧⎩⎨1Lk0if \ ant k used \ edge (i,j) in \ its \ tour,otherwise,

where Lkis the tour length of the k-th ant.

When constructing the solutions, the ants in AS traverse the construction graph and make a probabilistic decision at each vertex. The transition probability p(cij|spk)of the k-th ant moving from city ito city jis given by:

p (c i j | s p k) = ⎧ ⎩ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ τ α i j \cdot η β i j \sum c i l \in N ( s p k ) τ α i l \cdot η β i l 0 if j \in N (s p k), otherwise,

where N(spk)is the set of components that do not belong yet to the partial solution spkof ant k ,and αand βare parameters that control the relative importance of the pheromone versus the heuristic information ηij=1/dij ,where dijis the length of component cij(i.e., of edge (i,j)).

Ant colony system

The first major improvement over the original ant system to be proposed was ant colony system (ACS), introduced by Dorigo and Gambardella (1997).
The first important difference between ACS and AS is the form of the decision rule used by the ants during the construction process.
Ants in ACS use the so-called pseudorandom proportional rule: the probability for an ant to move from city ito city jdepends on a random variable quniformly distributed over [0,1] ,and a parameter q0 ;if q≤q0 ,then, among the feasible components, the component that maximizes the product τilηβilis chosen; otherwise, the same equation as in AS is used.

This rather greedy rule, which favors exploitation of the pheromone information, is counterbalanced by the introduction of a diversifying component: the local pheromone update. The local pheromone update is performed by all ants after each construction step. Each ant applies it only to the last edge traversed:

τ i j = (1 - φ) \cdot τ i j + φ \cdot τ 0,

where φ∈(0,1]is the pheromone decay coefficient, and τ0is the initial value of the pheromone.

The main goal of the local update is to diversify the search performed by subsequent ants during one iteration.
In fact, decreasing the pheromone concentration on the edges as they are traversed during one iteration encourages subsequent ants to choose other edges and hence to produce different solutions.
This makes less likely that several ants produce identical solutions during one iteration.
Additionally, because of the local pheromone update in ACS, the minimum values of the pheromone are limited.
As in AS, also in ACS at the end of the construction process a pheromone update, called offline pheromone update, is performed.
ACS offline pheromone update is performed only by the best ant, that is, only edges that were visited by the best ant are updated, according to the equation:

τ i j \leftarrow (1 - ρ) \cdot τ i j + ρ \cdot Δ τ b e s t i j

where Δτbestij=1/Lbestif the best ant used edge (i,j)in its tour, Δτbestij=0otherwise (Lbest can be set to either the length of the best tour found in the current iteration -- iteration-best, Lib-- or the best solution found since the start of the algorithm -- best-so-far, Lbs).

MAX-MIN ant system

MAX-MIN ant system (MMAS) is another improvement, proposed by Stützle and Hoos (2000), over the original ant system idea.
MMAS differs from AS in that (i) only the best ant adds pheromone trails, and (ii) the minimum and maximum values of the pheromone are explicitly limited (in AS and ACS these values are limited implicitly, that is, the value of the limits is a result of the algorithm working rather than a value set explicitly by the algorithm designer).

The pheromone update equation takes the following form:

τ i j \leftarrow (1 - ρ) \cdot τ i j + Δ τ b e s t i j,

where Δτbestij=1/Lbestif the best ant used edge (i,j)in its tour, Δτbestij=0otherwise, where Lbestis the length of the tour of the best ant. As in ACS, Lbestmay be set (subject to the algorithm designer decision) either to Libor to Lbs ,or to a combination of both.

The pheromone values are constrained between τminand τmaxby verifying, after they have been updated by the ants, that all pheromone values are within the imposed limits

τ i j

is set to

τmaxif

τij>τmaxand to

τminif

τij<τmin .It is important to note that the pheromone update equation of MMAS is applied, as it is the case for AS, to all the edges while in ACS it is applied only to the edges visited by the best ants.

The minimum value τminis most often experimentally chosen (however, some theory about how to define its value analytically has been developed in (Stützle & Hoos 2000)). The maximum value τmaxmay be calculated analytically provided that the optimum ant tour length is known. In the case of the TSP, τmax=1/(ρ⋅L∗) ,where L∗is the length of the optimal tour. If L∗is not known, it can be approximated by Lbs .It is also important to note that the initial value of the trails is set to τmax ,and that the algorithm is restarted when no improvement can be observed for a given number of iterations.

This algorithm controls the maximum and minimum pheromone amounts on each trail. Only the global best tour or the iteration best tour are allowed to add pheromone to its trail. To avoid stagnation of the search algorithm, the range of possible pheromone amounts on each trail is limited to an interval [τ_max,τ_min]. All edges are initialized to τ_max to force a higher exploration of solutions. The trails are reinitialized to τ_max when nearing stagnatio

MMAS is an ACO algorithm derived from the ANT System and it differs from the ANT System in various aspects.
MMAS algorithm achieves strong exploitation of the search history by allowing only the best solutions to add pheromone during the pheromone trail update.
The use of a rather simple mechanism for limiting the strengths of the pheromone trails effectively avoids premature convergence of the search.
MMAS can easily be extended by adding local search algorithms.
MMAS is currently one of the best performing ACO algorithms for the TSP and the Quadratic Assignment Problem (QAP).

The key to achieving the best performance of ACO algorithms is to combine improved exploitation of the best solutions found during the search with an effective mechanism for avoiding early search stagnation. Min- Max Ant System, which has been specifically developed to meet the requirements, differs in three key aspects from AS.

To exploit the best solutions found during an iteration or during the run of the algorithm, after each iteration only one single ant adds pheromone. This ant may be the one that found the best solution in the current iteration (iteration-best ant) or the one which found the best solution from the beginning of the trial (global-best ant).
To avoid stagnation of the search the range of possible pheromone trails on each solution component is limited to an interval [min; max].
Additionally, we deliberately initialize the pheromone trails to the max, achieving in this way a higher exploration of solutions at the start of the algorithm.

Points to be Noted:

Four modifications with respect to AS.

Strongly exploits the best tours found. (This may lead to stagnation. So…)
Limits the possible range of pheromone values.
Pheromone values initialized to an upper limit.
Pheromone values are reinitialized when the system approaches stagnation.

After all, ants construct a solution, pheromone values are updated. (Evaporation is the same as in AS)

Lower and upper limits on pheromones limit the probability of selecting a city.
Initial pheromone values are set to the upper limit, resulting in initial exploration.
Occasionally pheromones are reinitialized.

Differences with Ant System:

Best only offline Pheromone Update
Min and Max values of the pheromone are explicitly limited

τij is constrained between τmin and τmax (explicitly set by algorithm designer).
After pheromone update, τij is set to τmax if τij > τmax and to τmin if τij < τmin

Elitist ant system

In this algorithm, the global best solution deposits pheromone on its trail after every iteration (even if this trail has not been revisited), along with all the other ants. The elitist strategy has as its objective directing the search of all ants to construct a solution to contain links of the current best route.

Rank-based ant system (ASrank)

All solutions are ranked according to their length. Only a fixed number of the best ants in this iteration are allowed to update their trials. The amount of pheromone deposited is weighted for each solution, such that solutions with shorter paths deposit more pheromone than the solutions with longer paths.

Applications

Ant colony optimization algorithms have been applied to many combinatorial optimization problems, ranging from quadratic assignment to protein folding or routing vehicles and a lot of derived methods have been adapted to dynamic problems in real variables, stochastic problems, multi-targets and parallel implementations.
It has also been used to produce near-optimal solutions to the travelling salesman problem.
Routing in telecommunication networks
Traveling Salesman
Graph Coloring
Scheduling
Constraint Satisfaction

Advantage

Inherent parallelism
Efficient for Traveling Salesman Problem and similar problems
Can be used in dynamic applications (adapts to changes such as new distances, etc)
Positive Feedback accounts for rapid discovery of good solutions
Distributed computation avoids premature convergence
The greedy heuristic helps find acceptable solution in the early solution in the early stages of the search process.
The collective interaction of a population of agents.

Diadvantage

Theoretical analysis is difficult
Sequences of random decisions (not independent)
Probability distribution changes by iteration
Research is experimental rather than theoretical
Time to convergence uncertain (but convergence is gauranteed!)
Slower convergence than other Heuristics
Performed poorly for TSP problems larger than 75 cities.
No centralized processor to guide the AS towards good solutions

Why include ACO in Multi Agent Learning?

Can be used to solve similar problems, under certain conditions
Is in sense multi agent (swarm) based, and shows learning behaviour.
Is very popular, due to its nice analogy, and well researched.

Ant Colony Optimization (ACO) is a metaheuristic for solving hard combinatorial optimization problems

An heuristic algorithm trades optimality, completeness, accuracy and/or precision for speed.
A metaheuristic is a top-level general strategy which guides other heuristics to search for feasible solutions in domains where the task is hard.

http://www.scholarpedia.org/article/Ant_colony_optimization

https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

Saturday, October 30, 2021

Ant Colony Optimization