Week 07: Local and Global Search

An optimization problem has:

The assignment that optimizes (maximize/minimize) the value of the objective function must be found.

A constrained optimization problem adds a set of constraints which determines which assignments are allowed.

Use algorithms to iteratively improve a state.

A single current state is kept in memory and in each iteration, we move to one of its neighbors to improve it.

Most local search algorithms are greedy. Two such algorithms are hill climbing and greedy descent.

e.g. TSP: start with any complete tour, and perform pairwise exchanges (pick two edges, switch the vertex to one in the other edge). This type of approach can get close to the optimal solution quickly.

Local Search for CSPs

CSP can be reduced to an optimization problem.

If each variable is assigned a value, a conflict is an unsatisfied constraint: the goal is to find an assignment with zero conflicts.

Hence, the heuristic function to to be minimized will be the number of conflicts.

Example: n-queens problem

Put n queens on an n by n board such that no two queens can attack each other.

Heuristic: number of pairs of queens that can attack each other.

One queen on each column, queens can move up and down only. For each state there will be n(n1)n(n-1) neighbors (each can move to n1n-1 locations).

Variants of Greedy Descent

Issues

Can get stuck in local optima/flat areas of the landscape - randomized greedy descent can sometimes help:

These two make the search global.

A total assignment is called an individual.

Maintain a population of k individuals instead of one and update each individual at every stage. If an individual is a solution, it can be reported (and the search can stop).

With local search, random restarts will occur when it reaches a non-zero local minimum and finish when a solution is found.

With parallel search, if a solution is found all individuals can be stopped.

Simulated Annealing

Given:

The probability of adopting the new value is:

e(h(n)h(n))/T e^{(h(n) - h(n'))/T}

As temperature gets reduced, the probability of accepting a change decreases.

Gradient Descent

Objective function must be (mostly) differentiable:

def gradient_descent(f, initial_guess):
  # f = objective function
  x = initial_guess # x is a vector
  while magnitude(grad(f)(x)) > epsilon:
    # f is a multi-variable function, so \nabla f returns a vector
    x = x - step_size * grad(f)(x)
    # Steepest slope is along the vector, so walk along it

Evolutionary Algorithms

Requires:

Flow

After initializing the population:

Evaluation/Fitness Function

A value relating to how ‘good’ an individual is.

Used for:

Selection

Better individuals should have a higher chance of surviving and breeding.

Roulette wheel

Each individual is assigned to a part of the roulette wheel, with the ‘angle’ being proportional to its fitness, and it is spun n times to select n individuals.

def roulette_wheel_select(population, fitness, r):
  total_fitness = sum([fitness(individual) for individual in population])

  rolling_sum = 0

  for individual in population:
    rolling_sum += fitness(individual)
    if rolling_sum > total_fitness * r:
      return individual

Tournaments

n (size of the tournament) individuals chosen randomly: the fittest one is selected as a parent.

If n is one, it is equivalent to randomly selecting an individual.

If n is the population size, it is completely deterministic.

Elitism

Keeps at least one copy of the fittest solution so far for the next generation - ensures the fitness will not decrease over generations.

Reproduction Operators

Crossover

Two parents produce two offspring. 1, 2 or n crossover points are generated:

One point crossover: child has parent one’s chromosomes up until the random cross over point; after that is the other parent’s chromosomes.

n point crossover: at n separate points, it swaps from getting chromosomes from one parent to the other.

Mutation

Mutation gives a low probability of randomly changing the gene of a child.

Mutation brings in diversity as compared to combining candidates to (hopefully) produce better children.

Randomly Generating Programs (Trees)

Mutation: pick a random node and replace the subtree with a randomly-generated subtree.

Crossover: pick a random node in each parent and exchange them.