Hill Climbing: A simple optimization method

by Amit


An optimization problem can usually also be modelled as a search problem, since we are searching for the optimum solution from among the solution space. Without any loss of generality, we can assume that our optimization problems are of the maximization category.

So, here is the hill climbing technique of search:

  1. Start with an initial solution, also called the starting point. Set current point as the starting point
  2. Make a move to a next solution, called the move operation
  3. If the move is a good move, then set the new point as the current point and repeat (2). If the move is a bad move, terminate. The last current solution is the possible optimum solution.

Move operation: The move operation is problem dependent. In a discrete optimization problem, such as the Travelling Salesman Problem, a move operation would probably shuffle a couple of positions in the original solution. In problems where the search space is continuous, techniques such as those used in steepest ascent/descent methods are used to obtain the next solution.

Good/Bad Move: A move is said to be good, if a point obtained by the move operation improves the quality of the solution, as compared to the previous solution. A bad move is defined similarly.

Quality of solution: In all optimization problems, there is always atleast on quantity (in single-objective optimization problems) that we want to improve. A quality of solution is judged on the basis of this quantity. In a function maximization problem, the function value itself is a measure of the quality.

Illustration:

Let us now consider a simple one-peaked function f(x) = -x*x and we seek the maximum of this function. This is a plot of the function:

Sample one-peak function

  • Let A be the starting point,  and let the move operation take us to point  B (pretty far away; usually it will be much closer to A but it will serve the purpose here). Consider the quality of the two solutions, A & B
  • B definitely improves our search, since the function value at B is more than A, we keep moving, setting B as the current point
  • Our next move operation takes us to C (our desired solution, but we don’t know that yet). Now, compare B & C. Since C is definitely a good move, so, we set the current point to C and keep moving
  • Our next move takes us to D. On comparing C &D, we see that it was a bad move, Hence we terminate. Hence, the current point, C is our solution.

The following plot shows the results obtained via simple Hill-climbing on the above function:

Hill clmbing results on a simple one-peak function (The green points are the ones obtained via a Hillclimbing search)

Simplicity is a hallmark of the Hill climbing search procedure. However, its simplicity comes at a price.  In a multi-modal landscape, Hill climbing procedure will terminate the first peak it finds. It may be the global maximum, but it could also be a local maxima.  This is illustrated below:

Case I:

Hill Climbing on a multimodal function – I

Case II:

Hill climbing on multimodal function- II

As you can see from the figures above, depending on your starting point, you will reach the nearest peak of your function. That is definitely a weak point in a multimodal landscape. To overcome this limitation  random restart hill climbing algorithm, i.e with a random initial point has been proposed (by who?).

Hill climbing has been used on its own for many optimization problems (See http://scholar.google.co.in/scholar?q=hill+climbing) as well as as a local search operator in global search algorithms, like Evolutionary Algorithms (GAs, ES, etc).

This is a simple Python implementation of hill climbing that I used for illustration. It contains some of the functions that I plotted above as well.

Added: 13-April-2012

Hill-climbing with Multiple Solutions

As you have noticed earlier, the classic hill climbing will not go beyond the first peak it reaches. In a multi-modal landscape this can indeed be limiting. I made some simple changes to the above algorithm to allow hill-climbing to go beyond the first peak it reaches. They are:

  • Multiple starting points: Instead of searching for the highest peak, starting from a single peak, search from multiple points. This simulates a parallel search, similar in principle to the way population based algorithms function. This is also sometimes referred to as a random-restart hill-climbing.
  • Random move when stuck: If a particular point gets stuck on its uphill climb, a random noise is added to it. A point is allowed to be stuck for a specified number of times, before beginning with the next point, if any.

With these changes, I saw that the algorithm is now able to find the highest peak with two simple 1-D functions:

Sample Function-1

Sample Function – 2

The updated code is now available here Some other minor changes in the code are given as comments. The code also reports the best solution reached with each of the multiple starting points: (the output below is for the sample function 1)

Best Solution with 1.142805 :: 0.260280
Best Solution with 0.857778 :: 0.503222
Best Solution with 1.156183 :: 0.260280
Best Solution with 0.412196 :: 0.780970
Best Solution with 1.626643 :: 0.108062
Best Solution with 1.647178 :: 0.108062
Best Solution with 1.306945 :: 0.260280
Best Solution with 0.320459 :: 0.972905
Best Solution with 1.041339 :: 0.260280
Best Solution with 0.655546 :: 0.780970

One final plot to show how the multiple points searched the peaks:

Solutions from each point goes to its nearest peak

Solutions from each point goes to its nearest peak, effectively forming a parallel search. Its interesting to note that how it finds all the peaks. This is a very naive way of finding all the optima in a Multi-modal optimization algorithm.

Please feel free to drop a comment/email to discuss this.