Today, we would be discussing about some other heuristic search strategies available and also would try to understand about heuristic functions and how they can be constructed.
- a. Greedy best-first Search
- b. A* Search
The above search strategies were discussed in the previous article.
As the name implies, this type of search tries to reduce the memory requirements while searching based on heuristics. With regard to A* search which we briefly described last time, one of its main drawbacks is the high amount of memory consumption. It has to keep all generated nodes at each iteration in the memory, so when we run this algorithm on a large scale problem, it will run out of memory causing trouble. Some of such memory-bounded algorithms are described in detail as follows.
In general, Iterative-deepening search is a step by step procedure where the problem space is expanded until a certain pre-specified limit has been reached. In the case of A* search the evaluation function is defined as follows.
f(n) = g(n) + h(n)
- g(n) – cost to reach the node
- h(n) – cost to each node from the goal node
In this case, when we consider the iterative-deepening A* search the pre-defined limit for the iterative deepening search is an upper limit on the value of the evaluation function (f(n)). In the case where, a node in concern has a comparatively lower f(n) value to the cut-off limit, then that node would be expanded and if the algorithm reaches a node which has a f(n) value exceeding the limit, the algorithm will cease to expand on that node on that particular path and move on to finding other paths to reach the goal node. By imposing this kind of cut-off limit for the evaluation function cost, the nodes which are not feasible to be in the path would not be put on to memory and only the nodes within the paths which are feasible and cost effective would be stored in memory. This would lead to a lower memory requirement than using A* search on its whole.
A recursive algorithm is one which calls the same algorithm within itself with different input arguments at each time until a certain objective function has been reached. RBFS is a simple such recursive algorithm which tries to do the operation of standard best-first search (which was explained last time) but using only a limited memory space. In this type of best-first search the evaluation function (f(n)) in concern is the combination of:
- g(n) – cost to reach the node
- h(n) – cost to each node from the goal node
RBFS algorithm keeps track of the evaluation function value (i.e. f(n) value) of the best alternative path available from any ancestor of the current node. If the current node’s evaluation function value exceeds this limit, the recursive step makes the algorithm take the alternative path instead. When such recursion occurs, RBFS algorithm replaces the previous f-value of each node along its path with the best f-value of its children. By doing this, the memory requirement is limited, because it does not need to keep all the expanded nodes in memory. It only needs to store the evaluation function values of best alternative path from each ancestor node of the current node and do a comparison with the current value and update it if required, making the memory requirement decrease. Although this looks quite feasible, there is another drawback in RBFS algorithm. That is it uses too little memory and is not capable of utilizing all the available memory if required. So, in another sense, it wastes available memory in a system since this algorithm cannot utilize it but rather use very limited memory to perform its operations. Therefore, some other algorithms have come in to play which use limited memory but also have the ability to utilize all available memory, which are described in detail in the following sections.
SMA algorithm works quite similar to the general A* search algorithm with the only exception being, it expanding the best leaf only until the memory is full. When it reaches the point of memory being full, it has to drop an existing node in memory to fill in the new one. In order to do this, SMA algorithm follows a simple strategy where it drops the worst leaf node already in memory and adds in the new node. ‘Worst leaf node’ means the leaf node in the memory which has a highest evaluation function value which is f(n), which implies it costs more to expand on that node in order to reach the goal. Therefore, this algorithm removes it and adds the better node in the memory. There might be a scenario where all the leaf nodes have the same f-value. In such as case, how does the SMA algorithm select which node is the worst and which node is the best since same node could be selected for both expansion and deletion? The solution is as follows. SMA* solves this by expanding newest best leaf and deleting oldest worst leaf, which means it deletes the node with the same f-value but was in memory for the longest time and adds the latest node. In this type of SMA search, all the limited amount memory that is available would be used to tackle the problem without wasting any resources.
The heuristic function is based on the problem in concern. If the goal of a certain problem is finding the lowest distance path from one place to another, then the heuristic function is obviously the lowest distance from each node to the goal node. The problems in practice are not so simple and obvious most of the time. Therefore, it is essential to know some important aspects that should be considered when coming up with a good heuristic function to do the searching on that problem space. Some important pointers when designing a heuristic function are as follows.
- It should never overestimate the true solution cost to the goal – For example, if the most feasible path to the goal node from the starting node costs ‘x’ no: of steps, then the heuristic function should not exceed that limit from any node for it to be a suitable heuristic function.
- The quality of the heuristic mostly depends on the ‘effective branching factor’ value – Following is the definition for effective branching factor. If the total number of nodes generated by A* search for a problem space is N, and the solution tree depth is d, then the effective branching factor b is defined using the following equation
N + 1 = 1 + b + b2 + b3 + … … … … … … . + bd
Therefore after removing 1 from both sides of the equation;
N = b + b2 + b3 + … … … … … … . + bd
The above is a geometric series. So to find the sum of a geometric series the equation is as follows. Hope you all can remember your year 10-11 Mathematics for doing this.
Where r – common ratio, n – no: of terms
Therefore in our equation no: 2 can be re-written as:-
Then an approximate solution for b can be obtained mathematically, which is called the effective branching factor. It is found out that that for a well-designed heuristic, the value for the effective branching factor would be close to 1.
Over the past few months and from today’s article, we discussed a lot about searching strategies available in problem solving which is a vital component in Artificial Intelligence. We learnt about problem solving basics, formulation of problems and solutions, uninformed (blind) search strategies and informed (heuristic) search strategies throughout this article series. Hope you were able to gain some understanding and develop an interest towards this area which paves the basics to AI. From the next article onwards, we will be presenting more about new machine learning techniques in AI such as neural networks, classification and clustering and so on. So, stay tuned!
Artificial Intelligence – A modern Approach, Second edition, Stuart Russell & Peter Norvig