Chathra Hendahewa is a graduate student at Rutgers –The State University of New Jersey since fall 2008 and currently in her second year of studies. She is studying towards her Doctorate in Computer Science and her major research interests are Pattern Classification with application to financial systems, Machine Learning and Cognitive Sciences. She was the first ever International Fulbright Science &Technology awardscholar from Sri Lanka. Chathra graduated from Faculty of Information Technology, University of Moratuwa in year 2007. She is also a passed finalist and a multiple gold medalist of CIMA (Chartered Institute of Management Accountants –UK). She has professional experience in working as a Business Systems Analyst for more than a year at IFS R&D and has completed an internship period in developing modules for‘Sahana-Disaster Management System’. She loves reading biographies, mysteries and science fiction and listening to music during her free time.
 

Informed Search Strategies

07/28/2010 12:42 am By Chathra Hendahewa | Articles: 11

In the last article on Artificial Intelligence which appeared on diGIT’s June 2010 issue, we discussed about different types of ‘Informed search’ strategies. Just to recall, this type of searching uses extra problem-specific knowledge which is beyond the actual problem itself in finding the best solution. It is said that informed search leads to finding better problem-specific solutions in an efficient manner when compared to the uninformed search. You might also remember that we have another name for this search known as ‘Heuristic search’. Last time we mainly discussed about two types of searches called the ‘Greedy Best-First Search’ and ‘A* Search’. Please read the last article to get more information about these if you have forgotten what they are all about.

Today, we would be discussing about some other heuristic search strategies available and also would try to understand about heuristic functions and how they can be constructed.

Types of Informed Search Continued
1. Best-First Search
  • a. Greedy best-first Search
  • b. A* Search

The above search strategies were discussed in the previous article.

2. Memory bounded heuristic search

As the name implies, this type of search tries to reduce the memory requirements while searching based on heuristics. With regard to A* search which we briefly described last time, one of its main drawbacks is the high amount of memory consumption. It has to keep all generated nodes at each iteration in the memory, so when we run this algorithm on a large scale problem, it will run out of memory causing trouble. Some of such memory-bounded algorithms are described in detail as follows.

a. Iterative-deepening A* search (IDA*)

In general, Iterative-deepening search is a step by step procedure where the problem space is expanded until a certain pre-specified limit has been reached. In the case of A* search the evaluation function is defined as follows.

f(n) = g(n) + h(n)

where;

  • g(n) - cost to reach the node
  • h(n) - cost to each node from the goal node

In this case, when we consider the iterative-deepening A* search the pre-defined limit for the iterative deepening search is an upper limit on the value of the evaluation function (f(n)). In the case where, a node in concern has a comparatively lower f(n) value to the cut-off limit, then that node would be expanded and if the algorithm reaches a node which has a f(n) value exceeding the limit, the algorithm will cease to expand on that node on that particular path and move on to finding other paths to reach the goal node. By imposing this kind of cut-off limit for the evaluation function cost, the nodes which are not feasible to be in the path would not be put on to memory and only the nodes within the paths which are feasible and cost effective would be stored in memory. This would lead to a lower memory requirement than using A* search on its whole.

b. Recursive best-first search (RBFS)

A recursive algorithm is one which calls the same algorithm within itself with different input arguments at each time until a certain objective function has been reached. RBFS is a simple such recursive algorithm which tries to do the operation of standard best-first search (which was explained last time) but using only a limited memory space. In this type of best-first search the evaluation function (f(n)) in concern is the combination of:

  • g(n) - cost to reach the node
  • h(n) - cost to each node from the goal node

RBFS algorithm keeps track of the evaluation function value (i.e. f(n) value) of the best alternative path available from any ancestor of the current node. If the current node’s evaluation function value exceeds this limit, the recursive step makes the algorithm take the alternative path instead. When such recursion occurs, RBFS algorithm replaces the previous f-value of each node along its path with the best f-value of its children. By doing this, the memory requirement is limited, because it does not need to keep all the expanded nodes in memory. It only needs to store the evaluation function values of best alternative path from each ancestor node of the current node and do a comparison with the current value and update it if required, making the memory requirement decrease. Although this looks quite feasible, there is another drawback in RBFS algorithm. That is it uses too little memory and is not capable of utilizing all the available memory if required. So, in another sense, it wastes available memory in a system since this algorithm cannot utilize it but rather use very limited memory to perform its operations. Therefore, some other algorithms have come in to play which use limited memory but also have the ability to utilize all available memory, which are described in detail in the following sections.

c. Simplified Memory-bounded A* search (SMA)

SMA algorithm works quite similar to the general A* search algorithm with the only exception being, it expanding the best leaf only until the memory is full. When it reaches the point of memory being full, it has to drop an existing node in memory to fill in the new one. In order to do this, SMA algorithm follows a simple strategy where it drops the worst leaf node already in memory and adds in the new node. ‘Worst leaf node’ means the leaf node in the memory which has a highest evaluation function value which is f(n), which implies it costs more to expand on that node in order to reach the goal. Therefore, this algorithm removes it and adds the better node in the memory. There might be a scenario where all the leaf nodes have the same f-value. In such as case, how does the SMA algorithm select which node is the worst and which node is the best since same node could be selected for both expansion and deletion? The solution is as follows. SMA* solves this by expanding newest best leaf and deleting oldest worst leaf, which means it deletes the node with the same f-value but was in memory for the longest time and adds the latest node. In this type of SMA search, all the limited amount memory that is available would be used to tackle the problem without wasting any resources.

Heuristic functions

The heuristic function is based on the problem in concern. If the goal of a certain problem is finding the lowest distance path from one place to another, then the heuristic function is obviously the lowest distance from each node to the goal node. The problems in practice are not so simple and obvious most of the time. Therefore, it is essential to know some important aspects that should be considered when coming up with a good heuristic function to do the searching on that problem space. Some important pointers when designing a heuristic function are as follows.

  • It should never overestimate the true solution cost to the goal – For example, if the most feasible path to the goal node from the starting node costs ‘x’ no: of steps, then the heuristic function should not exceed that limit from any node for it to be a suitable heuristic function.
  • The quality of the heuristic mostly depends on the ‘effective branching factor’ value – Following is the definition for effective branching factor. If the total number of nodes generated by A* search for a problem space is N, and the solution tree depth is d, then the effective branching factor b is defined using the following equation

N + 1 = 1 + b + b2 + b3 + ... ... ... ... ... ... . + bd

Therefore after removing 1 from both sides of the equation;

N = b + b2 + b3 + ... ... ... ... ... ... . + bd

The above is a geometric series. So to find the sum of a geometric series the equation is as follows. Hope you all can remember your year 10-11 Mathematics for doing this.

Where r – common ratio, n – no: of terms

Therefore in our equation no: 2 can be re-written as:-

Then an approximate solution for b can be obtained mathematically, which is called the effective branching factor. It is found out that that for a well-designed heuristic, the value for the effective branching factor would be close to 1.

Stay tuned!

Over the past few months and from today’s article, we discussed a lot about searching strategies available in problem solving which is a vital component in Artificial Intelligence. We learnt about problem solving basics, formulation of problems and solutions, uninformed (blind) search strategies and informed (heuristic) search strategies throughout this article series. Hope you were able to gain some understanding and develop an interest towards this area which paves the basics to AI. From the next article onwards, we will be presenting more about new machine learning techniques in AI such as neural networks, classification and clustering and so on. So, stay tuned!

References
Artificial Intelligence – A modern Approach, Second edition, Stuart Russell & Peter Norvig

 

Previous Article

Share/Save
Your rating: None Average: 3 (1 vote)