Deprecated: Assigning the return value of new by reference is deprecated in /usr/home/alexjc/public_www/articles/wp-includes/cache.php on line 36

Deprecated: Assigning the return value of new by reference is deprecated in /usr/home/alexjc/public_www/articles/wp-includes/query.php on line 15

Deprecated: Assigning the return value of new by reference is deprecated in /usr/home/alexjc/public_www/articles/wp-includes/theme.php on line 505

Deprecated: Assigning the return value of new by reference is deprecated in /usr/home/alexjc/public_www/articles/wp-content/plugins/simple-tagging/simpletagging.php on line 47

Rule-Based Systems and Identification Trees

By ai-depot | November 23, 2002

Building RBS with Identification Trees

Building Rule-Based Systems with Identification Trees

Semantic Network

A semantic network is the most basic structure upon which an identification tree (hereafter referred to as an ID tree) is based. Simply put, a semantic network consists of nodes representing objects and links representing any relations between these objects.

In this sample semantic network, Zoom is a feline; a feline is a mammal; a mammal is an animal. Zoom chases a mouse; a mouse is a mammal; a mammal is an animal. Zoom eats fish; fish is an animal. The relations are written on the lines: is a, is an, eats, chases. The nodes (circles) are ‘objects’.

Semantic Tree

At the next level of complexity exists a semantic tree, which is simply a semantic network with a few additional conditions and terms. Each node has a parent to which it is linked (with the exception of the root node which is it’s own parent and which needs no link). Each link connects the parent node with any and all children nodes. A single parent node may have multiple children, but no children may have multiple parents. Nodes with no children are the leaf nodes. The difference between a tree and a network is this: a network can have loops, a tree cannot.

The root node is marked as such. It is parent to itself, A and B. A is child to the root and parent to C. B is child to the root and parent to D and E. C is a child to A and has no children of its own, making it a leaf node. D is parent to F, which is parent to leaf nodes I and J. E, is parent to leaf nodes G and H.

Decision Tree

Above semantic trees comes a decision tree. Each node of a decision tree is linked to a set of possible solutions. Each parent node, that is each node that is not a leaf (and thus has children) is associated with a test, which splits the set of possible answers into subsets representing every possibility of the test’s outcomes.

Each non-leaf node serves as a test to lead to one of the leaf outcomes.

Identification Trees

Last, but not least, an ID tree is a decision tree in which all possible divisions is created by training the tree against a list of known data. The purpose of an ID tree is to take a set of sample data, classify the data and construct a series of test to classify an unknown object based on like properties.

Training ID Trees

First, the tree must be created and trained. It must be provided with sufficient labeled samples that are used to create the tree itself. It does this by dividing the samples into subsets based on features. The sets of samples at the leaves of the tree define a classification. The tree is created based on Occam’s Razor, which (modified for ID trees) states that the simplest (smallest) tree, that is consistent with the training samples, is the best predictor. To find the smallest tree, one could find every possible tree given the data set then examine each one and choose the smallest. However, this is expensive and wasteful. The solution to this, therefore, is to greedily create one small tree:

 At each node, pick a test such that branches are close to same classification
     Split into subsets with the least disorder
 Find which of these tests minimizes the disorder

Then:

 Until each leaf node contains a set that is homogenous or is near homogenous
   Select a leaf node that is non-homogenous
   Split this set into two or more homogenous subsets to minimize disorder

Since the goal of an ID tree is to generate homogenous subsets, we want to calculate how non-homogenous the subsets each test creates. The test that minimizes the disorder is the one that divides the samples into the cleanest categories. Disorder is calculated as follows:

Average disorder = Σb (nb/nt) * (Σc (nbc/nb)log2(nbc/nb))

 Where:
  nb is the number of samples in branch ‘b’
  nt is the total number of samples in all branches
  nbc is the total of samples in branch b of class c

For an example of training an ID tree, see Appendix C.

ID Trees to Rules

Once an ID tree is constructed successfully, it can be used to generate a rule-set, which will serve to perform the necessary classifications of the ID tree. This is done by creating a single rule for each path from the root to a leaf in the ID tree.

For an example of this, see Appendix D.

Pruning Unnecessary Conditions

If there are conditions of that rule that are inconsequential to the outcome, discard them thus simplifying the rule (and thus improving efficiency). This is accomplished by proving that the outcome is independent of the given condition. Events A and B are independent if the probability of event B does not change given that event A occurs. Using Bayes Rule:

P(B | A) = P(B)

This states that the probability of event B given that event A occurs is equal to the probability that event B occurs by itself. If this holds true, then event A does not effect whether or not event B occurs. If A is a condition and B is a result, then A can be discarded without affecting the rule.

For an example of this, see Appendix E.

Pruning Unnecessary Rules

If two or more rules share the same end result, you may be able to replace them with a rule that fires in the event that no other rule is fired:

 if (no other rule fires)
 then (execute these common actions)

If there is more than one such group of rules, replace only one group. Which one is determined by some heuristic tiebreaker. Two such tiebreakers follow:

  • Replace the larger of the two groups. If group A has six rules which share a common result and group B only has five, replace the larger group A with will eliminate more rules and simplify the rule base the most.
  • Replace the group with the highest average number of rule conditions. While more rules may remain, the rules that remain will be more simple as they have fewer conditions. For example, given the rules:
     if (x) and (y) and (z) then (A)
     if (m) and (n) and (o) then (A)
    
     vs. 
    
     if (p) then (Z)
     if (q) then (Z)
    

    You would want to replace the first set with:

     if (no other rule fires)
     then (A)
    

For an example of this, see Appendix F.

With enough training data, an ID tree can be created which, in turn, can be used to create a rule-base for classification. From then on, using forward-chaining, a new entity can be introduced as an assertion in the knowledge base and it can be classified as if by the ID tree. Using backward-chaining, one could use it to find evidence to support that a given classification is valid.

Pages: 1 2 3 4

Tags: none
Category: tutorial |

2 Responses to “Rule-Based Systems and Identification Trees”

  1. Ivo Says:
    April 11th, 2008 at 7:12 am

    I see it has been a while since this one has been post, but i wonder if the images are still somewhere on the net?

  2. Ivo Says:
    April 11th, 2008 at 8:52 am

    Got in another article here at the AI-depot..
    http://ai-depot.com/Tutorial/RuleBased-Methods.html

Comments