Rule-Based Systems and Identification Trees

By ai-depot | November 23, 2002

Conclusion & Appendices

Conclusion

I have heard a few people, including some of my classmates, say that rule-based and expert systems are obsolete; and that ID trees are a thing of the past. Granted, this is not the direction that most research is moving, but that doesn�t negate the existing accomplishments of these architectures. As it stands, expert rule-based systems are the most widely used and accepted AI in the world outside of games. The fields of medicine, finance and many others have benefited greatly by intelligent use of such systems. With the combination of rule-based systems and ID trees, there is great potential for most fields.

Appendices

Appendix A — Forward-Chaining Example: Medical Diagnosis

Assertions (Working Memory):

 A1: runny nose
 A2: temperature=101.7
 A3: headache
 A4: cough

Rules (Rule-Base):

 R1:    if   (nasal congestion)
             (viremia)
        then diagnose (influenza)
             exit

 R2:    if   (runny nose)
        then assert (nasal congestion)

 R3:    if   (body- aches)
        then assert (achiness)

 R4:    if   (temp >100)
        then assert (fever)

 R5:    if   (headache)
        then assert (achiness)

 R6:    if   (fever)
             (achiness)
             (cough)
        then assert (viremia)

Execution:

R2 fires, adding (nasal congestion) to working memory.
R4 fires, adding (fever) to working memory.
R5 fires, adding (achiness) to working memory.
R6 fires, adding (viremia) to working memory.
R1 fires, diagnosing the disease as (influenza) and exits, returning the diagnosis

Appendix B — Backward-Chaining Example: Medical Diagnosis

Use same rules/assertions from Appendix A

Hypothesis/Goal: Diagnosis (influenza)

Execution:

R1 fires since the goal, diagnosis(influenza), matches the conclusion of that rule. New goals are created: (nasal congestion) and (viremia) and backchaining is recursively called with these new goals.
R2 fires, matching goal nasal congestion. New goal is created: (runny nose). Backchaining is recursively called. Since (runny nose) is in working memory, it returns true.
R6 fires, matching goal viremia. Back-chaining recursion with new goals: (fever), (achiness) and (cough)
R4 fires, adding goal (temperature > 100). Since (temperature = 101.7) is in working memory, it returns true.
R3 fires, adding goal (body-aches). On recursion, there is no information in working memory nor rules that match this goal. Therefore it returns false and the next matching rule is chosen. That rule is R5 which fires, adding goal (headache). Since (headache) is in working memory, it returns true.
Goal (cough) is in working memory, so that returns true.
Now, all recursive procedures have returned true, the system exits, returning true: this hypothesis was correct: subject has influenza.

Appendix C — Identification Tree Training

The identification tree will be trained on the following data:

We greedily create a small ID tree from this. For each column (save the first and last, since the first is simply an identifier and the last is the result we�re trying to identify) we create a tree based solely on the divisions within that category. The resulting trees are as follows:

From these, we calculate the disorder of each:

 Size_Disorder = Σ_b (nb/nt) * (Σ_c -(nbc/nb)log₂(nbc/nb))
 = (4/8) * ((-(2/4) log₂ (2/4))
   + (-(2/4) log₂ (2/4))) + ((1/8) * 0) + ((3/8) * 0)
 = 0.5

The disorder for the �size� test is 0.5. The other disorders are as follows:

 Size:    0.5
 Color:   0.69
 Weight:  0.94
 Rubber:  0.61

Since Size is the lowest disorder, we take that one and further break down any unhomogenous sets. There is only one, those of the �small� branch. The following trees are resulting from the further division of the size=small test.

The Rubber test splits the remaining samples into perfect subsets with 0 disorder. Therefore, our final, simplest ID tree which represents the data is:

Appendix D — Creating Rules from ID trees

Given the final ID tree from appendix C, follow from the root test down to each outcome with each node visited becoming a condition of our rule. This gives us the following rules:

First, we’ll follow the rightmost path from the root node: size=large. Of the three in this branch, none bounce.

 R1:  if   (size = large)
      then (ball does not bounce)

Next, we examine the next branch: size=medium. There is only one in this branch, and it bounces. Based on this data (this may change under a larger training set) all medium balls bounce.

 R2:  if   (size = medium)
      then (ball does bounce)

Third, we follow the leftmost branch: size=no. This leads us to another decision node. Taking the rightmost branch: rubber=no gives us this rule:

 R3:  if   (size = small)
           (rubber = no)
      then (ball does not bounce)

And finally, we follow the first branch left: size=no and at the next test follow rubber=yes. The following rule is produced:

R4:   if   (size = small)
           (rubber = yes)
      then (ball does bounce)

Appendix E — Eliminating unnecessary rule conditions

Given the rules provided in appendix D, we see if there�s any way to simplify those rules by eliminating unnecessary rule conditions.
The last two rules have two conditions each. Consider, for example, the first of these, R3:

 R3:  if (size = small)
         (rubber = no)
      then (ball does not bounce)

Looking at the probability with event A = (size=small) and event B = (ball does not bounce)

 P(B|A) = (3 non rubber balls do not bounce / 8 total) = 0.375
 P(B) = (3 non rubber balls / 8 total) = 0.375
 P(B|A) = P(B) therefore B is independent of A

If we were to eliminate the first condition: size=small, then this rule would trigger for every ball not made of rubber. There are 3 balls not made of rubber. They are 2, 3 and 8 � none of these bounce. Because none bounce, the size does not affect this and we can eliminate that condition.

 if   (rubber = no)
 then (ball does not bounce)

Examining the next condition, the probability for A = (rubber=no) and B the same:

 P(B|A) = (2 small balls do not bounce / 8 total) = 0.25
 P(B) = (4 small balls / 8 total) = 0.5
 P(B|A) does not equal P(B) therefore A and B are not independent

If you eliminate the next condition in the same rule, (rubber = no) this triggers for every small ball. Of the small balls, two bounce and two do not. Therefore, the rubber does affect if they bounce or not and cannot be eliminated. The small balls bounce only if they are rubber.

Now, the next rule with two conditions:

R4:   if   (size = small)
           (rubber = yes)
      then (ball does bounce)

Examining the probabilities: A = (size=small) B = (ball does bounce)

 P(B|A) = P(2 small balls bounce / 8 total) = 0.25
 P(B) = P(4 small balls / 8 total) = 0.5
 P(B|A) does not equal P(B) therefore A and B are not independent.

If we eliminate the first rule, it fires for all rubber balls. Of the five rubber balls, two are small and both bounce. Of the other three, one bounces and two do not. For this rule, (size=small) is important.

On to the next condition. Examining the probabilities: A = (rubber=yes) B = (ball does bounce)

 P(B|A) = P(3 rubber balls bounce / 8 total) = 0.375
 P(B) = P(5 rubber balls / 8 total) = 0.625
 P(B|A) does not equal P(B) therefore A and B are not independent

Eliminating the second fires for all small balls. Of the four small, two bounce and two do not. Again, the condition is significant and cannot be dropped. Therefore this rule must stay as it is.

Appendix F — Eliminating unnecessary rules

We have the following simplified rules from Appendix E:

 R1:  if   (size = large)
     then (ball does not bounce)

 R2:  if   (size = medium)
     then (ball does bounce)

 R3:  if   (rubber = no)
     then (ball does not bounce)

 R4:  if   (size = small)
          (rubber = yes)
     then (ball does bounce)

Of these, we have two sets of rules, each set shares a common result. The first group consists of rules R1 and R3. The second consists of rules R2 and R4.
We can eliminate one of these sets and replace it with the rule:

 if   (no other rule fires)
 then (perform these common actions)

Both sets have the same number of rules, 2, but the second set has more conditions than the first. So, we�ll eliminate the second set and replace it with:

 if   (no other rule fires)
 then (ball does bounce)

Our final rule-base is:

 R1: if   (size = large)
     then (ball does not bounce)

 R2: if   (rubber = no)
     then (ball does not bounce)

 R3: if   (no other rule fires)
     then (ball does bounce)

Appendix G — Additional Online Resources

A much more in-depth examination of the Rete Method
Another source about ID tree machine learning
CLIPS: A tool for building Expert Systems
FuzzyCLIPS is an extension of the CLIPS expert system shell.
A list of papers from CiteSeer
Companion for a book, there’s a section for Rule-Based Systems
Some class notes (not mine) on Rule-Based Systems
Some class notes (not mine) on Expert Systems

Naturally, Google and Yahoo are good places to start looking on your own.

Written by James Freeman-Hargis.

Pages: 1 2 3 4

Tags: none
Category: tutorial |

2 Responses to “Rule-Based Systems and Identification Trees”

Ivo Says:
April 11th, 2008 at 7:12 am
I see it has been a while since this one has been post, but i wonder if the images are still somewhere on the net?
Ivo Says:
April 11th, 2008 at 8:52 am
Got in another article here at the AI-depot..
http://ai-depot.com/Tutorial/RuleBased-Methods.html