The implementation

In the text that follows, the java documentation for a particular class or function is reachable by clicking on the corresponding name.

Overview

On a semantic level, the system models the interactions of three parties. There is a learner trying to infer a target language by querying a teacher, and a scribe which does not perform any other action than that it record the actions of others (and in this using the utility class StopWatch). After the target language has been acquired, the statistics thus derived - for example, the number of equivalence queries made by the Learner, the aggregated size of the counterexamples returned by the teacher - are formatted by Layout and reported to the user.

The interface between learner and teacher

The learner communicates with the teacher through a sequence of equivalence and membership queries. In practice, this means that the learner calls the teachers functions
	boolean membership (Tree tree) 
and
	Tree equivalence (FiniteTreeAutomata fta) , 
and the teacher responds in accordance with the MAT interface.

The teacher

The teacher is created with constructor

	void teacher (FiniteTreeAutomaton fta) , 
where the fta is a finite tree automaton that defines the target language U = L(fta) for the learning session. The class Language contains a number of static fta's that that represent different regular tree language. If, for example, the teacher is created with parameter Language.simpleEnglish, then Simple English is what will be taught. The teacher saves fta as teacher.fta and uses it as an internal representation of U.

If asked, through a membership query, whether a tree tree is a member of U, the teacher simply returns the result of invoking function

	boolean accepts (Tree t) 
of teacher.fta on tree.

When faced with an equivalence query with parameter fta, the teacher constructs a regular tree grammar G which recognises the symmetric difference between L(teacher.fta) and L(fta). It then generates a tree t in L(G) by first picking productions at random, but once that tree that is to become t has grown beyond a certain size, favouring terminating rules. When the derivation has terminated, t is returned as a counterexample to the conjecture that fta recognises U (that is, L(teacher.fta)).

The learner

When the learner is created, through

	learner (Teacher teacher) ,
the parameter teacher is a reference to the oracle that will teach it. The creation of both teacher and learner takes place in
	static public void main () 
of of class Test. After also the Scribe has been initialised, the learning session begins with a call to
	FiniteTreeAutomata Learner.learn () . 

In Learner.learn(), a sequence of tasks that can roughly be described as

    [List 1]
  1. build a candidate fta
  2. ask the teacher if the fta is correct
  3. if it is not, collect a counterexample from the teacher, and
  4. derive more information about the target language from the counterexample that the teacher provides
is repeated over and over again until the teacher finally confirms that the target language has been acquired by returning null on an equivalence query.

To build a candidate fta, the learner gathers one set of trees S to represent states, and another set of trees R to represent transitions. To store S and R in an efficient manner, a directed acyclic graph dag is used (also described at [Wik]).

The observation table

Just having S and R is, in it self, not sufficient to build an fta. The learner must also be able to decide what state or transition a tree in these sets should be taken to represent. For this purpose, the learner also keeps a set of contexts C that can be used to partition the trees in S and R into equivalence classes. Information about whether a tree c[t] is in U, where t in S or R, and c in C, is kept at position (c,t) in the ObservationTable learner.obs.

The observation table is not merely a static container for information. Since learner.obs contains all information that is needed to synthesise an fta, it is delegated this task by the learner. When it synthesises a new fta, learner.obs uses the trees of S as states, and translates each element in R into a Transition, which are then combined to to form a Transition table.

To save work, learner.obs saves the last automata in a public field fta, which it only updates through a call to

	void synthesise () , 
when absolutely necessary, that is, when a new state has been found. New transitions, on the other hand, can simply be added to the existing fta, so no re-synthetisation is required.

Apart from synthesising candidate fta's, the observation table is also responsible for the extraction of new states and transitions from the counterexamples that the teacher supplies. When the learner has received a counterexample from the teacher, it feeds this to learner.obs through a call to the function

	Tree extend (Tree context, Tree counterexample, boolean reference) , 
where context is initially the empty context, and reference is the teachers opinion about whether counterexample is in U or not. In extend, the observation table searches the counterexample in a bottom-up fashion, until it finds either a new transition, or a pair of a new state and an new context, which is promptly assimilated to complete the last task of List 1.