Architecture of the Existing (110.97) Match Compiler

================================================================================
These are early notes made during the "discovery" phase while trying
to understand the workings of the Aitken match compiler
[amc1992]. They are largely concerned with working out the types and
the roles played by different types or components.
Most of these notes are not relevant to the new match compiler [mc2020].
================================================================================

This version of the match compiler was written by Bill Aitken in the
summer or 1992 and appeared somewhere between version 0.74
(1991.10.10) and 0.93 (1992.02.15).

1. Types in MCCommon

The result of OR expanding a rule is an _OR-family_, a set of patterns with
a common rhs lexp, and a common set of bound pattern variables


type ruleno = int
   index, zero-based, of a rule in a match (pre- or post- ?) OR expansion
   E.g. "allRules" defined in matchComp, which is the list after OR expansion

type ruleset = ruleno list = int list
   a list (ordered set) of rule indices, maintained in strictly ascending order
   should be an abstract type?
   
type dconinfo = datacon * tyvar list
   a dataconstructor (occurring in a pattern) with an "associated"? list of tyvars
   [Q: what is the relation between the datacon and the tyvars?]

* datacons and constants appearing in patterns  

  datatype pcon
    DATApcon of dconinfo --  a dataconstructor annotated with tyvars
    INTpcon, WORDpcon -- integer and word constants, with value (int IntConst.t)
    STRINGpcon of string -- a string constant
    VLENpcon of int * ty -- vector construction, with length and element type

* destructuring paths
    paths for accessing the position corresponding to a pattern-bound variable
    not "linear" because of the RECORDPATH constructor

  datatype path
    RECORDPATH of path list    -- ???
    PIPATH of int * path       -- a path to the ith record component?
    VPIPATH of int * ty * path -- a path to the ith vector component, of type ty
    DELTAPATH of pcon * path   -- a path after stripping off the given pcon,
                                 where the pcon must be DATApcon, or possibly VLENpcon [Q:]
    ROOTPATH                   -- path corresponding to top node

* andor trees
    AND-OR trees where all children (subtrees) of an AND note have to be matched, 
    at least one of the children (cases) of an OR note must be matched.  LEAF nodes
    correspond to atomic matches of either constants or varialbes [Q:]
    Each node has a list of _bindings_.
    Each node has a list of _constraints_.
    [Q: explain the nature and role of bindings and constraints.]
      [Q: do constraints arrise from layered patterns?]
    OR nodes involve matching one of the dataconstructors of a datatype, whose "shape"
      is characterized by sign: DA.consig
    Produced by the call of makeAndor in matchComp.

  datatype andor
    AND   -- all subtrees must match
      subtrees : andor list 
      bindings : (ruleno * var) list
   [obs -- constraints : constraint list ]
    
    OR    -- one of the andor-cases must be matched, determined by pcon
      cases : andor_case list
      sign : DA.consig
      bindings : (ruleno * var) list  -- variables bound at this node
                                      -- and the rule in which they are bound
         [obs -- constraints : constraint list ]

    LEAF  -- analysis complete (except what to do with constraints)
      bindings : (ruleno * var) list
   [obs -- constraints : constraint list ]

   [obs --
    constraint
      dcon : dconinfo
      rules : ruleset
      andorOp : andor option ]
      
    what are bindings?
       binding = ruleno * var
         a variable bound at a node and the rule it belongs to

    what are constraints? where do they come from? and what do they mean?
       constraints used to arrise when processing general layered
       patterns where the left pattern was structured (not a var)
    should the common bindings, [constraints] fields be factored out of the datatype?

* decision

  datatype decision
    CASEDEC
    ABSCONDEC -- gone, not needed
    BINDDEC

* decision tree
    decision trees are the penultimate product, finally translated into an lexp
    produced by genDecisionTree,
    consumed by translation to lexp (codeTy) by generate

  datatype dectree
    CASETEST  -- switch on a datacon

    ABSTEST   -- ? arrise from ABSCONDEC in pass1, cause error in
                 pass2. Related to layered patterns?
        
    BIND      -- bind a variable and continue matching (layering)

    RHS       -- dispatch to corresponding RHS


* preProcessRule: preprocessing the match rules
  input: (type translation), rule (pat * lexp)
  output: multiRule

  type multiRule =
     {pats: (pat * path list) list,
        -- one for each orFamily pattern, thus just one if no OR;
        -- for each pat, path list gives access paths for the pattern
           bound vars
        -- INVARIANT: all path list components have same length = # of
           bound vars	      
      fname: lvar,
        -- a newly generated lvar uses as an "identifier" for the rule
           family, maintains connection between patterns and the shared rhs lexp
      rhsFun: Plambda.lexp}
        -- the previously translated shared RHS expression (lexp),
           abstracted over pat-bound variables (or over unit if none)
     

  val preProcessRule : toLtyTy    (* type to FLINT type translators *)
                       -> Absyn.pat * Plambda.lexp    (* a rule *)
                       -> multiRule

  type matchRep = multiRule list
    -- type of multiRules in matchComp, defined by mapping
       preProcessRule over original match rules

  type matchRepTy = ((path * pat) list * path list * lvar) list


* makeAndor: building AND-OR tree from expanded set of lhs patterns
  input: list of LHS patterns after OR-expansion
  output: andor

* flattenAndor: andor * path * ruleset -> decision list
  input: andor (produced by makeAndor), ruleset
  output: (path? * decision list) list
  -- called once in matchComp, result bound to "flattened"

* flattenAndors : (path * andor) list * ruleset -> (path list * decision list) list

* fireConstraint: path                                -- initially ROOTPATH
                  * (path list * decision list) list  -- from flattenAndors (needPaths, decisions)
                  * decision list                     -- ready: initially nil
		  * (path * decision list) list       -- delayed: initially nil
                  -> decision list                    -- new ready
		  * (path * decision list) list       -- new delayed
  input: path?, (path? * decision) list (from flattenAndors),
         (ready: decision list; accum), (delayed: (path * decision list) list; accum)
  output: ready, delayed
  called in matchComp, result bound to "ready_delayed"

* genDecisionTree: decision list                      -- "ready"
                * (path * decision list) list         -- "delayed"
		* ruleset                             -- initially allRules
		-> dectree
  input: (decision list (ready) * (path * decision list) list (delayed)),
         ruleSet (allRules)
  output: dectree (decision tree)
  construct decision tree from output of fireConstraint

ready : decision list
delayed : (path * decision list) list

makeAndor:      pat list  ---------------->  andor  (new version)

makeAndor:      matchRep               ---------------->  [(path,andor)]
flattenAndors : [(path,andor)],ruleset ---------------->  decision list * (path * decision list) list
fireConstraint: path,(paths,decisions),ready,delayed -->  ready, delayed
genDecisionTree: ready,delayed,ruleset ---------------->  decisiontree
