|
- Synopsis
- The LRSX Unification Tool implements a new algorithm for unification of meta-expression in lambda calculi with recursive and shared bindings (i.e. so-called letrec-expressions).
The underlying meta-language and the unification algorithm are designed to capture usual call-by-need lambda-calculi included their reduction strategies.
Besides the core utility to unify expressions of the meta-language, the tool can overlap reduction rules with program transformations on a given calculus description as input,
and compute the corresponding critical pairs,
The LRSX Unification Tool is implemented in Haskell and packaged via Cabal and comes with
a 3-clause BSD-like LICENSE.
- Download and Building
-
-
The source-distribution is available here:
lrsx-0.3.0.4.tar.gz
- A recent installation of the Glasgow Haskell Compiler, the base libraries, and Cabal-Install is required (see https://www.haskell.org/downloads)
-
To build lrsx unzip the archive, then run cabal, i.e.:
tar -xzf lrsx-0.3.0.4.tar.gz
cd lrsx-0.3.0.4/
cabal configure
cabal build
This will create the executable lrsx in dist/build/lrsx/ , e.g. type
dist/build/lrsx/lrsx help
to see the help-text of the LRSX Unification tool.
- The executable
lrsx can be installed via typing cabal install
If this completes successfully you will have the lrsx-binary in ~/.cabal/bin .
- Documentation
-
Using LSRX
The LRSX Unification Tool prints several help-texts by calling lrsx help or more command specific help by typing lrsx command help .
LRSX mainly supports two kinds of input files:
- Unification equations with constraints:
For those input files call lrsx unify file, where file contains the input.
LRSX will unify the input and print the output to stdout.
- Program calculus descriptions and overlap commands:
lrsx overlap file computes the critical pairs, requested by the commands in file.
Both formats share a common syntax for expressions and constraints as described below.
Unification equations with constraints
The allowed syntax can be represented by the following grammar in ENBF,
where nonterminals are in bold font and the side condition that
in derivations from EXPRESSION the symbol HOLE does not occur, except
for deriviation starting with CONTEXT where the symbol HOLE occurs exactly once.
CONSTRAINED_EQUATIONS | ::= | EQUATIONS |
| | | EQUATIONS where CONSTRAINTS |
EQUATIONS | ::= | EQUATION |
| | | EQUATION , EQUATIONS |
EQUATION | ::= | EXPRESSION =?= EXPRESSION |
EXPRESSION | ::= | letrec ENVIRONMENT in EXPRESSION |
| | | letrec { ENVIRONMENT } in EXPRESSION |
| | | SEXPRESSION |
SEXPRESSION | ::= | FUNSYMWITHINFO BINDEXPR1...BINDEXPRN |
| | | ATOM |
FUNSYMWITHINFO | ::= | FUNSYM | FUNSYM[ STRICTPOSITIONS] |
STRICTPOSITIONS | ::= | INTEGER,...,INTEGER |
BINDEXPR | ::= | ATOM | VARIABLE. BINDEXPR |
| | | * VARIABLE | (* VARIABLE) |
ATOM | ::= | var VARIABLE |
| | | ( EXPRESSION ) |
| | | METAVARIABLE |
| | | CONTEXTVAR [ EXPRESSION ] |
| | | HOLE |
ENVIRONMENT | ::= | ENVITEM |
| | | ENVITEM ; ENVIRONMENT |
ENVITEM | ::= | METAVARIABLE |
| | | METAVARIABLE [ VARIABLE , EXPRESSION ] |
| | | METAVARIABLE | VARIABLE , VARIABLE | |
| | | VARIABLE = EXPRESSION |
VARIABLE | ::= | CONCRETEVAR | METAVARIABLE |
CONCRETEVAR | ::= | (a |...|z )(a |...|z |A |...|Z |0 |...|9 )* |
FUNSYM | ::= | CONCRETEVAR |
METAVARIABLE | ::= | (B |D |...|S |U |...|Z )(a |...|z |A |...|Z |0 |...|9 )* |
CONTEXTVAR | ::= | ACONTEXT | TCONTEXT | CCONTEXT |
ACONTEXT | ::= | A (a |...|z |A |...|Z |0 |...|9 )* |
TCONTEXT | ::= | T (a |...|z |A |...|Z |0 |...|9 )* |
CCONTEXT | ::= | C (a |...|z |A |...|Z |0 |...|9 )* |
CONSTRAINTS | ::= | CONSTRAINT |
| | | CONSTRAINT , CONSTRAINTS |
CONSTRAINT | ::= | METAVAR /= {} |
| | | CONTEXTVAR /= [] |
| | | ( EXPRESSION , CONTEXT ) |
| | | ( ENVIRONMENT , CONTEXT ) |
CONTEXT | ::= | EXPRESSION |
An informal explanation of the syntax is
- A valid input is a sequence of equations where equations are separated by
,
and after the equations a sequence of constraints may occur which is
introduced by the keyword where
-
Equations are written with
=?= as equation symbols, and both sides of one
equation are LRSX-expressions.
-
LRSX-expressions are meta-expressions which comprise concrete syntax and meta-variables,
symbols beginning with upper-case letters are unification variables:
unification variables beginning with
A are interpreted as application context variables
T are interpreted as top context variables
C are interpreted as arbitrary context variables
Other unification variables are
- expression variables (usually denoted by
S )
- lambda- or let-variable variables (usually denoted by
X or Y )
- letrec-environment variables (usually denoted by
E )
- environment chain variables with special syntax
- Varname
[y,s]
represents a (non-empty) application context chain of
the form y=A[y1];y2=A[y2];...;yn=A[s] (usually denoted by EE[y,s] )
-
Varname
|y1,yn| represents a (non-empty) variable-to-variable chain of
the form y1 = (var y2);y2=(var y3);...;yn-1=(var yn) (usually denoted by VV|y1,yn| )
The syntax of expressions includes:
- the special function symbol
var to lift
let- and lambda-variables to expressions,
- other function symbols are recognized by the input,
where function symbols must begin with a lower case letter.
The algorithm assumes the first argument of a function symbol to be strict, while all
other positions are nonstrict.
An example is
(app S1 S2) which introduces the function symbol app with arity 2
However, strictness can be added by providing a list of strict positions, i.e. app[1,2] defined app
to be strict in both arguments.
- Higher-order expressions are written as
x1.x2....xn.s , i.e. lambda abstraction is a function symbol applied to
a higher-order expression with on binder, which can be written as \(var.body) but also without parentheses as \var.body .
- Variable-positions are denoted by the special symbol
* , e.g. in app S *X the function symbol
app only allows variables as second argument.
letrec Environment in body where body is the in-expression and Environment is a meta-letrec-environment:
environments may be enclosed by { and } . Parts of the environment are separated by ; and allowed parts are
- a meta-variable for an environment (usually denoted by
E )
- an environment chain (see above, usually denoted by
EE[y,s] or VV|y1,yn| )
- single bindings, written as
var = expr
Expressions must fulfill the weak distinct variable convention which requires that
names of bound variables are different from names of free variables, However, it is allowed that bound (let- or lambda-) variables occur several times,
where it is forbidden, that the same let-variable is bound twice in the same letrec-environment.
The following constraints can be added to the input equations (separated by , ):
-
non-emptiness constraints for contexts written as
CtxtName /= [] with the semantics
that the context represented by CtxtName must not be the empty context.
-
non-emptiness constraints for environment variables written as
EnvName /= {} with the
the semantics that the part of the environment represented by EnvName is not an empty environment
-
non-capture constraints written as
(expr,ctxt) or [env,ctxt] with the meaning that
the free variables of expression expr (environment env , resp.) is not captured by the hole of
the context ctxt . The syntax for contexts is the same as for expressions, where additionally
the terminal HOLE can (and must be) used to denote the context hole.
An example is (var Y,\X.HOLE) which ensures that the variable Y may not be instantiated with the
same variable as X is instantiated, since in this case Y is captured by the hole of \X.HOLE .
Program calculus descriptions and overlap commands
The grammar for the syntax of program calculus descriptions and
commands extends the above grammar as follows, where the start
symbol is RC
RC | ::= | PRULES COMMANDS |
PRULES | ::= | RULES | RULES TRANSFORMATIONS RULES |
RULES | ::= | RULE | RULES |
RULE | ::= | { RULENAME } EXPRESSION ==> EXPRESSION |
| | | { RULENAME } EXPRESSION ==> EXPRESSION where CONSTRAINTS |
RULENAME | ::= | SR, RNAME, Number | RNAME, Number
|
RNAME | ::= | (a |...|z )(a |...|z |A |...|Z |0 ...9 |_ )* |
COMMANDS | ::= | COMMAND | COMMAND COMMANDS |
COMMAND | ::= | overlap ( RULENAME) .DIR and ( RULENAME) .DIR |
| | | overlap ( RULENAME) .DIR all |
DIR | ::= | l | r |
The calculus description begins with a sequence of rules followed
by a sequence of commands.
The sequence of rules may be partitioned into two sets of rules
where the first set are standard reductions and the second set are
program transformations. The sets are separated by the keyword TRANSFORMATIONS .
Rules are written as {rulename} left hand side ==> right hand side
followed by an optional where -clause to add constraints to the rule.
Both sides of the rule are expressions. The rulename should be unique
for every rule and it must start with a lower-case letter. If it starts with SR, then it denotes a standard reduction, otherwise it denotes a transformation rule. The rulename also have a number. If no number is given, then the number is always 1.
Commands are of the form
overlap ( rulename1,Number) .dir1
and
( rulename2,Number) .dir2
where dir is either l or r , to express which side of the rule should
be used for overlapping.
For computing forking diagrams the left hand side of a transformation
has to be overlapped with the left hand sides of standard reductions,
while for computing commuting diagrams, the right hand side of a
transformation hat to be overlapped with the left hand sides of
standard reductions.
As a short form also the command
overlap (rulename1,Number).dir1 all
is available which computes all overlaps of rulename1.dir1 with
all left hand sides of all standard reductions (i.e. the rules which are named with SR,...
).
- Examples
The source distribution comes with several examples which can be found in examples .
Examples for Unification
The following table lists the example-input files and the corresponding output generated by the LRSX Unification Tool:
Examples for Computing Overlaps
We provide two examples of program calculi with recursive let:
- The calculus LNEED is a minimal lambda-calculus with letrec.
Its syntax comprises abstractions, applications, variables and letrec.
The description of the calculus and several program transformations
can be found in the file
LNEED.inp.
Computing forking overlaps can be done with the input file
lneed_fork.inp
which requires
LNEED.inp.
in the same directory.
Running
lrsx overlap lneed_fork.inp
produces the critical pairs which can be found
in the following text-file
.
Computing commuting overlaps can be done with the input file
lneed_commute.inp
which requires
LNEED.inp.
in the same directory.
Running
lrsx overlap lneed_commute.inp
produces the critical pairs which can be found
in the following text-file
.
-
The calculus LR extends the calculus LNEED by data constructors,
case-expressions and Haskell's seq-operator (LR can be seen as an untyped core language of Haskell)
The description of the calculus and several program transformations
can be found in the file
LR.inp.
Computing forking overlaps can be done with the input file
lr_fork.inp
which requires
LR.inp.
in the same directory.
Running
lrsx overlap LR_fork.inp
produces the critical pairs which can be found
in the following text-file
(~389 MB!).
Computing commuting overlaps can be done with the input file
lr_commute.inp
which requires
lr.inp.
in the same directory.
Running
lrsx overlap lr_commute.inp
produces the critical pairs which can be found
in the following text-file
(~455 MB!).
- Contact
-
David Sabel
|