Breaking Through the Normalization Barrier: A Self-Interpreter for F-omega

1. Introduction
2. Step 1: Installation
3. Step 2: Evaluation

The full paper is available here. The appendix contains proofs of all theorems stated in the paper.

1 Introduction

The main result of the paper is that System F-omega supports typed self-representation with several operations, including a self-interpreter. The purpose of the artifact is to provide tools that help build, type check, and experiment with representations and their operations. The artifact includes:

A parser for our concrete syntax of System F-omega, which includes syntactic sugar for defining abbreviations, loading definitions from other files, and building representations of terms and types.
An implementation of System F-omega, including a type checker, term and type quoters, an evaluator, and a beta-equivalence checker.
Source code in our concrete syntax for all the operations described in the paper.
A suite a System F-omega test files.

2 Step 1: Installation

2.1 Requirements

The tests should be run on a modern Unix-like system (e.g. Linux or Mac OS X) with at least 4 Gigabytes of RAM. We assume that the Haskell Platform has been installed and ghc and cabal are in $PATH.

2.2 Get the sources

Download from http://compilers.cs.ucla.edu/popl16/popl16-artifact.zip

Run the commands:

$ unzip popl16-artifact.zip
$ cd popl16-artifact/

2.3 Build the F-omega implementation

The implementation of F-omega is written in Haskell, and uses the cabal package manager. The following command will download and compile all the Haskell dependencies, and build and test the F-Omega implementation.

$ bin/build.sh

3 Step 2: Evaluation

3.1 Description of concrete syntax

Our concrete syntax is based on the formulation of System F-omega as a Pure Type System (PTS). In the paper, we use $\Lambda$ for abstraction over types in terms, while PTS notation uses $\lambda$ for all abstractions. Also, the paper uses $\forall$ for universal quantification and PTS uses $\Pi$. Our concrete syntax uses a backslash \ for all lambda abstractions, and Pi for universal quantification.

Syntax in Paper	Syntax in Artifact
$\forall \alpha:*. \alpha \to \alpha$	`Pi A:*. A -> A`
$\Lambda \alpha:*. \lambda x:\alpha. x$	`\A:*. \x:A. x`

Our parser supports the declaration of abbreviations via the %decl directive. For example, the following example (taken from fomega/lib/Id.fw) declares id to be the polymorphic application function, and Id to be its type.

%decl Id : *  = Pi A:*. A -> A;
%decl id : Id = \A:*. \x:A. x;

We can load all the declarations from a file in fomega/lib/ with the %load directive. For example, the following example (taken from fomega/id.fw) loads the above declarations from fomega/lib/Id.fw and then applies id to itself.

%load "Id";
id Id id

Note that fomega/id.fw ends with a term, while fomega/lib/Id.fw does not. The files in fomega/lib are called libraries and consist of zero or more %load and %decl statements. The files in fomega/ are called programs and consist of zero or more %load and %decl statements followed by a term to be evaluated.

Another important difference between the artifact and the paper is the notation for representations of terms and types. The paper uses an overline notation $\overline{\cdot}$ to denote the representations of terms and types, while the artifact uses $[\cdot]$. The artifact does not have syntax for pre-representations of terms or types.

	Syntax in Paper	Syntax in Artifact
Representation of a type	$\overline{\forall \alpha:*. \alpha \to \alpha}$	`[Pi A:*. A -> A]`
Representation of a term	$\overline{\Lambda \alpha:*. \lambda x:\alpha. x}$	`[\A:*. \x:A. x]`

3.2 The Test Types

This artifact includes three types of test. Each type of test is to be run by the corresponding test runner. The types are:

Typecheck tests, which check if a program typechecks successfully. These tests are run with the bin/runTypecheckTest.sh test runner. Examples:
```
$ bin/runTypecheckTest.sh fomega/typecheckTestSuccess.fw
Succeeded.
$ bin/runTypecheckTest.sh fomega/typecheckTestFailure.fw
runTypecheckTest: unexpected app type: 
l = Var (Der (Var (Der (Star (Der (Box TypeOfBox)))) "A")) "x"
ty l = Var (Der (Star (Der (Box TypeOfBox)))) "A"
```
The second example attempts to typecheck the term \A:*. \x:A. x x, which fails. This demonstrates that the test runner does in fact detect and report type errors (i.e. it doesn't simply output "Succeeded." for all inputs). It also demonstrates that our type errors are not user-friendly. In this artifact, we will not use the error messages. We use typecheck tests only to determine whether a program type checks.
An equivalence test, which checks if two terms or types are $\beta$-equivalent. The program should be a church pair or terms or types. The runner extracts each component, normalizes them, and tests that the normal forms are $\alpha$-equivalent. These tests are run with the bin/runEquivTest.sh test runner.
```
$ bin/runEquivTest.sh fomega/equivTestSuccess.fw
Succeeded.
$ bin/runEquivTest.sh fomega/equivTestFailure.fw
Failed.
Not alpha-equivalent: 
   Lam () "3593" (Star ()) (...
   Lam () "3608" (Star ()) (...
```
Again, the second test demonstrates that our equivalence test runner does catch and report failures when the two terms are not equivalent. In this case, it fails because the church encoding of 0 is not equivalent to the church encoding of 1. Again, we will not use these error messages.

3.3 The Tests

We now describe the various test files included in the artifact, and relate them to the results in the paper. You can run each test individually, or run them all via the command:

$ bin/runAllTests.sh

3.3.1 Section 5

In Section 5 we discuss our representation of types, and show two example representations. The following tests check each example, by comparing a representation of a type built by our quoter with one built by hand.

$ bin/runEquivTest.sh fomega/reprIdType1.fw
$ bin/runEquivTest.sh fomega/reprIdType2.fw

Theorem 5.1 states: if a type has kind $\kappa$, then its representation has type $\kappa \to *$. The file fomega/Theorem5-1.fw contains a number of examples:

$ bin/runTypecheckTest.sh fomega/Theorem5-1.fw

3.3.2 Section 6

In Section 6 we discuss our representation of terms, and give two examples. The following tests check each example, by comparing a representation of a type built by our quoter with one built by hand.

$ bin/runEquivTest.sh fomega/reprId.fw
$ bin/runEquivTest.sh fomega/reprSelfApply.fw

Theorem 6.1 states: a term of type $\tau$ has a representation of type $\tt{Exp}\ \tau$. The file fomega/Theorem6-1.fw contains a number of examples:

$ bin/runTypecheckTest.sh fomega/Theorem6-1.fw

3.3.3 Section 7

In Section 7 we discuss our operations, and state several correctness theorems. The appendix includes detail proofs of each theorem. The following tests provide some examples.

Theorem 7.3: Correctness of Id.

$ bin/runEquivTest.sh fomega/Theorem7-3a.fw
$ bin/runEquivTest.sh fomega/Theorem7-3b.fw
$ bin/runEquivTest.sh fomega/Theorem7-3c.fw
$ bin/runEquivTest.sh fomega/Theorem7-3d.fw

Theorem 7.4: Correctness of unquote.

$ bin/runEquivTest.sh fomega/Theorem7-4a.fw
$ bin/runEquivTest.sh fomega/Theorem7-4b.fw
$ bin/runEquivTest.sh fomega/Theorem7-4c.fw
$ bin/runEquivTest.sh fomega/Theorem7-4d.fw

Theorem 7.5: Correctness of KBool.

$ bin/runEquivTest.sh fomega/Theorem7-5a.fw
$ bin/runEquivTest.sh fomega/Theorem7-5b.fw
$ bin/runEquivTest.sh fomega/Theorem7-5c.fw
$ bin/runEquivTest.sh fomega/Theorem7-5d.fw

Theorem 7.6: Correctness of isAbs.

$ bin/runEquivTest.sh fomega/Theorem7-6a.fw
$ bin/runEquivTest.sh fomega/Theorem7-6b.fw
$ bin/runEquivTest.sh fomega/Theorem7-6c.fw
$ bin/runEquivTest.sh fomega/Theorem7-6d.fw

Theorem 7.7: Correctness of KNat.

$ bin/runEquivTest.sh fomega/Theorem7-7a.fw
$ bin/runEquivTest.sh fomega/Theorem7-7b.fw
$ bin/runEquivTest.sh fomega/Theorem7-7c.fw
$ bin/runEquivTest.sh fomega/Theorem7-7d.fw

Theorem 7.8: Correctness of size.

$ bin/runEquivTest.sh fomega/Theorem7-8a.fw
$ bin/runEquivTest.sh fomega/Theorem7-8b.fw
$ bin/runEquivTest.sh fomega/Theorem7-8c.fw

Theorem 7.9: Correctness of KBools.

$ bin/runEquivTest.sh fomega/Theorem7-9a.fw
$ bin/runEquivTest.sh fomega/Theorem7-9b.fw
$ bin/runEquivTest.sh fomega/Theorem7-9c.fw
$ bin/runEquivTest.sh fomega/Theorem7-9d.fw

Theorem 7.10: Correctness of nf.

$ bin/runEquivTest.sh fomega/Theorem7-10a.fw
$ bin/runEquivTest.sh fomega/Theorem7-10b.fw
$ bin/runEquivTest.sh fomega/Theorem7-10c.fw
$ bin/runEquivTest.sh fomega/Theorem7-10d.fw

Correctness of CPS (the operation on types that specifies the result type of cps).
```
$ bin/runEquivTest.sh fomega/CPSId.fw
```

Correctness of cps (the operation of terms).

$ bin/runEquivTest.sh fomega/cpsId1.fw
$ bin/runEquivTest.sh fomega/cpsId2.fw
$ bin/runEquivTest.sh fomega/cpsId3.fw
$ bin/runEquivTest.sh fomega/cpsSelfApply1.fw
$ bin/runEquivTest.sh fomega/cpsSelfApply2.fw
$ bin/runEquivTest.sh fomega/cpsSelfApply3.fw
$ bin/runEquivTest.sh fomega/cpsSelfApply4.fw
$ bin/runEquivTest.sh fomega/cpsSelfApply5.fw

3.3.4 Section 8

In Section 8, we state that each of our operations is self-applicable, meaning that it can be applied to a representation of itself. The following tests demonstrate this.

$ bin/runEquivTest.sh fomega/selfApply_unquote.fw
$ bin/runEquivTest.sh fomega/selfApply_isAbs.fw
$ bin/runTypecheckTest.sh fomega/selfApply_size.fw
$ bin/runEquivTest.sh fomega/selfApply_nf.fw
$ bin/runTypecheckTest.sh fomega/selfApply_cps.fw

Note that we only test that the self-applications of size and cps type-check, since it is not practical to express the expected result for an equivalence test.

Syntax in Paper	Syntax in Artifact
\(\forall \alpha:*. \alpha \to \alpha\)	`Pi A:*. A -> A`
\(\Lambda \alpha:*. \lambda x:\alpha. x\)	`\A:*. \x:A. x`

	Syntax in Paper	Syntax in Artifact
Representation of a type	\(\overline{\forall \alpha:*. \alpha \to \alpha}\)	`[Pi A:*. A -> A]`
Representation of a term	\(\overline{\Lambda \alpha:*. \lambda x:\alpha. x}\)	`[\A:*. \x:A. x]`