Virgil Language Overview

Syntax

// method declaration
method myMethod(a: type1): type2  
// field declaration
field myField: type1 = . . .;
// local declaration
local myLocal: type1 = . . .;
// switch cases don't fall through  
switch ( e ) {
   case (0) stmt;
   case (1, 2) { . . . } 
}

The syntax of Virgil is similar to languages such as C, Java, and C#, and is immediately readable to programmers familiar with those languages. Virgil contains familiar control structures such as if, for, while, and do...while. Like C, Virgil uses curly braces { } to represent blocks of code in loops, branches, method bodies, etc.

However, Virgil makes some slight adjustments to certain syntactic categories. For example, the notation for declaring methods, fields, and locals differs slightly in that types are placed after the name of variables, rather than before. Also, the syntax and semantics of switch cases differs slightly, as cases no longer fall through by default, but instead multiple case values can be grouped for the same case body.

Basic Types, Arrays, and Strings

local f: int = 42;
local g: char = 'A';
local h: boolean = true;
local i: int[] = { 0, 1, 7 }; 
local j: char[] = "Xanadu";

Virgil provides three basic primitive types. These primitive types are value types, and quantities of these types are never passed by reference.

int - a signed, 32-bit integer type with arithmetic operations
char - an 8-bit quantity for representing ASCII characters
boolean - a true / false type for representing conditions

Additionally, Virgil provides the array type constructor [] that can construct the array type T[] from any type T. Arrays have their length fixed upon allocation, and are always passed by reference. The index must be of type int, and all element accesses are checked against the bounds of the array and produce a language-level exception upon failure. However, unlike Java, Virgil arrays are not objects, and are not covariantly typed [1].

String constants in Virgil are represented with the char[] type, avoiding the need for a built-in string type and associated functionality. String constants are written in the familiar style with support for escape characters. String constants are immutable and stored in the read-only heap [2].

[1] = this means that B subtype of A does not imply B[] subtype of A[].
[2] = read-only arrays are not yet implemented in the prototype compiler.

Components

component MyComponent {
   field value: int;
   method reset(v: int): int {    
      local old = value;
      value = v;
      return old;
   }
}

A Virgil component is a collection of fields and methods that form a cohesive unit of functionality. Each component is instantiated only once, and its non-private fields and methods are accessible from other code in the program by using the component's name. This provides for global state and behavior in programs, but encapsulated and scoped within modules, reducing the problems associated with global state in languages like C and C++.

Classes and Inheritance

class Parent {
   field value: int;
   method init(v: int) { ... }   
}
class Child extends Parent {
   method init(v: int) { ... }   
   method reset(v: int) { ... }   
}

Virgil is a class-based language that is most closely related to Java, C++, and C#. Like Java, Virgil provides single inheritance between classes only, with all methods virtual by default, except those declared private. Objects are always passed by reference, and never by value. However, like C++, Virgil has no universal super-class akin to java.lang.Object from which classes inherit by default. But Virgil differs from C++ in two important ways: it is strongly typed, which forces dynamic type tests for explicit casts, and it does not provide pointer types.

Compile-time Initialization

component Strobe {
   field tree: Tree = new Tree();
   constructor() {
      tree.add(6, 9, 11, 300);
      . . .;
      tree.balance();
   } 
   method runtime() {
      while ( true ) {
         local v = . . .;
	 if ( tree.find(v) ) . . .;
      }
   }
}

The most significant feature of Virgil that is not in other mainstream languages is the concept of initialization time. To avoid the need for a large runtime system that dynamically manages heap memory and performs garbage collection, Virgil does not allow applications to allocate memory from the heap at runtime. Instead, the Virgil compiler allows the application to run initialization routines at compilation time, while the program is being compiled. These initialization routines allow the program to pre-allocate and initialize all data structures that it will need at runtime. This computation phase is Turing-complete, which means that the application can perform any computation, including building complex data structures such as trees and hashtables, building numerical approximation tables, allocating resource pools, and generally configuring itself.

When the application's initialization routines terminate, the live fields of the components of the program serve as the roots for computing the reachable heap of the program. Unreachable objects and arrays that are allocated during the initialization phase are discarded. The reachable heap is then compiled directly into the program binary and is immediately available to the program at runtime. The Virgil compiler also optimizes the implementation of the program against this reachable heap and removes dead code with an optimization known as reachable members analysis, which is described in more detail in the publications available.

Delegates

class List {
   field head: Link;
   method add(i: Item) { . . . }
   method apply(f: function(Item)) {
      local p = head;
      for ( ; p != null; p = p.next )  
         f(p.item);
   }
}
component Client {
   method printAll(list: List) {
      list.apply(print);
   }
   method append(src: List, dst: List) {
      src.apply(dst.add);
   }
   method print(i: Item) { . . . }
}

Virgil incorporates one important feature that makes a compromise between the functional paradigm and the object-oriented paradigm by borrowing from C# the delegate concept, which is a first-class value that represents a reference to a method. A delegate in Virgil may be bound to a component method or to an instance method of a particular object; either kind can be used interchangeably, provided the argument and return types match. Delegate types in Virgil are declared using the function type constructor. For example, function(int): int represents the type of a function that takes a single integer as an argument and returns an integer.

Delegate syntax generalizes the common expr.method(args) notation for instance method calls in C# and Java by allowing expr.method to denote a delegate value and expr(args) to denote the application of the delegate value expr to the arguments. Unlike C#, the use of delegate values, either creation or application, does not require allocating memory from the heap. Delegates are implemented as a tuple of a pointer to the bound object and a pointer to the delegate's code.

Raw Types

component RawTest {
   field f1: 7;  // a 7-bit field
   field f2: 16; // a 16-bit field
   field f3: 23; // a 23-bit field  

   field f4: 16 = 0xffea;
   field f5: 3  = 0b001;

   method mask(a: 32): 16 { 
      return a & mask; 
   }
   method getBit(a: 32, b: int): 1 {  
      return a[b];
   }
}

Virgil makes an explicit distinction between the basic type int and the representation of bit-level quantities such as hexadecimal or binary constants. To represent bit-level values, Virgil defines a family of types called the raw types which represent values of a particular known width in bits. These types are denoted simply by the integer constants 1 to 64. Context in Virgil programs avoids confusion of the raw types and integer constants representing values.

The bitwise operators such as & (and), | (or), ^ (xor), << (shift left), >> (shift right), and ~ (one's complement) are defined on raw types, preserving more precise information about the width in bits of program quantities, which is both more convenient and less error prone than masking and shifting in other languages. Values of raw type can also be indexed like an array with the [] subscript operator, which allows reading or writing of individual bits with convenient notation. For convenience, values of type int can be implicitly converted to 32-bit raw values, values of type char can be converted to 8-bit raw values, and values of type boolean can be converted to 1-bit raw values.

Hardware Access and Interrupts

program P {
   entrypoint "main" = 
      Blink.entry;
   entrypoint "timer0_interrupt" = 
      Blink.interrupt;
}
component Blink {
   method entry() {
      device.TIMER = 0b1001;
   }
   method interrupt() {
      device.PIN0 ^= 0b0001;
   }
}

Virgil is designed to allow hardware drivers to be written in the language without relying on underlying unsafe features. Virgil allows hardware state in the form of device registers to be exposed to the program, which can then read and write these registers as if they were fields of a special component named device. Device registers have a raw type, which is the number of bits of storage in the register. The names and sizes of the registers available to the program are determined by the target hardware device.

Programmers can write interrupt handlers for device and microcontroller interrupts in Virgil and connect them directly to the hardware interrupt routine. This is done through a master program declaration, which attaches methods declared in the program to entrypoints corresponding to the main entrypoint and hardware interrupts of the device.