# Structure

## Modules¶

A module is defined by a single flowspec file. A module can contain several sections, for defining control-flow, data flow, types, and functions. Modules can import other modules.

module $module-id imports$module-ref*

$section*  ## Terms and Patterns¶ FlowSpec defines various data types, including terms, tuples, sets, and maps. These can be constructed by the user, or introduced by matching on the AST. term = ctor-id "(" {term ","}* ")" | "(" {term ","}* ")" | "{" {term ","}* "}" | "{" term "|" {term ","}* "}" | "{" {(term "|->" term) ","}* "}" | "{" term "|->" term "|" {(term "|->" term) ","}* "}"  Examples of these terms can be found in the Expressions subsection. Control flow and data flow rules can use patterns to define which rules apply to which AST nodes. pattern = ctor-id "(" {pattern ","}* ")" | "(" {pattern ","}* ")" | var-id "@" pattern | "_" | var-id  Example. The following shows an example of a pattern, matching a VarDec constructor with an Int child, binding some of the subterms. VarDec(n, _, num@Int(i))  ## Control Flow¶ The control-flow section contains the rules that define the control-flow for the subject language. control-flow rules$control-flow-rule*


### Control Flow Rules¶

A control-flow rule consists of a pattern and a corresponding list of control-flow chains.

control-flow-rule = "root"? pattern "=" {cfg-chain ","}+
| "node" pattern

cfg-chain = {cfg-chain-elem "->"}+

cfg-chain-elem = "entry"
| "exit"
| variable
| "node" variable
| "this"


Example. Module that specifies how the control-flow for the Add AST node goes from the lhs, the rhs, and then to the Add itself. It also specifies that Int must have a node in the control-flow graph.

module control

control-flow rules

node Int(_)
Add(l, r) = entry -> l -> r -> this -> exit


### Root Rules¶

A root of the control-flow defines the start and end nodes of a control-flow graph. You can have multiple control-flow graphs in the same AST, but not nested ones. Each control-flow graph has a unique start and end node. A root control-flow rule introduces the start and end node. In other control-flow rules these nodes can be referred to for abrupt termination.

cfg-chain-elem = ...
| "start"
| "end"


Example. Module that defines control-flow for a procedure, and the return statement that goes straight to the end of the procedure.

module control

control-flow rules

root Procedure(args, _, body) = start -> args -> body -> end
Return(_) = entry -> this -> end


## Data Flow¶

### Properties¶

The data flow section contains definitions of the properties to compute, and the rules that define how these properties are computed.

properties

property-definition*


A property has a name, and a corresponding lattice type. The result after analysis will be a lattice of this type for each node in the control-flow graph.

property-definition = name ":" lattice


Example. Lattice definition for a constant-value analysis.

properties

values: Map[name, Value]


### Rules¶

The data flow rules specify how data should flow across the control-flow graph.

property rules

property-rule*

property-rule = name "(" prop-pattern ")" "=" expr
prop-pattern = name "->" pattern
| pattern "->" name
| pattern "." "start"
| pattern "." "end"


Example. A simple specification for a constant-value analysis.

property rules

values(_.end) = Map[string, Value].bottom
values(prev -> VarDec(n, _, Int(i))) = { k |-> v | (k |-> v) <- values(prev), k != n } \/ {n |-> Const(i)}
values(prev -> VarDec(n, _, _))      = { k |-> v | (k |-> v) <- values(prev), k != n } \/ {n |-> Top()}
values(prev -> _) = values(prev)


## Types¶

Algebraic datatypes can be defined for use within lattices definitions. Users can directly match these datatypes, or construct new values.

types

type-definition*


An algebraic datatype consists of a constructor and zero or more arguments.

name =
("|" ctor-id "(" {type ","}* ")")+


Example. The definition for an algebraic type ConstProp used in constant value analysis.

types

ConstProp =
| Top()
| Const(int)
| Bottom()


## Lattices¶

Lattices are the main data type used in data-flow analysis, because of their desirable properties. Properties (the analysis results) must always be of type lattice. FlowSpec contains some builtin lattice types, but users can also specify their own.

lattices

lattice-definition*


Lattice definitions must include the following: the underlying datatype, a join operator (either least-upper bound or greatest-lower bound), a top, and a bottom.

name where
type = type
lub([name], name) = expr
top = expr
bottom = expr


Example. A lattice definition using the ConstProp above to define a Value type.

lattices
Value where
type = ConstProp
bottom = Bottom()
top = Top()
lub(l, r) = match (l, r) with
| (Top(), _) => Top()
| (_, Top()) => Top()
| (Const(i), Const(j)) => if i == j then Const(i) else Top()
| (_, Bottom()) => l
| (Bottom(), _) => r


## Functions¶

Functions make it possible to reuse functionality and avoid duplication of logic.

functions

function-definition*

name([{(name ":" type) ","}+]) =
expr


## Expressions¶

### Integers¶

Integer literals are written with an optional minus sign followed by one or more decimals.

Supported integer operations are:

• Addition [+]
• Subtraction [-]
• Multiplication [*]
• Division [/]
• Modulo [%]
• Negate [-]
• Comparison [<, <=, >, >=, ==, !=]

### Booleans¶

Boolean literals true and false are available as well as the usual boolean operations:

• And [&&]
• Or [||]
• Not [!]

### Sets and Maps¶

Set and map literals are both denoted with curly braces. A set literal contains a comma-separated list of elements: {elem1, elem2, elem3}. A map literal contains a comma-separated list of bindings of the form key |-> value: { key1 |-> value1, key2 |-> value2 }.

Operations on sets and maps include

• Union [\/]
• Intersection [/\]
• Set/map minus [\]
• Containment/lookup [in]

There are also comprehensions of the form { new | old <- set, conditions } or { newkey |-> newvalue | oldkey |-> oldvalue <- map, condition }, where new elements or bindings are gathered based on old ones from a set or map, as long as the boolean condition expressions hold. Such a condition expression may also be a match expression without a body for the arms. This is commonly used to filter maps or sets.

Example. The following are some examples of sets and maps.

// A map comprehension filtering the key n
{ k |-> v | (k |-> v) <- values(prev), k != n }

// A map literal
{n |-> Top()}

// A set comprehension filtering the value n
{ k | k <- live(prev), k != n }

// A set literal
{ n, "b", "foo" }


### Match¶

Pattern matching can be done with a match expression: match expr with | pattern1 => expr2 | pattern2 => expr2, where expr are expressions and pattern are patterns. Terms and patterns are defined at the start of the reference.

### Variables and References¶

Pattern matching can introduce variables. Other references include values in the lattice, such as MaySet.bottom or MustSet.top.

### Functions and Lattice Operations¶

User defined functions are invoked with functionname(arg1, arg2). Lattice operations can be similarly invoked, requiring the type name: MaySet.lub(s1, s2).

### Property Lookup¶

Property lookup is similar to a function call, although property lookup only ever has a single argument.

Example. The following property rule performs a set comprehension over the results of a property lookup, live(prev), where the property live has been declared in the properties section, and next is bound in the pattern.

live(VarDec(n, _, _) -> next) = { k | k <- live(next), k != n }


### Term Positions¶

FlowSpec provides a builtin function that returns the position of a term: position(term). This can be used to differentiate two terms from an AST that are otherwise equal.

## Lexical Grammar¶

### Identifiers¶

Most identifiers in FlowSpec fall into one of two categories, which we will refer to as:

Lowercase identifiers, that start with a lowercase character, and must match the regular expression [a-z][a-zA-Z0-9]*.
Uppercase identifiers, that start with an uppercase character, and must match the regular expression [A-Z][a-zA-Z0-9]*.


// ... single line ... for single-line comments