diff options
Diffstat (limited to 'doc/limbo/limbo.ms')
| -rw-r--r-- | doc/limbo/limbo.ms | 5070 |
1 files changed, 5070 insertions, 0 deletions
diff --git a/doc/limbo/limbo.ms b/doc/limbo/limbo.ms new file mode 100644 index 00000000..84f25533 --- /dev/null +++ b/doc/limbo/limbo.ms @@ -0,0 +1,5070 @@ +...FP lucidasans +...FP palatino +.FP palatino +." .fp 6 RR R +.nr PI 2n +.de P1 +.DS L +.ft CW +.ps -1 +.vs -1u +.ta .75i 1.5i 2.25i 3i 3.75i 4.5i 5.25i 6i 6.75i 7.5i +.. +.de P2 +.ft R +.ps +1 +.vs +1u +.DE +.. +.de s1 +.DS +.br +.ft I +.. +.de s2 +.br +.DE +.ft R +.. +...ds CF "Copyright 2000 Lucent Technologies Inc. Modifications Vita Nuova Limited. +.TL +The Limbo Programming Language +.AU +Dennis M. Ritchie +.br +(Revised 2005 by Vita Nuova) +.SP .5i exactly +.PP +Limbo is a programming language intended for applications +running distributed systems on small computers. +It supports modular programming, +strong type checking at compile- and run-time, +interprocess communication over typed channels, +automatic garbage collection, +and simple abstract data types. +It is designed for safe execution even on +small machines without hardware memory protection. +.PP +In its implementation for the Inferno operating system, +object programs generated by the Limbo compiler run +using an interpreter for a fixed virtual machine. +Inferno and its accompanying virtual machine run either stand-alone +on bare hardware +or as an application under conventional operating systems like +Unix, Windows 2000, Linux, FreeBSD, MacOSX, and Plan 9. +For most architectures, including +Intel x86, ARM, PowerPC, MIPS and Sparc, Limbo object programs +are transformed on-the-fly into instructions for the underlying hardware. +.NH 1 +Overview and introduction +.PP +A Limbo application consists of one or more +.I modules , +each of which supplies an interface declaration and +an implementation part. +A module that uses another module +includes its declaration part. +During +execution, a module dynamically attaches another module by +stating the other module's type identifier and a place from which to load +the object code for its implementation. +.PP +A module declaration specifies the functions and data it will make visible, +its data types, and constants. +Its implementation part defines the functions and data visible at its interface and +any functions associated with its data types; +it may also contain definitions for functions used only internally and +for data local to the module. +.PP +Here is a simple module to illustrate the flavour of the language. +.P1 +1 implement Command; + +2 include "sys.m"; +3 include "draw.m"; + +4 sys: Sys; + +5 Command: module + { +6 init: fn (ctxt: ref Draw->Context, argv: list of string); +7 }; +.P2 +.P1 +8 # The canonical "Hello world" program, enhanced +9 init(ctxt: ref Draw->Context, argv: list of string) +10 { +11 sys = load Sys Sys->PATH; +12 sys->print("hello world\en"); +13 for (; argv!=nil; argv = tl argv) +14 sys->print("%s ", hd argv); +15 sys->print("\en"); +16 } +.P2 +A quick glance at the program reveals that +the syntax of Limbo is influenced by C in its expressions, +statements, and some conventions (for example, look at lines 13-14), +and also by Pascal and its successors (the declarations on lines 4, 6, 9). +When executed in the Inferno environment, the program writes +.CW hello +.CW world +somewhere, then echoes its arguments. +.PP +Let's look at the program line-by-line. +It begins (line 1) by saying that this is the implementation of module +.CW Command . +Line 2 includes a file (found in a way analogous to C's +.CW #include +mechanism) named +.CW sys.m . +This file defines the interface to module +.CW Sys ; +.ds CF +it says, in part, +.P1 +Sys: module { + PATH: con "$Sys"; + . . . + print: fn (s: string, *): int; + . . . +}; +.P2 +This declares +.CW Sys +to be the type name for a module containing among other things a +function named +.CW print ; +the first argument of +.CW print +is a string. +The +.CW * +in the argument list specifies that further arguments, of +unspecified type, may be given. +.PP +Line 3 includes +.CW draw.m ; +only one piece of information, mentioned below, +is used from it. +Line 4 declares the variable +.CW sys +to be of type +.CW Sys ; +its name will be visible throughout the remainder of the file +describing this module. +It will be used later to refer to an instance of the +.CW Sys +module. +This declaration initializes it to +.CW nil ; +it still needs to be set to a useful value. +.PP +Lines 5-7 constitute the declaration of +.CW Command , +the module being implemented. +It contains only a function named +.CW init , +with two arguments, a +.CW ref +.CW Draw->Context +and a list of strings, +and it doesn't +return any value. +The +.CW ref +.CW Draw->Context +argument would be used if the program did any +graphics; it is a data type defined in +.CW draw.m +and refers to the display. +Since the program just writes text, it won't be used. +The +.CW init +function isn't special to the Limbo language, +but it is conventional in the environment, +like +.CW main +in C. +.PP +In a module designed to be useful +to other modules in an application, it would be wise to +take the module declaration for +.CW Command +out, put it in a separate file called +.CW command.m +and use +.CW "include +.CW command.m +to allow this module and others to refer to it. +It is called, for example, by the program loader in the Inferno +system to start the execution of applications. +.PP +Line 8 is a comment; everything from the +.CW # +to the end of line is ignored. +.PP +Line 9 begins the definition for the +.CW init +function that was promised in the module's declaration +(line 6). +The argument that is a list of strings is named +.CW argv . +.PP +Line 11 connects the program +being written to the +.CW Sys +module. +The first token after +.CW load +is the target module's name as defined by its interface +(here found in the +.CW include +on line 2) +The next token is the place +where the code for the module can be found; it is a string +that usually names a file. +Conventionally, in the Inferno system, +each module contains a constant declaration for the name +.CW PATH +as a string that names the file where +the object module can be found. +Loading the file is performed dynamically during +execution except for a few modules built +into the execution environment. +(These include +.CW Sys ; +this accounts for the peculiar file name +.CW "$Sys" +as +the value of +.CW PATH .) +.PP +The value of +.CW load +is a reference to the +named module; line 11 assigns it +to the variable +.CW sys +for later use. +The +.CW load +operator dynamically loads the code for the named +module if it is not already present and instantiates +a new instance of it. +.PP +Line 12 starts the work by printing a familiar message, +using the facilities provided by module +.CW Sys +through its handle +.CW sys ; +the notation +.CW sys->print(...) +means to call the +.CW print +function of the module referred to by +.CW sys . +The interface of +.CW Sys +resembles a binding to some of +the mechanisms of Unix and the ISO/ANSI C library. +.PP +The loop at lines 13-14 takes the +.CW "list +.CW of +.CW string +argument to +.CW init +and iterates over it using the +.CW hd +(head) and +.CW tl +(tail) operators. +When executed, this module combines the +traditional `Hello world' and +.CW echo . +.NH 1 +Lexical conventions +.PP +There are several kinds of tokens: +keywords, identifiers, constants, strings, expression operators, +and other separators. +White space (blanks, tabs, new-lines) is ignored except that +it serves to separate tokens; sometimes it is required +to separate tokens. +If the input has been parsed into tokens up to a particular +character, the next token is taken to include the longest +string of characters that could constitute a token. +.PP +The native character set of Limbo is Unicode, +which is identical with the first 16-bit plane of the ISO 10646 standard. +Any Unicode character may be used in comments, or in strings +and character constants. +The implementation assumes that source files use the UTF-8 representation, +in which 16-bit Unicode characters are represented as sequences +of one, two, or three bytes. +.NH 2 +Comments +.PP +Comments begin with the +.CW # +character and extend to the end of the line. +Comments are ignored. +.NH 2 +Identifiers +.PP +An identifier is a sequence of letters and digits +of which the first is a letter. +Letters are the Unicode characters +.CW a +through +.CW z +and +.CW A +through +.CW Z , +together with the underscore character, and +all Unicode characters with encoded values greater than 160 +(A0 hexadecimal, the beginning of the range corresponding to Latin-1). +.PP +Only the first 256 characters in an identifier +are significant. +.NH 2 +Keywords +.PP +The following identifiers are reserved for use as keywords, +and may not be used otherwise: +.P1 +.ta 1i 2i 3i 4i 5i + adt alt array big + break byte case chan + con continue cyclic do + else exit fn for + hd if implement import + include int len list + load module nil of + or pick real ref + return self spawn string + tagof tl to type + while +.P2 +The word +.CW union +is not currently used by the language. +.NH 2 +Constants +.PP +There are several kinds of constants for denoting values of the +basic types. +.PP +.NH 3 +Integer constants +.PP +Integer constants have type +.CW int +or +.CW big . +They can be represented in several ways. +.PP +Decimal integer constants consist of a sequence of decimal +digits. +A constant with an explicit radix +consists of a decimal radix followed by +.CW R +or +.CW r +followed by the digits of the number. +The radix is between 2 and 36 inclusive; +digits above 10 in the number +are expressed using letters +.CW A +to +.CW Z +or +.CW a +to +.CW z . +For example, +.CW 16r20 +has value 32. +.PP +The type of a decimal or explicit-radix number is +.CW big +if its value exceeds +.CW 2\u31\d\(mi1 , +otherwise it is +.CW int . +.PP +Character constants consist of a single Unicode +character enclosed within single-quote characters +.CW ' . +Inside the quotes the following escape sequences represent +special characters: +.DS +\f(CW\e\e\fR backslash +\f(CW\e'\fR single quote +\f(CW\e"\fR double quote +\f(CW\ea\fR bell (BEL) +\f(CW\eb\fR backspace (BS) +\f(CW\et\fR horizontal tabulation (HT) +\f(CW\en\fR line feed (LF) +\f(CW\ev\fR vertical tabulation (VT) +\f(CW\ef\fR form feed (FF) +\f(CW\er\fR carriage return (CR) +\f(CW\eu\fIdddd \fRUnicode character named by 4 hexadecimal digits +\f(CW\e0\fR NUL +.DE +Character constants have type +.CW int . +.NH 3 +Real constants +.PP +Real constants consist of a sequence of decimal digits +containing one period +.CW . +and optionally followed by +.CW e +or +.CW E +and then by a possibly signed integer. +If there is an explicit exponent, the period is +not required. +Real constants have type +.CW real . +.NH 3 +Strings +.PP +String constants are sequences of Unicode characters contained in double +quotes. +They cannot extend across source lines. +The same escape sequences listed above for character +constants are usable within string constants. +Strings have type +.CW string . +.NH 3 +The nil constant +.PP +The constant +.CW nil +denotes a reference to nothing. +It may be used where an object of a reference +type is expected; +otherwise uninitialized values of reference type +start off with this value, it can be assigned to +reference objects, and reference types can be +tested for equality with it. +(The keyword has other uses as well.) +.NH 2 +Operators and other separators +.PP +The operators are +.P1 +.ta .5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i 5i 5.5i 6.0i 6.5i + + - * / % & | ^ + == < > <= >= != << >> + && || <- :: + = += -= *= /= %= &= |= ^= <<= >>= + := + ~ ++ -- ! ** +.P2 +The other separators are +.P1 +.ta .5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i 5i 5.5i + : ; ( ) { } [ ] + , . -> => +.P2 +.NH 1 +Syntax notation +.PP +In this manual, Limbo syntax is described by a modified BNF +in which syntactic categories are named in an +.I italic +font, and literals in +.CW typewriter +font. +Alternative productions are listed on separate lines, and +an optional symbol is indicated with +the subscript ``opt.'' +.NH 1 +Types and objects +.PP +Limbo has three kinds of objects. +.I Data +objects exist in the storage associated with +a module; they can be manipulated by arithmetic operations, +assignment, selection of component entities, and other concrete +operations. +Each data object has a type that determines what can be stored +in it and what operations are applicable. +.PP +The second kind of object is the +.I function . +Functions are characterized by the types of the arguments they +accept and the values they return, and are associated with +the modules in which they are defined. +Their names can be made visible in their module's declaration, +or they can be encapsulated within the +.CW adt +(abstract data types) of their modules, +or they can exist privately within their module. +.PP +Finally, Limbo programs are organized into +.I modules : +a named collection of constants, abstract data types, +data, and functions made available by that module. +A module declaration displays the +members visible to other modules; +the module's implementation +defines both the publicly visible members and its +private parts, including the data objects it uses. +A module that wishes to use +the facilities of another includes its declaration in order to +understand what it exports, but +before using them it explicitly loads the new module. +.NH 2 +Types +.PP +Limbo has several basic types, some built-in higher abstractions, +and other ways of composing new types. +In declarations and some other places, constructions naming +a type are used. +The syntax is: +.s1 +type: + data-type + function-type +.s2 +Functions will be discussed in §7 below. +First, data types will be explored. +.NH 2 +Data types +.PP +The syntax of data types is +.s1 +data-type: + CbyteI + CintI + CbigI + CrealI + CstringI + tuple-type + Carray of Idata-type + Clist of Idata-type + Cchan of Idata-type + adt-type + Cref Iadt-type + Cref Ifunction-type + module-type + module-qualified-type + type-name + +data-type-list: + data-type + data-type-list C,I data-type +.s2 +Objects of most data types have +.I value +semantics; when they +are assigned or passed to functions, the destination receives a copy of the +object. +Subsequent changes to the assigned object itself have no effect on +the original object. +The value types are +.CW byte , +.CW int , +.CW big , +.CW real , +.CW string , +the +.CW tuple +types, and +abstract data types or +.CW adt . +The rest have +.I reference +semantics. +When they are assigned, the quantity actually assigned +is a reference to (a pointer to) an underlying object +that is not copied; thus changes or operations on +the assigned value affect the original object. +Reference types include lists, arrays, channels, modules, +.CW ref +.CW adt , +and +.CW ref +.CW fn +types. +.NH 3 +Basic types +.PP +The five basic data types are denoted by +.CW byte , +.CW int , +.CW big , +.CW real , +and +.CW string . +.PP +Bytes are unsigned 8-bit quantities. +.PP +Integers +.CW int ) ( +are 32-bit signed quantities represented in two's complement +notation. +Large integers +.CW big ) ( +are 64-bit signed quantities represented in two's complement notation. +.PP +Real numbers +.CW real ) ( +are 64-bit quantities represented in the +IEEE long floating notation. +.PP +The +.CW byte , +.CW int , +.CW big , +and +.CW real +types are collectively called arithmetic types. +.PP +Strings are rows of Unicode characters. +They may be concatenated and extended character-by-character. +When a string is indexed with a single subscript, it yields an integer +with the Unicode encoding of the character; +when it is indexed by a range, it yields another string. +.NH 3 +Tuple type +.PP +The +.I tuple +type, denoted +.s1 +tuple-type: + C( Idata-type-listC )I +.s2 +is a type consisting of an ordered collection of two or more objects, +each having its own data type. +For each tuple type, the types of the members are +fixed, but need not be identical; +for example, a function might return a tuple containing +an integer and a string. +Each tuple type is characterized solely by the +the order and identity of the types it contains. +Objects of tuple type may be assigned to a list of identifiers (to pick out the +components), and a parenthesized, comma-separated list of expressions +denotes a tuple. +.NH 3 +Array types +.PP +The +.I array +type describes a dynamically-sized row of objects, all of the same +type; it is indexed starting from 0. +An array type is denoted by +.s1 + Carray of Idata-type +.s2 +The size of an array is not part of its type; instead +it is part of the value. +The +.I data-type +may itself be an array, to achieve a multidimensional array. +.NH 3 +List types +.PP +A +.I list +is a sequence of like-typed objects; its denotation is +.s1 + Clist of Idata-type +.s2 +A list is a stack-like object, optimized for +a few operations: get the head (the first object), +get the tail (the rest of the list), place an object at the beginning. +.NH 3 +Channel types +.PP +A +.I channel , +whose type is written +.s1 + Cchan of Idata-type +.s2 +is a communication mechanism capable of sending and receiving objects of the +specified type to another agent in the system. +Channels may be used to communicate between local processes; +using library procedures, they may be connected +to named destinations. +In either case +.I send +and +.I receive +operations may be directed to them. +For example, +.P1 + chan of (int, string) +.P2 +is the type of a channel that transmits tuples consisting of +an integer and an string. +Once an instance of such a channel (say +.CW c ) +has been declared and initialized, +the statement +.P1 + c <-= (123, "Hello"); +.P2 +sends such a tuple across it. +.NH 3 +Abstract data types +.PP +An abstract data type or +.I adt +is an object that can contain data objects of several +different types and declare +functions that operate on them. +The syntax for declaring an +.CW adt +is given later. +Once an +.CW adt +has been declared, the identifier associated with it +becomes a data-type name. +.s1 +adt-type: + identifier + module-qualified-type +.s2 +.PP +There is also a +.CW ref +.CW adt +type representing a reference (pointer) to an +.CW adt . +It is denoted +.s1 + Cref Iadt-type +.s2 +where the identifier is the name of an +.CW adt +type. +.NH 3 +Module types +.PP +A module type name is an identifier: +.s1 +module-type: + identifier +.s2 +The identifier is declared as a module identifier by a +.I module-declaration , +as described in §6.5 below. +An object of module type serves as a handle for the +module, and is used to access its functions. +.NH 3 +Module-qualified type +.PP +When an +.CW adt +is declared within a module declaration, the type name of that +.CW adt +is not generally visible to the rest of the program unless a specific +.CW import +request is given (see §6.6, §10 below). +Without such a request, when +.CW adt +objects implemented by a module are declared by a client +of that module, the +.CW adt +type name is qualified: +.s1 +module-qualified-type: + identifier C->I identifier +.s2 +Here the first identifier is either the name of a module +or a variable of the module type; +the second is the name of a type +mentioned in the module declaration. +.NH 3 +Function reference types +.PP +A function reference type represents a reference to a function of a given type. +It is written as +.s1 + Cref Ifunction-type +.s2 +Function types are discussed in §4.3 below. +.NH 3 +Named types +.PP +Finally, data types may be named, using a +.CW type +declaration; this is discussed in §6.4 below. +.s1 +type-name: + identifier +.s2 +.NH 2 +Function types +.PP +A function type characterizes the arguments and return value of +a function. The syntax is +.s1 +function-type: + Cfn Ifunction-arg-ret + +function-arg-ret: + C( Iformal-arg-listOC ) IraisesO + C( Iformal-arg-listOC ) : Idata-type raisesO + +formal-arg-list: + formal-arg + formal-arg-listC , Iformal-arg + +formal-arg: + nil-or-ID-listC : Itype + nil-or-IDC : self refO Iidentifier + nil-or-IDC : self Iidentifier + C*I + +nil-or-ID-list: + nil-or-ID + nil-or-ID-list C, Inil-or-ID + +nil-or-ID: + identifier + CnilI + +raises: + Craises ( Inil-or-ID-listC )I + CraisesI nil-or-ID + +.s2 +That is, the denotation of a function type has the keyword +.CW fn +followed by a comma-separated list of its arguments +enclosed in parentheses, +and perhaps followed by the type the function returns. +Absence of a return value means that the function returns no +value: it is a procedure. +The names and types of arguments are specified. +However, the name of an argument may be replaced by +.CW nil ; +in this case it is nameless. +For example, +.P1 + fn (nil: int, nil: int): int + fn (radius: int, angle: int): int + fn (radius, angle: int): int +.P2 +all denote exactly the same type, +namely a function of two integers that returns an integer. +As another example, +.P1 + fn (nil: string) +.P2 +is the type of a function that takes a string argument +and returns no value. +.PP +The +.CW self +keyword has a specialized use within +.CW adt +declarations. +It may be used only for the first argument +of a function declared within an +.CW adt ; +its meaning is discussed in §6.3 below. +.PP +The star character +.CW * +may be given as the last argument in a function type. +It declares that +the function is variadic; during a call, actual arguments at its +position and following are passed in a manner +unspecified by the language. +For example, the type of the +.CW print +function of the +.CW Sys +module is +.P1 + fn (s: string, *): int +.P2 +This means that the first argument of +.CW print +is a string and that other arguments may be given when the function +is called. +The Limbo language itself has no way of accessing these arguments; +the notation is an artifice for describing facilities +built into the runtime system, such as the +.CW Sys +module. +.PP +The type of a function includes user-defined exceptions that it raises, +which must be listed in a corresponding +.CW raises +clause. +.NH 1 +Limbo programs +.PP +Limbo source programs that implement modules are stored in files, +conventionally named with the suffix +.CW .b . +Each such file begins with a single +.CW implement +directive naming the type of the module being implemented, +followed by a sequence of declarations. +Other files, conventionally named with the suffix +.CW .m , +contain declarations for things obtainable from other modules. +These files are incorporated by an +.CW include +declaration in the implementation modules that need them. +At the top level, a program consists of a sequence +of declarations. +The syntax is +.s1 +program: + Cimplement Iidentifier-listC ; Itop-declaration-sequence + +top-declaration-sequence: + top-declaration + top-declaration-sequence top-declaration + +top-declaration: + declaration + identifier-listC := IexpressionC ;I + identifier-listC = IexpressionC ;I + C( Iidentifier-listC ) := IexpressionC ;I + module-declaration + function-definition + adt-declaration +.s2 +The +.CW implement +declaration at the start identifies the type of the module that +is being implemented. +The rest of the program consists of a sequence of various kinds of +declarations and definitions that announce the names +of data objects, types, and functions, and also create +and initialize them. +It must include a module declaration for the module +being implemented and the objects it announces, +and may also include declarations for the functions, data +objects, types, and constants used privately within the module +as well as declarations for modules used by it. +.PP +Declarations are used both at the top +level (outside of functions) and also inside functions +and module declarations. +Some styles of declaration +are allowed only in certain of these places, +but all will be discussed together. +.PP +Most implementation modules provide an implementation for one type of module. +Several module types may be listed, however, in the +.CW implement +declaration, when the implementation module implements them all. +When the same name appears in more than one such module type, it must +have the same type. +.NH 1 +Declarations +.PP +Declarations take several forms: +.s1 +declaration: + identifier-listC : ItypeC ;I + identifier-listC : ItypeC = IexpressionC ;I + identifier-listC : con IexpressionC ;I + Iidentifier-listC : import Iidentifier C;I + identifier-listC : typeI typeC ;I + identifier-listC : exceptionI tuple-typeO + Cinclude Istring-constantC ;I + +identifier-list: + identifier + identifier-listC , Iidentifier + +expression-list: + expression + expression-listC , Iexpression +.s2 +.NH 2 +Data declarations +.PP +These forms constitute the basic way to declare and +initialize data: +.s1 + identifier-listC : ItypeC ;I + identifier-listC : ItypeC = IexpressionC ;I +.s2 +A comma-separated sequence of identifiers is followed by a colon +and then the name of a type. +Each identifier is declared as having that type and denotes a +particular object +for rest of its scope (see §11 below). +If the declaration contains +.CW = +and an expression, the type must be a data type, and +all the objects are initialized from +the value of the expression. +In a declaration at the top level +(outside of a function), the expression must be +constant (see §8.5) or an array initialized with constant expressions; +the bound of any array must be a constant expression. +Lists and +.CW ref +.CW adt +types may not be initialized at the top level. +If an object is not explicitly initialized, then +it is always set to +.CW nil +if it has a reference type; +if it has arithmetic type, then it is set to 0 +at the top level and is undefined if it occurs +within a function. +.PP +For example, +.P1 + i, j: int = 1; + r, s: real = 1.0; +.P2 +declares +.CW i +and +.CW j +as integers, +.CW r +and +.CW s +as real. +It sets +.CW i +and +.CW j +to 1, +and +.CW r +and +.CW s +to 1.0. +.PP +Another kind of declaration is a shorthand. +In either of +.s1 + identifierC := IexpressionC ;I + C( Iidentifier-listC ) := IexpressionC ;I + +.s2 +identifiers on the left are declared using the type of the expression, +and are initialized with the value of the expression. +In the second case, the expression must be a tuple or an +.CW adt , +and the types and values attributed to the identifiers +in the list are taken from the members of the tuple, or the +data members of the +.CW adt +respectively. +For example, +.P1 + x: int = 1; +.P2 +and +.P1 + x := 1; +.P2 +are the same. +Similarly, +.P1 + (p, q) := (1, 2.1); +.P2 +declares the identifiers on the left as +.CW int +and +.CW real +and initializes them to 1 and 2.1 respectively. +Declarations with +.CW := +can also be expressions, and are discussed again in §8.4.4 below. +.NH 2 +Constant declarations +.PP +The +.CW con +declaration +.s1 + Iidentifier-listC : conI expressionC ;I +.s2 +declares a name (or names) for constants. +The +.I expression +must be constant (see §8.5). +After the declaration, +each identifier in the list may be used anywhere a constant +of the appropriate type is needed; +the type is taken from the type of the constant. +For example, after +.P1 + Seven: con 3+4; +.P2 +the name +.CW Seven +is exactly the same as the constant 7. +.PP +The identifier +.CW iota +has a special meaning in the expression in a +.CW con +declaration. +It is equivalent to the integer constant +.CW 0 +when evaluating the expression for the first (leftmost) identifier declared, +.CW 1 +for the second, and so on numerically. +For example, the declaration +.P1 + M0, M1, M2, M3, M4: con (1<<iota); +.P2 +declares several constants +.CW M0 +through +.CW M4 +with the values 1, 2, 4, 8, 16 respectively. +.PP +The identifier +.CW iota +is not reserved except inside the expression +of the +.CW con +declaration. +.NH 2 +adt declarations +.PP +An +.CW adt +or abstract data type contains data objects and functions that +operate on them. +The syntax is +.s1 +adt-declaration: + IidentifierC : adt { Iadt-member-listOC } ;I + +adt-member-list: + adt-member + adt-member-list adt-member + +adt-member: + identifier-listC : cyclicO Idata-typeC ;I + identifier-listC : con IexpressionC ;I + identifier-listC : Ifunction-typeC ;I + Cpick { Ipick-member-listC }I +.s2 +After an +.I adt-declaration , +the identifier becomes the name of the type of that +.CW adt . +For example, after +.P1 + Point: adt { + x, y: int; + add: fn (p: Point, q: Point): Point; + eq: fn (p: Point, q: Point): int; + }; +.P2 +the name +.CW Point +is a type name for an +.CW adt +of two integers and two +functions; the fragment +.P1 + r, s: Point; + xcoord: int; + ... + xcoord = s.x; + r = r.add(r, s); +.P2 +makes sense. +The first assignment selects one of the data members of +.CW s ; +the second calls one of the function members of +.CW r . +.PP +As this example indicates, +.CW adt +members are accessed by mentioning an object with the +.CW adt +type, a dot, and then the name of the member; +the details will be discussed in §8.13 below. +A special syntactic indulgence is available for functions declared within an +.CW adt : +frequently such a function +receives as an argument the same object used to access it +(that is, the object before the dot). +In the example just above, +.CW r +was both the object being operated on and the first argument to the +.CW add +function. +If the first formal argument of a function declared in an +.CW adt +is marked with the +.CW self +keyword, then in any calls to the function, the +.CW adt +object is implicitly passed to the function, and +is not mentioned explicitly in the actual argument list +at the call site. +For example, in +.P1 + Rect: adt { + min, max: Point; + contains: fn(r: self Rect, p: Point): int; + }; + + r1: Rect; + p1: Point; + ... + if (r1.contains(p1)) ... +.P2 +because the first argument of the +.CW contains +function is declared with +.CW self , +the subsequent call to it automatically passes +.CW r1 +as its first argument. The +.CW contains +function itself is defined elsewhere with this first +argument explicit. +(This mechanism is analogous to the +.I this +construct in C++ and other languages, +but puts the special-casing at the declaration site and makes it explicit.) +.PP +If +.CW self +is specified in the declaration of a function, it must also be +specified in the definition as well. For example, +.CW contains +would be defined +.P1 + Rect.contains(r: self Rect, p: Point) + { + . . . + } +.P2 +.PP +The +.CW adt +type in Limbo +does not provide control over the visibility +of its individual members; if any are accessible, all are. +.PP +Constant +.CW adt +members follow the same rules as ordinary constants (§6.2). +.PP +The +.CW cyclic +modifier will be discussed in §11.1. +.NH 3 +pick adts +.PP +An +.CW adt +which contains a +.CW pick +member is known as a +.I pick +.I adt . +A +.CW pick +.CW adt +is Limbo's version of a +.I "discriminated union" . +An +.CW adt +can only contain one +.CW pick +member and it must be the last component of the +.CW adt . +Each +.I identifier +enumerated in the +.I pick-tag-list +names a variant type of the +.CW pick +.CW adt . +The syntax is +.s1 +pick-member-list: + pick-tag-listC =>I + pick-member-list pick-tag-listC =>I + pick-member-list identifier-listC : cyclicO Idata-typeC ;I +.s2 +.s1 +pick-tag-list: + identifier + pick-tag-listC or Iidentifier +.s2 +.PP +The +.I pick-member-list +contains a set of data members for each +.I pick-tag-list . +These data members are specific to those variants of the +.CW pick +.CW adt +enumerated in the +.I pick-tag-list . +The +.CW adt +data members found outside of the +.CW pick +are common to all variants of the +.CW adt . +A +.CW pick +.CW adt +can only be used as a +.CW ref +.CW adt +and can only be initialized from a value of one of its variants. +For example, if +.CW Constant +is a +.CW pick +.CW adt +and +.CW Constant.Real +is one of its variant types then +.P1 + c : ref Constant = ref Constant.Real("pi", 3.1); +.P2 +will declare +.CW c +to have type +.CW ref +.CW Constant +and initialize it with a value of the variant type +.CW ref +.CW Constant.Real . +.NH 2 +Type declarations +.PP +The type declaration +.s1 + Iidentifier-listC : typeI data-type ;I +.s2 +introduces the identifiers as synonyms for the +given type. +Type declarations are transparent; that is, +an object declared with the newly-named +type has the same type as the one it abbreviates. +.NH 2 +Module declarations +.PP +A module declaration collects and packages declarations of +.CW adt , +functions, constants and simple types, and creates an +interface with a name +that serves to identify the type of the module. +The syntax is +.s1 +module-declaration: + IidentifierC : module { Imod-member-listOC } ;I + +mod-member-list: + mod-member + mod-member-list mod-member + +mod-member: + identifier-listC : Ifunction-typeC ;I + identifier-listC : Idata-typeC ;I + adt-declarationC ;I + identifier-listC : con Iexpression C;I + identifier-listC : type Itype C;I +.s2 +After a module declaration, the named +.I identifier +becomes the name of the type of that module. +For example, the declaration +.P1 +Linear: module { + setflags: fn (flag: int); + TRUNCATE: con 1; + Vector: adt { + v: array of real; + add: fn (v1: self Vector, v2: Vector): Vector; + cross: fn (v1: self Vector, v2: Vector): Vector; + dot: fn (v1: self Vector, v2: Vector); + make: fn (a: array of real): Vector; + }; + Matrix: adt { + m: array of array of real; + add: fn (m1: self Matrix, m2: Matrix): Matrix; + mul: fn (m1: self Matrix, m2: Matrix): Matrix; + make: fn (a: array of array of real): Matrix; + }; +}; +.P2 +is a module declaration for a linear algebra package that +implements two +.CW adt , +namely +.CW Vector +and +.CW Matrix , +a constant, +and a function +.CW setflags . +The name +.CW Linear +is the type name for the module, and it may be used to declare +an object referring to an instance of the module: +.P1 + linearmodule: Linear; +.P2 +Before the module can be used, it must be loaded, for example in +the style: +.P1 + linearmodule = load Linear "/usr/dmr/limbo/linear.dis"; + if (linearmodule == nil) { + sys->print("Can't load Linear\en"); + exit; + } +.P2 +The +.CW load +operator is discussed more fully in §8.4.5 below. +.PP +To initialize data declared as part of a module +declaration, an assignment expression may be used at the top level. +For example: +.P1 + implement testmod; + testmod: module { + num: int; + }; + . . . + num = 5; +.P2 +The right side of the assignment must be a constant expression (§8.5). +.NH 2 +Declarations with +.CW import +.PP +These declarations take the form +.s1 + Iidentifier-listC : import Iidentifier C;I +.s2 +Identifiers for entities +declared within a module declaration are normally +meaningful only in a context that +identifies the module. +The +.CW import +declaration lifts the names of specified members of a module +directly into the current scope. +The use of +.CW import +will be discussed more fully in §8.1.4 below, after the syntax +for expressions involving modules has been presented. +.NH 2 +Exception declarations +.PP +Exceptions represent run-time errors not data objects or values. +Exception declarations have the form: +.s1 + identifier-listC : exceptionI tuple-typeO +.s2 +Each identifier gives a compile-time name to a distinct user-defined run-time error, +signaled at run-time by a +.CW raise +statement that quotes that identifier, as described below. +An exception optionally includes a tuple of data values that qualifies the exception; +the types of those values are provided by the tuple type in this declaration. +.NH 2 +Declarations with +.CW include +.PP +The string following the +.CW include +keyword names +a file, which is inserted into the program's +text at that point. +The included +text is treated like text literally present. +Conventionally, included files declare +module interfaces and are named with the suffix +.CW .m . +The directories to be searched for included files +may be specified to the Limbo compiler command. +Include files may be nested. +.NH 1 +Function definitions +.PP +All executable code +is supplied as part of a function definition. +The syntax is +.s1 +function-definition: + function-name-part function-arg-retC { IstatementsC }I + +function-name-part: + identifier + function-name-partC . Iidentifier +.s2 +The syntax of the statements in a function will be discussed in §9 below. +As a brief example, +.P1 + add_one(a: int): int + { + return a+1; + } +.P2 +is a simple function +that might be part of the top level of a module. +.PP +Functions that are declared within an +.CW adt +use the qualified form of definition: +.P1 + Point: adt { + x, y: int; + add: fn (p: Point, q: Point): Point; + eq: fn (p: Point, q: Point): int; + } + . . . + Point.add(p: Point, q: Point): Point + { + return Point(p.x+q.x, p.y+q.y); + } +.P2 +Because an +.CW adt +may contain an +.CW adt , +more than one qualification is possible. +.NH 1 +Expressions +.PP +Expressions in Limbo resemble those of C, although some +of the operators are different. +The most salient difference between Limbo's expression +semantics and those of C is that Limbo +has no automatic coercions between types; in Limbo every +type conversion is explicit. +.NH 2 +Terms +.PP +The basic elements of expressions are terms: +.s1 +term: + identifier + constant + real-constant + string-constant + CnilI + C( Iexpression-listC )I + termC . Iidentifier + termC -> Iterm + termC ( Iexpression-listOC )I + termC [ IexpressionC ]I + termC [ IexpressionC : IexpressionC ]I + termC [ IexpressionC : ]I + termC ++I + termC --I +.s2 +The operators on terms all associate to the left, +and their order of precedence, with tightest listed first, is as follows: +.P1 + . + -> + () [] ++ -- +.P2 +.NH 3 +Simple terms +.PP +The first five kinds of term are constants and identifiers. +Constants have a type indicated by their syntax. +An identifier used in an expression is often a previously declared +data object with a particular data type; when used as a term +in an expression +it denotes the value stored in the object, and the term has +the declared object's type. +Sometimes, as discussed below, identifiers used in expressions +are type names, function names, or module identifiers. +.NH 3 +Parenthesized terms +.PP +A comma-separated list of expressions enclosed in parentheses +is a term. +If a single expression is present in the list, +the type and value are those of the expression; +the parentheses affect only the binding +of operators in the expression of which the term +is a part. +If there is more than one expression in the list, +the value is a tuple. +The member types +and values are taken from those of the expressions. +.NH 3 +Selection +.PP +A term of the form +.s1 + termC . Iidentifier +.s2 +denotes selection of a member of an +.CW adt . +The term must be a +type name or yield an object; +its type must be +.CW adt +or +.CW ref +.CW adt ; +the identifier must be a member of the +.CW adt . +The result denotes the named member (either a data object +or a function). +.NH 3 +Module qualification +.PP +A term of the form +.s1 + termC -> Iterm +.s2 +denotes module qualification. +The first term identifies a module: either it is a module type name, +or it is an expression of module type. +The second term is a constant name, type, or function specified within +that module's declaration. +Either the module type name or +an object of the module's type suffices to qualify constants and types; +functions directly exported by the module or contained within its +.CW adt +must be qualified by an object of the module's type, initialized with +.CW load . +.PP +An example using an abridged version of an example above: given +.P1 + Linear: module { + setflags: fn(flag: int); + TRUNCATE: con 1; + Vector: adt { + make: fn(v: array of real): Vector; + v: array of real; + }; + }; +.P2 +one might say +.P1 + lin := load Linear "/dis/linear.dis"; + a: array of real; + + v1: lin->Vector; + v2: Linear->Vector; + lin->setflags(Linear->TRUNCATE); + v1 = lin->(Linear->Vector).make(a); + v1 = lin->v1.make(a); + v1 = lin->v1.add(v1); + v1.v = nil; +.P2 +Here, the declarations for +.CW v1 +and +.CW v2 +are equivalent; either a module type name (here, +.CW Linear ) +or a handle (here, +.CW lin ) +suffices to identify the module. +In the call to +.CW setflags , +a handle +is required for the call itself; +the type name is sufficient for the constant. +.PP +When calling a function associated with an +.CW adt +of another module, it is necessary to identify +both the module and the +.CW adt +as well as the function. +The two calls to the +.CW make +function illustrate two ways of doing this. +In the first, +.P1 + v1 = lin->(Linear->Vector).make(a); +.P2 +the module handle +.CW lin +is specified first, then +the type name of the +.CW Vector +.CW adt +within it, and then the function. +In the second call +.P1 + v1 = lin->v1.make(a); +.P2 +instead of using a type name to specify the +.CW adt , +an instance of an object of the appropriate type is +used instead. +In the first example, the parentheses are required because +the qualification operators associate to the left. +.P1 + v1 = lin->Vector.make(a); # Wrong + v1 = lin->Linear->Vector.make(a); # Wrong +.P2 +The first is wrong because the same +.CW lin +can't serve as a qualifier for both the type and the call; +the second is wrong because +.CW lin->Linear +is meaningless. +.PP +Using +.CW import +makes the code less verbose: +.P1 + lin := load Linear "/usr/dmr/limbo/linear.dis"; + Vector, TRUNCATE, setflags: import lin; + a: array of real; + + v1: Vector; + v2: Vector; + setflags(TRUNCATE); + v1 = Vector.make(a); + v1 = v1.make(a); + v1 = v1.add(v1); + v1.v = nil; +.P2 +.NH 3 +Function calls +.PP +The interpretation of an expression in the form +.s1 + termC ( Iexpression-listOC ) +.s2 +depends on the declaration of the term. +If it is the (perhaps qualified) name of an +.CW adt , +then the expression is a cast; this is discussed in §8.2.11 below. +If the term is either the (perhaps qualified) name of a function +or a value of a function reference type, and +the expression means a function call; this is discussed here. +.PP +A plain identifier as the +.I term +can name a function defined +in the current module or imported into it. +A term qualified by using the selection operator +.CW . +specifies a function member of an +.CW adt ; +a term using +.CW -> +specifies a function defined in another module. +.PP +The +.I term , +including a plain identifier denoting a variable of function reference type, +can also yield a function reference value. +The value specifies both a function and its module, +established when the value was created, +and cannot be qualified by the +.B -> +specifier. +.PP +Function calls in Limbo +create a copy of each argument of value type, +and the execution of a function cannot +affect the value of the corresponding actual argument. +For arguments of reference type, +execution of the function may affect the value of the object +to which the reference refers, although it cannot +change the argument itself. +The actual arguments to a function are evaluated +in an unspecified order, +although any side effects caused by argument evaluation +occur before the function is called. +.PP +Function calls may be directly or indirectly recursive; +objects declared within each function are distinct from +those in their dynamic predecessors. +.PP +Functions (§4.3, §7) may either return a value +of a specified type, or return no value. +If a function returns a value, it has the specified type. +A call to a function that returns no value may appear only as the +sole expression in a statement (§9.1). +.PP +A function name is converted to a reference to that function when it appears +in a context requiring a function reference type, including assignment to a variable, +as an actual parameter, or the return value of a function. +The resulting reference value includes the appropriate module value for the function name, +following the rules given above for implicit and explicit qualifiers, and imports. +For example, the following program fragment defines a table of commands: +.P1 + Cmd: adt { + c: int; + f: ref fn(a: array of string): int; + }; + + mkcmds(): array of Cmd + { + return array[] of { + ('.', editdot), + ('a', editadd), + ('d', editdel), + ('?', edithelp), + ('w', editwrite), + ('q', editquit), + }; + } + + editdot(a: array of string): int + { + ... + } + \&... + editquit(a: array of string): int + { + ... + } +.P2 +which might be used as follows: +.P1 + cmd := mkcmds(); + ... + for(i := 0; i < len cmd; i++) + if(cmd[i].c == c){ + cmd[i].f(args); + return; + } + error("unknown command"); +.P2 +.NH 3 +Subscripting and slicing +.PP +In a term of the form +.s1 + termC [ IexpressionC ]I +.s2 +the first term must be an array or a string, and the +bracketed expression must have +.CW int +type. +The whole term +designates a member of the array or string, indexed by the bracketed expression; +the index origin is 0. +For an array, the type of the whole term is +the type from which the array is constructed; +for a string, the type is an +.CW int +whose value is the Unicode character at that position in the string. +.PP +It is erroneous to refer to a nonexisting +part of an array or string. +(A single exception to this rule, discussed in §8.4.1 below, +allows extending a string by assigning a character at its end.) +.PP +In a term of the form +.s1 + termC [ IexpressionC : IexpressionC ]I +.s2 +the first term must be an array or a string, and the whole term +denotes a slice of it. +The first expression is the lower bound, and the second +is the upper. +If +.CW e1 +is the first expression and +.CW e2 +is the second, then in +.CW a[e1:e2] +it must be the case that +.CW "0<=e1, e1<=e2, e2<=len a" , +where +.CW len +gives the number of elements in the array or string. +When the term is an array, the value is an +array of the same type beginning at the indicated +lower bound and extending to the element just before +the upper bound. +When the term is a string, the value is similarly the substring +whose first character is indexed by the lower bound +and whose last character lies just before the upper bound. +.PP +Thus, for both arrays and strings, the number of elements in +.CW "a[e1:e2] +is equal to +.CW e2-e1 . +.PP +A slice of the form +.CW a[e:] +means +.CW "a[e:len a]. +.PP +When a string slice is assigned to another string or passed as an +argument, a copy of its value is made. +.PP +A slice of an array produces a reference to the designated subarray; +a change to an element of either the original array or +the slice is reflected in the other. +.PP +In general, slice expressions cannot be the subject of +assignments. +However, as a special case, an array slice expression of the +form +.CW a[e1:] +may be assigned to. +This is discussed in §8.4.1. +.PP +The following example shows how slices +can be used to accomplish what would +need to be done with pointer arithmetic in C: +.P1 + fd := sys->open( ... ); + want := 1024; + buf := array[want] of byte; + b := buf[0:]; + while (want>0) { + got := sys->read(fd, b, want); + if (got<=0) + break; + b = b[got:]; + want -= got; + } +.P2 +Here the array +.CW buf +is filled by successive calls to +.CW sys->read +that may supply fewer bytes than requested; each call +stores up to +.CW want +bytes +starting at +.CW b[0] , +and returns the number of bytes stored. +The invariant is that the slice +.CW b +always refers to the part of the array still to be stored into. +.NH 3 +Postfix increment and decrement +.PP +A term of the form +.s1 + termC ++I +.s2 +is called a +.I post-increment . +The term must be an lvalue (see §8.4 below) and must have an +arithmetic type. +The type and value of the whole term is +that of the incremented term. +After the value is taken, 1 of the appropriate +type is added to the lvalue. +The result is undefined if the same object is changed +more than once in the same expression. +.PP +The term +.s1 + termC --I +.s2 +behaves analogously to the increment case except +that 1 is subtracted from the lvalue. +.PP +.NH 2 +Monadic expressions +.PP +Monadic expressions are expressions with +monadic operators, together with a few more +specialized notations: +.s1 +monadic-expression: + term + monadic-operator monadic-expression + Carray [ IexpressionC ] of Idata-type + Carray [ IexpressionOC ] of { Iinit-listC }I + Clist of { Iexpression-listC }I + Cchan of Idata-type + Cchan [ IexpressionC ] of Idata-type + data-type monadic-expression + +monadic-operator: one of + C+ - ! ~ ref * ++ -- <- hd tl lenI +.s2 +.NH 3 +Monadic additive operators +.PP +The +.CW - +operator produces the negative of its operand, which +must have an arithmetic type. +The type of the result is the same as the type of +its operand. +.PP +The +.CW + +operator has no effect; it is supplied only for +symmetry. +However, its argument must have an arithmetic type +and the type of the result is the same. +.NH 3 +Logical negation +.PP +The +.CW ! +operator yields the +.CW int +value 1 if its operand +has the value 0, and yields 0 otherwise. +The operand must have type +.CW int . +.NH 3 +One's complement +.PP +The +.CW ~ +operator yields the 1's complement of its +operand, which must have type +.CW int +or +.CW byte . +The type of the result is the same as that of its operand. +.NH 3 +Reference and indirection operators +.PP +If +.I e +is an expression of an +.CW adt +type, then +.CW ref +.I e +is an expression of +.CW ref +.CW adt +type whose value refers to (points to) an anonymous object with value +.I e . +The +.CW ref +operator differs from the unary +.CW & +operator of C; it makes a new object and returns a reference +to it, rather than generating a reference to an existing object. +.PP +If +.I e +is an expression of type +.CW ref +.CW adt , +then +.CW * +.I e +is the value +of the +.CW adt +itself. +The value of +.I e +must not be +.CW nil . +.PP +For example, in +.P1 + Point: adt { ... }; + p: Point; + pp: ref Point; + p = Point(1, 2); + pp = ref p; # pp is a new Point; *pp has value (1, 2) + p = Point(3, 4); # This makes *pp differ from p + *pp = Point(4, 5); # This does not affect p +.P2 +the expression +.CW *pp +at first refers to a copy of the value stored in +.CW p , +so +.CW "*pp == p +is true; however, when +.CW p +is changed later, +.CW *pp +does not change. +.NH 3 +Prefix increment and decrement +.PP +A monadic expression of the form +.s1 + C++ Imonadic-expression +.s2 +is called a +.I pre-increment . +The monadic expression must be an lvalue (see §8.4 below) and must have an +arithmetic type. +Before the value is taken, 1 of the appropriate type +is added to the lvalue. +The type and value of the whole expression is +that of the now incremented term. +The result is undefined if the same object is changed +more than once in the same expression. +.PP +The term +.s1 + C-- Imonadic-expression +.s2 +behaves analogously to the increment case except +that 1 is subtracted from the lvalue. +.NH 3 +Head and tail +.PP +The operand of the +.CW hd +operator must be a non-empty list. +The value is the first member of the list +and has that member's type. +.PP +The operand of the +.CW tl +operator must be a non-empty list. +The value is the tail of the list, +that is, the part of the list after its +first member. +The tail of a list with one member is +.CW nil . +.NH 3 +Length +.PP +The operand of the +.CW len +operator is a string, an array, or a list. +The value is an +.CW int +giving the number of elements currently in the item. +.NH 3 +Tagof +.PP +The operand of the +.CW tagof +operator is a monadic expression of type +.CW ref +.CW adt +that refers to a +.CW pick +.CW adt . +or the type name of a +.CW pick +.CW adt +or one of its variants. +The value is an +.CW int +giving a unique value for each of the variants and for the +.CW pick +.CW adt +type itself. +.NH 3 +Channel communication +.PP +The operand of the communication operator +.CW <- +has type +.CW chan +.CW of +.I sometype . +The value of the expression +is the first unread object previously sent over that +channel, and has the type associated with the channel. +If the channel is empty, the program delays +until something is sent. +.PP +As a special case, the operand of +.CW <- +may have type +.CW array +.CW of +.CW chan +.CW of +.I sometype . +In this case, all of the channels in the array are tested; +one is fairly selected from those that have data. +The expression yields a tuple of type +.CW (int, +.I sometype +.CW ) ; +its first member gives the index of the channel from +which data was read, and its second member is the +value read from the channel. +If no member of the array has data ready, the expression delays. +.PP +Communication channels are treated more fully in §9.8 and +§9.13 below with the discussion of the +.CW alt +and +.CW spawn +statements. +.NH 3 +Creation of arrays +.PP +In the expressions +.s1 + Carray [ IexpressionC ] of Idata-type + Carray [ IexpressionOC ] of { Iinit-listC ,OC }I +.s2 +the value is a new array of the specified type. +In both forms, the +.I expression +must be of type +.CW int , +and it supplies the size of the array. +In the first form, the type is given, +and the values in the array are initialized as +appropriate to the underlying type. +In the second form, a comma-separated list of values to initialize +the array is given, optionally followed by a trailing comma. +The type of the array is taken from the types of +the initializers, which must all be the same. +The list of initializers has the syntax +.s1 +init-list: + element + init-listC , Ielement + +element: + expression + expressionC => Iexpression + C* => Iexpression +.s2 +In an +.I init-list +of plain expressions (without +.CW => ), +the members of the array +are successively initialized with the corresponding +elements of the init-list. +An element of the form +.CW e1=>e2 +initializes the member of the array at subscript +.CW e1 +with the expression +.CW e2 . +After such an element has been given, subsequent +simple elements (without +.CW => ) +begin initializing at position +.CW e1+1 +and so on. +Each of the first expressions must be of type +.CW int +and must evaluate to a constant (§8.5). +.PP +If an element of the form +.CW * +.CW =>e2 +is present, all members of the array not otherwise +initialized are set to the value +.CW e2 . +The expression +.CW e2 +is evaluated for each subscript position, +but in an undefined order. +For example, +.P1 + arr := array[3] of { * => array[3] of { * => 1 } }; +.P2 +yields a 2-dimensional array (actually an array of arrays) filled with +.CW 1 's. +.PP +If the expression giving the size of the array is omitted, its size +is taken from the largest subscript of +a member explicitly initialized. +It is erroneous to initialize a member twice. +.NH 3 +Creation of lists +.PP +The value of an expression +.s1 + Clist of { Iexpression-listC }I +.s2 +is a list consisting of the expressions given. +The types of the expressions must be identical, +and this type is the underlying type of the list. +The first expression is the head of the list, and the +remaining expressions are a list constituting its tail. +Where a list is expected, +.CW nil +specifies an empty list. +.NH 3 +Creation of channels +.PP +The value of +.s1 + Cchan of Idata-type +.s2 +is an initialized channel of the specified type. +Just a declaration of a channel leaves it initialized only to +.CW nil ; +before it can be used it must be created. For example, +.P1 + ch: chan of int; # just declares, sets ch to nil + . . . + ch = chan of int; # creates the channel and assigns it +.P2 +Such a channel is unbuffered. +The value of +.s1 + Cchan [ IexpressionC ] of Idata-type +.s2 +is an initialized channel of the specified type. +The +.I expression +must be of type +.CW int , +and sets the size of the channel's buffer. +If the size is zero, the channel is unbuffered, as for the first form. +.NH 3 +Casts +.PP +An expression of the form +.s1 + data-type monadic-expression +.s2 +in which a type name is followed by an expression +is called a +.I cast , +and converts the monadic expression to the named type. +Only certain specialized forms are provided for. +.NH 4 +Arithmetic casts +.PP +In arithmetic casts, the named type must be one of +.CW byte , +.CW int , +.CW big , +or +.CW real , +and the monadic-expression must have arithmetic type. +For example, +.P1 + byte 10 +.P2 +is an expression of +.CW byte +type and value 10. +When real values are converted to integral ones, +they are rounded to the nearest integer, and away from 0 +if there is a tie. +The effect of overflow during conversion is undefined. +.NH 4 +Casts to strings +.PP +Here the named data type is +.CW string . +In a first form, the monadic expression has arithmetic type +.CW byte , ( +.CW int , +.CW big , +or +.CW real ) +and the value is a string containing the decimal representation +of the value, which may be either positive or negative. +A +.CW real +operand is converted as if by format +.CW %g , +and if the result is converted back to +.CW real , +the original value will be recovered exactly. +.PP +In a second form, +the monadic expression has type +.CW array +.CW of +.CW byte . +The value is a new string containing the Unicode characters +obtained by interpreting the bytes in the array as a UTF-8 representation +of that string. +(UTF-8 is a representation of 16-bit Unicode characters as one, +two, or three bytes.) +The result of the conversion is undefined if the byte array +ends within a multi-byte UTF-8 sequence. +.NH 4 +Casts from strings +.PP +In a first form, the monadic expression is a string, +and the named type is an arithmetic type. +The value is obtained by converting the string to +that type. Initial white space is ignored; after a possible +sign, conversion +ceases at the first character not part of a number. +.PP +In a second form, the named type is +.CW array +.CW of +.CW byte +and the monadic-expression is a string. +The value is a new array of bytes containing the UTF-8 representation +of the Unicode characters in the string. +For example, +.P1 + s := "Ångström"; + a := array of byte s; + s = string a; +.P2 +takes the string +.CW s +apart into bytes in the second line, +and puts it back in the third. +The length of +.CW s +is 8, because it contains that many characters; +the length of +.CW a +is larger, because some of its characters require more than +one byte in the UTF-8 representation. +.NH 4 +Casts to +.CW adt +and +.CW ref +.CW adt +.PP +Here the named type is that of an +.CW adt +or +.CW ref +.CW adt , +and the monadic expression is a comma-separated list of expressions +within parentheses. +The value of the expression is an instance of an +.CW adt +of the named type whose data members are initialized with +the members of the list, or whose single data member +is initialized with the parenthesized expression. +In case the type is +.CW ref +.CW adt , +the value is a reference to the new +instance of the +.CW adt . +.PP +The expressions in the list, read in order, correspond with the data +members of the +.CW adt +read in order; their types and number must agree. +Placement of any function members of the +.CW adt +is ignored. +For example, +.P1 + Point: adt { + x: int; + eq: fn (p: Point): int; + y: int; + }; + . . . + p: Point; + p = Point(1, 2); +.P2 +puts in +.CW p +a +.CW Point +whose +.CW x +value is 1 and whose +.CW y +value is 2. +The declaration and assignment could also be written +.P1 + p := Point(1, 2); +.P2 +.NH 2 +Binary expressions +.PP +Binary expressions are either monadic expressions, +or have two operands and an infix operator; +the syntax is +.s1 +binary-expression: + monadic-expression + binary-expression binary-operator binary-expression + +binary-operator: one of + C** * / % + - << >> < > <= >= == != & ^ | :: && ||I +.s2 +All these binary operators are left-associative except for +.CW ** +and +.CW :: , +which associate to the right. +Their precedence is as listed here, with tightest first: +.P1 + ** + * / % + + - + << >> + < > <= >= + == != + & + ^ + | + :: + && + || +.P2 +.NH 3 +Exponentiation +.PP +The +.CW ** +operator accomplishes exponentiation. +The type of the left operand must be +.CW int , +.CW big +or +.CW real . +The type of the right operand must be +.CW int . +The result has the type of the left operand. +The operator is right associative, thus +.P1 + 3**4*2 = (3**4)*2 = 81*2 = 162 + -3**4 = (-3)**4 = 81 + 2**3**2 = 2**(3**2) = 2**9 = 512 +.P2 +.NH 3 +Multiplicative operators +.PP +The +.CW * , +.CW / , +and +.CW % +operators respectively accomplish multiplication, division, and remainder. +The operands must be of identical arithmetic type, and the result has that +same type. +The remainder operator does not apply to type +.CW real . +If overflow or division by 0 occurs, the result is undefined. +The absolute value of +.CW a%b +is less than the absolute value of +.CW b ; +.CW "(a/b)*b + a%b +is always equal to +.CW a ; +and +.CW a%b +is non-negative if +.CW a +and +.CW b +are. +.NH 3 +Additive operators +.PP +The +.CW + +and +.CW - +operators respectively accomplish addition and subtraction +of arithmetic operands of identical type; +the result has the same type. +The behavior on overflow or underflow is undefined. +The +.CW + +operator may also be applied to strings; +the result is a string that is the concatenation of the operands. +.NH 3 +Shift operators +.PP +The shift operators are +.CW << +and +.CW >> . +The left operand may be +.CW big , +.CW int , +or +.CW byte ; +the right operand is +.CW int . +The type of the value is the same as its left operand. +The value of the right operand must be non-negative +and smaller than the number of bits in the left operand. +For the left-shift operator +.CW << , +the fill bits are 0; +for the right-shift operator +.CW >> , +the fill bits are a copy of the sign for the +.CW int +case, and 0 for the +.CW byte +case. +.NH 3 +Relational operators +.PP +The relational operators are +.CW < +(less than), +.CW > +(greater than), +.CW <= +(less than or equal), +.CW >= +(greater than or equal), +.CW == +(equal to), +.CW != +(not equal to). +The first four operators, which generate orderings, +apply only to arithmetic types +and to strings; the types of their operands +must be identical, except that a string may be +compared to +.CW nil . +Comparison on strings is lexicographic over the +Unicode character set. +.PP +The equality operators +.CW == +and +.CW != +accept operands of arithmetic, string, and reference types. +In general, the operands must have identical type, +but reference types and strings may be compared for identity with +.CW nil . +Equality for reference types occurs when the operands +refer to the same object, or when both are +.CW nil . +An uninitialized string, or one set to +.CW nil , +is identical to the empty string denoted +.CW \&"" +for all the relational operators. +.PP +The value of any comparison is the +.CW int +value 1 if the stated +relation is true, 0 if it is false. +.NH 3 +Bitwise logical operators +.PP +The logical operators +.CW & +(and), +.CW ^ +(exclusive or) and +.CW | +(inclusive or) +require operands of the same type, +which must be +.CW byte , +.CW int , +or +.CW big . +The result has the same type and its +value is obtained by applying the operation +bitwise. +.NH 3 +List concatenation +.PP +The concatenation operator +.CW :: +takes a object of any data type +as its left operand and a list as its right operand. +The list's underlying type must be the same as +the type of the left operand. +The result is a new list with the left operand +tacked onto the front: +.P1 + hd (a :: l) +.P2 +is the same as +.CW a . +.NH 3 +Logical operators +.PP +The logical +.I and +operator +.CW && +first evaluates its left operand. +If the result is zero, then the value of the +whole expression is the +.CW int +value 0. +Otherwise the right operand is evaluated; if +the result is zero, the value of the whole +expression is again 0; otherwise it is 1. +The operands must have the same arithmetic type. +.PP +The logical +.I or +operator +.CW || +first evaluates its left operand. +If the result is non-zero, then the value of the +whole expression is the +.CW int +value 1. +Otherwise the right operand is evaluated; if +the result is non-zero, the value of the whole +expression is again 1; otherwise it is 0. +The operands must have the same arithmetic type. +.NH 2 +General Expressions +.PP +The remaining syntax for expressions is +.s1 +expression: + binary-expression + lvalue-expression assignment-operator expression + C( Ilvalue-expression-listC ) = Iexpression + send-expression + declare-expression + load-expression + +assignment-operator: one of + C= &= |= ^= <<= >>= += -= *= /= %=I +.s2 +The left operand of an assignment can take only certain forms, called lvalues. +.s1 +lvalue-expression: + identifier + CnilI + termC [ IexpressionC ]I + termC [ IexpressionC : ]I + termC . Iidentifier + C( Ilvalue-expression-listC )I + C* Imonadic-expression + +lvalue-expression-list: + lvalue + lvalue-expression-listC , Ilvalue +.s2 +.NH 3 +Simple assignments with +.CW = +.PP +In general, the types of the left and right operands +must be the same; this type must be a data type. +The value of an assignment is its new left operand. +All the assignment operators associate right-to-left. +.PP +In the ordinary assignment with +.CW = , +the value of the right side is assigned to the object +on the left. +For simple assignment only, the left operand may be a +parenthesized list of lvalues and the right operand +either a tuple or an +.CW adt +whose data members correspond +in number and type to the lvalues in the list. +The members of the tuple, or +the data members of the +.CW adt , +are assigned in sequence to +lvalues in the list. +For example, +.P1 + p: Point; + x, y: int; + (x, y) = p; +.P2 +splits out the coordinates of the point into +.CW x +and +.CW y . +These rules apply recursively, so that if one of the +components of the left side is a parenthesized list of lvalues, +it is assigned from a corresponding +.CW adt +or tuple on the right. +.PP +If the left operand of a simple assignment is an +.CW adt +and the right side is a tuple, then the assignment +assigns the members of the tuple to the +.CW adt +data members; these must correspond in number and type +with the members of the tuple. +.PP +The constant +.CW nil +may be assigned to an lvalue of any reference type. +This lvalue will compare equal to +.CW nil +until it is subsequently reassigned. +Such an assignment also +triggers the removal of the object referred to unless other references +to it remain. +.PP +The left operand of an assignment may be the constant +.CW nil +to indicate that a value is discarded. +This applies in particular to any of the lvalues in +a tuple appearing on the left; to extend the examples above, +.P1 + (x, nil) = p; +.P2 +assigns the +.CW x +member of the Point +.CW p +to the variable +.CW x . +.PP +A special consideration applies to +strings. +If an +.CW int +containing a Unicode character is assigned to a subscripted +string, the subscript +is normally required to lie within the string. +As a special case, the subscript's value may be equal to +the length of the string (that is, just beyond its end); +in this case, the character is appended to +the string, and the string's length increases by 1. +.PP +A final special case applies to array slices in the form +.CW e1[e2:] . +Such expressions may lie on the left of +.CW = . +The right side must be an array of the same type as +.CW e1 , +and its length must be less than or equal to +.CW "(len e1)-e2. +In this case, the +elements in the array on the right replace the elements of +.CW e1 +starting at position +.CW e2 . +The length of the array is unchanged. +.NH 3 +Compound assignments +.PP +A compound assignment with +.I op\f(CW=\fP +is interpreted in terms of the plain assignment; +.P1 + e1 \fIop\f(CW= e2; +.P2 +is equivalent to +.P1 + e1 \f(CW= (e1) \fIop \f(CW(e2); +.P2 +except that +.CW e1 +is evaluated only once. +.NH 3 +Send expressions +.PP +A +.I send-expression +takes the form +.s1 +send-expression: + lvalue-expressionC <- = Iexpression +.s2 +In the expression +.P1 + e1 <- = e2 +.P2 +the lvalue +.CW e1 +must have type +.CW chan +.CW of +.I type , +and +.CW e2 +must be of that type. +The value of +.CW e2 +is sent over the channel. +If no task is executing a +channel receive operation on the specified channel, and the channel is unbuffered or its buffer +is full, the sender blocks. +Task synchronization is discussed in §9.8 and §9.13 below. +.NH 3 +Declare-expressions +.PP +A +.I declare-expression +is an assignment that also declares identifiers on its left: +.s1 +declare-expression: + lvalue-expressionC := Iexpression +.s2 +Each of the constituent terms in the +.I lvalue-expression +must be an identifier or +.CW nil . +A plain identifier on the left +is declared as having the type of the expression, +and it is initialized with the expression's value. +When a parenthesized list of identifiers is given, the expression +must be a tuple or an +.CW adt , +and the individual identifiers in the list are declared and initialized +with the members of the tuple, or the data members of the +.CW adt . +As with ordinary assignments, the keyword +.CW nil +may stand for an identifier whose declaration and assignment +are skipped. +.PP +The value and type of a declare-expression are the same as those of the expression. +.NH 3 +Load expressions +.PP +A +.I load-expression +has the form +.s1 +load-expression: + Cload Iidentifier expression +.s2 +The identifier is the identifier of a module, that is, the type +name declared in a +.CW module +declaration. +The expression following +.CW load +has type +.CW string +and names a file containing the +compiled form of the module. +The +.CW load +expression yields a handle for referring to the functions provided +by a module and its +.CW adt . +.PP +Execution of +.CW load +brings the file containing the module into local memory and dynamically type-checks +its interface: the run-time system ascertains that +the declarations exported by the module are compatible +with the module declaration visible in the scope of the +.CW load +operator (see §11.2). +In the scope of a module declaration, the types and constants +exported by the module may be referred to without a handle, but +the functions and data exported by the module +(directly at its top level, or within its +.CW adt ) +may be called only using a valid +handle acquired by the +.CW load +operator. +.PP +The value of +.CW load +is +.CW nil +if the attempt to load fails, either because the file containing +the module can not be found, or because the found module does not +export the specified interface. +.PP +Each evaluation of +.CW load +creates a separate instance of the specified module; +it does not share data with any other instance. +.NH 2 +Constant expressions +.PP +In several places a constant expression is required. +Such an expression contains operands that are +identifiers previously declared with +.CW con , +or +.CW int , +.CW big , +.CW real , +or +.CW string +constants. +These may be connected by any of the following operators: +.P1 +.ta .5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i 5i 5.5i + + - * / % & | ^ + == < > <= >= != << >> + && || + ~ ! +.P2 +together with arithmetic and string casts, and parentheses for +grouping. +.NH 2 +Expression evaluation +.PP +Expressions in Limbo are not reordered by the compiler; +values are computed in accordance with the parse of the expression. +However there is no guarantee of temporal evaluation order for expressions +with side effects, except in the following circumstances: +function arguments are fully evaluated before the function +is called; the logical operators +.CW && +and +.CW || +have fully defined order of evaluation, as explained above. +All side effects from an expression in one statement are +completed before the next statement is begun. +.PP +In an expression containing a constant subexpression (in the +sense of §8.5), the constant subexpression is evaluated at +compile-time with all exceptions ignored. +.PP +Underflow, overflow, and zero-divide conditions during integer +arithmetic produce undefined results. +.PP +The +.CW real +arithmetic of Limbo is all performed in IEEE double precision, +although denormalized numbers may not be supported. +By default, +invalid operations, zero-divide, overflow, and underflow +during real arithmetic are fatal; inexact-result is quiet. +The default rounding mode is round-to-nearest-even. +A set of routines in the +.CW Math +library module permits independent control of these modes within each thread. +.NH 1 +Statements +.PP +The executable code within a function definition consists +of a sequence of statements and declarations. +As discussed in the Scope section §11 below, +declarations become effective at the place they appear. +Statements are executed in sequence except as discussed below. +In particular, the optional labels on some of the statements are used with +.CW break +and +.CW continue +to exit from or re-execute the labeled statement. +.s1 +statements: + (empty) + statements declaration + statements statement + +statement: + expressionC ;I + C;I + C{ IstatementsC }I + Cif ( IexpressionC ) Istatement + Cif ( IexpressionC ) IstatementC else Istatement + labelO Cwhile ( IexpressionOC ) Istatement + labelO Cdo IstatementC while ( IexpressionOC ) ;I + labelO Cfor ( IexpressionOC ; IexpressionOC ; IexpressionOC ) Istatement + labelO Ccase IexpressionC { Iqual-statement-sequenceC }I + labelO Calt { Iqual-statement-sequenceC }I + labelO Cpick IidentifierC := IexpressionC { Ipqual-statement-sequenceC }I + Cbreak IidentifierOC ;I + Ccontinue IidentifierOC ;I + Creturn IexpressionOC ;I + Cspawn ItermC ( Iexpression-listOC ) ;I + Cexit ;I + C{ IstatementsC } exceptionI identifierOC{ Iqual-statement-sequenceC }I + Craise IexpressionOC ;I +.s2 +.s1 +label: + identifier C:I +.s2 +.NH 2 +Expression statements +.PP +Expression statements consist of an expression followed by +a semicolon: +.s1 + expressionC ;I +.s2 +Most often expression statements are assignments, but other expressions +that cause effects are often useful, for example calling a function +or sending or receiving on a channel. +.NH 2 +Null statement +.PP +The null statement consists of a lone semicolon. +It is most useful for supplying an empty body +to a looping statement with internal side effects. +.NH 2 +Blocks +.PP +Blocks are +.I statements +enclosed in +.CW {} +characters. +.s1 + C{ IstatementsC }I +.s2 +A block starts a new scope. +The effect of any declarations within a block disappears +at the end of the block. +.NH 2 +Conditional statements +.PP +The conditional statement takes two forms: +.s1 + Cif ( IexpressionC ) Istatement + Cif ( IexpressionC ) IstatementC else Istatement +.s2 +The +.I expression +is evaluated; it must have type +.CW int . +If it is non-zero, then the first +.I statement +is executed. +In the second form, the second +.I statement +is executed if the +.I expression +is 0. +The statement after +.CW else +is connected to the nearest +.CW else -less +.CW if . +.NH 2 +Simple looping statements +.PP +The simple looping statements are +.s1 + labelO Cwhile ( IexpressionOC ) Istatement + labelO Cdo IstatementC while ( IexpressionOC ) ;I +.s2 +In both cases the expression must be of type +.CW int . +In the first form, the +.I expression +is first tested against 0; +while it is not equal, the +.I statement +is repeatedly executed. +In the second form, the +.I statement +is executed, and then, while the +.I expression +is not 0, the statement is repeatedly executed. +If the +.I expression +is missing, it is understood to be non-zero. +.NH 2 +.CW for +statement +.PP +The +.CW for +statement has the form +.s1 + labelO Cfor ( Iexpression-1OC ; Iexpression-2OC ; Iexpression-3OC ) Istatement +.s2 +It is equivalent to +.s1 + expression-1C ;I + Cwhile ( Iexpression-2C ) { + Istatement + expression-3C ; + C}I +.s2 +in the absence of +.CW continue +or +.CW break +statements. +Thus (just as in C), the first expression is an initialization, +the second a test for starting and continuing the loop, and the third +a re-initialization for subsequent travels around the loop. +.NH 2 +.CW case +statement +.PP +The +.CW case +statement transfers control to one of several places +depending on the value of an expression: +.s1 + labelO Ccase IexpressionC { Iqual-statement-sequenceC }I +.s2 +The expression must have type +.CW int , +.CW big +or +.CW string . +The +.CW case +statement is followed by sequence of +qualified statements, which are statements labeled by +expressions or expression ranges: +.s1 +qual-statement-sequence: + qual-listC =>I + qual-statement-sequence qual-listC =>I + qual-statement-sequence statement + qual-statement-sequence declaration + +qual-list: + qualifier + qual-listC or Iqualifier + +qualifier: + expression + expressionC to Iexpression + C*I +.s2 +A +.I qual-statement-sequence +is a sequence of +statements and declarations, each of which +is preceded by one or more qualifiers. +Syntactically, the qualifiers are +expressions, expression ranges with +.CW to , +or +.CW * . +If the expression mentioned after +.CW case +has +.CW int +or +.CW big +type, +all the expressions appearing in the qualifiers +must evaluate to integer constants of the same type (§8.5). +If the expression has +.CW string +type, all the qualifiers must be +string constants. +.PP +The +.CW case +statement is executed by comparing +the expression at its head with the constants +in the qualifiers. +The test is for equality in the case +of simple constant qualifiers; +in range qualifiers, the test determines +whether the expression is greater than or +equal to the first constant and less than +or equal to the second. +.PP +None of the ranges or constants may overlap. +If no qualifier is selected and +there is a +.CW * +qualifier, +then that qualifier is selected. +.PP +Once a qualifier is selected, control passes +to the set of statements headed by that +qualifier. +When control reaches the end of that set +of statements, control passes to the end +of the +.CW case +statement. +If no qualifier is selected, the +.CW case +statement is skipped. +.PP +Each qualifier and the statements following it +up to the next qualifier together form a separate +scope, like a block; declarations within this scope +disappear at the next qualifier (or at the end of +the statement.) +.PP +As an example, this fragment separates small numbers +by the initial letter of their spelling: +.P1 + case i { + 1 or 8 => + sys->print("Begins with a vowel\en)"; + 0 or 2 to 7 or 9 => + sys->print("Begins with a consonant\en"); + * => + sys->print("Sorry, didn't understand\en"); + } +.P2 +.NH 2 +.CW alt +statement +.PP +The +.CW alt +statement transfers control to one of several groups +of statements depending on the readiness of communication +channels. +Its syntax resembles that of +.CW case : +.s1 + labelO Calt { Iqual-statement-sequenceC }I +.s2 +However, the qualifiers take a form different +from those of +.CW case . +In +.CW alt , +each qualifier must be a +.CW * , +or an expression containing a communication +operator +.CW <- +on a channel; +the operator may specify either sending or receiving. +For example, +.P1 + outchan := chan of string; + inchan := chan of int; + alt { + i := <-inchan => + sys->print("Received %d\en", i); + + outchan <- = "message" => + sys->print("Sent the message\en"); + } +.P2 +The +.CW alt +statement is executed by testing each of +the channels mentioned in the +.I qual-list +expressions for ability to send or receive, +depending on the operator; +if none is ready, the program blocks +until at least one is ready. +Then a random choice from the ready channels is selected +and control passes to the associated set +of statements. +.PP +If a qualifier of the form +.CW * +is present, then the statement does not block; +if no channel is ready the statements associated with +.CW * +are executed. +.PP +If two communication operators are present +in the same qualifier expression, only the leftmost one is +tested by +.CW alt . +If two or more +.CW alt +statements referring to the same receive (or send) +channel are executed in different +threads, the requests are queued; +when the channel becomes unblocked, the thread +that executed +.CW alt +first is activated. +.PP +As with +.CW case , +each qualifier and the statements following it +up to the next qualifier together form a separate +scope, like a block; declarations within this scope +disappear at the next qualifier (or at the end of +the statement.) +Thus, in the example above, the scope of +.CW i +in the arm +.P1 + i := <-inchan => + sys->print("Received %d\en", i); +.P2 +is restricted to these two lines. +.PP +As mentioned in the specification +of the channel receive operator +.CW <- +in §8.2.8, that operator can take an array of channels as an argument. +This notation serves as a kind of simplified +.CW alt +in which all the channels have the same type +and are treated similarly. +In this variant, +the value of the communication expression is a tuple +containing the index of the +channel over which a communication was received and +the value received. +For example, in +.P1 + a: array [2] of chan of string; + a[0] = chan of string; + a[1] = chan of string; + . . . + (i, s) := <- a; + # s has now has the string from channel a[i] +.P2 +the +.CW <- +operator waits until at least one of the +members of +.CW a +is ready, selects one of them at random, +and returns the index and the transmitted string +as a tuple. +.PP +During execution of an +.CW alt , +the expressions in the qualifiers are evaluated in an undefined +order, and in particular subexpressions may be evaluated before +the channels are tested for readiness. +Therefore qualifying expressions should not invoke side effects, +and should avoid subparts that might delay execution. +For example, in the qualifiers +.P1 + ch <- = getchar() => # Bad idea + ich <- = next++ => # Bad idea +.P2 +.CW getchar() +may be called early in the elaboration of the +.CW alt +statement; if it delays, the entire +.CW alt +may wait. +Similarly, the +.CW next++ +expression may be evaluated before testing the readiness of +.CW ich . +.NH 2 +.CW pick +statement +.PP +The +.CW pick +statement transfers control to one of several groups of statements +depending upon the resulting variant type of a +.CW pick +.CW adt +expression. The syntax resembles that of +.CW case : +.s1 + labelO Cpick IidentifierC := IexpressionC { Ipqual-statement-sequenceC }I +.s2 +The expression must have type +.CW ref +.CW adt +and the +.CW adt +must be a +.CW pick +.CW adt . +The +.CW pick +statement is followed by a sequence of qualified statements, which are +statements labeled by the +.CW pick +variant names: +.s1 +pqual-statement-sequence: + pqual-listC =>I + pqual-statement-sequence pqual-listC =>I + pqual-statement-sequence statement + pqual-statement-sequence declaration + +pqual-list: + pqualifier + pqual-listC or Ipqualifier + +pqualifier: + identifier + C*I +.s2 +A +.I pqual-statement-sequence +is a sequence of statements and declarations, each of which +is preceded by one or more qualifiers. +Syntactically, the qualifiers are identifiers, identifier lists (constructed with +.CW or ), +or +.CW * . +The identifiers must be names of the variant types of the +.CW pick +.CW adt . +The +.CW pick +statement is executed by comparing the variant type of the +.CW pick +.CW adt +referenced by the expression at its head with the variant type names in the qualifiers. +The matching qualifier is selected. +None of the variant type names may appear more than once. +If no qualifier is selected and there is a +.CW * +qualifier, then that qualifier is selected. +.PP +Once a qualifier is selected, control passes +to the set of statements headed by that qualifier. +When control reaches the end of that set of statements, +control passes to the end of the +.CW pick +statement. +If no qualifier is selected, the +.CW pick +statement is skipped. +.PP +Each qualifier and the statements following it +up to the next qualifier together form a separate +scope, like a block; declarations within this scope +disappear at the next qualifier (or at the end of +the statement.) +.PP +The +.I identifier +and +.I expression +given in the +.CW pick +statement are used to bind a new variable to a +.CW pick +.CW adt +reference expression, and within the statements associated with the +selected qualifier the variable can be used as if it were of the corresponding +variant type. +.PP +As an example, given a +.CW pick +.CW adt +of the following form: +.P1 + Constant: adt { + name: string; + pick { + Str or Pstring => + s: string; + Real => + r: real; + } + }; +.P2 +the following function could be used to print out the value of +an expression of type +.CW "ref Constant" : +.P1 + printconst(c: ref Constant) + { + sys->print("%s: ", c.name); + pick x := c { + Str => + sys->print("%s\en", x.s); + Pstring => + sys->print("[%s]\en", x.s); + Real => + sys->print("%f\en", x.r); + }; + } +.P2 +.NH 2 +.CW break +statement +.PP +The +.CW break +statement +.s1 + Cbreak IidentifierO C;I +.s2 +terminates execution of +.CW while , +.CW do , +.CW for , +.CW case , +.CW alt , +and +.CW pick +statements. +Execution of +.CW break +with no identifier +transfers control to +the statement after the innermost +.CW while , +.CW do , +.CW for , +.CW case , +.CW alt , +or +.CW pick +statement in which it appears as a substatement. +Execution of +.CW break +with an identifier +transfers control to the next statement after the unique enclosing +.CW while , +.CW do , +.CW for , +.CW case , +.CW alt , +or +.CW pick +labeled with that identifier. +.NH 2 +.CW continue +statement +.PP +The +.CW continue +statement +.s1 + Ccontinue IidentifierO C;I +.s2 +restarts execution of +.CW while , +.CW do , +and +.CW for +statements. +Execution of +.CW continue +with no identifier +transfers control to the end of +the innermost +.CW while , +.CW do , +or +.CW for +statement in which the +.CW continue +appears as a substatement. +The expression that controls the loop is tested +and if it succeeds, execution continues in the loop. +The initialization portion of +.CW for +is not redone. +.PP +Similarly, execution of +.CW continue +with an identifier transfers control to the end of the enclosing +.CW while , +.CW do , +or +.CW for +labeled with the same identifier. +.NH 2 +.CW return +statement +.PP +The +.CW return +statement, +.s1 + Creturn IexpressionOC ;I +.s2 +returns control to the caller of a function. +If the function returns a value (that is, if its definition +and declaration mention a return type), +the expression must be given and it must have the same type that the +function returns. +If the function returns no value, the expression +must generally be omitted. +However, if a function returns no value, and its +last action before returning is to call +another function with no value, then it may +use a special form of +.CW return +that names the function being called. +For example, +.P1 + f, g: fn(a: int); + f(a: int) { + . . . + return g(a+1); + } +.P2 +is permitted. +Its effect is the same as +.P1 + f(a: int) { + . . . + g(a+1); + return; + } +.P2 +This +.I "ad hoc +syntax offers the compiler a cheap opportunity to recognize +tail-recursion. +.PP +Running off the end of a function is equivalent to +.CW return +with no expression. +.NH 2 +.CW spawn +statement +.PP +The +.CW spawn +statement creates a new thread of control. +It has the form +.s1 + Cspawn ItermC ( Iexpression-listOC ) ;I +.s2 +The term and expression-list are taken to be +a function call. +Execution of +.CW spawn +creates an asynchronous, independent thread +of control, which calls the function in the new thread context. +This function may access the accessible objects +in the spawning thread; the two threads share +a common memory space. +These accessible objects include the data global to +the current module and reference data passed to the +spawned function. +Threads are preemptively scheduled, so that +changes to objects used in common between +threads may occur at any time. +The Limbo language provides no explicit synchronization +primitives; §12.3 shows examples of how to use channel +communication to control concurrency. +.NH 2 +.CW exit +statement +.PP +The +.CW exit +statement +.s1 + Cexit ;I +.s2 +terminates a thread and frees any resources +belonging exclusively to it. +.NH 2 +.CW raise +statement +.PP +The +.CW raise +statement +.s1 + Craise IexpressionOC ;I +.s2 +raises an exception in a thread. +The +.I expression +is either a string describing the failure, or an exception name and its parameter values, if any. +If an expression is not given, the +.CW raise +statement must appear in the body of an exception handler; it raises the currently active exception. +.NH 2 +Exception handler +.PP +Various errors in a Limbo program can be detected only at run-time. +These include programming errors such as an attempt to index outside the bounds of an array, +system errors such as exhausting memory, and user-defined exceptions +declared at compile-time by exception declarations and caused at run-time by the +.CW raise +statement. +A group of statements can have an associated exception handler: +.s1 + C{ IstatementsC } exceptionI identifierOC{ Iqual-statement-sequenceC }I +.s2 +The first run-time exception raised by any of the +.I statements , +or functions they call, +that is not handled by an exception handler enclosing the statement raising the exception +will terminate execution of the +.I statements +at that point, and transfer control to the clause in the sequence of qualified statements +that matches the exception. +An exception represented by a string is matched by a qualifier that is either the same +string value, or a prefix of it followed by +.CW * . +The optional identifier following +.CW exception +is set to the value of the exception string for the execution of the qualified statement. +If execution of the qualified statement completes, control passes to the statement following +the exception-handling statement. +.PP +A qualified statement labeled by a user-defined exception name matches that exception. +If the exception has parameters, the identifier following +.CW exception +will be be declared and initialized as a tuple of the parameter values for the scope +of the qualified statement, allowing the values to be recovered by tuple assigment. +.PP +The qualifier +.CW * +matches any string or user-defined exception. +An exception that is raised and not successfully handled by a thread will terminate the thread. +.NH +Referring to modules; +.CW import +.PP +As discussed above, modules present +constants, functions, and types +in their interface. +Their names may be the same as names +in other modules or of local objects or types within +a module that uses another. +Name clashes are avoided because references +to the entities presented by a module are +qualified by the module type name or an object +of that module type. +.PP +For example, +after the module and variable declarations +.P1 + M: module { + One: con 1; + Thing: adt { + t: int; + f: fn(); + }; + g: fn(); + }; + m: M; +.P2 +the name +.CW One +refers to the constant defined in +module +.CW M +only in the contexts +.CW M->One +or +.CW m->One ; +the name +.CW Thing +as the particular data type +associated with the +.CW M +module can be referred to only in contexts +like +.P1 + th1: M->Thing; + th2: m->Thing; +.P2 +Finally, to call a function defined either as a top-level +member of the module, or as a member of one of its +.CW adt , +it is necessary to declare, and also dynamically initialize using +.CW load , +a handle for the module. +Then calls of the form +.P1 + m->g(); + m->th1.f(); +.P2 +become appropriate. +It is possible to use just the type name of a module to qualify +its constants and types because constants and types can be understood +without having the code and data present. +Calling a function declared by a module or one of its +.CW adt +requires loading the module. +.PP +The +.CW import +declaration +.s1 + Iidentifier-listC : import Iidentifier C;I +.s2 +lifts the identifiers in the +.I identifier-list +into the scope in which +.CW import +appears, so that they are usable without a qualifier. +The identifier after the +.CW import +keyword is either +a module identifier, or an identifier declared as having +that type. +The initial list of identifiers specifies those +constants, +types, +and functions of the module whose names are promoted. +In the case of constants and types, +.CW import +merely makes their names accessible without using a qualifier. +In the example above, if the +.CW module +declaration above had been followed by +.P1 + One, Thing: import M; +.P2 +then one could refer to just +.CW One +instead of +.CW M->One ; +similarly an object could be declared like +.P1 + th: Thing; +.P2 +For functions, and also +.CW adt +with functions as members, +.CW import +must specify a module +variable (as opposed to a module identifier). +Each imported name is associated with the specified module +variable, and the current value of this module variable +controls which instance of the module will +be called. +For example, after +.P1 + g, Thing: import m; +.P2 +then +.P1 + g(); +.P2 +is equivalent to +.P1 + m->g(); +.P2 +and +.P1 + th: Thing; + th.f(); +.P2 +is equivalent to +.P1 + th: M.Thing; + m->th.f(); +.P2 +When the module declaration for the module being +implemented is encountered, an implicit +.CW import +of all the names of the module is executed. +That is, given +.P1 + implement Mod; + . . . + Mod: module { + . . . + }; +.P2 +the constants and types of +.CW Mod +are accessed as if they had been imported; +the functions declared in +.CW Mod +are imported as well, and refer dynamically to the +current instance of the module being implemented. +.NH +Scope +.PP +The scope of an identifier is the lexical range of +a program throughout which the identifier means a particular +type of, or instance of, an object. +The same identifier may be associated with several +different objects in different parts of the same program. +.PP +The names of members of an +.CW adt +occupy a separate, nonconflicting space from other identifiers; +they are declared in a syntactically distinct position, +and are always used in a distinguishable way, namely +after the +.CW . +selection operator. +Although the same scope rules apply to +.CW adt +members as to other identifiers, their names may +coincide with other entities in the same scope. +.PP +Similarly, the names of constants, functions, and +.CW adt +appearing +within a +.CW module +declaration are ordinarily qualified either with +the name of the module or with a module variable +using the +.CW -> +notation. +As discussed above, the +.CW import +declaration lifts these names into the current scope. +.PP +Identifiers declared in a top-declaration +(§5) have scope that lasts from the +declaration throughout the remainder of the +file in which it occurs, unless it is overridden +by a redeclaration of that name within an inner +scope. +Each function definition, and each block +within a function, +introduces a new scope. +A name declared within the block or function +(including a formal argument name of a function) +has a scope that begins +at the completion of its declaration and lasts until +the end of the block or function. +If an already-declared identifier is redeclared within +such an inner scope, the declaration previously in +force is used in any initialization expression +that is part of the new declaration. +.PP +As discussed above, within +.CW case +.CW alt +and +.CW pick , +each qualifier +and the statements following it form an inner +scope just like a block. +.PP +The scope of a label is restricted to the +labeled statement, +and label names may coincide with those of other +entities in the same scope. +.NH 2 +Forward referencing +.PP +In general, names must be declared before they are used. +.PP +The first exception to this rule is that a +function local to a module need not have a +declaration at all; it is sufficient to give +its definition, and that definition may appear anywhere +in the module. +.PP +The general rule implies that no +.CW adt +may contain, as a member, an +.CW adt +not previously declared (including an instance of itself). +A second exception to this rule applies to +.CW ref +.CW adt +types. +An +.CW adt +may contain a member whose type is a +.CW ref +to itself, or to another +.CW adt +even if the second +.CW adt +has not yet been declared. +Unless a special notation is used, such +references are restricted: +all mutual or self references among +.CW adt +are checked statically throughout all the +.CW adt +visible in a module to determine which +members refer to other +.CW adt . +Any member of an +.CW adt +of +.CW ref +.CW adt +type that refers directly, or indirectly through a chain of references, +back to its own underlying type may not be assigned to individually; +it can gain a value only by an assignment to the +.CW adt +as a whole. +For example, in +.P1 + Tree: adt { + l: ref Tree; + r: ref Tree; + t: ref Ntree; + }; + Ntree: adt { + t: ref Tree; + }; + + t1 := Tree(nil, nil, nil); # OK + t2 := Tree(ref t1, ref t1, nil); # OK + t1 = Tree(ref t1, ref t2, nil); # OK + t1.l = ... ; # not OK + + nt := ref Ntree(nil); # OK + nt.t = ... # not OK +.P2 +the first three assignments are correct, but +any assignment to +.CW t1.l +is forbidden, because it is self-referential. +The situation is the same with the mutually +referential fields of the +.CW Tree +and +.CW Ntree +.CW adt . +.PP +These restrictions suffice +to prevent the creation of circular data structures. +Limbo implementations guarantee to +destroy all data objects not involved in such circularity +immediately after they become non-referenced by active +tasks, whether because +their names go out of scope or because they are assigned new values. +This property has visible effect because certain system resources, +like windows and file descriptors, can be seen outside the program. +In particular, if a reference to such a resource is held only within an +.CW adt , +then that resource too is destroyed when the +.CW adt +is. +.PP +The default rules are burdensome because they impede the construction even +of harmless structures like trees. +Therefore an escape is provided: using the word +.CW cyclic +before the type in an +.CW adt +member removes the circular-reference restriction for that member. +For example, +.P1 + Tree: adt { + l: cyclic ref Tree; + r: cyclic ref Tree; + t: ref Ntree; + }; + Ntree: adt { + t: cyclic ref Tree; + }; + + t1 := Tree(nil, nil, nil); # OK + t2 := Tree(ref t1, ref t1, nil); # OK + t1 = Tree(ref t1, ref t2, nil); # OK + t1.l = ... ; # OK now + + nt := ref Ntree(nil); # OK + nt.t = ... # OK now +.P2 +With the use of +.CW cyclic , +circular data structures can be created. +When they become unreferenced except by themselves, they will +be garbage-collected eventually, but not instantly. +.NH 2 +Type equality and compatibility +.PP +In an assignment and in passing an actual argument to a function, +the types of the target and the expression being assigned or +passed must be equal (with certain exceptions, e.g. assignment of +.CW nil +to a reference type). +When a function is defined, its type must be equal to the type +of a function with the same name if one is in scope. +Type equality is determined as follows. +.PP +Two basic types are equal if and only if they are identical. +.PP +Two tuple types are equal if and only if they are composed +of equal types in the same order. +.PP +Two array types are equal if and only if they are arrays +of equal types. +The size of an array is not part of its type. +.PP +Two list types are equal if and only if they are composed +of equal types. +.PP +Two channel types are equal if and only if they transmit +equal types. +.PP +Two +.CW adt +types are equal if and only if their data members +have the same names and correspondingly +equal types, including any +.CW cyclic +attribute. +The order of member declaration is insignificant, and +constant and function members of an +.CW adt +do not enter into the comparison, +nor does the name of the +.CW adt +type itself. +In particular, with the declarations +.P1 + A: adt { x: ref B; }; + B: adt { x: ref A; }; +.P2 +the types +.CW A +and +.CW B +are equal. +.PP +Two +.CW ref +.CW adt +types are equal if and only if they are references to equal +.CW adt +types. +.PP +Two module types are equal if and only if their data and function members +have the same names and correspondingly equal types; the order +of their mention is insignificant. +Constant members and type members do not enter into the comparison. +.PP +Two function types are equal if and only if their return +values have the same type +and their argument lists have correspondingly equal types. +Any +.CW self +attributes given to arguments much match. +Names given to arguments do not enter into the comparison. +.PP +A type name has the same type as the type from +which it was constructed. +.PP +When a module is loaded, the module stored +in the file system must have a type that is +.I compatible +with the type mentioned in the +.CW load +expression. +The type of the stored module +type is compatible with the mentioned type if and only if +all data members of the two types are equal in name and type, +and all +.CW adt +or functions actually mentioned by the program executing +.CW load +have names and types equal to corresponding members of +the stored module. +.NH +Examples +.PP +Because Limbo was designed for the Inferno environment, several +of these examples consist of simplified versions of already simple +Inferno applications in a prototype Inferno implementation. +Some appreciation for the resources available in this environment +should become evident, but its full description is available +elsewhere; +the discussion here will focus on language features. +However, several of the programs use facilities +from the module +.CW Sys , +which provides an interface to a file system and its methods +resembling those of Unix or Plan 9, +as well as other useful library facilities. +.PP +Some of the programs are annotated with line numbers; +they are there only for descriptive purposes. +.NH 2 +A simple command interpreter module +.PP +This version of a shell program reads from a keyboard and +executes `commands' typed by the user. +Its own interface has the type of a +.CW Command +module, and that is the type of the things it executes. +In particular, it can call modules like the +.CW hello +example at the beginning of the paper. +.P1 +1 implement Command; + +2 include "sys.m"; +3 include "draw.m"; + +4 sys: Sys; +5 stdin: ref Sys->FD; + +6 Command: module +7 { +8 init: fn(nil: ref Draw->Context, nil: list of string); +9 }; +.P2 +After the boilerplate on lines 1-3, the variables +.CW sys +and +.CW stdin +are declared on lines 4 and 5. +The I/O operations of the +.CW Sys +module use the +.CW ref +.CW FD +type to refer to open files. +.P1 +10 init(ctx: ref Draw->Context, nil: list of string) +11 { +12 +13 +14 buf := array[256] of byte; + +15 sys = load Sys Sys->PATH; +16 stdin = sys->fildes(0); + +17 for(;;) { +18 sys->print("$ "); +19 n := sys->read(stdin, buf, len buf); +20 if(n <= 0) +21 break; +22 (nw, arg) := + sys->tokenize(string buf[0:n], " \et\en"); +23 if(nw != 0) +24 exec(ctx, arg); +25 } +26 } +.P2 +Line 10: conventionally, stand-alone modules are started +by calling their +.CW init +functions. +The +.CW Command +module follows this convention. +The arguments are presented as a list of strings. +In this simple example, the command interpreter itself +ignores its argument, so it need not be given a name. +.PP +Local variables are declared on lines 12-14; line 15 +loads the +.CW Sys +module and stores a handle for it in the variable +.CW sys . +Line 16 creates an +.CW FD +for the standard input by calling the +.CW fildes +function of the +.CW Sys +module using the +.CW -> +operator; the notation +.CW modhandle->func(...) +specifies a call to the function named +.CW func +in the module currently referred to by +.CW modhandle . +(In general there can be several modules of the same type and name +active, and there can also be unrelated modules containing identically +named functions. +The +.CW import +declaration, described in §6.6 above, can be used to abbreviate +the references when names do not clash.) +.PP +The loop on lines 17-25 prints a prompt (line 18), reads a line from +the standard input (line 19), parses it into tokens (line 22), and +executes the command. +.PP +The function call +.CW sys->tokenize +is worth discussing as an example of style. +It takes two strings as arguments. +The characters in the second string are interpreted as separators +of tokens in the first string. +It returns a tuple whose first member is the number of +tokens found, and whose second is a list of strings +containing the tokens found: its declaration is +.P1 + tokenize: fn (s: string, sep: string): (int, list of string); +.P2 +In the example, the second argument is +\f(CW" \et\en"\fP, +so that the routine returns the number of, and a list of, +`words' separated by blanks, tabs, and new-lines. +The free use of strings, lists, and tuple-returning +functions is common in Limbo. +.PP +The +.CW sys->read +routine gathers an array of bytes into +.CW buf . +Thus the expression for the first argument of +.CW sys->tokenize +converts this array to a string by slicing the +array with +.CW [0:n] , +using the actual number of bytes +gathered by the +.CW read , +and using a cast. +.PP +At lines 23-24, if there were any words found, +.CW exec +is called: +.P1 +27 exec(ctx: ref Draw->Context, args: list of string) +28 { +29 c: Command; +30 cmd, file: string; + +31 cmd = hd args; + +32 file = cmd + ".dis"; +33 c = load Command file; +34 if(c == nil) +35 c = load Command "/dis/"+file; + +36 if(c == nil) { +37 sys->print("%s: not found\en", cmd); +38 return; +39 } +40 c->init(ctx, args); +41 } +.P2 +On lines 31 and 32 of +.CW exec , +.CW cmd +is set to the first of the words in the argument list, +and the string +.CW .dis +is concatenated to it (to account for the fact that Limbo +object program files are conventionally named using this suffix). +On line 33 an attempt is made to load the named module +from the derived file name; it will fail if the file +does not exist. +The attempt will succeed, +and a non-nil handle assigned to +.CW c , +if the file is found, and if +the module stored in that file does in fact implement the +.CW Command +module type. +In case this fails, lines 34-35 make another attempt, after prefixing +.CW /dis/ +to the file name. +.PP +If either attempt to get a handle to the named module +succeeds, +.CW c +will contain a valid handle to it; line 40 calls its +.CW init +function, passing it the whole argument list. +When it returns, the +.CW exec +function returns, and the main loop resumes. +.NH 2 +Infrared remote control +.PP +This example shows two instances of a module +for interfacing to a TV remote control; one +is for the real remote, which in this case +is connected to a serial port on a set-top +box, and the other is simulated for testing +programs running on a regular operating +system. +The techniques of special interest are the +dynamic use of modules and the communication +using a channel. +.PP +The module is used by creating a channel and passing +it to the module's +.CW init +function, +which returns a success/error indicator and starts an +asynchronous process to read the remote control. +The user of the module executes a receive +on the channel whenever it wishes to accept +a button-push. +.PP +The (abridged) module declaration is +.P1 +Ir: module +{ + # Codes buttons on IR remote control + Zero: con 0; + One: con 1; + . . . + Mute: con 23; + Error: con 9999; + + init: fn(chan of int): int; + PATH: con "/dis/ir.dis"; + SIMPATH: con "/dis/irsim.h"; +}; +.P2 +The implementation for the `real' remote control is +.P1 +implement Ir; + +include "ir.m"; +include "sys.m"; +FD, Dir: import Sys; + +sys: Sys; + +init(keys: chan of int): int +{ + cfd, dfd: ref FD; + + sys = load Sys Sys->PATH; + + cfd = sys->open("/dev/eia1ctl", sys->OWRITE); + if(cfd == nil) + return -1; + sys->fprint(cfd, "b9600"); + + dfd = sys->open("/dev/eia1", sys->OREAD); + cfd = nil; + + spawn reader(keys, dfd); + return 0; +} +.P2 +The +.CW init +routine accepts a +.CW chan +argument; it will be used by the module to +send codes for the buttons pressed by the user. +In this routine, the calls to +.CW sys->open +and +.CW sys->fprint +open and set up the device data and control files +.CW /dev/eia1 +and +.CW /dev/eia1ctl +used to communicate with the device itself. +The important step is at the end: the +.CW spawn +statement creates a new, +asynchronous task to read the device, using a routine +that is passed the communications channel and the +FD for the device: +.P1 +reader(keys: chan of int, dfd: ref FD) +{ + n, ta, tb: int; + dir: Dir; + b1:= array[1] of byte; + b2:= array[1] of byte; + + # find the number of bytes already + # queued and flush that many + (n, dir) = sys->fstat(dfd); + if(n >= 0 && dir.length > 0) { + while(dir.length) { + n = sys->read(dfd, + array[dir.length] of byte, + dir.length); + if(n < 0) + break; + dir.length -= n; + } + } +.P2 +.P1 +loop: for(;;) { + n = sys->read(dfd, b1, len b1); + if(n <= 0) + break; + ta = sys->millisec(); + # Button pushes are pairs of characters + # that arrive closer together than + # 200 ms. Longer than that is likely + # to be noise. + for(;;) { + n = sys->read(dfd, b2, 1); + if(n <= 0) + break loop; + tb = sys->millisec(); + if(tb - ta <= 200) + break; + ta = tb; + b1[0] = b2[0]; + } + # map the character pair; the significant + # bits are the lowest 5. + case ((int b1[0]&16r1f)<<5) | (int b2[0]&16r1f) { + 975 => n = Ir->Zero; + 479 => n = Ir->One; + . . . + 791 => n = Ir->Mute; + * => n = Ir->Error; + } + # found a button-push; send the value + keys <-= n; + } + keys <-= Ir->Error; +} +.P2 +The code in the middle is related to noise-filtering +and is uninteresting in detail except as it illustrates +some of the methods provided by the +.CW Sys +module; the crucial actions are found at the bottom, +where the routine sends either +a true button-push or an error code over the channel to +the module's client. +.PP +Here is another implementation of the same interface. +Its +.CW init +function performs the same kind of initialization +as the other version, but using the operating system's +keyboard files +.CW /dev/cons +and +.CW /dev/consctl . +In the Inferno environment, operations corresponding to the Unix +`stty' primitive are accomplished by writing messages to +a control file associated with the file that handles the data. +.P1 +implement Ir; + +include "ir.m"; +include "sys.m"; +FD: import Sys; + +sys: Sys; +cctlfd: ref FD; + +init(keys: chan of int): int +{ + dfd: ref FD; + + sys = load Sys Sys->PATH; + + cctlfd = sys->open("/dev/consctl", sys->OWRITE); + if(cctlfd == nil) + return -1; + sys->write(cctlfd, array of byte "rawon", 5); + + dfd = sys->open("/dev/cons", sys->OREAD); + if(dfd == nil) + return -1; + + spawn reader(keys, dfd); + return 0; +} +.P2 +A fine point: the variable +.CW cctlfd +that contains the FD for the control device is +declared external to the +init function, even though it appears to be used +only inside it. +Programming cleanliness suggests that +its declaration be moved inside, but here that +won't work; +device control files +in Inferno retain settings like `raw mode' only +while they remain open. +If +.CW cctlfd +were declared inside +.CW init , +then returning from +.CW init +would destroy the last reference to the FD for the control file, +and the device would be closed automatically. +.PP +The reader function for this module has the same structure as the first +example, but doesn't have to worry about a noisy infrared detector: +.P1 +reader(keys: chan of int, dfd: ref FD) +{ + n: int; + b:= array[1] of byte; + + for(;;) { + n = sys->read(dfd, b, 1); + if(n != 1) + break; + case int b[0] { + '0' => n = Ir->Zero; + '1' => n = Ir->One; + . . . + 16r7f => n = Ir->Mute; + * => n = Ir->Error; + } + keys <-= n; + } + keys <-= Ir->Error; +} +.P2 +The following module can be used to test the above code. +It simply prints the name of the button that was pressed. +.P1 +implement Irtest; + +include "sys.m"; +include "draw.m"; +FD: import Sys; +include "ir.m"; + +Irtest: module +{ + init: fn(nil: ref Draw->Context, nil: list of string); +}; +ir: Ir; +sys: Sys; +.P2 +.P1 +init(nil: ref Draw->Context, nil: list of string) +{ + c: int; + stderr: ref FD; + irchan := chan of int; + + sys = load Sys Sys->PATH; + stderr = sys->fildes(2); + + # If the real IR remote application can + # be found, use it, otherwise use the simulator: + ir = load Ir Ir->PATH; + if(ir == nil) + ir = load Ir Ir->SIMPATH; + if(ir == nil) { + # %r format code means the last system error string + sys->fprint(stderr, "load ir: %r\en"); + return; + } + if(ir->init(irchan) != 0) { + sys->fprint(stderr, "Ir.init: %r\en"); + return; + } + names := array[] of { + "Zero", + "One", + . . . + "Mute", + }; + for(;;) { + c = <-irchan; + if(c == ir->Error) + sys->print("Error %d\en", c); + else + sys->print("%s\en", names[c]); + } +} +.P2 +Finally, here is a snippet from a movie application that +uses the IR module; it demonstrates how +.CW alt +is useful for dealing with multiple events. +This is only one of the functions of the +movie module, so not everything is defined. +It uses the +.CW Mpeg +module, which actually +copies the MPEG data stream to the screen +asynchronously. +Its +.CW play +function takes, as one of its arguments, +a channel; +before starting to play it writes a +string on the channel. +An empty string indicates success at +locating the movie; a non-empty +string contains an error message. +When it finishes, it writes another string. +.P1 +movie(entry: ref Dbinfo, cc: chan of int) +{ + i: int; + m: Mpeg; + b: ref Image; + + m = load Mpeg Mpeg->PATH; + if (m == nil) + return; + # make a place on the screen + w := screen.window(screen.image.r); + + mr := chan of string; + s := m->play(w, 1, w.r, entry.movie, mr); + if(s != "") + return; + # wait for the end of the movie + # while watching for button pushes + for(;;) { + alt { + <-mr => + return; + i = <-cc => + case i { + Ir->Select => + m->ctl("stop"); + Ir->Up or Ir->Dn => + m->ctl("pause"); + } + } + } +} +.P2 +.NH 2 +Monitors +.PP +Statically allocated storage within a module is accessible to +all the functions of that module, +and there is no explicit mechanism in Limbo for synchronizing +concurrent updates to this storage from several tasks. +However, it is straightforward to build a variety of concurrency-control +mechanisms by using channel communications. +.PP +An example is a module that implements a +.CW Monitor +abstract data type. +Each instance of +.CW Monitor +has a +.CW lock +and an +.CW unlock +operation; +calling +.CW lock +delays if another task holds the lock; calling +.CW unlock +releases the lock and enables any other task attempting +to execute +.CW lock . +.P1 +implement Monitors; + +Monitors: module +{ + Monitor: adt { + create: fn(): Monitor; + lock: fn(m: self Monitor); + unlock: fn(m: self Monitor); + ch: chan of int; + }; +}; +.P2 +.P1 +Monitor.create(): Monitor +{ + m := Monitor(chan of int); + spawn lockproc(m.ch); + return m; +} +.P2 +.P1 +Monitor.lock(m: self Monitor) +{ + m.ch <- = 0; +} +.P2 +.P1 +Monitor.unlock(m: self Monitor) +{ + <- m.ch; +} +.P2 +.P1 +lockproc(ch: chan of int) +{ + for (;;) { + <- ch; # wait for someone to lock + ch <- = 0; # wait for someone to unlock + } +} +.P2 +It would be used like this: +.P1 +mp: Mon; +Monitor: import mp; +mp = load Mon "..."; +l := Monitor.create(); +\. . . +l.lock(); +# region of code to be protected; +# only one thread can execute here at once. +l.unlock(); +.P2 +The +.CW create +method of +.CW Monitor +allocates an instance of a +.CW Monitor +containing an initialized channel. +It also creates a thread executed in the +.CW lockproc +routine, which repeatedly reads from the channel, +then writes on it. +The values transmitted over the channel are of no +interest; it is the pure fact of communication that is put to use. +The +.CW lock +routine sends a message; in the idle state, the +.CW lockproc +thread reads it and the sender proceeds. +Meanwhile, +.CW lockproc +tries to send a message over the same channel. +If another thread attempts to +.CW lock , +there is no reader for the channel, and so its transmission will block. +At some point, the thread that gained the lock +calls +.CW unlock , +which receives from the channel. +Depending on timing, this reception enables execution of either +.CW lockproc +or one of the threads attempting to send via +.CW lock . +.PP +There is a simpler implementation of +.CW Monitor , +using a buffered channel. +The +.CW create +operation simply allocates a channel with a one-element buffer: +.P1 +Monitor.create(): Monitor +{ + return Monitor(chan[1] of int); +} +.P2 +The +.CW lock +and +.CW unlock +operations have the same implementation. +Because of the buffer, when a process locks an unlocked +.CW Monitor , +the send succeeds but fills the channel. +Subsequent attempts to +.CW lock +will therefore block as long as the channel is full. +.CW Unlock +removes the value from the channel, making it empty, +and allowing another +.CW lock +to proceed. +The +.CW lockproc +is not needed. +Note that a program using the module would not need to be recompiled to +use the new implementation, because the module's signature and use +remains the same. +This is the implementation of the +.CW Lock +module in the Limbo library for Inferno. +.PP +Limbo channels are usually unbuffered: +a sender blocks until there +is a receiver, and processes synchronise at each communication. +Buffered channels are used sparingly in Limbo programs, typically to improve throughput or, +less often, in specialized ways as in the monitor example above. +.NH 2 +Guarding sends and receives +.PP +In some applications, a process takes input from one channel, +and sends it on to another channel, possibly having transformed it. +In case the input and output processes run at different rates, +the process itself acts as a buffer, holding a queue of values internally. +If the input process were faster than the output process, the queue +would accumulate values faster than they are consumed, exhausting memory. +To prevent that, when the queue reaches a specified limit, the process should guard against +receiving from the input channel, but continue sending to the output channel. +Conversely, when the queue is empty, it should not attempt to send. +The +.CW alt +statement allows a process to choose between sending and receiving based on +which channels are ready, but the process must also account for the current state +of the queue. +This example shows a way to make a buffered +channel of strings from an unbuffered channel. +It is written as a module whose +.CW bufchan +function takes a +.CW chan +.CW of +.CW string +and a size as argument, and returns a new channel; +it creates an asynchronous task that accepts input from the argument +channel and saves up to +.CW size +strings, meanwhile trying to send them to its user. +.P1 +implement Bufchan; +Bufchan: module { + bufchan: fn(c: chan of string, size: int): chan of string; +}; + +xfer(oldchan, newchan: chan of string, size: int) +{ + temp := array[size] of string; + fp := 0; # first string in buffer + n := 0; # number of strings in buffer + dummy := chan of string; + sendch, recvch: chan of string; + s: string; + + for (;;) { + sendch = recvch = dummy; + if (n > 0) + sendch = newchan; + if (n < size) + recvch = oldchan; + alt { + s = <-recvch => + temp[(fp+n)%size] = s; + n++; + + sendch <- = temp[fp] => + temp[fp++] = nil; + n--; + if (fp>=size) + fp -= size; + } + } +} +.P2 +.P1 +bufchan(oldchan: chan of string, size: int): chan of string +{ + newchan := chan of string; + spawn xfer(oldchan, newchan, size); + return newchan; +} +.P2 +The module is somewhat specialized, but it illustrates +useful programming techniques. +The most interesting occurs in +.CW xfer , +which does the work. +The problem +.CW xfer +faces is that it doesn't want to receive input when its +buffer is full, nor to try to send when it has nothing to +transmit. +The solution here is to use a dummy channel +on which nothing is ever sent or received; in the +.CW alt +statement, +that channel substitutes for the real input channel +when the buffer is full, and for the output channel +when the buffer is empty. +.PP +The module could be used in the following way: +.P1 +Bufchan: module { + PATH: con "/dis/lib/bufchan.dis"; + bufchan: fn(c: chan of string, size: int): chan of string; +}; +\. . . +bufc := load Bufchan Bufchan->PATH; +sourcech := chan of string; + +# ... (here, hand off sourcech to a process that +# reads strings from it and copies them to ch) +ch: chan of string = bufc->bufchan(sourcech, 10); +\. . . +s := <- ch; +\. . . +.P2 +.NH 1 +Syntax summary +.PP +This section summarizes the grammar of Limbo +above the lexical level; constants and identifiers +are left undefined. +.PP |
