summaryrefslogtreecommitdiff
path: root/doc/dis.ms
blob: 7b90fbee040439b8b7407198bb120e731a4c4e47 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
.so /sys/lib/tmac/tmac.uni
.TL
Dis Virtual Machine Specification
.AU
.I "Lucent Technologies Inc"
.I "30 September 1999"

.I "Extensively revised by Vita Nuova Limited"
.I "5 June 2000, 9 January 2003"
.NH 1
Introduction
.LP
The Dis virtual machine provides the execution environment for programs running under the Inferno operating system. The virtual machine models a CISC-like, three operand, memory-to-memory architecture. Code can either be interpreted by a C library or compiled on-the-fly into machine code for the target architecture.
.LP
This paper defines the virtual machine informally.
A separate paper by Winterbottom and Pike[2] discusses its design.
The Dis object file format is also defined here.
Literals and keywords are in
.CW typewriter
typeface.
.NH 1
Addressing Modes
.SH
Operand Size
.LP
Operand sizes are defined as follows: a byte is 8 bits, a word or pointer is 32 bits, a float is 64 bits, a big integer is 64 bits. The operand size of each instruction is encoded explicitly by the operand code. The operand size and type are specified by the last character of the instruction mnemonic:
.IP
.TS
lf(CW) lfR .
W	word, 32-bit two's complement
B	byte, 8-bit unsigned
F	float, 64-bit IEEE format
L	big, 64-bit two's complement
P	pointer
C	Unicode string encoded in UTF-8
M	memory
MP	memory containing pointers
.TE
.LP
Two more operand types are defined to provide `short'
types for use by languages other than Limbo:
signed 16-bit integers, called `short word'
here, and 32-bit IEEE format floating-point numbers, called `short float' or `short real' here.
Support for them is limited to conversion to and from words or floats respectively;
the instructions are marked below with a dagger (†).
.SH
Memory Organization
.LP
Memory for a thread is divided into several separate regions. The code segment stores either a decoded virtual machine instruction stream suitable for execution by the interpreter or flash compiled native machine code for the host CPU. Neither type of code segment is addressable from the instruction set. At the object code level, PC values are offsets, counted in instructions, from the beginning of the code space.
.LP
Data memory is a linear array of bytes, addressed using 32-bit pointers. Words are stored in the native representation of the host CPU. Data types larger than a byte must be stored at addresses aligned to
a multiple of the data size. A thread executing a module has access to two regions of addressable data memory. A module pointer
.CW "mp" \& (
register) defines a region of global storage for a particular module, a frame pointer
.CW "fp" \& (
register) defines the current activation record or frame for the thread. Frames are allocated dynamically from a stack by function call and return instructions. The stack is extended automatically from the heap.
.LP
The
.CW mp
and
.CW fp
registers cannot be addressed directly, and therefore, can be modified only by call and return instructions.
.SH
Effective Addresses
.LP
Each instruction can potentially address three operands. The source and destination operands are general, but the middle operand can use any address mode except double indirect. If the middle operand of a three address instruction is omitted, it is assumed to be the same as the destination operand.
.LP
The general operands generate an effective address from three basic modes: immediate, indirect and double indirect. The assembler syntax for each mode is:
.IP
.TS
lf(CW) lfR .
10(fp)	30-bit signed indirect from fp
20(mp)	30-bit signed indirect from mp
$0x123	30-bit signed immediate value
10(20(fp))	two 16-bit unsigned offsets double indirect from fp
10(20(mp))	two 16-bit unsigned offsets double indirect from mp
.TE
.SH
Garbage Collection
.LP
The Dis machine performs both reference counted and real time mark and sweep garbage collection. This hyrbrid approach allows code to be generated in several styles: pure reference counted, mark and sweep, or a hybrid of the two approaches. Compiler writers have the freedom to choose how specific types are handled by the machine to optimize code for performance or language implementation. Instruction selection determines which algorithm will be applied to specific types.
.LP
When using reference counting, pointers are a special operand type and should only be manipulated using the pointer instructions in order to ensure the correct functioning of the garbage collector. Every memory location that stores a pointer must be known to the interpreter so that it can be initialized and deallocated correctly. The information is transmitted in the form of type descriptors in the object module. Each type descriptor contains a bit vector for a particular type where each bit corresponds to a word in memory. Type descriptors are generated automatically by the Limbo compiler. The assembler syntax for a type descriptor is:
.P1
desc	$10, 132, "001F"
.P2
The first parameter is the descriptor number, the second is the size in bytes, and the third a pointer map. The map contains a list of hex bytes where each byte maps eight 32 bit words. The most significant bit represents the lowest memory address.
A one bit indicates a pointer in memory. The map need not have an entry for every byte and unspecified bytes are assumed zero.
.LP
Throughout this description, the symbolic constant
.CW H
refers to a nil pointer.
.NH 1
Instruction Set
.SH
add\fIx\fP \- Add
.P1
Syntax:		addb	src1, src2, dst
		addf	src1, src2, dst
		addw	src1, src2, dst
		addl	src1, src2, dst
Function:	dst = src1 + src2
.P2
.LP
The 
.CW "add"
instructions compute the sum of the operands addressed by 
.CW "src1"
and 
.CW "src2"
and stores the result in the
.CW " dst"
operand. For 
.CW "addb"
the result is truncated to eight bits.
.SH
addc \- Add strings
.P1
Syntax:		addc	src1, src2, dst
Function:	dst = src1 + src2
.P2
.LP
The 
.CW "addc"
instruction concatenates the two UTF strings pointed to by
.CW " src1"
and 
.CW "src2" ;
the result is placed in the pointer addressed by 
.CW "dst" .
If both pointers are 
.CW "H"
the result will be a zero length string rather than 
.CW "H" .
.SH
alt \- Alternate between communications
.P1
Syntax:		alt	src, dst
.P2
The 
.CW "alt"
instruction selects between a set of channels ready to communicate. The
.CW src
argument is the address of a structure of the following form:
.P1
struct Alt {
	int nsend;		/* Number of senders */
	int nrecv;		/* Number of receivers */
	struct {
		Channel* c;		/* Channel */
		void*	val;	/* Address of lval/rval */
	} entry[];
};
.P2
The vector is divided into two sections; the first lists the channels ready to send values, the second lists channels either ready to receive or an array of channels each of which may be ready to receive. The counts of the sender and receiver channels are stored as the first and second words addressed by
.CW src .
An 
.CW "alt"
instruction proceeds by testing each channel for readiness to communicate. A ready channel is added to a list. If the list is empty after each channel has been considered, the thread blocks at the 
.CW "alt"
instruction waiting for a channel to become ready; otherwise, a channel is picked at random from the ready set.
.LP
The
.CW "alt"
instruction then uses the selected channel to perform the communication using the 
.CW "val"
address as either a source for send or a destination for receive. The numeric index of the selected vector element is placed in 
.CW "dst" .
.SH
and\fIx\fP \- Logical AND
.P1
Syntax:		andb	src1, src2, dst
		andw	src1, src2, dst
		andl	src1, src2, dst
Function:	dst = src1 & src2
.P2
The instructions compute the bitwise AND of the two operands addressed by 
.CW "src1"
and 
.CW "src2"
and stores the result in the 
.CW "dst"
operand.
.SH
beq\fIx\fP \- Branch equal
.P1
Syntax:		beqb	src1, src2, dst
		beqc	src1, src2, dst
		beqf	src1, src2, dst
		beqw	src1, src2, dst
		beql	src1, src2, dst
Function:	if src1 == src2 then pc = dst
.P2
If the 
.CW "src1"
operand is equal to the 
.CW "src2"
operand, then control is transferred to the program counter specified by the 
.CW "dst"
operand.
.SH
bge\fIx\fP \- Branch greater or equal
.P1
Syntax:		bgeb	src1, src2, dst
		bgec	src1, src2, dst
		bgef	src1, src2, dst
		bgew	src1, src2, dst
		bgel	src1, src2, dst
Function:	if src1 >= src2 then pc = dst
.P2
If the 
.CW "src1"
operand is greater than or equal to the 
.CW "src2"
operand, then control is transferred to program counter specified by the 
.CW "dst"
operand. This instruction performs a signed comparison.
.SH
bgt\fIx\fP \- Branch greater
.P1
Syntax:		bgtb	src1, src2, dst
		bgtc	src1, src2, dst
		bgtf	src1, src2, dst
		bgtw	src1, src2, dst
		bgtl	src1, src2, dst
Function:	if src1 > src2 then pc = dst
.P2
If the 
.CW "src1"
operand is greater than the 
.CW "src2"
operand, then control is transferred to the program counter specified by the 
.CW "dst"
operand. This instruction performs a signed comparison.
.SH
ble\fIx\fP \- Branch less than or equal
.P1
Syntax:		bleb	src1, src2, dst
		blec	src1, src2, dst
		blef	src1, src2, dst
		blew	src1, src2, dst
		blel	src1, src2, dst
Function:	if src1 <= src2 then pc = dst
.P2
If the 
.CW "src1"
operand is less than or equal to the 
.CW "src2"
operand, then control is transferred to the program counter specified by the 
.CW "dst"
operand. This instruction performs a signed comparison.
.SH
blt\fIx\fP \- Branch less than
.P1
Syntax:		bltb	src1, src2, dst
		bltc	src1, src2, dst
		bltf	src1, src2, dst
		bltw	src1, src2, dst
		bltl	src1, src2, dst
Function:	if src1 < src2 then pc = dst
.P2
If the 
.CW "src1"
operand is less than the 
.CW "src2"
operand, then control is transferred to the program counter specified by the 
.CW "dst"
operand.
.SH
bne\fIx\fP \- Branch not equal
.P1
Syntax:		bneb	src1, src2, dst
		bnec	src1, src2, dst
		bnef	src1, src2, dst
		bnew	src1, src2, dst
		bnel	src1, src2, dst
Function:	if src1 != src2 then pc = dst
.P2
If the 
.CW "src1"
operand is not equal to the 
.CW "src2"
operand, then control is transferred to the program counter specified by the 
.CW "dst"
operand.
.SH
call \- Call local function
.P1
Syntax:		call	src, dst
Function:	link(src) = pc
		frame(src) = fp
		mod(src) = 0
		fp = src
		pc = dst
.P2
The 
.CW "call"
instruction performs a function call to a routine in the same module. The 
.CW "src"
argument specifies a frame created by 
.CW "new" .
The current value of 
.CW "pc"
is stored in link(src), the current value of 
.CW "fp"
is stored in frame(src) and the module link register is set to 0. The value of 
.CW "fp"
is then set to 
.CW "src"
and control is transferred to the program counter specified by
.CW dst .
.SH
case \- Case compare integer and branch
.P1
Syntax:		case	src, dst
Function:	pc = 0..i: dst[i].pc where
		  dst[i].lo >= src && dst[i].hi < src
.P2
The 
.CW "case"
instruction jumps to a new location specified by a range of values. The 
.CW "dst"
operand points to a table in memory containing a table of 
.CW "i"
values. Each value is three words long: the first word specifies a low value, the second word specifies a high value, and the third word specifies a program counter. The first word of the table gives the number of entries. The 
.CW "case"
instruction searches the table for the first matching value where the 
.CW "src"
operand is greater than or equal to the low word and less than the high word. Control is transferred to the program counter stored in the first word of the matching entry.
.SH
casec \- Case compare string and branch
.P1
Syntax:		casec	src, dst
Function:	pc = 0..i: dst[i].pc where
		   dst[i].lo >= src && dst[i].hi < src
.P2
The 
.CW "casec"
instruction jumps to a new location specified by a range of string constants. The table is the same as described for the
.CW case
instruction.
.SH
cons\fIx\fP \- Allocate new list element
.P1
Syntax:		consb	src, dst
		consc	src, dst
		consf	src, dst
		consl	src, dst
		consm	src, dst
		consmp	src, dst
		consp	src, dst
		consw	src, dst
Function:	p = new(src, dst)
		dst = p
.P2
The 
.CW "cons"
instructions add a new element to the head of a list. A new list element is composed from the 
.CW "src"
operand and a pointer to the head of an extant list specified by 
.CW "dst" .
The resulting element is stored back into 
.CW "dst" .
.SH
cvtac \- Convert byte array to string
.P1
Syntax:		cvtac	src, dst
Function:	dst = string(src)
.P2
The 
.CW "src"
operand must be an array of bytes, which is converted into a character string and stored in 
.CW "dst" .
The new string is a copy of the bytes in 
.CW "src" .
.SH
cvtbw \- Convert byte to word
.P1
Syntax:		cvtbw	src, dst
Function:	dst = src & 0xff
.P2
A byte is fetched from the 
.CW "src"
operand extended to the size of a word and then stored into 
.CW "dst" .
.SH
cvtca \- Convert string to byte array
.P1
Syntax:		cvtca	src, dst
Function:	dst = array(src)
.P2
The 
.CW "src"
operand must be a string which is converted into an array of bytes and stored in 
.CW "dst" .
The new array is a copy of the characters in src.
.SH
cvtcf \- Convert string to real
.P1
Syntax:		cvtcf	src, dst
Function:	dst = (float)src
.P2
The string addressed by the 
.CW "src"
operand is converted to a floating point value and stored in the 
.CW "dst"
operand. Initial white space is ignored; conversion ceases at the first character in the string that is not part of the representation of the floating point value.
.SH
cvtcl \- Convert string to big
.P1
Syntax:		cvtcl	src, dst
Function:	dst = (big)src
.P2
The string addressed by the 
.CW "src"
operand is converted to a big integer and stored in the 
.CW "dst"
operand. Initial white space is ignored; conversion ceases at the first non-digit in the string.
.SH
cvtcw \- Convert string to word
.P1
Syntax:		cvtcw	src, dst
Function:	dst = (int)src
.P2
The string addressed by the 
.CW "src"
operand is converted to a word and stored in the 
.CW "dst"
operand. Initial white space is ignored; after a possible sign, conversion ceases at the first non-digit in the string.
.SH
cvtfc \- Convert real to string
.P1
Syntax:		cvtfc	src, dst
Function:	dst = string(src)
.P2
The floating point value addressed by the 
.CW "src"
operand is converted to a string and stored in the 
.CW "dst"
operand. The string is a floating point representation of the value.
.SH
cvtfw \- Convert real to word
.P1
Syntax:		cvtfw	src, dst
Function:	dst = (int)src
.P2
The floating point value addressed by 
.CW "src"
is converted into a word and stored into 
.CW "dst" .
The floating point value is rounded to the nearest integer.
.SH
cvtfl \- Convert real to big
.P1
Syntax:		cvtfl	src, dst
Function:	dst = (big)src
.P2
The floating point value addressed by 
.CW "src"
is converted into a big integer and stored into 
.CW "dst" .
The floating point value is rounded to the nearest integer.
.SH
cvtfr \- Convert real to short real†
.P1
Syntax:		cvtfr	src, dst
Function:	dst = (short float)src
.P2
The floating point value addressed by 
.CW "src"
is converted to a short (32-bit) floating point value and stored into 
.CW "dst" .
The floating point value is rounded to the nearest integer.
.SH
cvtlc \- Convert big to string
.P1
Syntax:		cvtlc	src, dst
Function:	dst = string(src)
.P2
The big integer addressed by the 
.CW "src"
operand is converted to a string and stored in the 
.CW "dst"
operand. The string is the decimal representation of the big integer.
.SH
cvtlw \- Convert big to word
.P1
Syntax:		cvtlw	src, dst
Function:	dst = (int)src
.P2
The big integer addressed by the 
.CW "src"
operand is converted to a word and stored in the 
.CW "dst"
operand.
.SH
cvtsw \- Convert short word to word†
.P1
Syntax:		cvtsw	src, dst
Function:	dst = (int)src
.P2
The short word addressed by the 
.CW "src"
operand is converted to a word and stored in the 
.CW "dst"
operand.
.SH
cvtwb \- Convert word to byte
.P1
Syntax:		cvtwb	src, dst
Function:	dst = (byte)src;
.P2
The 
.CW "src"
operand is converted to a byte and stored in the 
.CW "dst"
operand.
.SH
cvtwc \- Convert word to string
.P1
Syntax:		cvtwc	src, dst
Function:	dst = string(src)
.P2
The word addressed by the 
.CW "src"
operand is converted to a string and stored in the 
.CW "dst"
operand. The string is the decimal representation of the word.
.SH
cvtwl \- Convert word to big
.P1
Syntax:		cvtwl	src, dst
Function:	dst = (big)src;
.P2
The word addressed by the 
.CW "src"
operand is converted to a big integer and stored in the 
.CW "dst"
operand.
.SH
cvtwf \- Convert word to real
.P1
Syntax:		cvtwf	src, dst
Function:	dst = (float)src;
.P2
The word addressed by the 
.CW "src"
operand is converted to a floating point value and stored in the 
.CW "dst"
operand.
.SH
cvtws \- Convert word to short word†
.P1
Syntax:		cvtws	src, dst
Function:	dst = (short)src;
.P2
The word addressed by the 
.CW "src"
operand is converted to a short word and stored in the 
.CW "dst"
operand.
.SH
cvtlf \- Convert big to real
.P1
Syntax:		cvtlf	src, dst
Function:	dst = (float)src;
.P2
The big integer addressed by the 
.CW "src"
operand is converted to a floating point value and stored in the 
.CW "dst"
operand.
.SH
cvtrf \- Convert short real to real†
.P1
Syntax:		cvtrf	src, dst
Function:	dst = (float)src;
.P2
The short (32 bit) floating point value addressed by the 
.CW "src"
operand is converted to a 64-bit floating point value and stored in the 
.CW "dst"
operand.
.SH
div\fIx\fP \- Divide
.P1
Syntax:		divb	src1, src2, dst
		divf	src1, src2, dst
		divw	src1, src2, dst
		divl	src1, src2, dst
Function:	dst = src2/src1
.P2
The 
.CW "src2"
operand is divided by the 
.CW "src1"
operand and the quotient is stored in the 
.CW "dst"
operand. Division by zero causes the thread to terminate.
.SH
exit \- Terminate thread
.P1
Syntax:		exit
Function:	exit()
.P2
The executing thread terminates. All resources held in the stack are deallocated.
.SH
frame \- Allocate frame for local call
.P1
Syntax:		frame	src1, src2
Function:	src2 = fp + src1->size
		initmem(src2, src1);
.P2
The frame instruction creates a new stack frame
for a call to a function in the same module. The frame is initialized according to the type descriptor supplied as the
.CW src1
operand. A pointer to the newly created frame is stored in the
.CW src2
operand.
.SH
goto \- Computed goto
.P1
Syntax:		goto	src, dst
Function:	pc = dst[src]
.P2
The 
.CW "goto"
instruction performs a computed goto. The 
.CW "src"
operand must be an integer index into a table of PC values specified by the 
.CW "dst"
operand.
.SH
head\fIx\fP \- Head of list
.P1
Syntax:		headb	src, dst
		headf	src, dst
		headm	src, dst
		headmp	src, dst
		headp	src, dst
		headw	src, dst
		headl	src, dst
Function:	dst = hd src
.P2
The 
.CW "head"
instructions make a copy of the first data item stored in a list. The 
.CW "src"
operand must be a list of the correct type. The first item is copied into the 
.CW "dst"
operand. The list is not modified.
.SH
indc \- Index by character
.P1
Syntax:		indc	src1, src2, dst	
Function:	dst = src1[src2]
.P2
The 
.CW "indc"
instruction indexes Unicode strings. The 
.CW "src1"
instruction must be a string. The 
.CW "src2"
operand must be an integer specifying the origin-0 index in
.CW src1
of the (Unicode) character to store in the 
.CW "dst"
operand.
.SH
indx \- Array index
.P1
Syntax:		indx	src1, dst, src2
Function:	dst = &src1[src2]
.P2
The 
.CW "indx"
instruction computes the effective address of an array element. The 
.CW "src1"
operand must be an array created by the 
.CW "newa"
instruction. The 
.CW "src2"
operand must be an integer. The effective address of the 
.CW "src2"
element of the array is stored in the 
.CW "dst"
operand.
.SH
ind\fIx\fP \- Index by type
.P1
Syntax:		indb	src1, dst, src2
		indw	src1, dst, src2
		indf	src1, dst, src2
		indl	src1, dst, src2
Function:	dst = &src1[src2]
.P2
The 
.CW "indb" , 
.CW "indw" , 
.CW "indf"
and 
.CW "indl"
instructions index arrays of the basic types. The 
.CW "src1"
operand must be an array created by the 
.CW "newa"
instruction. The 
.CW "src2"
operand must be a non-negative integer index less than the array size. The effective address of the element at the index is stored in the 
.CW "dst"
operand.
.SH
insc \- Insert character into string
.P1
Syntax:		insc	src1, src2, dst
Function:	src1[src2] = dst
.P2
The 
.CW "insc"
instruction inserts a character into an existing string.
The index in
.CW "src2"
must be a non-negative integer less than the length of the string plus one.
(The character will be appended to the string if the index is equal to
the string's length.)
The
.CW "src1"
operand must be a string (or nil).
The character to insert must be a valid 21-bit unicode value represented as a word.
.SH
jmp \- Branch always
.P1
Syntax:		jmp	dst
Function:	pc = dst
.P2
Control is transferred to the location specified by the 
.CW "dst"
operand.
.SH
lea \- Load effective address
.P1
Syntax:		lea	src, dst
Function:	dst = &src
.P2
The 
.CW "lea"
instruction computes the effective address of the 
.CW "src"
operand and stores it in the 
.CW "dst"
operand.
.SH
lena \- Length of array
.P1
Syntax:		lena	src, dst
Function:	dst = nelem(src)
.P2
The 
.CW "lena"
instruction computes the length of the array specified by the 
.CW "src"
operand and stores it in the 
.CW "dst"
operand.
.SH
lenc \- Length of string
.P1
Syntax:		lenc	src, dst
Function:	dst = utflen(src)
.P2
The 
.CW "lenc"
instruction computes the number of characters in the UTF string addressed by the 
.CW "src"
operand and stores it in the 
.CW "dst"
operand.
.SH
lenl \- Length of list
.P1
Syntax:		lenl	src, dst
Function:	dst = 0;
		for(l = src; l; l = tl l)
			dst++;
.P2
The 
.CW "lenl"
instruction computes the number of elements in the list addressed by the 
.CW "src"
operand and stores the result in the 
.CW "dst"
operand.
.SH
load \- Load module
.P1
Syntax:		load	src1, src2, dst
Function:	dst = load src2 src1
.P2
The 
.CW "load"
instruction loads a new module into the heap. The module might optionally be compiled into machine code depending on the module header. The 
.CW "src1"
operand is a pathname to the file containing the object code for the module. The 
.CW "src2"
operand specifies the address
of a linkage descriptor for the module (see below).
A reference to the newly loaded module is stored in the 
.CW "dst"
operand.
If the module could not be loaded for any reason, then 
.CW "dst"
will be set to
.CW H .
.LP
The linkage descriptor referenced by the
.CW src2
operand is a table in data space that lists the functions
imported by the current module from the module to be loaded.
It has the following layout:
.P1
int nentries;
struct {	/* word aligned */
	int	sig;
	byte	name[];	/* UTF encoded name, 0-terminated */
} entry[];
.P2
The
.CW nentries
value gives the number of entries in the table and can be zero.
It is followed by that many linkage entries.
Each entry is aligned on a word boundary; there can therefore
be padding before each structure.
The entry names the imported function in the UTF-encoded string in
.CW name ,
which is terminated by a byte containing zero.
The MD5 hash of the function's type signature is given in the value
.CW sig .
For each entry,
.CW load
instruction checks that a function with the same name in the newly loaded
exists, with the same signature.
Otherwise the load will fail and
.CW dst
will be set to
.CW H .
.LP
The entries in the linkage descriptor form an array of linkage records
(internal to the virtual machine) associated with the
module pointer returned in
.CW dst ,
that is indexed by operators
.CW mframe ,
.CW mcall
and
.CW mspawn
to refer to functions in that module.
The linkage scheme provides a level of indirection that allows
a module to be loaded using any module declaration that is a valid
subset of the implementation module's declaration,
and allows entry points to be added to modules without invalidating
calling modules.
.SH
lsr\fIx\fP \- Logical shift right
.P1
Syntax:		lsrw	src1, src2, dst
		lsrl	src1, src2, dst
Function:	dst = (unsigned)src2 >> src1
.P2
The 
.CW "lsr"
instructions shift the 
.CW "src2"
operand right by the number of bits specified by the 
.CW "src1"
operand, replacing the vacated bits by 0, and store the result in the 
.CW "dst"
operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results.
This instruction is included for support of languages other than Limbo,
and is not used by the Limbo compiler.
.SH
mcall \- Inter-module call
.P1
Syntax:		mcall	src1, src2, src3
Function:	link(src1) = pc
		frame(src1) = fp
		mod(src1) = current_moduleptr
		current_moduleptr = src3->moduleptr
		fp = src1
		pc = src3->links[src2]->pc
.P2
The 
.CW "mcall"
instruction calls a function in another module. The first argument specifies a new frame for the called procedure and must have been built using the 
.CW "mframe"
instruction.
The 
.CW "src3"
operand is a module reference generated by a successful 
.CW "load"
instruction.
The 
.CW "src2"
operand specifies the index for the called
function in the array of linkage records associated with that module reference
(see the
.CW load
instruction).
.SH
mframe \- Allocate inter-module frame
.P1
Syntax:		mframe	src1, src2, dst
Function:	dst = fp + src1->links[src2]->t->size
		initmem(dst, src1->links[src2])
.P2
The
.CW mframe
instruction allocates a new frame for a procedure call into another module. The
.CW src1
operand specifies the location of a module pointer created as the result of a successful load instruction. The
.CW src2
operand specifies the index for the called function in
the array of linkage records associated
with that module pointer (see the
.CW load
instruction).
A pointer to the initialized frame is stored in
.CW dst .
The
.CW src2
operand specifies the linkage number of the function to be called in the module specified by
.CW src1 .
.SH
mnewz \- Allocate object given type from another module
.P1
Syntax:		mnewz	src1, src2, dst
Function:	dst = malloc(src1->types[src2]->size)
		initmem(dst, src1->types[src2]->map)
.P2
The
.CW mnewz
instruction allocates and initializes storage to a new
area of memory.
The
.CW src1
operand specifies the location of a module pointer created as the result of a successful load instruction.
The size of the new memory area and the location of
pointers within it are specified by the
.CW src2
operand, which gives a
type descriptor number within that module.
Space not occupied by pointers is initialized to zero.
A pointer to the initialized object is stored in
.CW dst .
This instruction is not used by Limbo; it was added to implement other languages.
.SH
mod\fIx\fP \- Modulus
.P1
Syntax:		modb	src1, src2, dst
		modw	src1, src2, dst
		modl	src1, src2, dst
Function:	dst = src2 % src1
.P2
The modulus instructions compute the remainder of the 
.CW "src2"
operand divided by the 
.CW "src1"
operand and store the result in 
.CW "dst" .
The operator preserves the condition that the absolute value of a%b is less than the absolute value of 
.CW "b" ; 
.CW "(a/b)*b + a%b"
is always equal to
.CW a .
.SH
mov\fIx\fP \- Move scalar
.P1
Syntax:		movb	src, dst
		movw	src, dst
		movf	src, dst
		movl	src, dst
Function:	dst = src
.P2
The move operators perform assignment. The value specified by the 
.CW "src"
operand is copied to the 
.CW "dst"
operand.
.SH
movm \- Move memory
.P1
Syntax:		movm	src1, src2, dst
Function:	memmove(&dst, &src1, src2)
.P2
The 
.CW "movm"
instruction copies memory from the 
.CW "src1"
operand to the 
.CW "dst"
operand for 
.CW "src2"
bytes. The 
.CW "src1"
and 
.CW "dst"
operands specify the effective address of the memory rather than a pointer to the memory.
.SH
movmp \- Move memory and update reference counts
.P1
Syntax:		movmp	src1, src2, dst
Function:	decmem(&dst, src2)
		memmove(&dst, &src1, src2->size)
		incmem(&src, src2)
.P2
The 
.CW "movmp"
instructions performs the same function as the 
.CW "movm"
instruction but increments the reference count of pointers contained in the data type. For each pointer specified by the 
.CW "src2"
type descriptor, the corresponding pointer reference count in the destination is decremented. The 
.CW "movmp"
instruction then copies memory from the 
.CW "src1"
operand to the 
.CW "dst"
operand for the number of bytes described by the type descriptor. For each pointer specified by the type descriptor the corresponding pointer reference count in the source is incremented.
.SH
movp \- Move pointer
.P1
Syntax:		movp	src, dst
Function:	destroy(dst)
		dst = src
		incref(src)
.P2
The 
.CW "movp"
instruction copies a pointer adjusting the reference counts to reflect the new pointers.
.SH
movpc \- Move program counter
.P1
Syntax:		movpc	src, dst
Function:	dst = PC(src);
.P2
The 
.CW "movpc"
instruction computes the actual address of an immediate PC value. The 
.CW "dst"
operand is set to the actual machine address of the instruction addressed by the 
.CW "src"
operand. This instruction must be used to calculate PC values for computed branches.
.SH
mspawn \- Module spawn function
.P1
Syntax:		mspawn	src1, src2, src3
Function:	fork();
		if(child){
			link(src1) = 0
			frame(src1) = 0
			mod(src1) = src3->moduleptr
			current_moduleptr = src3->moduleptr
			fp = src1
			pc = src3->links[src2]->pc
		}
.P2
The 
.CW "mspawn"
instruction creates a new thread, which starts executing a function in another module.
The first argument specifies a new frame for the called procedure and must have been built using the 
.CW "mframe"
instruction.
The 
.CW "src3"
operand is a module reference generated by a successful 
.CW "load"
instruction.
The
.CW "src2"
operand specifies the index for the called function in
the array of linkage records associated with that module reference (see the
.CW load
instruction above).
.SH
mul\fIx\fP - Multiply
.P1
Syntax:		mulb	src1, src2, dst
		mulw	src1, src2, dst
		mulf	src1, src2, dst
		mull	src1, src2, dst
Function:	dst = src1 * src2
.P2
The
.CW src1
operand is multiplied by the
.CW src2
operand and the product is stored in the
.CW dst
operand.
.SH
nbalt \- Non blocking alternate
.P1
Syntax:		nbalt	src, dst
.P2
The 
.CW "nbalt"
instruction has the same operands and function as 
.CW "alt"
, except that if no channel is ready to communicate, the instruction does not block. When no channels are ready, control is transferred to the PC in the last element of the table addressed by
.CW dst .
.SH
negf \- Negate real
.P1
Syntax:		negf	src, dst
Function:	dst = -src
.P2
The floating point value addressed by the 
.CW "src"
operand is negated and stored in the 
.CW "dst"
operand.
.SH
new, newz \- Allocate object
.P1
Syntax:		new	src, dst
		newz	src, dst
Function:	dst = malloc(src->size);
		initmem(dst, src->map);
.P2
The 
.CW "new"
instruction allocates and initializes storage to a new area of memory. The size and locations of pointers are specified by the type descriptor number given as the 
.CW "src"
operand. A pointer to the newly allocated object is placed in 
.CW "dst" .
Any space not occupied by pointers has undefined value.
.LP
The
.CW "newz"
instruction additionally guarantees that all non-pointer values are set to zero.
It is not used by Limbo.
.SH
newa, newaz \- Allocate array
.P1
Syntax:		newa	src1, src2, dst
		newaz	src1, src2, dst
Function:	dst = malloc(src2->size * src1);
		for(i = 0; i < src1; i++)
			initmem(dst + i*src2->size, src2->map);
.P2
The 
.CW "newa"
instruction allocates and initializes an array. The number of elements is specified by the 
.CW "src1"
operand. The type of each element is specified by the type descriptor number given as the 
.CW "src2"
operand.
Space not occupied by pointers has undefined value.
The
.CW newaz
instruction additionally guarantees that all non-pointer values are set to zero;
it is not used by Limbo.
.SH
newc\fIx\fP \- Allocate channel
.P1
Syntax:		newcw	dst
		newcb	dst
		newcl	dst
		newcf	dst
		newcp	dst
		newcm	src, dst
		newcmp	src, dst
Function:	dst = new(Channel)
.P2
The 
.CW "newc"
instruction allocates a new channel of the specified type and stores a reference to the channel in 
.CW "dst" .
For the 
.CW "newcm"
instruction the source specifies the number of bytes of memory used by values sent on the channel (see the
.CW movm
instruction above).
For the 
.CW "newcmp"
instruction the first operand specifies a type descriptor giving the length of the structure and the location of pointers within the structure (see the
.CW movmp
instruction above).
.SH
or\fIx\fP \- Logical OR
.P1
Syntax:		orb	src1, src2, dst
		orw	src1, src2, dst
		orl	src1, src2, dst
Function:	dst = src1 | src
.P2
These instructions compute the bitwise OR of the two operands addressed by 
.CW "src1"
and 
.CW "src2"
and store the result in the 
.CW "dst"
operand.
.SH
recv \- Receive from channel
.P1
Syntax:		recv	src, dst
Function:	dst = <-src
.P2
The 
.CW "recv"
instruction receives a value from some other thread on the channel specified by the 
.CW "src"
operand. Communication is synchronous, so the calling thread will block until a corresponding 
.CW "send"
or 
.CW "alt"
is performed on the channel. The type of the received value is determined by the channel type and the 
.CW "dst"
operand specifies where to place the received value.
.SH
ret \- Return from function
.P1
Syntax:		ret
Function:	npc = link(fp)
		mod = mod(fp)
		fp = frame(fp)
		pc = npc
.P2
The 
.CW "ret"
instruction returns control to the instruction after the call of the current function.
.SH
send \- Send to channel
.P1
Syntax:		send	src, dst
Function:	dst <-= src
.P2
The 
.CW "send"
instruction sends a value from this thread to some other thread on the channel specified by the 
.CW "dst"
operand. Communication is synchronous so the calling thread will block until a corresponding 
.CW "recv"
or 
.CW "alt"
is performed on the channel. The type of the sent value is determined by the channel type and the 
.CW "dst"
operand specifies where to retrieve the sent value.
.SH
shl\fIx\fP \- Shift left arithmetic
.P1
Syntax:		shlb	src1, src2, dst
		shlw	src1, src2, dst
		shll	src1, src2, dst
Function:	dst = src2 << src1
.P2
The 
.CW "shl"
instructions shift the 
.CW "src2"
operand left by the number of bits specified by the 
.CW "src1"
operand and store the result in the 
.CW "dst"
operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results.
.SH
shr\fIx\fP \- Shift right arithmetic
.P1
Syntax:		shrb	src1, src2, dst
		shrw	src1, src2, dst
		shrl	src1, src2, dst
Function:	dst = src2 >> src1
.P2
The 
.CW "shr"
instructions shift the 
.CW "src2"
operand right by the number of bits specified by the 
.CW "src1"
operand and store the result in the 
.CW "dst"
operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results.
.SH
slicea \- Slice array
.P1
Syntax:		slicea	src1, src2, dst
Function:	dst = dst[src1:src2]
.P2
The 
.CW "slicea"
instruction creates a new array, which contains the elements from the index at 
.CW "src1"
to the index
.CW "src2-1" .
The new array is a reference array which points at the elements in the initial array. The initial array will remain allocated until both arrays are no longer referenced.
.SH
slicec \- Slice string
.P1
Syntax:		slicec	src1, src2, dst
Function:	dst = dst[src1:src2]
.P2
The 
.CW "slicec"
instruction creates a new string, which contains characters from the index at 
.CW "src1"
to the index
.CW "src2-1" .
Unlike 
.CW "slicea"
, the new string is a copy of the elements from the initial string.
.SH
slicela \- Assign to array slice
.P1
Syntax:		slicela	  src1, src2, dst
Function:	dst[src2:] = src1
.P2
The 
.CW "src1"
and 
.CW "dst"
operands must be arrays of equal types. The 
.CW "src2"
operand is a non-negative integer index. The 
.CW "src1"
array is assigned to the array slice 
.CW "dst[src2:]" ; 
.CW "src2 + nelem(src1)"
must not exceed 
.CW "nelem(dst)" .
.SH
spawn \- Spawn function
.P1
Syntax:		spawn	src, dst
Function:	fork();
		if(child)
			dst(src);
.P2
The 
.CW "spawn"
instruction creates a new thread and calls the function specified by the 
.CW "dst"
operand. The argument frame passed to the thread function is specified by the 
.CW "src"
operand and should have been created by the 
.CW "frame"
instruction.
.SH
sub\fIx\fP \- Subtract	
.P1
Syntax:		subb	src1, src2, dst
		subf	src1, src2, dst
		subw	src1, src2, dst
		subl	src1, src2, dst
Function:	dst = src2 - src1
.P2
The 
.CW "sub"
instructions subtract the operands addressed by 
.CW "src1"
and 
.CW "src2"
and stores the result in the 
.CW "dst"
operand. For 
.CW "subb" ,
the result is truncated to eight bits.
.SH
tail \- Tail of list
.P1
Syntax:		tail	src, dst
Function:	dst = src->next
.P2
The 
.CW "tail"
instruction takes the list specified by the 
.CW "src"
operand and creates a reference to a new list with the head removed, which is stored in the 
.CW "dst"
operand.
.SH
tcmp \- Compare types
.P1
Syntax:		tcmp	src, dst
Function:	if(typeof(src) != typeof(dst))
			error("typecheck");
.P2
The 
.CW "tcmp"
instruction compares the types of the two pointers supplied by the 
.CW "src"
and 
.CW "dst"
operands. The comparison will succeed if the two pointers were created from the same type descriptor or the 
.CW "src"
operand is 
.CW "nil" ;
otherwise, the program will error. The 
.CW "dst"
operand must be a valid pointer.
.SH
xor\fIx\fP \- Exclusive OR
.P1
Syntax:		xorb	src1, src2, dst
		xorw	src1, src2, dst
		xorl	src1, src2, dst
Function:	dst = src1 ^ src2
.P2
These instructions compute the bitwise exclusive-OR of the two operands addressed by 
.CW "src1"
and 
.CW "src2"
and store the result in the 
.CW "dst"
operand.
.NH 1
Object File Format
.LP
An object file defines a single module. The file has the following structure:
.P1
Objfile
{
	Header;
	Code_section;
	Type_section;
	Data_section;
	Module_name;
	Link_section;
};
.P2
The following data types are used in the description of the file encoding:
.IP
.TS
lf(CW) lw(4i)fR .
OP	T{
encoded integer operand, encoding selected by the two most significant bits as follows:
.nf
00 signed 7 bits, 1 byte
.br
10 signed 14 bits, 2 bytes
.br
11 signed 30 bits, 4 bytes
T}
B	unsigned byte
W	32 bit signed integer
F	canonicalized 64-bit IEEE754 floating point value
SO	16 bit unsigned small offset from register
SI	16 bit signed immediate value
LO	30 bit signed large offset from register
.TE
.LP
All binary values are encoded in two's complement format, most significant byte first.
.SH
The Header Section
.P1
Header
{
	OP: magic_number;
	Signature;
	OP: runtime_flag;
	OP: stack_extent;
	OP: code_size;
	OP: data_size;
	OP: type_size;
	OP: link_size;
	OP: entry_pc;
	OP: entry_type;
};
.P2
The magic number is defined as 819248
(symbolically
.CW XMAGIC ),
for modules that have not been signed cryptographically, and 923426
(symbolically
.CW "SMAGIC" ),
for modules that contain a signature.
On the Inferno system, the symbolic names
.CW "XMAGIC"
and
.CW SMAGIC
are defined by the C include file
.CW "/include/isa.h"
and the Limbo module
.CW /module/dis.m .
.LP
The signature field is only present if the magic number is
.CW "SMAGIC" .
It has the form:
.P1
Signature
{
	OP: length;
	array[length] of byte: signature;
};
.P2
A digital signature is defined by a length, followed by an array of untyped bytes.
Data within the signature should identify the signing authority, algorithm, and data to be signed.
.LP
The
.CW runtime_flag
is a bit mask that defines various execution options for a Dis module. The flags currently defined are:
.P1
MUSTCOMPILE	= 1<<0
DONTCOMPILE	= 1<<1
SHAREMP		= 1<<2
.P2
The 
.CW "MUSTCOMPILE"
flag indicates that a 
.CW "load"
instruction should draw an error if the implementation is unable to compile the module into native instructions using a just-in-time compiler.
.LP
The 
.CW "DONTCOMPILE"
flag indicates that the module should not be compiled into native instructions, even though it is the default for the runtime environment. This flag may be set to allow debugging or to save memory.
.LP
The 
.CW "SHAREMP"
flag indicates that each instance of the module should use the same module data for all instances of the module. There is no implicit synchronization between threads using the shared data.
.LP
The
.CW stack_extent
value indicates the number of bytes by which the thread stack of this module should be extended in the event that procedure calls exhaust the allocated stack. While stack extension is transparent to programs, increasing this value may improve the efficiency of execution at the expense of using more memory.
.LP
The
.CW code_size
is a count of the number of instructions stored in the Code_section.
.LP
The
.CW data_size
gives the size in bytes of the module's global data, which is initialized
by evaluating the contents of the data section.
.LP
The
.CW type_size
is a count of the number of type descriptors stored in the Type_section.
.LP
The
.CW link_size
is a count of the number of external linkage directives stored in the Link_section.
.LP
The
.CW entry_pc
is an integer index into the instruction stream that is the default entry point for this module. The
.CW entry_pc
should point to the first instruction of a function. Instructions are numbered from a program counter value of zero.
.LP
The
.CW entry_type
is the index of the type descriptor that corresponds to the function entry point set by
.CW entry_pc .
.SH
The Code Section
.LP
The code section describes a sequence of instructions for the virtual machine. An instruction is encoded as follows:
.P1
Instruction
{
	B: opcode;
	B: address_mode;
	Middle_data;
	Source_data;
	Dest_data;
};
.P2
.LP
The
.CW opcode
specifies the instruction to execute, encoded as follows:
.IP
.TS
tab(:);
l l l l l .
00 nop:20 headb:40 mulw:60 blew:80 shrl
01 alt:21 headw:41 mulf:61 bgtw:81 bnel
02 nbalt:22 headp:42 divb:62 bgew:82 bltl
03 goto:23 headf:43 divw:63 beqf:83 blel
04 call:24 headm:44 divf:64 bnef:84 bgtl
05 frame:25 headmp:45 modw:65 bltf:85 bgel
06 spawn:26 tail:46 modb:66 blef:86 beql
07 runt:27 lea:47 andb:67 bgtf:87 cvtlf
08 load:28 indx:48 andw:68 bgef:88 cvtfl
09 mcall:29 movp:49 orb:69 beqc:89 cvtlw
0A mspawn:2A movm:4A orw:6A bnec:8A cvtwl
0B mframe:2B movmp:4B xorb:6B bltc:8B cvtlc
0C ret:2C movb:4C xorw:6C blec:8C cvtcl
0D jmp:2D movw:4D shlb:6D bgtc:8D headl
0E case:2E movf:4E shlw:6E bgec:8E consl
0F exit:2F cvtbw:4F shrb:6F slicea:8F newcl
10 new:30 cvtwb:50 shrw:70 slicela:90 casec
11 newa:31 cvtfw:51 insc:71 slicec:91 indl
12 newcb:32 cvtwf:52 indc:72 indw:92 movpc
13 newcw:33 cvtca:53 addc:73 indf:93 tcmp
14 newcf:34 cvtac:54 lenc:74 indb:94 mnewz
15 newcp:35 cvtwc:55 lena:75 negf:95 cvtrf
16 newcm:36 cvtcw:56 lenl:76 movl:96 cvtfr
17 newcmp:37 cvtfc:57 beqb:77 addl:97 cvtws
18 send:38 cvtcf:58 bneb:78 subl:98 cvtsw
19 recv:39 addb:59 bltb:79 divl:99 lsrw
1A consb:3A addw:5A bleb:7A modl:9A lsrl
1B consw:3B addf:5B bgtb:7B mull:9B eclr
1C consp:3C subb:5C bgeb:7C andl:9C newz
1D consf:3D subw:5D beqw:7D orl:9D newaz
1E consm:3E subf:5E bnew:7E xorl
1F consmp:3F mulb:5F bltw:7F shll
.TE
.LP
The
.CW address_mode
byte specifies the addressing mode of each of the three operands: middle, source and destination. The source and destination operands are encoded by three bits and the middle operand by two bits. The bits are packed as follows:
.P1
bit	 7  6  5  4  3  2  1  0
	m1 m0 s2 s1 s0 d2 d1 d0
.P2
The middle operand is encoded as follows:
.IP
.TS
lf(CW) lf(CW)	lw(3i)fR .
00	\fInone\fP	no middle operand	
01	$SI	small immediate
10	SO(FP)	small offset indirect from FP
11	SO(MP)	small offset indirect from MP
.TE
.LP
The source and destination operands are encoded as follows:
.IP
.TS
lf(CW) lf(CW)	lw(3i)fR .
000	LO(MP)	offset indirect from MP
001	LO(FP)	offset indirect from FP
010	$OP	30 bit immediate
011	\fInone\fP	no operand
100	SO(SO(MP))	double indirect from MP
101	SO(SO(FP))	double indirect from FP
110	\fIreserved\fP
111	\fIreserved\fP
.TE
.LP
The
.CW middle_data
field is only present if the middle operand specifier of the address_mode is not  `none'.
If the field is present it is encoded as an 
.CW "OP" .
.LP
The
.CW source_data
and
.CW dest_data
fields are present only if the corresponding
.CW address_mode
field is not `none'.
For offset indirect and immediate modes the field contains a single 
.CW "OP" .
For double indirect modes the values are encoded as two 
.CW "OP"
values: the first value is the register indirect offset, and the second value is the final indirect offset. The offsets for double indirect addressing cannot be larger than 16 bits.
.SH
The Type Section
.LP
The type section contains type descriptors describing the layout of pointers within data types. The format of each descriptor is:
.P1
Type_descriptor
{
	OP: desc_number;
	OP: size;
	OP: number_ptrs;
	array[number_ptrs] of B: map;
};
.P2
.LP
The
.CW desc_number
is a small integer index used to identify the descriptor to instructions such as 
.CW "new" .
.LP
The 
.CW "size"
field is the size in bytes of the memory described by this type.
.LP
The
.CW number_ptrs
field gives the size in bytes of the 
.CW "map"
array.
.LP
The 
.CW "map"
array is a bit vector where each bit corresponds to a word in memory.
The most significant bit corresponds to the lowest address.
For each bit in the map,
the word at the corresponding offset in the type is a pointer iff the bit is set to 1.
.SH
The Data Section
.LP
The data section encodes the contents of the
.CW "MP"
data for the module. The section contains a sequence of items; each item contains
a control byte and an offset into the section,
followed by one or more data items.
A control byte of zero marks the end of the data section.
Otherwise, it gives the type of data to be loaded and selects between
two representations of an item:
.P1
Short_item
{
	B: code;
	OP: offset;
	array[code & 16rF] of type[code>>4]: data;
};
.P3
Long_item
{
	B: code;
	OP: count;
	OP: offset;
	array[ndata] of type[code>>4]: data;
};
.P2
A
.CW Short_item
is generated for 15 or fewer items, otherwise a
.CW "Long_item"
is generated. In a
.CW "Long_item"
the count field (bottom 4 bits of code) is set to zero and the count follows as an
.CW "OP" .
The top 4 bits of code determine the type of the datum.
The defined values are:
.IP
.TS
lf(CW)	lw(3i)f(R) .
0001	8 bit bytes
0010	32 bit words
0011	utf encoded string
0100	real value IEEE754 canonical representation
0101	Array
0110	Set array address
0111	Restore load address
1000	64 bit big
.TE
.LP
The byte, word, real and big operands are encoded as sequences
of bytes (of appropriate length) in big-endian form, converted to native
format before being stored in the data space.
The `string' code takes a UTF-encoded sequence of
.CW count
bytes, which is converted to an array of 21-bit Unicode values stored in an
implementation-dependent structure on
the heap; a 4-byte pointer to the string descriptor is stored in the data space.
The `array' code takes two 4-byte operands: the first is the index of the array's type
descriptor in the type section; the second is the length of the array to be created.
The result in memory is a 4-byte pointer to an implementation-dependent
array descriptor in the heap.
.LP
Each item's data is stored at the address formed by adding the
.CW offset
in that item to a base address maintained by the loader.
Initially that address is the base of the data space of the module instance.
A new base for loading subsequent items can be set or restored by
the following operations, used to initialize arrays.
The `set array index' item must appear immediately following an `array'
item.
Its operand is a 4-byte big-endian integer that gives an index into that
array, at which address subsequent data should be loaded; the
previous load address is stacked internally.
Subsequent data will be loaded at offsets from the new base address.
The `restore load address' item has no operands; it pops a load address
from the internal address stack and makes that the new
base address.
.SH
The Module Name
.LP
The module name immediately follows the data section.
It contains the name of the implementation module, in UTF encoding,
terminated by a zero byte.
.SH
The Link Section
.LP
The link section contains an array of external linkage items:
the list of functions exported by this module.
Each item describes one exported function in the following form:
.P1
Linkage_item
{
	OP: pc;
	OP: desc_number;
	W: sig;
	array[] of byte: name;
};
.P2
The
.CW pc
is the instruction number of the function's entry point.
The
.CW desc_number
is the index, in the type section, of the type descriptor for the function's stack frame.
The
.CW sig
word is a 32-bit hash of the function's type signature.
Finally,
the name of the function is stored as a variable length array of bytes
in UTF-8 encoding,
with the end of the array marked by a zero byte.
The names of member functions of an exported adt are qualified
by the name of the adt.
The next linkage item, if any, follows immediately.
.NH 1
Symbol Table File Format
.LP
The object file format does not include type information for debuggers.
The Limbo compiler can optionally produce a separate symbol table file.
Its format is defined in the entry
.I sbl (6)
of [1].
.NH 1
References
.IP 1.
.I "Inferno Programmer's Manual"
(Third Edition),
Volume 1 (`the manual'),
Vita Nuova Holdings Limited, June 2000.
.IP 2.
P Winterbottom and R Pike,
``The Design of the Inferno Virtual Machine'',
reprinted in this volume.