The Taos Operating System (1991)

{This material originally appeared in the March, 1991 International Edition of Byte magazine, a McGraw-Hill publication} 


To describe TAOS (pronounced 'dowse' as in the oriental religion Daoism) as an

innovatory operating system would be a profound understatement. It has so few

points of similarity with operating systems like MS-DOS that many observers,

myself included, have initially had trouble believing in it at all. For one

thing TAOS is strongly Object Oriented and makes almost no distinction between

data files and executable programs. It brings together many advanced ideas

about Persistent Objects, Object Oriented Virtual Memory and Dataflow

programming into one wholly novel system. It is a parallel operating system

which configures itself automatically to run on a network of distributed

processors and load-balances itself. More incredibly still, it promises to run

exactly the same code on different processors in a heterogeneous network (eg.

a mixture of Inmos Transputers, Intel 80386 and Acorn ARMs); it offers CPU

transparency.

Before proceeding I must stress that TAOS is not finished yet and there are no

commercial ports of it in use. Some of the radical features mentioned above

are only partially implemented at the time of writing. This article is

therefore more of a preview of the concepts in TAOS than a full 'review'.

TAOS is being developed by Tao Systems UK Ltd., a small firm that has existed

for about 9 months, and whose leading spokesmen are Nick Spicer and Chris

Hinsley, two experienced programmers from very different backgrounds. Spicer

was a founder member of the European arm of Microway, the US firm which

specialises in high-performance floating point hardware and more latterly in

Transputer products. Spicer was Technical Manager and systems programmer.

Hinsley on the other hand is a veteran games programmer, an assembler wizard

who has been responsible for successful Atari ST and Amiga games like

'Verminator' and 'Onslaught'. Though writing exclusively in assembler, Hinsley

developed an Object Oriented style for writing games based on extensive macro

libraries, and this provided the original inspiration for TAOS. Spicer added

the parallel aspects to TAOS, based on his experience with Transputers and his

dissatisfaction with Occam. 


VP CODE

Perhaps the most important thing to understand about TAOS, before even

discussing the way it works, is that it is designed to execute a virtual

machine language called VP (for Virtual Processor). VP is the binary code of

an imaginary RISC-like processor with about 32 instructions, designed by Tao

Systems. In order to execute VP on a real processor, a translator will be

provided from VP into the native machine code of the target processor.

The ingenious part is that the translation takes place at program load time,

not during runtime as some erroneous reports have stated. This is NOT an

interpreted pseudo-code as used in the UCSD P-System or POP-11. When a program

module is loaded onto a Transputer node it will get turned into Transputer

code before it is run, whereas if it gets loaded onto an 80386 it will be

turned into Intel code. 

Tao Systems is currently committed to produce translators for the Inmos T800

transputer, Motorola 680x0, PgC7600 (see article in this issue), Intel

80386/80486, Acorn ARM and Sun SPARC. Of these the first three have been

written. The TAOS program loader, called ALEX, is part of the kernel that runs

on each node in a parallel computer, and as ALEX loads each program module it

performs the native code translation and optimization. Because TAOS is

extremely modular in nature and employs a novel late binding scheme, the size

of each module to be translated will be very small, and since all the target

processors are powerful CPUs the time taken up in translation should be quite

small. ALEX is not yet completed and the prototype TAOS systems which have

been demonstrated employ a conventional VP assembler to produce executable

native code. Tao Systems seems confident that the design of the VP instruction

set is such that writing the individual translators should not be too

difficult, even to such un-RISC like processors as the 80486.

So the first point about TAOS is that everything in it (apart from the native

code kernels at each node) consists of VP code, including all the applications.

A parallel TAOS program consists of a number of tool modules which you can run

on any mixture of the supported target processors without even knowing what

they are. If TAOS becomes established as an operating system then compilers

from C and other languages into VP will undoubtedly emerge.

                            

OBJECTS

Everything in TAOS is an object, and as in traditional OOPS, objects contain

both code and data. Under TAOS the code part is called a 'tool', equivalent to

a 'method' in conventional OOPs. Even what would be considered data files

under a conventional OS are objects which have tools that can for example

display their name, and check the access rights of a user.

All the executable code in TAOS is contained in tools. Tools must be fully re-

entrant, relocatable and side-effect free and cannot therefore contain static

data; so TAOS groups data and code together into objects, but rigorously

segregates code from data within an object. Tools are the smallest units of

executable code under TAOS so you can think of them as being like subroutines,

or C functions, or Pascal procedures. However tools are independently

executable so a better analogy would be with Forth, where words can always be

executed directly without having to call them from a main program.

Some tools are called 'permanent tools' and get loaded up by the boot system;

these form the heart of the operating system itself. Some tools are 'semi-

permanent' and these are loaded at boot time by a user-created configuration

process, but once loaded they and become part of the operating system and

cannot be de-activated. These tools are roughly equivalent to the device

drivers and extensions loaded by CONFIG.SYS and AUTOEXEC.BAT under PC-DOS.

Finally there are 'application tools' which are the user executable programs

of TAOS. Application tools can be 'virtual' or 'non-virtual'. Non-virtual

tools are loaded with the application and remain in memory until it is de-

activated (ie. quits). Virtual tools only get loaded into memory when the

application calls them, and when they cease to execute they can be marked as

'freeable' and the space they occupy can be reclaimed for use by another tool.

Virtual tools are quite similar in concept to Borland's VROOM system or to

OS/2's Dynamic Link Libraries. Virtual and non-Virtual tools are collectively

referred to as 'external' tools and they have names that resemble PC-DOS or

Unix pathnames by which they are accessed. They are normally cached so that a

subsequent access does not need to reload them from disk, unless memory is in

very short supply. An application in TAOS thus consists largely of named

references to external tools, allowing a great degree of reuse of code, and

keeping the individual executable units very small.

As well as external tools TAOS has 'library' tools and 'local' tools.

Library tools are kept in named libraries and can be shared by many

applications; like external methods they are cached for future use. Local

tools are part of the actual code of an application and have single unique

names, not pathnames. A local tool always overrides any external tool with the

same name. 

All TAOS programs are started up by a 'control object'. A control object is

primarily a data object, a sort of template, which contains all the

information necessary to load and run the tools which make up the program, but

it also has code which sets the loading of tools in motion. A control object

roughly corresponds to a task or process in other operating systems. It will

contain a list of the pathnames of all its component tools, the stack space

each one requires and a bit-mask which define what kind of messages it will

accept. Control objects are executed by the permanent TAOS kernel which

resides on each node of a parallel system, which then loads and runs the

necessary tools. Control objects can spawn children, and it is by this means

that TAOS programs distribute themselves around a parallel processor network.

Control objects can be inactivated without being removed from memory, and may

then be woken by other control objects sending them mail messages.

One way of looking at the TAOS tool system is that all programs employ very

late, ie. loadtime or runtime, binding. Programs assembled or compiled under

TAOS will not need to be linked for linkage occurs when a program is run,

under the control of its control object. You can recompile a single tool

without recompiling the whole application, and even design applications that

load different tools according to the runtime environment they encounter.


MAILBOXES

TAOS is a message passing operating system which uses 'mailboxes' for

communication between processes. All program I/O takes place through the mail

system, as does the distribution of executable code. Every control object is

automatically given a mailbox by TAOS, and can send or receive mail from any

other object whose mailing address it knows. These will always include its own

parents and children and any named resources like disk drives and VDU displays,

and may also include its siblings if measures are taken to record their

addresses when they are created. The kernel on each TAOS node has a permanent

Mail Guardian tool which handles all incoming mail for local objects on that

node, all outgoing mail, and mail which is to be forwarded to another node. 

The format of a TAOS mail messages includes a header that contains the size and

type of a message. There are 16 predefined message types including system,

executable code, error data, and debugging data and the programmer can use

numbered types up to 255 to make private communications between two objects.

The type mask allows the Message Guardian to trap system messages such as

executable code destined for the kernel, and it also allows user objects to

prioritize the way they read their waiting mail by type.

The header also contains the addresses of the originator and destination of the

message, of any other intended recipients for forwarding, and where a reply (if

one is requested) is to be sent. Mail can be sent to an inactivated object,

and may re-activate it as a child of the sender object so it can read the

message and respond immediately. In other circumstances such a message may

just sit in the inactive object's mailbox until some other event wakens it. If

such an inactive object gets removed due to a shortage of free memory, any

pending mail is stored in the object's filefolder (ie. is written to disk) and

re-sent next time the object is loaded. This might also happen when the TAOS

system is closed down, so that messages as well as objects can be persistent.

TAOS dynamically allocates the buffers that receive mail messages. In the

event that there is not enough memory to create the buffer, the Guardian reads

the header, discards the message data and replies to the sender that the

message has been trashed. The sender can then retry using a different route. 

 

FILEFOLDERS

TAOS stores objects onto mass storage media like hard disks via the mailbox

system, in objects called 'filefolders'. A filefolder has some similarities to

a DOS or Unix directory, but rather than being a passive storage structure it

is an active object (in fact a control object). The tools attached to every

filefolder are responsible for storing and retrieving objects from the folder

and for negotiating with the hardware device drivers that are necessary for

this transfer. Because filefolders are actually instances of a broader category

of TAOS object called 'filters' they may also process the data they transfer,

for example compressing and expanding it transparently to the user.

I mentioned above that TAOS tools have pathnames like those used to identify

files in a hierarchical directory system such as DOS. Because TAOS cannot

permit the duplication of files that DOS allows, pathnames are actually

handled rather differently. Where DOS relies on the PATH command and the

user's own knowledge of the directory tree and its contents, TAOS must

always be able automatically to locate objects. The pathname of a TAOS object

IS the object name, rather than just a qualifying prefix indicating its

current location. Therefore TAOS has built-in defaults related to the object

type; tools are expected by default to be in the 'tools' filefolder. So if you

request 'mystuff' TAOS will look for 'tools/mystuff' unless you explicitly

specify another filefolder. The path 'tools/mystuff' is actually a message to

the control object 'tools' requesting it to retrieve an object called

'mystuff'.

In large or multi-user systems there may be multiple server devices, and so

multiple 'tools' folders. In this case TAOS adds the the mailbox ID of a

'master server' to the path, roughly equivalent to a DOS drive; for example

'/dicks_server:/tools/mystuff'. All future objects will be retrieved from this

server until you explicitly change it.

Filters in general are control objects that receive input from a mailbox, and a

parameter string, and send output to another mailbox. The destination address

for the output is contained in the input message as explained above. The

parameter string  (which may be null) may tell the filter which of several

actions to perform on its input. Filefolders are just one type of filter. In

fact all TAOS applications constructed from TaoScript commands (see later) are

just pipelines of filters running in parallel and passing data from one to the

other. TAOS is a dataflow operating system, extending the Unix pipe concept to

its logical conclusion.


PARALLEL PROCESSING

Tao Systems is seeking a patent on the algorithm used to distribute parallel

programs over a network of processors, and so I am not allowed to describe it

in detail. It's basically a smarter enhancement of the flood-fill type of

algorithm, which can take into account the loading of each processor node as

measured by the number of processes running on it. An application like the

Mandelbrot which requires a large number of identical processes will get

distributed in a manner which resembles water running down a mountain, where

the flow seeks out the 'gullies' or lowest points. Tao System claims that this

mechanism achieves a high degree of automatic load balancing.

Very crudely, each node knows the current loading of all its nearest neighbors

and any control object can spawn children onto the neighbour that is least

busy, using a kernel call which invokes the mail system to transfer the

executable VP code. Deadlock due to circular network paths is avoided by an

incrementally created a routing table that remembers which paths have already

been traversed. This is the part of the algorithm which is the subject of the

patent, and it ought to work for any network topology at all. The routing

table allows messages to find their destination mailboxes. The routing

algorithm may, hardware permitting, generate multiple paths between two nodes

which can be used as alternatives when mail gets blocked due to a shortage of

buffer memory or a hardware failure.

When necessary you can eschew the automatic mechanism and specify the

particular node that an object is to run on. You can also invoke a partially

automatic mechanism in which you explicitly send a number of objects to a node

but let TAOS distribute them from there. For example you could specify that a

1000 process Mandelbrot calculation should be run by sending 10 groups of 100

objects to different nodes of the network but letting TAOS complete the

distribution. The TAOS kernel contains a time-slice scheduler that allows many

objects to run on the same processor. Box 1 lists the TAOS Kernel Calls

(similar to DOS BIOS interrupts), and if you look at the entries under Control

Object Management whose names begin with OPEN you will get some idea of how

distribution is achieved.

From a programmer's point of view remote interprocess communication is

transparent under TAOS. You only need to know the mailbox address of a remote

object, and then TAOS will route messages to it without you specifying the

route. Tao Systems claims that near optimal routings are usually achieved.


TAOSCRIPT

You interact with TAOS via a job control language edit/interpreter called

TaoScribe which processes a language called TaoScript. This is a dataflow

language which describes the flow of data through pipelines of tools specified

by their paths. For example :-


/data2 < proc2 < proc1 < /data1


would pass the data object 'data1' through the two tools 'proc1' and 'proc2'

and then store the output into data object 'data2'. Tools 1 and 2 will be 

executed concurrently, possibly on different processors, and both will be

started up before any data is transferred.


/hd/data2 /hd/data3 > proc3


would send both data streams into the same tool, while :-


/dtp_stuff/chapter2 "parameter" > tformat


sends the parameter string as well as the data into a 'tformat' tool. The only

operators in TaoScript are /,< and > but these can be used in combination to

create forks and joins as well as linear pipes. The resulting programs are in

my opinion difficult to read though. There are no control constructs such as

IF..THEN or DO..WHILE in TaoScript because all such control flow is governed

inside the tools themselves, by interpreting parameter strings. So :-


/data2 "TRUE, FALSE" > proc3


might tell proc3 to send its output to one of the objects TRUE or FALSE

according to the value of /data2. Groups of TaoScript commands can be saved

as named objects and used as single commands, just like DOS batch files. The

ordering of commands within a TaoScript program is very often irrelevant since

the tools are all running in parallel and the dataflow governs the order of

their execution.


THE GUI

There is also a simple Graphical User Interface built-into TAOS. Any control

object can create a window object, which uses a buffer local to the object for

maximum speed. These windows are sensitive to mouse events such as button

clicks and dragging. Control objects do not need to worry about window

movement or overlapping and refreshing which are handled transparently by the

GUI supervisor, but they can receive mouse X,Y and time data from button

clicks.

Only one window, the Input Window, can receive user input at any one time and

this window is indicated visually by a change of border color and by

automatically being brought to the top of the pile. An object can send

messages to its window to define 2D screen areas as Area Event triggers; the

GUI monitors such areas and sends an Area Event back to the object if the

mouse pointer enters the area. Areas can range from one pixel to the whole

window. The GUI can also monitor areas by sending a continuous stream of X,Y

data whenever the pointer is in the area. These simple facilities allow you to

implement pop-up menus and dialog boxes.

The GUI supports 8-bit color palettes, of which the first 8 colors are defined

as EGA compatible defaults.


CONCLUSIONS

I have seen prototype versions of TAOS running on 5 or 6 Inmos Transputers in

a PC host. The GUI is lightning fast and the demonstrations of multiple

windowed Mandelbrot plots are very impressive indeed. However I have had no

direct experience of programming under it, nor do I have complete

documentation. Certain parts of TAOS, including the ALEX language translators,

were not complete when this article went to press and some aspects of the

design are still undergoing evolution. Tao Systems claims for the efficiency

of parallel program distribution and message routing are very strong and

will need to be independently tested by some parallel computer vendors.

There is no doubt though that TAOS brings together many clever ideas in an

elegant way. The philosophy it embodies is right on target for the next

generation of operating systems, with its emphasis on persistent objects,

dataflow and re-usable modular tools. The VP system for supporting multiple

processor types is so ingenious that you wonder other operating system vendors

have not tried it, until you reflect that it is only really viable for a

starting-from-scratch OS; the effort of re-writing all PC, or Mac or Unix

software in VP would clearly be prohibitive.

Tao Systems is now busy demonstrating TAOS to a number of big names in parallel

processing, including some from Japan, and I shall be following the progress of

TAOS with great interest.


Dick Pountain, Nov 1990

-------------------------------------------------------------------------------

BOX 1.

     

The TAOS Kernel Calls listed in functional groups:-


Memory Management

ALLOCFAST       allocate memory block from fast memory pool

ALLOCMIN        allocate memory block given minimum size required

ALLOCMAX        allocate memory block given maximum size required

FREEMEM         free memory block


Object Memory Management

COPYNODE        copy object

COPYHEAD        copy object to head of linked list

COPYTAIL        copy object to tail of linked list

DUMPLIST        free memory of all nodes on a list

FREELIST        UNALEX and free memory of all nodes on a list


Mailbox Management

SENDMAIL        Send mail message

COPYMAIL        Copy mail message then send copy

READMAIL        Read mailbox


Control Object Management

STARTCONTROL    Start a control object locally

OPENCONTROL     Copy a control object and start locally

OPENCHILD       Distribute and start a control object in network

CLOSECONTROL    Close a local control object

OPENARRAY       Distribute and open a number of control objects

                 in the network

OPENFARM        Distribute and open multiple copies of a control

                 object in the network

OPENDEVICE      Transport a control object to a specified network

                 node and start

OPENREMOTE      Transport a control object to a specified network

                 node for distribution from that node, then start


Tool Object Management

VCALL           Virtual Call tool object

FINDTOOL        Enquire if tool is available locally

OPENTOOL        Request tool load

FLUSHPETL       Flush un-referenced tools from Permanent and

                 External Tool List


General Object Management

VADDR           Obtain address of an embedded object

OBJPROC         Process an object in same thread

LISTPROC        Process a linked list of objects in same thread

LISTTEST        List enquiry for types of objects

LISTINFO        Enquire general list information


Global Variable Management

DECLARE         Declare named 64 bit integer value to child control

                 objects

ENQUIRE         Obtain 64 bit value for declared variable name from

                 family tree

UNDECLARE       Locally remove last instance of a named variable

                 declaration from declaration stack


Processor Type Node Identification

FINDTYPE        Enquire processor ID of processor node conforming

                 to specified processor type and minimum memory

                 requirement

Local Timer Functions

GETTIME         Obtain local processor timer count

DELAY           Timed deschedule


Template Object Management

ALEX            Convert template to Process Ready

UNALEX          Invalidate Process Ready status of object