Home‎ > ‎Computing‎ > ‎Byte Articles‎ > ‎

Parallel Course (July 1994)

{This material originally appeared in the July, 1994 issue of Byte magazine, a McGraw-Hill publication} 

The Taos operating system uses objects from the ground up to enable
processors based on different architectures to work together on the
same problem

Nowadays many computers can generate pretty ray-traced images, and
some of them can even do it fast. What impressed me about the
demonstration I was witnessing was that it was running on a no-name
486 PC clone with some co-processor boards in it, not a Silicon
Graphics workstation. On the other hand parallel-processing
accelerator boards for PCs are not exactly new - no, what made the
event REALLY special was that the PC contained an Intel 486 CPU, four
Inmos T800 transputers and four MIPS R3000s, and it was running the
same binary ray-tracing program in parallel on all of them at once.
What made this feat possible was Taos, a radically different,
object-oriented parallel operating system.

Most operating systems are created either by large hardware
manufacturers or by university researchers, but Taos came from neither
- it's the product of a devoted group of enthusiasts with an idea that
was well ahead of its time. The principal architect of Taos is Chris
Hinsley who was a professional games programmer, with hit titles for
the Atari ST and the Commodore Amiga to his credit. Although writing
solely in assembler, Hinsley devised his own object-oriented
development style based on macros, which sparked the original idea for
Taos. (Incidentally you pronounce its name 'dowse' like the Chinese
religion rather than the New Mexico town). Fired by the launch of
Inmos's Transputer, Hinsley wanted to create a real-time operating
system that could harness the parallel-processing power he believed
would be needed for future multimedia systems.

When I first wrote about Taos in Byte International some four years
ago it seemed outrageously far removed from the mainstream, but the
rest of the commercial operating system world is catching up. Everyone
wants a micro-kernel now, but Taos is already a nano-kernel system,
with its tiny 12 Kbyte kernel running on each processor in a parallel
network. Taligent promises objects-from-the-ground-up with dynamic
binding; Taos has had them from the start. However Taos doesn't really
aspire to mainstream desktop status, but is rather a fast-and-skinny
system for embedded applications and Tao Systems is now promoting it
into the multimedia and games console markets.

By far the most radical aspect of Taos is it's hardware independence.
Taos programs are all written in the machine code of a virtual
processor (VP), which is called VPcode. The Taos kernel translates
VPcode into the native machine code of each real processor immediately
before running it - there is little or no runtime penalty, unlike
earlier interpreted systems like the UCSD p-system which were very
slow. Taos's fine-grained object orientation and dynamic binding makes
this translation strategy feasible since VPcode modules are always
small (typical a few hundred bytes) and so can be translated
on-the-fly as they load from disk into memory. Huge monolithic
applications like Excel or Wordperfect wouldn't lend themselves to
this approach, though Tao Systems' translator supremo Andy Henson did
stress to me that a fast modern CPU can actually translate VPcode
faster than the hard disk can transfer data.

The imaginary VP processor is a 32-bit little-endian RISC machine with
16 registers. It supports data types from 8-bit bytes up to 64-bit
double integers and 32 or 64-bit IEEE floats. Hence the VP machine is
a reasonably good match to most real RISC chips like the Alpha, MIPS,
ARM and PowerPC, if somewhat short of registers by today's standards.
It supports around 60 simple RISC-like arithmetic, logical and
branching instructions and a few special pseudo-instructions, like TAO
which calls Taos kernel routines, and LIT which marks literal data
that needs to be translated (eg. from little to big-endian).

The Taos assembler VPASM outputs VPcode, which you can run directly or
you can invoke the appropriate translator manually to convert it to
native code (Text box 1 shows a sample of VPASM source code).
Currently Tao Systems has translators for the Intel 286, 386 and 486,
the Inmos T8 and T9000 transputers, MIPS R3000 and ARM 601. PowerPC
and DEC Alpha are next in the pipeline; it takes around 6 man-months
to produce a new translator.

Taos is a message-passing operating system whose software model is
based on objects, processes, and messages. An object is a bundle of
data and code that consumes memory, while a process executes an object
and consumes processor time. The Taos hardware model involves multiple
processors each with a local memory, connected by a network of
communication links. Every processor in this network runs a copy of
the Taos kernel and the translator from VPcode to its own native code.
Whenever Taos creates a new object, it allocates the object to a
processor and then starts a process to execute the object.

All Taos objects are constructed from variably-sized blocks of
contiguous memory called 'nodes' which contain two link fields so that
the kernel can manage them in doubly-linked lists. Nodes can contain
data or code, and they have a type field that identifies the type of
object they hold. Taos itself doesn't type-check the application of
operations to data, though you can implement such type-checking at a
higher level within an Object-Oriented programming language.

While stored on disk, or in transit over a communications link, nodes
exist as unbound 'templates' but once loaded into memory they are
converted to 'process-ready' form, and it's at this time that any
translation of VPcode to native code takes place. The Taos kernel on a
particular processor inserts a process-ready node onto a list of other
process-ready objects, from where it can be processed according to the
type of object it holds.

Taos's pre-defined system object types are Tools, Control Objects,
Bitmaps, Graphical Objects and Class Objects but programmers are free
to define new types. A Tool is a node containing executable code that
can act upon the data contained in an object, to perform calculations
or send and receive messages. A Control Object is the Taos equivalent
of a program, consisting of one or more component tools which are
executed in sequence. Control objects are the smallest unit of
parallel distribution and execution under Taos, but not the smallest
unit of memory management since individual tools can be retrieved from
disk and made process-ready. The kernel which creates a new control
object distributes its template (using a special load-balancing
algorithm) onto some processor which starts a process to execute the
object. When the last component of the control object is finished, the
control object closes and its process terminates. Every component has
at least two tools associated with it, one that executes it and one to
clean up after it dies.

A control object's template contains only the text names of its
component tools, not their actual code. When the kernel creates a new
control object, it first checks whether any of these specified
components are already in memory, and if so just points to them -
otherwise it fetches them from disk and makes them process-ready
(first translating them if necessary). Binding under Taos is thus
fully dynamic, so that no module gets loaded until it's needed and
only one copy is ever present in memory. The Taos processes which
execute control objects are lightweight, equivalent to 'threads' in
operating systems like OS/2, and more than one process can share the
same tool's code in multi-threaded fashion.

Class objects provide the highest level of organization under Taos. A
class encapsulates a group of message-passing objects which can run in
parallel, hiding them behind an OOP method interface. Users of classes
like Window or PolygonWorld make method calls to the class object, for
example to open a new window, and are shielded from the complexity of
the underlying parallelism that's generated by the execution of the
objects hidden within the class.

The simplest version of the Taos kernel is just 12 Kbytes in size and
is responsible for multi-tasking (via a time-sliced process
scheduler), memory-management, and the mail and naming systems. Tao
Systems is currently working on a POSIX compliant version of the
kernel which implements virtual memory and memory protection on
microprocessors that have suitable MMU hardware, but the 12 Kbyte
version does not offer these features.

All executable code in Taos is contained in tools, apart from the
small bootstrap loader on each processor which must be in native
code). Even the kernel itself is built from tools and is largely
written in VPcode. Device drivers are simply processes like any other,
running outside and in parallel with the kernel. All message I/O is
handled by link drivers running outside the kernel, though the kernel
handles some local I/O support mechanisms such as data cacheing.

The lifetime of a Taos tool in memory is determined by its status,
according to four different degrees of volatility:

1) VIRTUAL tools are only loaded, translated and bound when called by
another tool, the translated code remaining in memory until the tool's
reference count (kept by the kernel) falls to zero, after which it may
be flushed whenever the kernel needs memory. The kernel may relocate a
virtual tool at any time;

2) NON-VIRTUAL tools get loaded and bound at the same time as the tool
that references them, and they remain in memory, exempt from
relocation, for at least the lifetime of their caller;

3) SEMI-VIRTUAL tools are only loaded and bound when called, like
virtual tools, but they then remain in memory like non-virtual tools;

4) Non-virtual tools can also be flagged as EMBED, which causes the
translator to embed them as inline code in their caller's code. This
is a speed optimization which is extensively used in the kernel code.

A process called the Migrator, running outside the kernel, is
responsible for actually relocating objects in memory and for
incremental garbage collection.

Since Taos does not support shared memory, the only way for objects
existing in the address spaces of different processors to interact is
by exchanging messages. The lightweight asynchronous mail system works
through just two kernel operations, SENDMAIL and READMAIL, and it's
non-blocking so that the sender continues executing without waiting
for a reply.

All Taos messages are sent to 'mailboxes' belonging to processes,
which act as queues for incoming messages. When a control object is
created and executed it automatically receives a default mailbox,
whose mail address is simply the ID of the child process which
executes the object. The new control object can then send mail to any
other object whose mailbox address it knows, which will always include
its own parents and children, and named resources like disk drives and
VDU displays. Messages may contain a whole list of successive
destination addresses for forwarding, along with the address of their
sender in case a reply is requested.

Taos messages are typed, with 16 reserved types used by the kernel
that include arrays, streams and executable code; error and debugging
data; screen refresh, mouse and keyboard events. A further 16 types
are free for programmers' use. The kernel on each processor handles
all incoming mail for its local objects, all outgoing mail, and mail
to be forwarded to another processor. The typing system enables the
kernel to trap system messages (eg. executable code) and also allows
user-defined objects to prioritize the way they read their waiting
mail. Objects can employ the READMAIL kernel call to read messages
from their mailbox, adding a list of the desired message types as a
parameter. The result of such a call might be a message of the
required type, or the news that there are no such messages - if the
mailbox contains no messages at all then READMAIL suspends the calling
process until some mail arrives, so that you can use mail messages to
awaken sleeping objects.

Taos's link drivers hide the details of the physical transport
mechanism that implements the communication links from user programs
(though real-time performance issues may sometimes intrude). In the PC
demonstration I mentioned at the start of this article the transputers
were connected via their serial links, while the MIPS R3000 chips were
connected together though FIFO chips, and all of them talked to the
486 host CPU via the PC's VL local-bus.

Taos is a fully distributed operating system which doesn't attempt to
exert central control over the execution of parallel applications.
Obviously in practice you must pick out one processor from which to
boot the system, but once all the kernels are booted Taos programs
tend to spread out over the network of processors in an almost organic
manner, controlled by a distributed load-balancing algorithm. Text Box
2 lists some of the Taos kernel calls, and if you examine the
subsection on 'control object management' you'll see the kind of
services that are available for spawning remote processes. These
kernel calls use the mail system to transfer executable VPcode from
one processor to another.

Information about the system's performance and current loading is
stored in the link drivers that control each communication channel. At
boot-time each link driver benchmarks the processors to which it's
attached (by timing the VPcode translator) and this number is divided
by the number of processes currently running to give a measure of
available power for each processor. The automatic load-balancing
algorithm uses these power figures in the allocation of new processes.
When a tool object arrives at a processor, the local kernel inspects
all the links leading outwards and asks "is there one of my nearest
neighbours who's got more spare power than I have?" - if so the object
is passed on, if not it executes here. Applications that dynamically
spawn many parallel processes spread out like water running down a
mountain, the flow seeking out the 'gullies' or lowest points in
processor-loading space.

Each link driver also maintains a table of encoded information about
the network topology, used by the kernel to route messages. These
tables are dynamically updated at runtime so that if a new processor
is added to the system, news of its existence spreads outwards like a
wave. The nature of the routing algorithm reduces the probability of
deadlock due to circular message paths, and it can usually find
multiple paths between two processors (if they exist) which provides a
degree of fault tolerance if a link fails.

A programmer can always override the automatic load-balancing and
allocate objects to specified processors, by using the OPENDEVICE or
OPENGLOBAL calls, while OPENREMOTE invokes a partially automatic
mechanism where you explicitly send a number of objects to a
particular processor but let Taos distribute them automatically from
there. For example you could specify that a 1000 process ray-tracing
calculation should be run by sending groups of 100 objects to 10
different processors, with Taos completing the distribution.

Though Taos can support its own file and display systems, the current
release version is PC-hosted, using the MS-DOS file system and a
SuperVGA graphics adaptor for display. I received Taos on six 1.44
Mbyte floppies, though more than three of these were filled with
bitmaps and MPEG animations. I was able to run Taos quite happily on
my 486DX2/66 Elonex PC as a single processor operating system,
coexisting on the same hard disk with Windows (though hardly
surprisingly it would not run under Windows).

Taos comes with a very simple GUI whose look-and-feel is loosely
modelled on Motif (fig.2). Control objects which you store in the
taos/control directory automatically appear on a pop-up menu from
where you can execute them with a mouse click. To supplement this GUI
you can open a shell window and use a command line interface, with a
syntax that resembles DOS. However unlike DOS, Taos command lines
represent genuine pipelines in which each successive command launches
a separate process whose output is fed to the next.

The most immediately striking attribute of Taos is its blazing
graphics speed; you can grab a window in which an image is being
ray-traced and whirl it vigorously around the screen while tracing
continues unhindered. The GUI, which is packaged as a Taos class
object, works to a device-independent virtual screen with only two
hardware-dependent primitives to put and get bitmaps to the real
screen. Apart from SVGA adaptors Tao Systems currently implements the
GUI for several of Inmos's graphical TRAMs (Transputer Modules).
Processes running on remote processors can open screen windows by
sending messages to the processor running the GUI, rather like a
lightweight version of the X Window system.

Taos also encapsulates the MS-DOS filing system within its own object
model, so that DOS disk drives are mapped into Taos servers which you
can send messages about the objects they hold. For example a control
object called TRACE.CTL which is referenced in Taos by the message
@PC1 is a Taos server object that aliases my C:\TAOS directory.

At present Taos is very deficient in the sort of development tools
that programmers under UNIX or DOS expect to find - the small Taos
team has devoted most of its time over the last two years to getting
the kernel and VPCODE translation system robust, and to building a
variety of graphical tools for manipulating and displaying ray-traced
images and MPEG animations, all written directly in VPASM assembler.
There's a Basic compiler which uses a QBasic-like dialect but as yet
no C compiler - there is however a library called the Taos HLL
Toolset, accessible from VP assembler or Basic, which provides the
functionality of the ANSI C library including malloc, sprintf, fscanf
and all the rest. Work is underway on a in-house C++ implementation.

The much hyped 'multimedia revolution' which puts a new premium on
cheap but high-performance graphics may prove to be a window of
opportunity for Taos. SGS-Thomson/Inmos has made a technology sharing
agreement with Tao Systems to use Taos on its next generation
processors (code-named 'Chameleon') in the games, visualisation and
multimedia markets. Tao Systems is presently negotiating with a large
Japanese communications corporation which is evaluating Taos as an
operating system for the TV 'set-top boxes' that will control the new
domestic multimedia services. These units will have to decrypt,
decompress, decode and otherwise mangle real-time data streams for
'video on demand', videophone communications and other services yet to
be invented - this will require large amounts of processing power, but
must be delivered at domestic electrical appliance prices. A small,
hardware-independent parallel operating system begins to look very
attractive; you can shop around for this week's best processor deals
and issue cheap upgrade cards to provide more processing power.

Dick Pountain - 08/03/94 15:04

Tao Systems
PO Box 2320,
London NW11 6PW,


An example tool written in VP assembly language. This tool changes the
backdrop picture shown on the Taos GUI desktop to another bitmap file
selected by the user from a browser window.

include 'tao.inc'

    control 0,1024,DATATP,0,0,0,0
    component plmthd2-bproc
    tstring plmthd2,'DESK/BACKDROP'
nodeend bproc

    tool 'DESK/BACKDROP'
    ;r6=control object pointer

    allocstruct 80,r7
        ;request filename from user
        cpy r7,r0
        lea bpath,r1
        qcall GUI/GI_BROWSE
        breakif r8=0

        ;gui will be parent
        enquire 'BACKDROP',0
        breakif r8=0

        ;send name to backdrop
        cpy r7,r1
        qcall LIB/PA_SEND
    freestruct 80

    string bpath,'BITMAPS/*.TBM'

    toolend tl_b
nodeend tl_b


This selection from among the 64 Taos kernel calls gives some
impression of the kind of services that the kernel provides.

Mailbox Management
SENDMAIL    Send a mail message
COPYMAIL    Copy a mail message then send the copy
READMAIL    Read a mail message from a mailbox
READTYPE    Read a mail message from a mailbox; wait until
                specified type arrives

Control Object Management
STARTCONTROL    Start a control object locally
OPENCONTROL    Create a control object and start locally
OPENCHILD    Create, distribute and start a control object in the
OPENARRAY    Create, distribute and open a number of different
        control objects
OPENFARM    Create, distribute and open multiple instances of a
        control object
OPENDEVICE    Create and transport a control object to a specified
        processor and start it
OPENGLOBAL    Create, distribute and open multiple instances of a
        control object. Guarantee one control object on every
OPENREMOTE    Create and transport a control object to a specified
        processor for distribution from that processor, then
        start it
OPENPIPE    Create, distribute and open a pipeline of control objects

Tool Object Management
VCALL        Virtual Call tool object
OPENTOOL    Request tool object load
CLOSETOOL    Close tool object
FLUSHNAMES    Flush named tools from local tool list
FLUSHTOOLS    Flush un-referenced tools from local tool list
UNCLOSETOOL    Increments a tool object's reference count

General Object Management
VADDR        Obtain address of embedded object
OBJPROC        Process an object using the existing thread
LISTPROC    Process a linked list of objects using the existing
LISTTEST    Search list for types of node

Processor Type Identification & Mailbox ID

FINDTYPE    Enquire processor id of a processor node of
        specified processor type and with a minimum memory
GETMYID        Enquire mailbox id of own control object
GETPARENT    Enquire mailbox id of parent control object
GETSERVER    Enquire server mailbox id for an object
NETINFO        Enquire processor and network information