VR language wishlist

(Still under development)

I have become a little worried about the future of VR. VRML was a mighty step in the right direction, but things seem to have faltered lately. VR needs an open, standard language like VRML which is easy to use. This paper presents some of my humble suggestions.

A VR language needs to be geared for the future. Accordingly we should be planning the syntax for things we can't do yet as well as the things we can. Computer graphics hardware progresses at such a rate that if we only define things that we can do now, then the language will be so out of date by the time of the next revision that nobody will want to use it. We are in great danger of this happening right now -- there has not been a major upgrade for about 4 years. The world is marching on, but we seem stuck and in danger of becoming an irrelevant backwater.

First a quick list of things that need to be considered for the future:

I know some of this stuff will be impractical, or even impossible in the very near future, but much of it can be achieved... and even if it can't be used yet, that shouldn't stop us planning for when it will be.

We have much to learn from others. ActiveWorlds have efficient worlds that work well even on slow computers. Their worlds are collected together in conceptually clear universes with the ability to jump easily between them. We should strive for ease of interchange with other open 3d efforts such as POVRay. It would make available to us the enormous resources of textures, shapes, and lighting effects they have built up over the last decade or more. Other open efforts such as OpenFX and CrystalSpace have much to offer also. We are poorer for ignoring them.

Defining a VR language should not stop at just setting out the syntax; it should extend to making sample implementations available for all functions. The language should be modular and extensible as a series of fully documented libraries. If someone finds a bug or a better way of doing something then everybody benefits immediately because the new or repaired building block is easily added to the whole. The UMEL (Universal Media Elements Library), FreeWRL, LibVRML97, and VRWave efforts have made some inroads here but they remain external to the language itself. They need to be more actively supported.

More than one way to do things should be available to the world builder. Sometimes giving position in absolute coordinates will be easiest; sometimes relative coordinates will. If the world builder must pause, start up a calculator and be diverted from the task at hand then it is A Bad Thing. It is always easier for a program to convert the coordinates than for a human. We need to attract people to VR not deter them.


Writing good error checking into a program should be something of a programming art itself where the computer should help the world builder pinpoint problems easily instead of simply throwing up obscure errors. The error check should also suggest solutions. In many programming languages one error often causes ripple-on effects where many others are generated because of that original error. These kinds of spurious errors should either not be thrown up or should be related sensibly. The scan should track these kinds of effects. If world building is to become popular then it must be made easy for people to do. Obscurity is not a virtue.

The size and complexity of the programs that do the initial checking and conditioning of the file do not affect the size and complexity of the renderer, so having complex error checking with lucid, helpful output doesn't mean the world renderer will be bloated and slow. I suggest that running a world be broken into at least 3 (possibly 4) separate programs: a lint kind of program that checks the syntax and logic of the file and delivers warnings and errors; a conditioner which converts arbitrary units and syntax variants into standard format (e.g. degrees into radians, relative into absolute); a compiler that creates the internal representation of the scene; and I guess the actual scene renderer. (Being somewhat ignorant here, I don't know whether the scene compiler should be separate from the renderer.) Breaking these functions up means that different people can create improved versions of different parts of this chain, and the world author can use the tools that suit them best. An expert builder can even omit the first 2 parts of the preprocessing, and create files that can be directly used by the compiler.


There are some excellent tools for creating VRML objects and worlds using a GUI, the Spazz3d and the ParallelGraphics tools spring to mind. These are very important to the future of VR. There are some things that are extremely difficult to write in a text editor. Some people will say that the language should be designed purely with GUIs in mind and that nobody in their right mind will create worlds in text editors in the future. I disagree. Some things are not suited to GUIs, but are simple to write out in a symbolic language. This will always be so. Perhaps most of VR construction will require GUIs in the future, but new concepts and ways of creating and manipulating things will always be developed and they will always require a language -- a clear, concise, human-readable language.

The language

Some global settings should be allowed for the convenience of the world builder. Although these settings would normally be used globally they can also be used multiple times in a file to switch the setting locally.

Textures or maps should have a better naming convention. Making each begin with map- groups them together in a reference which makes it easier to learn and look up the available options.

Multiple maps can be applied to the one surface and are simply added together. There may be a case for allowing other functions than simple adding (e.g. OR, AND, XOR, NOT, subtract, multiply, etc.) to enable negative lights, embossing, masking, and other possibilities, but I have not looked into this properly yet.

Lights should use a simpler naming convention too, with a single name (Light), and their type naturally defined by their attributes. It would make learning much easier.

Each light can have many more attributes of course but the 3 main types of light are chosen by the position and direction attibutes. If neither is given then it defaults to directional light shining down the -y axis. The default direction for DirectionalLight in VRML currently is down the Z axis, which is fine for building small, standalone models but is silly for worlds.

Sensors should use a simplified naming system too. Each should begin with "Sense" so they group together naturally in any reference on the subject. Cylinder, Plane, and Sphere sensors are really variants of a node that senses dragging actions. Instead of sensors affecting sibling geometry as VRML currently does it would be more consistent if affected nodes are enclosed in the Sense node's curly braces.

VRML's TimeSensor is terrible -- a real monster to use. It should really be called Timer and be more intuitive to use. It is not really a sensor like the others. The worst parts of the current TimeSensor are the startTime and stopTime attributes, and the cycleTime eventOut. They almost always require another node to be useful. While there may be some obscure use for the time in seconds since the 1st of January 1970, in almost all cases it would be much more sensible to start the timer at 0 and count upwards from that like a stopwatch. If the actual time is needed, as seconds since 1 Jan 1970, or in any other format then it could be got from the system clock via a javascript call. However, if kept, startTime should default to seconds since 1 Jan 1970 for backward compatibility, and have a variety of other selectable formats, including human readable time.

A few new attributes, start, alarm and alarmLength let Timer work like a stopwatch. When start is sent a TRUE signal it begins counting up from 0. When the Timer reaches the alarm value it sends a TRUE event, the alarm exposed field is set to TRUE for alarmLength amount of time, then Timer sends a FALSE, and alarm reverts to FALSE till the next alarm trigger. There can be as many alarm values as wished. There should also be an end attribute which is the same as the cycleInterval attribute but with a shorter name that makes more sense if the Timer is not set to loop.

Interpolators desperately need simplifying. (See near the bottom of this document for the piece about animation.)

Direction should be able to be given in any format that the world builder chooses.

Rotation likewise may have more than one way of being written, and should be able to be explicitly defined as radians or degrees regardless of the global setting. You should have the choice of absolute or relative rotation, and need not give all 3 axes -- just one axis or two may suffice.

Position can be given as absolute or relative, and as coordinates or direction, and distance.

The range of primitives should be extended, not reduced. Also, smooth primitives that aren't built of facets are needed. Smooth objects should not replace faceted objects but should supplement them. If a complex world is built and is expected to be run on slow machines then the designer will want to use faceted objects, but a simple world being run on fast machines opens the choice of smooth objects.

Many primitives should have more than one way of being defined. The language should be available for the designer to build in the most convenient way possible for them.
Example (these all define the same box):

The syntax of VRML is too clumsy and wordy. Every extra word, every extra parenthesis is another chance for error. Compact files are easier to debug, quicker to write, and faster to transmit over the net. As an example of what I mean, compare how a blue egg is defined in POVRay vs VRML.

ROUTEs are unnecessary. It would simplify matters if the node sending info made the connection from inside the node itself and it should be able to be sent to more than one target. Also the keyword TO should really be reserved for programming so that we can use looping code directly in the language. I think '->' may be a more sensible choice for routing signals.

TouchSensor {
   isOver -> switcher.choice, light3.enabled

The 'children' grouper is unnecessary. Everything under a grouping node is its child anyway. This lets translate, rotate and scale be used inside any curly braces simplifying the syntax tremendously. Because of this the Transform and TextureTransform nodes are unnecessary too. Transforms are applied to the object that owns the curly braces. This is similar to how colors and textures currently work in VRML (except that VRML forces you to refer to an object back over 2 or more curly braces). This should speed up parsing the scene while giving world builders more flexibility.

Group {
   translate 100 0 0
   Sphere {}
   Box {}
Anchor {
   url "http://gobble.de/gook.html"
   Sphere {
       translate 0 20 44
       scale 2 0 0
       mapImage {
           url "test.jpg"
           rotate 45 degrees
   Box {}

We should be able to define variables instead of just parts of the scene graph. If these variables could then be mathematically manipulated and even used in conditional statements then parts of the scene could be built or not depending on their values. These variables should not be restricted to just numbers -- they should be able to be strings as well.

I would suggest a loosely typed language for this like javascript. This is like incorporating the script node into all the other nodes, and extending the switch node to make it more general purpose.

DEF val 133
DEF touch 0
DEF ltBlue [127 127 255]
DEF dkGreen [31 195 15]
TouchSensor {
   isOver -> val++
   isActive -> touch
   Sphere {
       translate 0 val 0
       if touch then color ltBlue else color dkGreen
Box { size val val*2 val }

If javascript is used it should be optionally compiled to improve speed.

All parts of the html page that the vrml scene is embedded in should be available to the wrl, and conversely all of the wrl should be available to the html page. Javascript in the wrl should be able to call javascript functions in the html <script> node, and the reverse should be true too.

Requiring special variable types (MFString, SFVec3d, etc.) makes it difficult to allow the use of variables to manipulate attributes. They can be simple numbers/strings and the software should be able to do any conversions on the fly as required. Numbers should also be able to be given in any number base, or at least the common ones: decimal, hexadecimal, octal, and binary. Boolean values are represented by numbers too: 0 is FALSE and any other number is TRUE. Some would argue that this makes for difficult to weed out bugs, but a preprocessor can easily check and warn if you place the result of an addition into a boolean, asking "Did you mean to do this?" It is really counterproductive to enforce strict rules that prevent the programmer from taking advantage of speed savings. It is time we made machines do our bidding instead of forcing us to think like machines. Some would point out that specialised number types let the world parse and run faster, but I have no argument with allowing the use of specialised number types -- I just don't think they should be required.

DEF val 20
TouchSensor {
   isOver -> val
   translate 0 val 0
   Sphere { color 64 8 val }
Alpha should be one of the color components, however it should always remain optional.
color r g b a

Animation is way over-complicated and needs to be vastly simplified with definable libraries of movements able to be INCLUDEd so that complex actions and gestures can be built up and re-used in part and in whole. The current system should remain for backward compatibility, but needs some extensions.

If the start and end positions are set and the length of time given for the movement, then the acceleration can be defined with a simple a curve. Mechanical movements would be given with a flat curve -- a straight line -- no acceleration. Biological movements use a curve to ease in, build up speed, and ease out. A cannonball fired into the air starts fast, slows to a halt, reverses, and gathers speed falling to the ground. It may be advantageous to use one curve for the whole movement or to use a simpler curve to describe half and repeat it in reverse. A yo-yo might use a curve that is an elipse of some sort, or else a simpler curve which is reversed then repeated over and over.

It should be possible to make a repeating movement change rate too, so a curve can be set for the repeat rate. This would be useful for simple things like a bouncing ball where the first bounce is high, but succeeding ones aren't as high and take less time.

I have been talking of a curve but how is it defined? Initially it would be given as a simple formula (like a sine or parabola), but after a while people would build up libraries of common curves. It may be necessary on slow machines to specify that these be precalculated by the parser, but on fast machines it should be possible to calculate them in realtime. They don't need incredible accuracy so simple approximations are all that is needed -- even simple wave tables would suffice. They are just a way of defining how close or far apart successive images are on subsequent frames. The curves don't affect how long a movement takes. They just let the program work out what fraction of the trajectory the object has travelled for the next frame rendering.

There should be a trigger for the animation if it is meant to begin at a certain time or on some particular event. These should be able to be combined. Time should be able to be given in absolute or relative form and should be human-readable if desired. Seconds should be the default unit but can be optionally stated to enhance readability.


TouchSensor {
   isActive -> touchEvent
   Sphere {}
move {
   start 0 0 0
   end 0 20 0
   speed sin(t)
   trigger touchEvent + 5 seconds

Inverse kinematics needs to be part of the animation syntax too.


(This document is not finished, as I feel a VR language should never be either...)
Miriam English, 8th October 2001