VRML Problems

(last updated 2006-07-21)

ROUTEs
IS
Implied events
DEF
EXTERNPROTO
Children
Transform
naming inconsistency
Anchor and the other sensor nodes
repetition
arbitrary division of nodes and ambiguous, confusing names
unnecessary typing of data
wasteful processing
LOD
Viewpoint
predefined objects
Lack of CSG (Constructive Solid Geometry)
texture names
texture mappings
PROTO
Events
Interpolators
Lack of exposedField in Script node
Lack of absolute position/angle query
Confusing field and event names
Sensors
data storage
Sounds

ROUTEs

They unnecessarily complicate a file. They make a conceptual three part hassle out of any simple movement of data (eventOut-->ROUTE, ROUTE_in-->ROUTE_out, ROUTE-->eventIn) instead of directly connecting events. Even worse, they hide that connection down at the end of the file instead of where the connection actually occurs so you have to look in 3 places in the file (eventIn, ROUTE, eventOut).

The argument that ROUTEs make obvious movement of data around the scene is not correct either; they only transfer 2 kinds of data, events in and out. Any data sent to fields must be sent via script. This lack is worsened by the fact that in PROTOs they can't even communicate events out to the header. This also must be done via a script.

IS

IS is actually another kind of ROUTE -- as if one kind wasn't bad enough. But at least IS maps a value directly in the file instead of putting the connection at the end of the file. Unfortunately, the name 'IS' makes it difficult to understand what it does, and it would have been much better called FROM to show that the variable gets its value FROM another or TO to show that it sends its value TO another. And it would have been much better to be able to use this anywhere instead of ROUTEs.

Implied events

At the moment eventIns and eventOuts are implied in the object where they are actually used and only made explicit in the ROUTE or IS somewhere else in the file. This makes following data around a file a major headache. (The Script node is the only exception to that.) Weirdly, events are able to be made explicit in all nodes, but only inside PROTOs and even then only when used with IS to communicate with the PROTO head. That is terribly inconsistent. (For more on this see items PROTO, and Events, below)

DEF

DEF only works on objects actually instanced in the world. This makes it impossible to build large libraries of common definitions the way the POV-Ray community has. In POV scene description language "define" simply defines something. To instance it you must then use it in the world. This lets the community to easily build up big libraries of colors, textures, shapes, constants, math functions, cameras, etc. This helps with advancing the language too because things can get tried out in library files before it is decided that they get used so often it's time to hard-code them as part of the program proper.

Another problem with DEF is that it can only be used with nodes and not with attributes of nodes. There is no way to DEF a color as "olive" or "pink". You can DEF the whole damn Material node, but that runs into a limitation of the structure of VRML and a problem with USE. If you have 4 Material qualities that you want to define as "red", "shiny", "glow", "translucent" and you have to use these on many different objects in different combinations... you can't. Only one Material node per Appearance node and only one Appearance node per Shape node. The only way to do it with DEF/USE is to DEF them in each of the 16 possible combinations of the 4 and USE those, which is plain ridiculous.

USE is one of those keywords in VRML (and there seem to be many of them) which is superfluous. I've never really seen any rational reason for its existence. PROTOs manage fine without an equivalent term.

EXTERNPROTO

Currently EXTERNPROTO needs to be used each time you want to refer to a PROTO in the external file. You should only need to refer to the EXTERNPROTO library file once, after which the PROTOs within it should be treated as if they are cached inside the referencing file. It would work like the Include command in POVRay. Perhaps it could be done using the Inline node, though currently a wrl can't see PROTOs in an Inlined file. Also using Inline might lose the current nice ability to put an example wrl at the end of a PROTO file -- Inlining it would insert it into the current wrl.

Children

This is superfluous. Anything in a grouping node is a child of it anyway.

Transform

Position, angle, and size would be most easily and most often used as an attribute of geometry. When it is needed outside individual geometries then it can be used as an attribute of a grouping node.

naming inconsistency

There are awful inconsistencies. Transforms can only be used in the special Transform grouping node... except that they are also allowed in certain other things. Viewpoint has position (not translation) and orientation (not rotation), Box has size (not scale), DirectionalLight has direction (not rotation), Extrusion has scale and orientation (not rotation) though as lists, FontStyle has size (not scale), PointLight has location (not translation), ProximitySensor has size (not scale) and its events use position (not translation) and orientation (not rotation), Sound has location (not translation), Sphere has radius (not scale), SpotLight has location (not translation) and direction (not rotation) and radius (not scale), VisibilitySensor has size (not scale).

If all objects were allowed to have (consistently named!) transform attributes it would make learning and using VRML a lot easier and economical.

Anchor and the other sensor nodes

The sensors should be grouping nodes like Anchor and Collision so that they affect their children rather than their siblings (or else other things like transforms should be able to operate on siblings rather than children). Making them grouping nodes would be more consistent with how other VRML nodes work (like Transform, Billboard, Anchor, Collision, Switch, and LOD) and result in a more consistent and economical language. (See near the end of this list for more on sensors.)

repetition

appearance Appearance
material Material
color Color { color [ ]}
and others. This is wasteful, and time-consuming for the author and contributes to bloat of files.

arbitrary division of nodes and ambiguous, confusing names

The Shape node contains geometry (what is a shape but geometry?), and appearance (such a vague name as to be almost meaningless). Appearance is divided into 2 main parts: material (another bizarrely ambiguous name) and texture (slightly more precise name but why this is divided off from material is anyone's guess). The most absurd thing about this is that when an author thinks about modeling parts of the world they start with the geometry, and other aspects are generally thought of as attributes of that. I appreciate that keeping some qualities (color, glow, transparency, gloss, etc) separate from others (geometry and texture-mapping) is useful for VRML's awfully broken DEF, but it neither went far enough nor made enough sense to bother with imposing such confused and arbitrary categories on the poor authors. Either DEF should have been fixed, in which case all attributes could be under the one shape node, or else each attribute could have its own node, which would be silly overkill, but would effectively place all the attributes under the one shape node again.

unnecessary typing of data

If you want to send data from one part of the scene to another you find, far too often, that the strict data-typing means you need to add a Script node, bring in the data to convert it using Javascript (which is essentially typeless), and then send it back out to the scene again, usually through one of those awful ROUTEs. The argument for strict data-typing is generally made for speed enhancement, but all this slow conversion that is needed in any but the most simple wrls throws out the window any possible speed advantage. I think the real reason is that it makes it easier for the programmer to design the viewer. The trouble is that the viewer is created once. All the wrls created after that are hamstrung by that shortcut.

Strict data typing becomes particularly obnoxious when having to deal with things like color or position or angular info. It would make life easy if they standard arrays, but they aren't; they are special datatypes that require special functions to create and often to manipulate them. If you want to send PlaneSensor slider button's X value to an object's transparency you are forced to add a Script node for the simple extraction and conversion of an array element.

wasteful processing

There are many situations where VRML wastes time and effort processing items that it doesn't need. Touch and other sensors process mouse movement when often the only thing ever used is mouse-clicks. ProximitySensors are an example of terrible waste. Not only do they constantly process your position and heading whether required or not, there is a lopsidedness to the node itself that makes it wasteful to calculate your position with respect to a set of markers. VRML requires each marker to have its own ProximitySensor and you check to see if you have entered any of them. Much simpler would be to have a single ProximitySensor centered on yourself and special trigger objects that set a flag if they're in the area. An advantage of that is that it could easily be extended to be the basis of a 3rd-person collision-detection system.

LOD

LOD is a great idea, but because computers get faster and more capable of displaying more complex graphics, what is appropriate use of LOD for one machine may be entirely unsuitable for another. There should be some wiggle-room where the user is able to affect the LOD to extend or shorten by percentage all LODs in the current wrl. In this way LOD should not be strictly tied to distance, but ought to be a more abstract value. Also LOD should be able to be specified as frames per second, so that if the machine slows below the target fps then the defined LODs are activated. They would need to be prioritised so that a tradeoff is made between fps and proximity to detailed objects. Again, slow machines would need to be able to alter the rules if they are willing to put up with slow framerate in order to see maximum detail. On the other hand other people prefer high framerates over detail. The user should be able to affect the LOD decisions the viewer makes. It is important that this be catered for in the language instead of saying "Oh that is a browser issue -- that has nothing to do with defining the language." The two are inextricably linked. How the user-LOD is defined is required in the language itself.

Viewpoint

The camera in VRML is an implied object which is never really stated explicitly. You can bind to a certain Viewpoint, but Viewpoints simply reposition the current camera; they are not the camera itself. Once at a Vewpoint the camera can roam, and whilst ostensibly "bound" to that Viewpoint, the Viewpoint in fact has little effect on the camera except as a placeholder for working out the "next" Viewpoint in a list that can be jumped to.

The camera needs to be explicitly named and directly manipulated. An example of a problem that can occur with using Viewpoints as the only way to affect the camera is if a Viewpoint is jumped to then the user's position is updated under program control by altering a Transform which is around that Viewpoint. If the user has reason to later jump to that Viewpoint again it is no longer at its expected, original position -- it has moved!

In VRML there is a natural confusion that arises because the only way to manipulate the camera is via Viewpoints. But Viewpoints should really be like bookmarks that can be written into the original file, or saved, updated, deleted under program control. They should not be confused with the camera itself.

Another problem with current VRML Viewpoints is that unbinding a Viewpoint will jump the view to the next Viewpoint even if the user was nowhere near the previous Viewpoint. In other words even when a view is wandering far from a programmed Viewpoint the VRML view is still intimately associated with those Viewpoints. There is no way to program for a freely wandering camera -- it has to be hacked using Viewpoints.

predefined objects

VRML did a great thing by including some predefined objects (Sphere, Box, Cylinder, Cone) but doesn't take them far enough. It doesn't allow negative radii, height, or size. It doesn't allow zero values for those either. Negative dimensions would allow the predefined objects' interiors to be seen. It is only a partial solution though as it doesn't allow for both inside and outside to be seen without having to create the object twice -- once with positive dimension; once with negative dimension. Another limitation of current predefined objects is that there is only one way to apply a texture to them, when there should be many ways. For instance you can't apply a texture to just one face of a box, or apply the same texture the same way to all faces without reversal, or wrap a single, large texture over the faces. But I discuss this problem more at length below under the heading "texture mappings".

Lack of CSG (Constructive Solid Geometry)

CSG would have been an extremely useful feature to have in VRML, at little cost to viewer complexity. It would allow the building up of complex objects using boolean functions UNION, DIFFERENCE, INTERSECTION, MERGE, and NOT.

UNION uses an OR operation. It is almost like what VRML currently does, but a texture applied to a single UNION object is different to two overlapping objects with textures applied to each.

DIFFERENCE uses an XOR operation. It subtracts the parts that overlap, leaving only those that don't.

INTERSECTION uses an AND operation. It shows only those parts that overlap.

MERGE is like UNION, but with the surfaces in the intersection removed -- useful for transparent objects.

NOT lets one object be subtracted from another.

texture names

There should be one texture. Different variations like pixel, image, and movie (and procedural, reflection, bump, and other future texture forms) should be given by attributes. Having to remember several different kinds of texture name is silly when they all do basically the same thing, and it makes future extensions to the language more difficult than they need be.

texture mappings

VRML currently gives you the option to automatically map a texture onto a surface so that you can avoid having to painstakingly match points on an object to points in the texture. But it doesn't go far enough. The automated wrap is useful, but gives only one out of several possible ways to wrap a texture onto an object. Some ways of mapping a texture onto an object are: planar, spherical, cylindrical, toroidal. It would also be nice to be able to map a texture onto all faces identically so that, for instance, a box had a label displayed on each side so that the back wasn't reversed. And a square cylinder map would also be very useful for many objects. A box-mapping like currently used for the standard VRML Background node would be exceedingly useful too.

PROTO

These clumsy nodes are really VRML's only way of expanding and so should have been the frontier for furious and rapid development of the language, showing future directions it was to progress. Unfortunately this never happened because they are terribly difficult to use. The overly verbose header format followed by the rest of the PROTO makes writing more unwieldy than it needs to be.

Passing data in and out of a PROTO can be a headache in another way too. If you have data in a PROTO which is inside another PROTO, which itself might be in another PROTO, your only way to get the data out is to bubble it up through successive layers of PROTOs -- there is no way to get that data directly. There is no global data.

If you want to do any but the simplest building upon existing VRML then you need the ability to efficiently build upon past developments. PROTOs make this almost impossible. It is no wonder so little use was made of this, the only way of expanding the language.

Events

Events going into nodes and being sent from them are implied instead of explicit in all nodes except the Script node. This is the reason ROUTEs are needed, but has a few other really bad side-effects. One of the worst is that PROTOs that need to send data in or out through eventIn/eventOut must do so via an internal script Node [not correct -- IS can be used in PROTOs], and if that event involves moving a lot of data (for example continuous position and rotation updates) then this is extremely wasteful of CPU resources. Another bad result of implied events in Nodes is that it makes it far more difficult to follow data around a scene than it needs to be. This is not a great problem in simple worlds, but it is a real obstacle to creating and maintaining complex worlds.

Much better would have been to make events explicit. For eventOut, the event name then TO keyword followed by a destination, similar to how ROUTEs currently specify a destination would solve a lot of the above problems. Likewise, eventIn could use a keyword FROM followed by a source. The world author would decide which half of the event to make explicit.

Interpolators

These are really clumsy, though I can't blame the language designers because it is a very hard concept to grasp and implement.

The problems with interpolators:

The names, PositionInterpolator, OrientationInterpolator, ColorInterpolator, CoordinateInterpolator, NormalInterpolator, ScalarInterpolator, are descriptive of what they do, but are ridiculously wordy and duplicate what is basically one interpolator function handling many different datatypes. They could have been named using more readily understood terms, like move, turn, recolor, reshape, changeNormals, changeNumber, or call them all one thing: changer with different capabilities based upon keyword set.

TimeSensor is necessary for the use of all interpolators, so timer functions should have been included in the Node.

The attribute names, key and keyValue, convey nothing at all to distinguish them from each other. Much better would have been if they'd been named more descriptively, like timing and path or angles or such.

I come from an animated cartoon background so I can grok the keyframes concept with only a moderate level of difficulty, but others without such experience have a terrible time wrestling what could have been a simple concept if the film analogy had been abandoned in favor of a more natural one.

Restricting the key and keyValues to require the same number of items makes anything but very simple movements almost impossible to figure out. If timing and position/angle/color or whatever were separated it would be possible to easily specify a natural ease in and ease out with just a start position and an end position. It would also be possible to make a complex path with different length legs run at constant speed -- something that is very difficult to do in its current flawed state.

It is difficult to get a simple ball to bounce in VRML. Imagine how hard it is to move avatars -- files end up with massive, opaque lists of numbers that can't be re-used or built upon.

Very few movements in nature use flat speed; most movements are timed as a curve -- they start slowly, gather speed, then slow to a stop. Often it is sufficient to specify an ease-in rate, the top speed, and the ease-out rate. Some mechanical movements operate at a constant speed throughout -- mostly cyclical ones like a rotating planet.

Like timing, paths in nature rarely follow straight lines. Curves are far more common. Unfortunately curved paths are extremely complex to create in VRML. There should be a simple way to specify common and arbitrary curves in an anim node. Common curves are circle and ellipse, parabola, sine, exponential. Arbitrary curves can be defined by bezier-like control points (and even better, control surfaces) of attraction and repulsion.

Currently only absolute positions, angles, etc may be used for keyValues in interpolators. This is not flexible. It is extremely difficult to move an object to a certain spot from an arbitrary starting position; moving an object to another spot no matter where that spot is, is likewise extremely difficult -- especially so if the spot is moving arbitrarily; rotating an object to point in the direction it is moving is ridiculously hard; pointing an object to look at a moving object is also very hard; pointing an arbitrarily moving object to face a stationary spot is much too hard. Complex scripts that are not flexible or very adaptable are needed for what should be simple tasks.

How to fix interpolators:

Lack of exposedField in Script node

This is a well known annoyance. It has apparently been fixed in X3D.

Lack of absolute position/angle query

There needs to be a function to return the absolute position and angle of any object. This would be useful for animations when one or both of start and stop values are unknown.

Confusing field and event names

Some fields and events can be confusing. The ordinary English meaning of "enabled" is similar to "isActive" and I've seen at least one person confuse the fields. Some nodes use "on" instead of "enabled", and that seems more logical, but such state-altering fields should be named consistently. A better name for "isActive" in the TimeSensor would be "running", to distinguish it from the conceptually quite different "isActive" in most of the other sensors, like in the TouchSensor where it means a user has clicked on associated geometry. The TouchSensor's "isActive" would be better named "click" or "touch" or "mousebutton". (I know it is meant to work with devices other than mice, but they are what is used in more than 99% of cases.)

Sensors

TimeSensor is not really a sensor at all, but a clock. Its use is quite convoluted. The "enabled" switch doesn't actually start the timer as you would expect, but allows you to see outputs which have been quietly ticking away all along. This has some counterintuitive results. It is not enough to set cycleInterval and then enable the timer. You must find out the current time and send it to startTime. How do you find out the time? If the signal being used doesn't send the time then usually it means a Script must be brought into play. If you want to have a timer that counts up from 0 to n seconds then the time stream has to be manipulated by a Script to subtract the starting time from the stream's values. Scripts suck a lot of computing time. To require a Script for such a common thing seems absurd. The current time is so rarely needed that it would have been smarter to make the timer simply count up from 0 each time it was started. If the user really wanted the current time then it would be easy to provide a start value option.

VisibilitySensor doesn't really "sense" in the normal way of thinking, but simply reports on whether something is inside the view-frustrum. It would work much better as an attribute that can be enabled or disabled on any objects. By making the VisibilitySensor box a separate object to the bounding box around all objects the poor author has to in a sense double-up those parts of the scene where they want visibility reporting. This unnecessarily adds complexity to the authoring process.

ProximitySensor reports on user-position and would work better as either an attribute of the current avatar (a problem because VRML doesn't attach any actual object to the camera -- not even the viewpoint itself) or an attribute of any object. I suspect it would be mostly used as an attribute of a null object. The ProximitySensor is like several functions trying to live in one node. Many of its uses would be better performed by simpler, dedicated low-level things. Distance between objects would be better given by a simple distance/angle function, and would be better than ProximitySensor which can only work out such data with reference to the current user position. HUD would best be performed by a HUD node or making the viewpoint an attribute of an object. The inside/outside boundary trigger would be best as an attribute of any object. That could then be used to enable distance reporting between the user and a point, or between two points.

There should be one pointer sensor node, and all the others (TouchSensor, PlaneSensor, CylinderSensor, and SphereSensor) simply variations of it, determined by setting various attributes. It should be able to report more than simply whether a pointer is over or touching; it should report left-mouse-button (lmb) up, down, single-click, double-click, triple-click, right-mouse-button (rmb) click, middle-mouse-button (mmb) click, scroll-wheel turn, mouse-drag, and both buttons pressed simultaneously.

If drag was made available as separate X and Y values then that turns the general-purpose sensor into PlaneSensor, CylinderSensor, and SphereSensor, and more. But that requires another deficiency of VRML to be overcome. The language needs easy access to parts of translation and rotation values without having to go through a Script node. If all such values were simply elements of ordinary arrays then it would be no problem to send the X and/or Y drag values to any combination of X-move, Y-move, Z-move, X-turn, Y-turn, Z-turn. This would make it a very easy node to use and simplify VRML considerably. It would also make it very simple to build gauges and controllers that directly affect more than just position or angle; they could affect color, transparency, texture repetition, sound volume, light intensity, and so on.

data storage

VRML doesn't have a data storage node to simply hold arbitrary data. MFString attribute of a node can be hacked to hold arbitrary data as multiple strings, but it is still unsatisfactory because of its wordiness and its strict datatyping which prevents mixed data in a single array.

There is no easy way to build and access associative arrays for databases inside VRML. Javascript arrays inside Script nodes will work for just a small number of array elements, but are unwieldy for even moderately large databases because all the array elements must be stored again each time the script is accessed, unless incredibly cumbersome hacks are used to store the data in MFString attributes the first time through -- this means the info has to be written in first then retrieved in a time-consuming operation to be stored again.

Sounds

There are couple of different ways to insert sound into a world: This is perhaps an accidental result of mp3 files not being planned for, but they have been around since way before VRML. They should be able to be used the same way other sound files can. Also, notable for its absence from the file formats accepted by the Sound node is the .au format. It has been one of the few formats accepted by most web browsers since the early days.

It would have made great sense to have incorporated the various tracker formats into the VRML spec. Trackers did for sound what VRML tried to do for 3D. They are incredibly efficient in their use of sound to create long-form music. I believe omitting them was a terrible mistake.

Sound in VRML is incredibly primitive. There are options for spatialising, looping, and changing pitch. That's all. When you consider all the things that can be done with light in VRML (colors, ambient/direct lighting, transparency, shininess, emissiveColor, textures, and later extensions bringing environment mapping) it is amazing that sound didn't get at least a few extra options (filtering, echo, fade, envelope [attack/sustain/decay], procedural sounds and sound primitives [white noise/triangle/sine/square/custom], pitch, speed,)