of A VR Program
Just what is required of a VR program? The basic parts of the system can be
broken down into an Input Processor, a Simulation Processor, a Rendering
Process, and a World Database. All these parts must consider the time required
for processing. Every delay in response time degrades the feeling of 'presence'
and reality of the simulation.
Input Processes of a VR program control the devices used to input information
to the computer. There are a wide variety of possible input devices: keyboard,
mouse, trackball, joystick, 3D & 6D position trackers (glove, wand, head
tracker, body suit, etc.). A networked VR system would add inputs received from
net. A voice recognition system is also a good augmentation for VR, especially
if the user's hands are being used for other tasks. Generally, the input
processing of a VR system is kept simple. The object is to get the coordinate
data to the rest of the system with minimal lag time. Some position sensor
systems add some filtering and data smoothing processing. Some glove systems
add gesture recognition. This processing step examines the glove inputs and
determines when a specific gesture has been made. Thus it can provide a higher
level of input to the simulation.
The core of a VR program is the simulation system. This is the process that
knows about the objects and the various inputs. It handles the interactions,
the scripted object actions, simulations of physical laws (real or imaginary)
and determines the world status. This simulation is basically a discrete
process that is iterated once for each time step or frame. A networked VR
application may have multiple simulations running on different machines, each
with a different time step. Coordination of these can be a complex task.
It is the simulation engine that takes the user inputs along with any tasks
programmed into the world such as collision detection, scripts, etc. and
determines the actions that will take place in the virtual world.
The Rendering Processes of a VR program are those that create the sensations
that are output to the user. A network VR program would also output data to
other network processes. There would be separate rendering processes for
visual, auditory, haptic (touch/force), and other sensory systems. Each
renderer would take a description of the world state from the simulation
process or derive it directly from the World Database for each time step.
The visual renderer is the most common process and it has a long history from
the world of computer graphics and animation. The reader is encouraged to
become familiar with various aspects of this technology.
The major consideration of a graphic renderer for VR applications is the frame
generation rate. It is necessary to create a new frame every 1/20 of a second
or faster. 20 frames per second (fps) is roughly the minimum rate at which the
human brain will merge a stream of still images and perceive a smooth
animation. 24fps is the standard rate for film, 25fps is PAL TV, 30fps is NTSC
TV. 60fps is Showscan film rate. This requirement eliminates a number of
rendering techniques such as raytracing and radiosity. These techniques can
generate very realistic images but often take hours to generate single frames.
Visual renderers for VR use other methods such as a 'painter's algorithm', a
Z-Buffer, or other Scanline oriented algorithm. There are many areas of visual
rendering that have been augmented with specialized hardware. The Painter's
algorithm is favored by many low end VR systems since it is relatively fast,
easy to implement and light on memory resources. However, it has many
visibility problems. For a discussion of this and other rendering algorithms,
see one of the computer graphics reference books listed in a later section.
The visual rendering process is often referred to as a rendering pipeline.
This refers to the series of sub-processes that are invoked to create each
frame. A sample rendering pipeline starts with a description of the world, the
objects, lighting and camera (eye) location in world space. A first step would
be eliminate all objects that are not visible by the camera. This can be
quickly done by clipping the object bounding box or sphere against the viewing
pyramid of the camera. Then the remaining objects have their geometry's
transformed into the eye coordinate system (eye point at origin). Then the
hidden surface algorithm and actual pixel rendering is done.
The pixel rendering is also known as the 'lighting' or 'shading' algorithm.
There are a number of different methods that are possible depending on the
realism and calculation speed available. The simplest method is called flat
shading and simply fills the entire area with the same color. The next step up
provides some variation in color across a single surface. Beyond that is the
possibility of smooth shading across surface boundaries, adding highlights,
An effective short cut for visual rendering is the use of "texture" or "image"
maps. These are pictures that are mapped onto objects in the virtual world.
Instead of calculating lighting and shading for the object, the renderer
determines which part of the texture map is visible at each visible point of
the object. The resulting image appears to have significantly more detail than
is otherwise possible. Some VR systems have special 'billboard' objects that
always face towards the user. By mapping a series of different images onto the
billboard, the user can get the appearance of moving around the object.
I need to correct my earlier statement that radiosity cannot be used for VR
systems due to the time requirements. There have recently been at least two
radiosity renderers announced for walkthrough type systems - Lightscape from
Lightscape Graphics Software of Canada and Real Light from Atma Systems of
Italy. These packages compute the radiosity lighting in a long time consuming
process before hand. The user can interactively control the camera view but
cannot interact with the world. An executable demo of the Atma product is
available for SGI systems from ftp.iunet.it (126.96.36.199) in the directory
A VR system is greatly enhanced by the inclusion of an audio component. This
may produce mono, stereo or 3D audio. The latter is a fairly difficult
proposition. It is not enough to do stereo-pan effects as the mind tends to
locate these sounds inside the head. Research into 3D audio has shown that
there are many aspects of our head and ear shape that effect the recognition of
3D sounds. It is possible to apply a rather complex mathematical function
(called a Head Related Transfer Function or HRTF) to a sound to produce this
effect. The HRTF is a very personal function that depends on the individual's
ear shape, etc. However, there has been significant success in creating
generalized HRTFs that work for most people and most audio placement. There
remains a number of problems, such as the 'cone of confusion' wherein sounds
behind the head are perceived to be in front of the head.
Sound has also been suggested as a means to convey other information, such as
surface roughness. Dragging your virtual hand over sand would sound different
than dragging it through gravel.
Haptics is the generation of touch and force feedback information. This area
is a very new science and there is much to be learned. There have been very few
studies done on the rendering of true touch sense (such as liquid, fur, etc.).
Almost all systems to date have focused on force feedback and kinesthetic
senses. These systems can provide good clues to the body regarding the touch
sense, but are considered distinct from it. Many of the haptic systems thus far
have been exo-skeletons that can be used for position sensing as well as
providing resistance to movement or active force application.