instantreality 1.0

Vision Tracking Device

Keywords:
Instant IO, devices, vision
Author(s): Mario Becker
Date: 2008-06-10

Summary: This tutorial shows you how to do vision based tracking. As an example we will create a marker tracker.

Introduction

This examples is based on iosensors to instanciate a Vision Tracking device, For basic information about sensors refere to "iosensor".

Tracking in General

The Tracking module is based on a secondary library, which is why the configuration of a tracking system is somewhat out of the X3D or VRML style and is located in an additional xml file which must be referenced from your scene definition. Some terms used in this tutorial...
World
:
The desciption of the world how the tracking system sees it. It contains one ore more TrackedObjects.

TrackedObject
:
Describes the objects to be tracked, thus this node contains all information about an object which should be tracked

Marker
:
One way to track things is the use of a marker. To describe a TrackedObject with a marker this node is added to the TrackedObject

Camera
:
A camera is used in every vision tracking system. The node contains one Intrinsic and one Extrinsic data node.

ExtrinsicData
:
Part of the camera description, contains the parameters descibing the position and orientation of the camera in the world.

IntrinsicData
:
Second part of the camera description, describes the internal parameters of a camera like resolution, focal length or distortion.

ActionPipe
:
The execution units (Actions) in InstantVision are arranged in an execution pipe which is called an ActionPipe.

DataSet
:
All data used in InstantVision are placed in the DataSet and have a key(name) to refere to them.

The Example

Two files will be needed to setup an InstantReality scene with a vision tracking device. The first is a VisionLib configuration file, which describes the tracking setup, the second is a scene file for Instant Player.

The VisionLib config (visionlib.pm). All images and cameras used in the VisionLib config are exported to InstantPlayer, so you can use the images as textures or backgrounds and the cameras as transformations for Viewpoint, Viewfrustum or ComponentTransform. The names of the Images are the same as in the VisionLib config, the cameras are split into 4 names where the first part names the TrackedObject from which the camera is derived and the postfix names the output type. In the Example these names are "TrackedObjectCamera_ModelView", "TrackedObjectCamera_Projection", "TrackedObjectCamera_Position", "TrackedObjectCamera_Orientation". The camera here is derived from World.TrackedObject which gave the name.

Tracking multiple markers can be achieved by duplicating the TrackedObject sections and give the TrackedObjects distinctive keys and marker codes. As described above, you will get the camera (inverted object) transformations named like the TrackedObject key + (e.g.) "Camera_ModelView".

Code: The visionlib.pm file

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<VisionLib2 Version="2.0">

  <Plugins size="0">
  </Plugins>

  <ActionPipe category="Action" name="AbstractApplication-AP">
    <VideoSourceAction category="Action" name="VideoSourceAction">
      <Keys size="2">
        <key val="VideoSourceImage"/>
        <key val=""/>
      </Keys>
      <ActionConfig preferred_height="480" preferred_width="640" shutter="-1" source_url="ds"/>
    </VideoSourceAction>
    <ImageConvertActionT__ImageT__RGB_FrameImageT__GREY_Frame category="Action" name="ImageConvertActionT">
      <Keys size="2">
        <key val="VideoSourceImage"/>
        <key val="ConvertedImage"/>
      </Keys>
    </ImageConvertActionT__ImageT__RGB_FrameImageT__GREY_Frame>
    <MarkerTrackerAction category="Action">
      <Keys size="5">
        <key val="ConvertedImage"/>
        <key val="IntrinsicData"/>
        <key val="World"/>
        <key val="MarkerTrackerInternalContour"/>
        <key val="MarkerTrackerInternalSquares"/>
      </Keys>
      <ActionConfig MTAThresh="140" MTAcontrast="0" MTAlogbase="10" WithKalman="0" WithPoseNlls="1"/>
    </MarkerTrackerAction>
    <TrackedObject2CameraAction category="Action" name="TrackedObject2Camera">
      <Keys size="3">
        <key val="World"/>
        <key val="IntrinsicData"/>
        <key val="Camera"/>
      </Keys>
    </TrackedObject2CameraAction>
  </ActionPipe>

  <DataSet key="">
    <IntrinsicDataPerspective calibrated="1" key="IntrinsicData">
      <!--Image resolution (application-dependant)-->
      <Image_Resolution h="480" w="640"/>
      <!--Normalized principal point (invariant for a given camera)-->
      <Normalized_Principal_Point cx="5.0037218855e-01" cy="5.0014036507e-01"/>
      <!--Normalized focal length and skew (invariant for a given camera)-->
      <Normalized_Focal_Length_and_Skew fx="1.6826109287e+00" fy="2.2557202465e+00" s="-5.7349563803e-04"/>
      <!--Radial and tangential lens distortion (invariant for a given camera)-->
      <Lens_Distortion k1="-1.6826758076e-01" k2="2.5034542035e-01" k3="-1.1740904370e-03" k4="-4.8766380599e-03" k5="0.0000000000e+00"/>
    </IntrinsicDataPerspective>
    <World key="World">
      <TrackedObject key="TrackedObject">
        <ExtrinsicData calibrated="0">
          <R rotation="1 0 0 &#xA;"/>
          <t translation="0 0 0 &#xA;"/>
          <Cov covariance="0  0  0  0  0  0  &#xA;0  0  0  0  0  0  &#xA;0  0  0  0  0  0  &#xA;0  0  0  0  0  0  &#xA;0  0  0  0  0  0  &#xA;0  0  0  0  0  0  &#xA;"/>
        </ExtrinsicData>
        <Marker BitSamples="2" MarkerSamples="6" NBPoints="4" key="Marker1">
          <Code Line1="1100" Line2="1100" Line3="0100" Line4="0000"/>
          <Points3D nb="4">
            <HomgPoint3Covd Cov3x3="0  0  0  &#xA;0  0  0  &#xA;0  0  0  &#xA;" w="1" x="0" y="6" z="0"/>
            <HomgPoint3Covd Cov3x3="0  0  0  &#xA;0  0  0  &#xA;0  0  0  &#xA;" w="1" x="6" y="6" z="0"/>
            <HomgPoint3Covd Cov3x3="0  0  0  &#xA;0  0  0  &#xA;0  0  0  &#xA;" w="1" x="6" y="0" z="0"/>
            <HomgPoint3Covd Cov3x3="0  0  0  &#xA;0  0  0  &#xA;0  0  0  &#xA;" w="1" x="0" y="0" z="0"/>
          </Points3D>
        </Marker>
      </TrackedObject>
    </World>
  </DataSet>

</VisionLib2>

The scene file instanciates an IOSensor based on the VisionLib config file. The output of this IO sensor is then routed to a texture and a Viewfrustum node.

Code: The visionlib.x3d file

<?xml version="1.0" encoding="UTF-8"?>
<X3D>
	<Engine DEF='engine'>
		<TimerJob DEF='timer'/>
		<SynchronizeJob DEF='synchronize'/>
		<RenderJob DEF='render'>
			<WindowGroup>
				<Window position='10 50' size='640,480' fullScreen='false' />
			</WindowGroup>
		</RenderJob>
	</Engine>
	
	<Scene DEF='scene'>
    	<IOSensor DEF='VisionLib' type='VisionLib' configFile='visionlib.pm'>
    		<field accessType='outputOnly' name='VideoSourceImage' type='SFImage'/>
    		<field accessType='outputOnly' name='TrackedObjectCamera_ModelView' type='SFMatrix4f'/>
    		<field accessType='outputOnly' name='TrackedObjectCamera_Projection' type='SFMatrix4f'/>
    		<field accessType='outputOnly' name='TrackedObjectCamera_Position' type='SFVec3f'/>
    		<field accessType='outputOnly' name='TrackedObjectCamera_Orientation' type='SFRotation'/>
    	</IOSensor>
    
    	<Viewfrustum DEF='vf' />
    
    	<PolygonBackground>
    		<Appearance positions='0 0, 1 0, 1 1, 0 1' >
      			<TextureTransform rotation='0' scale='1 -1'/>
       			<PixelTexture2D DEF='tex' autoScale='false'/>
    		</Appearance>
    	</PolygonBackground>
    
    	<Transform translation='0 0 0'>
			<Shape DEF='geo2'>
				<Appearance>
					<Material emissiveColor='1 0.5 0' />
				</Appearance>
				<Teapot size='5 5 5' />
			</Shape>
		</Transform>
    
    	<ROUTE fromNode='VisionLib' fromField='VideoSourceImage' toNode='tex' toField='image'/>
    	<ROUTE fromNode='VisionLib' fromField='TrackedObjectCamera_ModelView' toNode='vf' toField='modelview'/>
    	<ROUTE fromNode='VisionLib' fromField='TrackedObjectCamera_Projection' toNode='vf' toField='projection'/>
    </Scene>
</X3D>
Instead of the Viewfrustum you can also use a Viewpoint (excerpt).

Code: The visionlib_vp.x3d file

    	<Viewpoint DEF='vf' fieldOfView='0.5' />
        
    	<ROUTE fromNode='VisionLib' fromField='VideoSourceImage' toNode='tex' toField='image'/>
    	<ROUTE fromNode='VisionLib' fromField='TrackedObjectCamera_Position' toNode='vf' toField='position'/>
    	<ROUTE fromNode='VisionLib' fromField='TrackedObjectCamera_Orientation' toNode='vf' toField='orientation'/>

Code: The visionlib.wrl file

#VRML V2.0 utf8

DEF	trackingSensor IOSensor {
	type "VisionLib"
	configFile "visionlib.pm"
	eventOut SFImage	VideoSourceImage
	eventOut SFMatrix4f TrackedObjectCamera_ModelView
	eventOut SFMatrix4f TrackedObjectCamera_Projection
	eventOut SFVec3f	TrackedObjectCamera_Position
	eventOut SFRotation TrackedObjectCamera_Orientation
}

DEF trans Transform {
  children [
    Shape {
      appearance Appearance {
        texture DEF tex PixelTexture2D {
        }
      }
      geometry Box {
      }
    }
  ]
}

ROUTE trackingSensor.VideoSourceImage TO tex.image
ROUTE trackingSensor.TrackedObjectCamera_Orientation TO	trans.rotation

Modifications

This section gives you some glues what to change to get you setup running.

VideoSource

The example above uses DirectShow (or QT on the Mac) to access a camera. This should work for all cameras which support it, these will usually have a WDM driver to be installed. To use other cameras you need to change the VideoSource:ActionConfig:source_url field in the .pm file. There are also a number of arguments which can be passed to the video source driver. The arguments are added to the source url like this: driver://parameter1=value;parameter2=value.

Some drivers and their parameters are (available on platform in parentheses):

ds
:
(win32, darwin) Windows driver as mentioned above, on a Mac this is the same as "qtvd" Parameters are: device - string name of the camera, mode - string name of the mode, framerate - integer. The driver compares the given parameters to whatever DS reports about the camera, if there is a mach the maching parameters are used, other values are ignored.

vfw
:
(win32) Old VideoForWindows driver. That is a good luck driver, no parameters implemented.

v4l
:
(linux) Works with video4linux (old version 1). No parameter support yet but you can pass something like v4l:///dev/myvideodev to select a device and it reads environment variables VIDEO_SIZE which is an integer value [0-10] which selects a video size between 160x120 and 768x576

ieee1394
:
(win32) FireWire DC cameras which run with the CMU driver http://www.cs.cmu.edu/~iwan/1394/index.html on windows. No parameters available for now.

ieee1394
:
(linux) FireWire DC cameras which run with the video1394 kernel module and libdc1394 (coriander), which includes PGR devices. Parameters are: unit - integer value for selecting a camera at the bus, trigger - boolean [0,1] 1 switches on external trigger, downsample - boolean [0,1] downsamples a bayer coded image to half size, device - string like "/dev/video1394/0" the device file to use.

ieee1394pgr
:
(win32) PointGreyResearch cameras, license needed Parameters are: unit - integer value for selecting a camera at the bus, trigger - boolean [0,1] 1 switches on external trigger, downsample - boolean [0,1] downsamples a bayer coded image to half size, mode - string value to select a mode, when passing "mode=320" some mode with a resolution of 320x240 is selected.

ueye
:
(win32, linux) IDS imaging uEye cameras, license needed (more adjustments then the ds drivers) Parameters: downsample - boolean [0,1] downsamples a bayer coded image to half size,

vrmc
:
(win32) VRmagic cameras, license needed, No parameters supported yet.

qtvd
:
(darwin) Mac QuickTimeVideoDigitizer, no parameters yet

Some of these drivers require additional libs/dlls which must be installed on your system and in the path.

Marker

A marker in IV is described by a 4x4 code mask and four corner points. You can easily change the marker code by editing the fields DataSet:World:TrackedObject:Marker:Code:LineX. The marker is made of 4 lines Line1 = 1100 Line2 = 1100 Line3 = 0100 Line4 = 0000 where 0 = black and 1 = white, e.g.

Image: marker code

the real marker must have a black square around this and another white square around. It will look like

Image: full marker

One way to create and print whose things is to go into word and create a 8x8 table, make the rows and cols the same size and color the cell background with black'n'white.

You can also change the position of the marker in the world by changing ...Marker:Points3D:HomgPoint3Covd:[xyz] values, make sure the marker stays rectangular and planar. The points describe the outer black border (6x6 field) not the white surrounding, they corrospond to upper left, upper right, lower right, lower left corners of the image.

You also use multiple markers in one TrackedObject, just duplicate DataSet:World:TrackedObject:Marker and change one of them to reflect its physical position on the object you want to track.

Comments

This tutorial has no comments.


Add a new comment

Due to excessive spamming we have disabled the comment functionality for tutorials. Please use our forum to post any questions.