instantreality 1.0

Component:
Sound
Status:
interface only
Structure type:
concrete
Standard:
Avalon

AudioTTS

Text-To-Speech (TTS) node. Transforms text to audio data by using a synthetic computer voice. Besides creating the audio data, this node can also provide weights to morph between different geometries. This can be used to animate the lips of avatars.

Inheritance

Code

XML encoding
<AudioTTS text=''
visemeKey=''
weightValue=''
visemeDurationScale='0.5'
autoSilentIndex='-1'
enabled='TRUE'
description=''
loop='FALSE'
pitch='1.0'
startTime='0'
stopTime='0'
pauseTime='0'
resumeTime='0'
triggerName='Sound'
logFeature=''
 />
Classic encoding
AudioTTS {
	text ""
	visemeKey [""]
	weightValue []
	visemeDurationScale 0.5
	autoSilentIndex -1
	enabled TRUE
	description ""
	loop FALSE
	pitch 1.0
	startTime 0
	stopTime 0
	pauseTime 0
	resumeTime 0
	triggerName "Sound"
	logFeature [""]
}

Interface

Filter: X3D only | Avalon only | All
id Name DataType PartType Default ValueType Description
SFString triggerName SFString initializeOnly Sound name of the dynamic context-slot which is used by the run-time environment (e.g. Jobs) to trigger the node. Life-Nodes will automatically connect the context-eventOutut to the triggerSlot-eventInput Slot.
MFString visemeKey MFString initializeOnly Maps visemes to geometries. E.g. when the viseme a is represented by the first geometry, you put "a" at index 0 of this field, etc.
SFNode voice SFNode initializeOnly Voice Contains a Voice node the describes the synthetic computer voice.
MFFloat weightValue MFFloat initializeOnly You can use this field to remap weights. This contains (number of visemes) * (number of geometries) weight values. When the node has to display viseme i, it takes the values starting at i * (number of visemes) as weights. Otherwise, it sets the weight of the geometry that corresponds to the viseme to 1 and the weights of all other geometries to 0.
SFTime triggerSlot SFTime inputOnly slot which is used internally to connect a dynamic context-slot which name is set by the triggerName value. Its used automatically to install run-time environment trigger.
SFInt32 autoSilentIndex SFInt32 inputOutput -1 The index of the silent geometry. When this field is >= 0, the node ensures that the corresponding geometry is shown at the end of the animation sequence.
SFString description SFString inputOutput text description to be displayed for action of this node. Hint: many XML tools substitute XML character references automatically if needed.
SFBool enabled SFBool inputOutput TRUE Enables or disables audio sources. A disabled audio source does not produce audio data.
MFString logFeature MFString inputOutput state, child, parent, route, eventIn, eventOut controls the logging of changes, state: log state changes (e.g. live), child: log child add/remove, parent: log parent add/remove, route: log route add/remove; eventIn: log receiving of events, eventOut: log sending of events: guiView, runtime system should create node-view, guiEdit: runtime system should create node-editeverything: log everything
SFBool loop SFBool inputOutput FALSE repeat indefinitely when loop=true, repeat only once when loop=false.
SFNode metadata SFNode inputOutput MetadataObject container for payload metadata inside MetadataSet element
SFTime pauseTime SFTime inputOutput 0 When time now >= pauseTime, isPaused becomes true and TimeSensor becomes paused. Absolute time: number of seconds since Jan 1, 1970, 00:00:00 GMT. Hint: usually receives a ROUTEd time value.
SFFloat pitch SFFloat inputOutput 1.0 Multiplier for the rate at which sampled sound is played. changing pitch also changes playback speed.
SFTime resumeTime SFTime inputOutput 0 When resumeTime becomes less than time now, isPaused becomes false and TimeSensor becomes inactive. Absolute time: number of seconds since Jan 1, 1970, 00:00:00 GMT. Hint: usually receives a ROUTEd time value.
SFTime startTime SFTime inputOutput 0 Absolute time: number of seconds since Jan 1, 1970, 00:00:00 GMT. Hint: usually receives a ROUTEd time value.
SFTime stopTime SFTime inputOutput 0 Absolute time: number of seconds since Jan 1, 1970, 00:00:00 GMT. Hint: usually receives a ROUTEd time value.
SFString text SFString inputOutput The text that gets spoken by the computer voice.
SFFloat visemeDurationScale SFFloat inputOutput 0.5 This value determines how long the visemes are displayed during the animation. By default the value is 0.5, which means that half of the time the current viseme is displayed, a quarter of the time is used to interpolate from the previous viseme to the current viseme, and a quarter of the time is used to interpolate from the current viseme to the next viseme.
SFTime cycleTime SFTime outputOnly cycleTime sends a time outputOnly at startTime, and also at the beginning of each new cycle (useful for synchronization with other time-based objects).
SFTime duration_changed SFTime outputOnly duration_changed is length of time in seconds for one cycle of audio.
SFTime elapsedTime SFTime outputOnly Current elapsed time since TimeSensor activated/running, cumulative in seconds, and not counting any paused time.
SFFloat fraction_changed SFFloat outputOnly fraction_changed continuously sends value in range [0,1] showing time progress in the current cycle.
SFBool isActive SFBool outputOnly isActive true/false events are sent when playback starts/stops.
SFBool isPaused SFBool outputOnly isPaused true/false events are sent when TimeSensor is paused/resumed.
SFString marker_reached SFString outputOnly Sends the name of a marker in the text when the computer voice reaches the position of the marker. Use this outslot to synchronize animations with the spoken text. Not yet implemented.
SFTime ready SFTime outputOnly Sends the current timestamp when audio data is available. Currently not implemented.
SFTime time SFTime outputOnly Time continuously sends the absolute time (since January 1, 1970) for a given simulation tick.
MFFloat weights_changed MFFloat outputOnly Provides the weights for the different geometries used to create the final geometry. Usually, you have one geometry for each viseme provided by the text-to-speech system. For each geometry, you get one weight. The sum of all weights is 1. You multiply all geometries with their respective weight and sum them up. The result is an animation of the lips that is synchronous to the speech.