Cognigy
Cognigy.AI provides a graphical conversation editor, AI-based understanding of text, and integration with backend systems such as CRMs and ERPs.
Project Setup
To use Cognigy and CVG to build voicebots you need an account for both Cognigy and CVG. If you do need an account please contact support@vier.ai.
On the Cognigy side there must be a project with an endpoint of type Socket
pointing to a flow.
To set up the Cognigy-backed project in CVG, only the endpoint URL is required, which can be found on Cognigy’s Endpoint page within an agent. A valid endpoint url should look like this:
https://endpoint-trial.cognigy.ai/nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
To connect your Cognigy bot with CVG, configure your CVG project like this under the bot configuration section:
Select the “Cognigy” template
Fill in the cognigy endpoint URL as supplied by Cognigy.
Optionally fill in the Cognigy channel which is used in Cognigy analytics to group dialogs from certain projects together (If Cognigy is used in conjunction with the Provisioning API this option is called
channel
).
To start using the VIER Voice Extension for Cognigy login into Cognigy and load the newest version of the VIER Voice Extension for Cognigy via the Cognigy Marketplace for Extensions.
Alternatively you can also download the newest version of the VIER Voice Extension for Cognigy from our GitHub-Account. Download the vier_voice.tar.gz and install/update your VIER Voice Extension for Cognigy in Cogniy using this file.
You need to install the VIER Voice Extension for Cognigy in every Cognigy Agent where you want to use it.
VIER Voice Extension for Cognigy
With the VIER Voice Extension for Cognigy, VIER provides a set of nodes that make it much easier for dialog designers to develop voicebots and control telephony features.
These nodes of the VIER Voice Extension for Cognigy offer you extensive functionality for call control and voice features of CVG. A technical documentation of the nodes can be found in the corresponding GitHub project. Furthermore you can find a description on how to use the nodes in our online manual.
What information is exchanged in detail between CVG and Cognigy can be found in the following section “Communication”.
Communication
From CVG to Cognigy (Events)
All events from CVG are transmitted to Cognigy. Every event contains
the dialog ID as session,
the remote phone number as the user ID.
If the caller suppressed the phone number, then the user ID will instead be suppressed-remote-number
.
Additionally, all events carry extra data that can be accessed using Cognigy’s ci.data
object:
dialogId
: This is the globally unique id assigned to the dialog by CVG.projectContext
: This object containers theresellerToken
andprojectToken
which can be required for certain API calls.timestamp
: This is the point in time (unix timestamp in milliseconds) the event occurred in CVG.
Text messages generated by the transcription of user input or pressing DTMF keys will also carry the language
of the transcriber, the confidence
value (integer in [0, 100]
), the vendor
, an indicator triggeredBargeIn
if user input triggered barge-in and type
(string in [SPEECH, DTMF]
) which allows differentiating between DTMF tones and actual speech.
Example additional data for such a text message:
{
"dialogId": "09e59647-5c77-4c02-a1c5-7fb2b47060f1",
"projectContext": {
"projectToken": "6681aeef-137f-465f-962c-01fa344a1b27",
"resellerToken": "f0ffe847-b84a-40fc-bd23-bd79100fe7b0"
},
"timestamp": 1679219476949,
"type": "SPEECH",
"text": "Stopp.",
"confidence": 82,
"vendor": "MICROSOFT",
"language": "de-DE",
"triggeredBargeIn": true
}
Non-speech related events will have a field status
which will have one of the following values, but not text:
greeting
: This status is sent once before anything else to allow the bot to respond e.g. with a greeting. It contains additional data, for example:{ "status":"greeting", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "local":"+49123412341234", "remote":"+49567856785678", "language":"de-DE", "transcriberLanguage": "de-DE", "synthesizerLanguage": "de-DE", "callType": "INBOUND", "customSipHeaders": { "X-SomeCustomHeader": ["value"] } "customDialogData": {}, "projectConfiguration": { "recordingsAllowed": false, "inactivityTimeout": 10000, "enabledFeatureFlags": [] } }
The
remote
number will be an empty string, when the caller suppressed the number.termination
: This status signals that the conversation has been terminated by the user. It contains additional data, for example:{ "status":"termination", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "reason":"botDisconnected" }
inactive
: Signals that the inactivity timeout has been triggered due to a lack of user input. It contains additional data, for example:{ "status":"inactive", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "duration":5000 }
recording-available
: Signals a change in the recording status, for example:{ "status": "recording-available", "dialogId": "09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp": 1535546718115, "recordingId": "default" }
The recording id will be
default
if no ID has been specified when starting the recording.answer
: The result of aprompt
(see next section). Here’s an example answer for a number prompt:{ "status": "answer", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "confidence": 100, "language": "en-US", "type": { "name": "Number", "value": "8342" } }
For a multiple choice prompt, it would look something like this instead:
{ "status": "answer", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "confidence": 83, "language": "en-US", "type": { "name": "MultipleChoice", "id": "no", "synonym": "never" } }
In case the prompt timed out because the user didn’t answer in time the answer looks like this:
{ "status": "answer", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "confidence": 100, "language": "en-US", "type": { "name": "Timeout" } }
outbound-success
: The success result offorward
orbridge
(see next section). It signals that the outgoing call has been successfully established. An example:{ "status": "outbound-success", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "ringTime": 12545, "ringStartTimestamp": 1535546718225 }
outbound-failure
: The failure result offorward
orbridge
(see next section). It signals that the outgoing call could not be established and provides some details as to why. An example:{ "status": "outbound-failure", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, "ringTime": 12545, "ringStartTimestamp": 1535546718225, "reason": "RING_TIMED_OUT" }
Depending on the exact reason (check out the
OutboundCallFailure
model in the API specification for all possible reasons) there might not be aringStartTimestamp
and theringTime
could be zero.refer-failure
: The failure result ofrefer
(see next section). It signals that the SIP REFER operation could not be performed. An example:{ "status": "refer-failure", "dialogId":"09e59647-5c77-4c02-a1c5-7fb2b47060f1", "projectContext": { "resellerToken": "ed4aff6d-c6f8-4ac9-ab67-d072ef45d9a0", "projectToken": "d30b1c38-b2fd-49c8-bec2-b268871338b0" }, "timestamp":1535546718115, }
From Cognigy to CVG (Commands)
Generally any standard “say”-functionality is supported by CVG, no matter how it is triggered (Say-node, Code-node, Process, …). The text is passed through to the synthesizers without any modifications. This allows to use SSML with synthesizers that support it.
In addition to the text, Cognigy also supports supplying customer data together or instead of the text. Together with CVG, this can be used to signal additional information to the call. Some of these custom data payloads will use the text messages supplied by Cognigy, other payloads ignore it.
All possible custom payloads are described in the following sections.
Say
Say can be used for messages that need some customization. This payload requires Cognigy’s text. In its simplest form, this payload does not contain any data, since all options are optional.
Options
language
(optional): This allows to override the synthesizer language for specific messages. (string, e.g. “de-DE”, defaults to the project language)synthesizers
(optional): If specified, this parameter overrides the synthesizer list from the project settings. Additional synthesizers are used as a fallback (in order) in case a service is currently unreachable. Please refer to the Say API specification for more details.interpretAs
(optional): Explicitly states what the given text should be interpreted as. If omitted, CVG tries to detect SSML and otherwise assumes plain text. Use TEXT if the text sent by your bot might contain XML-like text that could lead to a false SSML detection. Use SSML if the text sent by your bot should always be interpreted as SSML, even if it does not start with a <speak> tag.bargeIn
(optional): Allows the message to be interrupted by the speaker. (boolean, defaultfalse
)
Examples
{
"bargeIn": true,
"language": "de-CH",
"synthesizers": ["MICROSOFT", "c84dba09-8c2a-4e5e-98b5-d54e59812ee5", {"vendor":"GOOGLE","voice":"wavenet-Z"}]
}
Termination
This payload will hang up the call after all other messages have been synthesized. bargeIn
-enabled messages will be interrupted by the termination. If messages are without bargeIn
termination will happen after the end of the speech output.
Options
No options available.
Examples
{
"status": "termination"
}
Forward
This payload allows to forward a call to an external phone number (restricted to specific countries; ask us if you want to forward to a country currently not enabled) or SIP URI.
If the outbound call could not be established, the bot will receive a outbound-failure
event (see previous section). Otherwise the
bot will receive a outbound-success
event as soon as the call is fully established.
Options
destinationNumber
(required): The phone number (+E.164 format, e.g. “+49721480848680”) or SIP URI (e.g.sip:user@exmaple.org
) to forward to.callerId
(optional): The phone number displayed to the callee. (This is a best-effort option, correct display can not be guaranteed)customSipHeaders
(optional): An object where each property is the name of a header, and the value is a list of strings. All header names must begin withX-
. For example:{ "X-SomeHeader": ["some value", "another value"] }
ringTimeout
(optional): The maximum time the call will be ringing (in milliseconds) before the attempt will be cancelled. By default, this is 120 seconds.acceptAnsweringMachines
(optional): Whether the bot should accept answering machines picking up. Answering machine detection is a best-effort functionality and bots should not rely on an exact detection. It also cuts off up to 5 seconds of the beginning of the call for detection purposes.data
(optional): An object with key-value pairs to be attached as custom data to the dialog.experimentalEnableRingingTone
(optional, experimental): Enables the playback of a ringing tone while the call is pending. This option will change in the future.
Examples
{
"status": "forward",
"destinationNumber": "+49721480848680",
"callerId": "+49721480848680"
}
Bridge
This payload allows to bridge a call to an external phone number (restricted to specific countries; ask us if you want to forward to a country currently not enabled) for the Assist Use-Case.
If the outbound call could not be established, the bot will receive a outbound-failure
event (see previous section). Otherwise the
bot will receive a outbound-success
event as soon as the call is fully established.
Options
headNumber
(required): The phone number prefix to bridge to. (+E.164 format, e.g. “+49721480848680”)extensionLength
(required): The range of extensions to choose a number from.callerId
(optional): The phone number displayed to the callee. (This is a best-effort option, correct display can not be guaranteed)customSipHeaders
(optional): An object where each property is the name of a header, and the value is a list of strings. All header names must begin withX-
. For example:{ "X-SomeHeader": ["some value", "another value"] }
ringTimeout
(optional): The maximum time the call will be ringing (in milliseconds) before the attempt will be cancelled. By default, this is 120 seconds.acceptAnsweringMachines
(optional): Whether the bot should accept answering machines picking up. Answering machine detection is a best-effort functionality and bots should not rely on an exact detection. It also cuts off up to 5 seconds of the beginning of the call for detection purposes.data
(optional): An object with key-value pairs to be attached as custom data to the dialog.experimentalEnableRingingTone
(optional, experimental): Enables the playback of a ringing tone while the call is pending. This option will change in the future.
Examples
{
"status": "bridge",
"headNumber": "+49721480848680",
"extensionLength": 3
}
Refer
This payload allows to transfer a call to an external phone number or SIP URI using SIP REFER. Make sure that the involved call parties actually support SIP REFER.
If the outbound call could not be established, the bot will receive a refer-failure
event (see previous section). Otherwise the
bot will receive a termination
event with the reason callReferred
as soon as the call is fully established.
Options
destination
(required): The phone number (+E.164 format, e.g. “+49721480848680”) or SIP URI (e.g.sip:user@exmaple.org
) to refer to.
Examples
{
"status": "refer",
"destination": "+49721480848680",
}
Play
This payload can be used to play audio files to be heard by the caller.
Note the following requirements and limitations:
The audio file must be hosted at an Internet-accessible HTTP(S) endpoint. In case of HTTPS the server hosting the audio file must present a valid, trusted SSL certificate. Self-signed certificates cannot be used.
The audio file must be a valid wav file (waveform audio file format).
The file format must be one of the following:
Linear PCM with signed 16 bits per sample, with a sample rate of 8000 Hz or 16000 Hz
A-law with a sample rate of 8000 Hz
µ-law with a sample rate of 8000 Hz
Options
url
(required): The location of the audio file.bargeIn
(optional): Allows the message to be interrupted by the speaker. (boolean, defaultfalse
)
Examples
{
"status": "play",
"url": "https://example.org/some-audio.wav",
"bargeIn": true
}
Recording Start
This payload can be used to start the recording.
Options
maxDuration
(optional): Maximum recording duration in milliseconds. After the duration, the recording will be stopped automatically.recordingId
(optional): An arbitrary string to identify the recording in case multiple recordings are created in the same dialog.speakers
(optional): A list of audio channels to record. Possible values areCUSTOMER
andAGENT
.
Examples
{
"status": "recording-start",
"maxDuration": 20000,
"recordingId": "string",
"speakers": [
"CUSTOMER",
"AGENT"
]
}
Recording Stop
This payload can be used to stop the recording.
Options
recordingId
(optional): An arbitrary string to identify the recording in case multiple recordings are created in the same dialog.terminate
(optional): Whether the recording should be terminated, rather than just paused. If terminated, the recording will be processed as soon as possible instead of deferring processing until the dialog has ended.
Examples
{
"status": "recording-stop",
"recordingId": "string",
"terminate": false
}
Data
This payload can be used to attach custom data to the dialog.
Options
data
(required): This is an object that can have arbitrary properties, each property is expected to have a string value.
Examples
{
"status": "data",
"data": {
"UsedLanguage": "language={{input.data.language}}",
"CallerNumber": "{{input.data.remote}}"
}
}
Debug
This payload can be used to log bot state to CVG for debugging purposes.
Options
details
(required): The information to be logged as arbitrary JSON.
Examples
{
"status": "debug",
"details": {
"same-field": 123.4
}
}
Prompt
This payload allows to start a various prompts. This payload requires Cognigy’s text.
Options
message
(required): The message to introduce the prompt to the caller.timeout
(required): The duration (in milliseconds) after which the prompt will be cancelled.type
(required): The type of prompt and all its details in a nested object. Please consult the Prompt API specificationslanguage
(optional): This allows to override the synthesizer language for specific messages. (string, e.g. “de-DE”, defaults to the project language)synthesizers
(optional): If specified, this parameter overrides the synthesizer list from the project settings. Additional synthesizers are used as a fallback (in order) in case a service is currently unreachable. Please refer to the Prompt API specification for more details.interpretAs
(optional): Explicitly states what the given text should be interpreted as. If omitted, CVG tries to detect SSML and otherwise assumes plain text. Use TEXT if the text sent by your bot might contain XML-like text that could lead to a false SSML detection. Use SSML if the text sent by your bot should always be interpreted as SSML, even if it does not start with a <speak> tag.bargeIn
(optional): Allows the message to be interrupted by the speaker. (boolean, defaultfalse
)
Examples
Number Prompt
Here is another sample of a custom payload, that requests CVG to collect max 4 digits via DTMF input using #
as DTMF signal to terminate input collection (number prompt).
{
"status": "prompt",
"language": "en-US",
"bargeIn": false,
"timeout": 5000,
"type": {
"name": "Number",
"maxDigits": 4,
"submitInputs": ["DTMF_#"]
}
}
Some things to note:
There has to be at least one stop condition (so either
maxDigits
orsubmitInputs
must be specified).The caller response to such a prompt request will be a message with the
answer
status and type nameNumber
(see previous section).
Multiple Choice Prompt
Besides number prompts, there are also multiple choice prompts. In the following example, the two available “yes” and “no” choices can each be triggered by one of the provided synonyms. The synonyms also include DTMF digits, in case the user prefers to simply press ‘0’ or ‘1’.
{
"status": "prompt",
"timeout": 5000,
"type": {
"name": "MultipleChoice",
"choices": {
"yes": [
"yes",
"yeah",
"affirmative",
"DTMF_1"
],
"no": [
"no",
"never",
"negative",
"DTMF_0"
]
}
}
}
Some things to note:
The caller response to such a prompt request will be a message with the
answer
status and type nameMultipleChoice
(see previous section).