New and Exciting “Contact Center AI Solution”

Just imagine, you call a customer support on 123-456-7890 and your experience is very bad. Will you like to call them again? Umm….No

What are your chances to change the product brand that you are using? (speaking conservatively) Atleast 50%

“As reported in Huffington Post, 91% of unhappy costumers will simply leave a brand without ever complaining”

Now-a-days, “Customer Experience” is key to retain a customer and to make a memorable experience for people calling in. This small but impacting goal needs AI technology to back it up and give the organisation their competitive edge.

If we look at current market trends, as per Grand View Research, Global Call Center AI market was valued at USD 1.16B in 2021 and is expected to grow atleast 23% (CAGR) till 2030.

As per Globe News Wire, this market was valued at USD 1.42B in 2021 and is expected to grow at 25% CAGR till 2026.

In this blog, we will explore Conversational AI Solution, in general and also VoiceAI product from AudioCodes as a solution.

As per Globe News Wire, this market was valued at USD 1.42B in 2021 and is expected to grow at 25% CAGR till 2026.

What is Voice enabled Conversational AI bot?

Before I talk about this subject, let me put something on your radar.

Contact Center AI technology is used in several use cases.

a) Predictive Call Routing

b) Interactive Voice Response

c) Conversational AI

d) Emotional Intelligence AI

e) AI Powered Recommendations

f) Call Analytics

We will be speaking about “Conversations AI” in this blog.

As the name suggests, it is a Conversation AI based bot (text or voice enabled) capable to respond to humans. These can be AI enabled virtual agents, chatbots ETC. which offer extended customer support services.

Consider this use case, you call your bank helpline number to verify your credit limit. As you call, there is a voice bot greeting you and asking you to enter your card details. Basis your response, it will validate if there is a match in it’s database before it solves your problem. Now consider, that instead of bot a human asked for your card details. Will you be comfortable telling them?

In the last few years, there have been significant advances in how bots are developed and used. Majority of them have been text based which respond on a certain query. If they can’t resolve, then the chat escalates to an actual human supervisor.

They are used in almost every industry however service industry is the one using it the most.

Just to give more perspective to it, chatbots in the service sector is expected to be the fastest growing market between 2022-26 with a CAGR of 31.6%. Gartner predicts that, by 2022, 70% interactions will involve emerging technologies such AI, ML, chatbot, mobile messaging etc.

Now imagine, that you add voice and telephony channel to existing text-based chatbots using your existing AI investments and voice lines. What a transformation this will become and how improved the customer experience will be.

Benefits of this approach

At the outset, you are connecting the user to a voice. It is the most intuitive form of communication. I would always prefer to speak to someone then chat with bot. Even the home gadgets like Alexa, Google Home, Siri etc are voice based while you can still give instructions on their app.

Second benefit, automated voice based interactions for customer engagement, is the best way to reduce wait times, improve experience and reduce live agent costs. If you look at it holistically, not every call coming in requires live agent. Some are just queries that an automated bot can resolve.

and this list goes on.

First of all, let’s focus our approach in understanding the current challenges.

What are the current challenges with today’s Solutions?

As you might have noticed, specially after pandemic:

Whenever you open any website or app offering chat assistance, your messages are answered by a chat bot. Eg uber chat support, amazon etc.

Whenever you call a support IVR, either you are required to press DTMF for options or it is answered by voice bot first and then human if the bot can’t answer the query. Eg Any Airline websites

Now, current market situation suggests that:

a) Most bot frameworks only offer chat based solutions

b) Even if they offer voice solution, but None Supply “Native” Telephony Integration

c) Many customers are still using legacy DTMF method to route calls to the correct queue.

d) Demand for contextualisation and more human like connection.

e) Reduce effort and time to resolve queries.

f) Managing the customer lifecycle experience when they call the helpline number.

g) Demand for customized call flows which are easy to build and operate.

Also, from a business perspective, A lot of agents are expensive

Let me speak on some in detail:

Telephony System Integration – Generally, if you look at cognitive services such as speech-to-text, text-to-speech, bot frameworks, they are largely https based voice bots which lack telephony integration. Telephony has different VoIP protocols and codes that makes it hard to integrate for bot developers.

Voice Quality and Voice Latency – The challenge of accuracy of speech detection and voice latency with bot using telephony channels is higher due to low bandwidth voice codes.

User Experience – This is complex for voice bots as compared to chat bots. Voice human interactions have caller interuptions, no user input, more human like communication.

Let’s review now the solution from AudioCodes to fix this problem statement.

AudioCodes VoiceAI Solution

AudioCodes has recently introduced their capability around Conversational AI and it’s called “VoiceAI Connect

Leveraging their extensive expertise in voice communications, AudioCodes VoiceAI Connect enables the integration of any cognitive voice service and bot framework with any voice or telephony channel, thus facilitating full voice functionality and creating an intelligent voice journey.

Some capabilities include:

a) Connects any contact center or SIP trunk to any bot framework

b) Native telephony and voice control from any bot framework via simple APIs

c) Superior voice quality drives fast and accurate bot response

d) Robust, secure and scalable solution architecture

e) Offered as a managed service or cloud SaaS

Let me explain the below illustration:

VoiceAI Connect acts as a bridge between Telephony Channels for VoIP Protocols, Cognitive Speech Services & Bot Frameworks. This means that you integrate your choice of “Bot framework”, “Cognitive Speech Services like Text-To-Speech, Speech-To-Text etc”, “Telephony Channels using AudioCodes SBC’s. You can use any public telephony carrier, Contact center or enterprise communication platform.”


VoiceAI Connect creates a bridge between any bot framework and any telephony system based on best of breed approach that allows you to pick the best provider for each of the voice cognitive services.

VoiceAI Connect Use Cases

There are 3 main use cases for this:

Intelligent Virtual Agent (IVA) – Offloads simple and repetitive tasks from live agents to voice-bots. It allows live agents to concentrate on more complex customer interactions, reduces hold time and improves the customer experience without increasing the number of agents. If the interaction cannot be completed by the bot, the call will be transferred seamlessly, along with the relevant details, to a live agent for completion. This solution is characterized by high scalability, a high return on investment and an exceptional user experience.

Conversational IVR – Legacy Interactive Voice Response (IVR) allows humans to interact with a computer-operated phone system using DTMF tones inputted via a keypad. It is usually a long and tedious process, which has a detrimental effect on the customers’ user experience. Conversational AI-based IVR uses natural language understanding to replace hierarchical menus with a free speech experience. It ascertains the customers’ needs and instantly routes the call to a virtual or live agent. The solution saves time, is more accurate and improves the user experience.

Virtual Agent Assists – The bot assists contact center agents by listening to conversations between customers and agents, analyzing the data and sending real-time insights to the agents or their supervisors. The insights can be utilized to guide the human agents through the process of handling customer inquiries and suggesting relevant answers. It optimizes agent productivity and enhances the user experience. Customers are not aware of the AI tool’s involvement.

Outbound Dialing – Allow third party applications to conduct automated outbound selling, setting and confirming appointments and more by your voice-bot

VoiceAI Connect has 2 Offerings

Cloud Edition: SaaS Based offering from AudioCodes. You can use BYOC (bring your own carrier) or phone numbers from AudioCodes globally.

Enterprise Edition: Can be deployed in any Datacanter or cloud as a dedicated instance.

A range of Telephony Services are available:

a) Public telephony providers (SIP Trunk)

b) Contact centers

c) Enterprise communication platforms

d) Web-based calling (WebRTC)

VoiceAI Connect supports a range of speech service providers, including:

a) Azure Speech Services

b) Amazon Web Services (AWS):

c) Amazon Polly text-to-speech

d) Amazon Transcribe speech-to-text

e) Almagu (text-to-speech only)

f) AmiVoice (speech-to-text only)

g) AudioCodes LVCSR (speech-to-text only)

h) Google Cloud Speech-to-Text and Text-to-Speech

i) Nuance Speech-to-Text and Text-to-Speech

j) Yandex

k) Uniphore (speech-to-text only)

VoiceAI Connect offers connectivity to a wide range of leading third-party, bot frameworks:

a) Microsoft Azure Bot Framework (using Direct Line 3.0 API)

b) Microsoft Power Virtual Agents

c) Google Dialogflow ES

d) Google Dialogflow CX

e) Amazon Lex4 (beta – currently, not supported on VoiceAI Cloud Edition)

j) Koda

k) Inbenta

l) Creative virtual

m) Membit

n) CoCoHub

Solution Benefits

On top of the fundamental capabilities of connectivity to any telephony channel, bot framework and speech services, the VoiceAI Connect solution has additional important benefits that improve any voice-bot solution:

Smooth integration – The solution easily connects any telephony system to any bot framework. Because a single vendor performs the telephony and cognitive services integrations, successful implementation and best performance is much more likely.

Voice quality and voice latency – VoiceAI Connect is the only solution today that relies on SBC architecture with direct connectivity from the telephony systems to the speech engines and bot frameworks. This minimizes the number of media hops and delivers excellent voice quality and minimal voice latency.

Best-of-breed approach – VoiceAI Connect offers a list of certified speech engine providers (text-to-speech and speech-to-text) that bot developers can choose from to meet their specific needs. The solution also provides public APIs that enable it to be integrated with any other speech engine.

Reduce traffic to speech services – Based on the SBC’s ability to detect silence and voice, and stop and start speech-to-text detection accordingly, VoiceAI Connect can reduce the traffic and cost of the speech-to-text service by up to 40%. For text-to-speech, the solution implements a caching mechanism that reduces both traffic and the cost of this engine and also minimizes voice latency.

Multiple deployment options – VoiceAI Connect is offered in two variations to suit any deployment option and to provide highest possible flexibility for the bot developer.

Reducing speech service costs

VoiceAI Connect provides mechanisms that reduce the billing costs of third-party, text-to-speech and speech-to-text services. These services charge per consumption, which is a significant part of the cost of the whole voicebot solution. The VoiceAI Connect mechanisms can reduce the cost of speech services by up to 40%.

The first mechanism concerns text-to-speech services, whereby VoiceAI Connect caches the text-to-speech prompts from the bot, eliminating the need to involve the text-to-speech service provider for repeated prompts.

The second mechanism concerns speech-to-text services, whereby VoiceAI Connect utilizes its built-in Digital Signal Processors (DSP) to detect silence and voice from the human participant, and consumes speech-to-text services only when there is voice. In a typical voicebot scenario, the bot and human participant each speak about 50% during the entire conversation. Therefore, this mechanism can be a substantial cost saver. (This mechanism works best with speech providers that don’t charge a minimum per speech-to-text request.)

Still Interested?

A enthusiast technical blogger, speaker, writer and have an interest in learning & sharing new capabilities.

I work as a Digital Workplace Consultant, with a primary focus on Microsoft Teams, Cisco Telephony, Zoom, Office 365, Azure.

Like to talk about #FutureOfCollaboration #AgileManagedServices #AI, #UCAAS #WorkplaceTransformation #HybridWorkplace #WXC #TimeManagement #Productivity

Professionally, I am an Experienced Digital Communication and Workplace Transformation Consultant.

Total Experience of over 10 years. Currently leading a UC Presales Team and based out of London, UK. Responsible for consulting EN and NN customers on:

• Continued Innovation & Automation potential by data analytics.

• Solution transformation or Platform Harmonization approach.

• Potential of transforming traditional Managed Operations to Next Gen Agile Ops.

• Helping customers understand importance of experience transformation(CX) and technology adoption.

Apart from this, I have interests in Spirituality, Finance & Investments, Physical Sports and currently based out of London, UK.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top