Getting started with speech analytics in your application

TLDR: Voice data can be captured during business communications, like video calls, and run through speech analytics to unlock insights found in human to human conversations.

It’s no secret that businesses generate a lot of data. It comes from the software we use to work, from customers using our products, and from our cell phones, computers, and tablets.

For the most part, we do an excellent job of using that data to help fuel innovation, create better products, and better serve our customers. However, we’re often able to capture far more data than we actually use.

Voice data is a great example of data that many businesses aren’t using to their advantage. Voice data is generated from the conversations we have with each other while doing business. This can be everything from meetings with colleagues to conversations with customers.

Capturing the valuable information that’s found in our conversations lets you create apps that can help businesses better serve their customers or provide better answers to patients in medical settings.

What kind of voice data can be captured (and how)?

In its simplest form, voice data is any kind of information that can be pulled from a conversation between two or more humans. When you don’t tap into this data, you introduce the need to do more things manually, like having someone listening to calls after the fact to extract the key points you want.

Voice data contains insights like:

Follow up meetings mentioned during a conversation
Transcriptions of the conversation itself
Action items discussed in the meeting
Any next steps that may have come up
Sentiment analysis from customer service calls

This data can be captured in two ways: asynchronously or real-time. With asynchronous data, you get a recording of the conversation after it’s happened and then you process it with a speech analysis AI system that helps you analyze the data. In real time, you can use speech analytics to analyze your data as it happens.

Not surprisingly, real-time analysis of voice data is more challenging because you have to consider factors like what protocol is being used to transmit the data (SIP, PSTN, WebSocket) and whether or not you can even access the data stream.

Real-time data collection is especially hard if you’re not using a system that lets you add voice APIs, which are the easiest way to get started when you’re working with voice data.

How Symbl.ai can help you collect (and analyze) voice data

The good news is that even with the challenges around capturing real-time data, getting started with speech analytics is fairly straightforward.

The best way to get started is to use a communication system that allows you to build out the custom functionality you need in tools like VoIP or video conferencing platforms using Symbl.ai’s APIs.

These APIs help you stop worrying about how to handle and capture voice data in your applications, and instead, let you focus more on the insights gained from speech analysis AI.

For example, Symbl.ai’s Async APIs can help you process any recording (both audio and video) you have to reveal valuable insights. You can process files in several formats but let’s start with a .wav format audio recording.

First, you need to create an account on Symbl.ai platform to get your appId and appSecret. Once you have those, you can use them to generate the authorization token necessary to make a call to Symbl.ai’s Async API.

Below are the sample cURL requests for generating authorization token and then sending audio file to Symbl.ai for processing .

Getting your OAuth token:

curl --location --request POST 'https://api.symbl.ai/oauth2/token:generate'   --header 'Content-Type: application/json'   --data-raw '{     "type": "application",     "appId": "<Your APPID>",     "appSecret": "Your APPSecret"  }'

You will get a token as a response here, use it next cURL request below, sending any locally stored audio file recording for processing:

curl --location --request POST 'https://api.symbl.ai/v1/process/audio'   --header 'Content-Type: audio/mpeg'   --header "Authorization: Bearer $AUTH_TOKEN"   --data-binary '@/file/location/audio.mp3'

You will get a conversationID as a response to the second request and you should save it. Using the conversationId you can use our Conversation API to extract the valuable information found within the audio recording conversation (like action items, transcripts, etc.). You get the output as a JSON file, which makes it possible for you to use it however you want.

These APIs not only help you mine this data from your conversations, but also do it without having to use separate platforms for everything. Instead of having to use a separate automatic speech recognition platform or analytics vendor, you can manage everything from the communication tool you’re already using.

What’s great about all of this is that it helps unlock the data that happens in conversations across the entire business and in a variety of industries. Call centers are one of the major uses for conversation analytics. Conversation AI can help call center agents more easily navigate calls and provide better customer experiences with real-time insights, sentiment analysis, knowledge base searches, generating automatic follow ups, and action items.

Other use cases include:

Business meetings — Surface action items, meetings, and other useful information that arose during the conversation.
Sales calls — Not only can you pull the same kind of data you would from a business meeting, but you’re also able to analyze the effectiveness of your sales teams at closing deals.
Telehealth — AI analysis can help you train your system with contextual data to help make better diagnoses.
Distance education — E-learning platforms gain the ability to provide real-time transcriptions to students who are watching live lectures, increasing the accessibility of the lecture. The AI could also collect key points made during the lessons and send out summaries to students after the lesson ends.
Sales staff — You can learn who the best salesperson is, and (best of all), you can understand what phrases or tactics they use that makes them so effective.

Want help getting started with voice data?

Symbl.ai makes it easy to get started analyzing your voice data with voice APIs that provide out-of-the-box advanced capture and analysis functionality unlocking the ability to work in both real time and asynchronously. This reduces the amount of time you would normally spend building and training your AI down to virtually nothing, allowing you to accelerate time to value and scale with ease.

Check out our documentation page to explore all the different ways you can leverage voice data in your business.

Additional Reading

The What, Where, and Why of Contextual AI

Enhance Human Conversations with Conversation Intelligence

Transcribing audio from streaming input

Getting started with speech analytics in your application

What kind of voice data can be captured (and how)?

How Symbl.ai can help you collect (and analyze) voice data

Want help getting started with voice data?

Additional Reading

Neeraj Chaudhary

Ready to get started?

Platform

Developers

Company