How to Separate Speakers from Recorded Calls with Symbl.ai’s Python SDK

Symbl.ai is a Conversation Intelligence API platform for developers to augment the speaking realities of conversations in real-time. Many of the realities that speakers face in conversations are limited by the current state of the art for voice, video, message or broadcast platforms whose native functionality does not extend beyond enabling calls, streams, chats, or live events. Symbl.ai enables developers developing on any of the many streaming platforms to extend the functionalities of those platforms vastly through the introduction of no more than a few API calls.

Among the API calls with which you as a developer are empowered to extend the functionality of your calls is Symbl.ai’s Async API. With Symbl.ai’s Async API you have the power to augment the speaking realities of your speakers by separating their voices. With separated speakers you are able to connect, transform or visualize the aspects of a conversation that are unique to the separated speaker.

Symbl.ai’s Python SDK. Symbl.ai’s Python SDK provides convenience methods for making calls to API endpoints like the one uploading a recorded call for processing with the Async API. In the following guide you will upload a recorded call with Symbl.ai’s Async API with the API call configured to enable speaker separation but with Symbl.ai’s Python SDK.

Sign up

Register for an account at Symbl (i.e., https://platform.symbl.ai/). Grab both your appId and your appSecret. With both of those you should authenticate either with a cURL command or with Postman so that you receive your x-api-key. Here is an example with cURL:

curl -k -X POST "https://api.symbl.ai/oauth2/token:generate"      -H "accept: application/json"      -H "Content-Type: application/json"      -d "{ \"type\": \"application\", \"appId\": \"<appId>\", \"appSecret\": \"<appSecret>\"}"

Ideally a token server would handle authentication (with code that makes RESTful API call for generating token) so that neither the appSecret nor the appId were ever exposed. However, cURL sets you up immediately anyway. With thex-api-key handy you are now ready to establish a WebSocket endpoint for performing live transcription.

Setting Up Symbl.ai’s Python SDK

Symbl.ai’s Python SDK is a specially designed SDK for enable Pythonistas to program brand new experiences around conversations. In that respect Symbl.ai’s Python SDK empowers developers to access programmable intelligence directly in your mobile or web applications, client or server.

Installation

Before you install the Symbl.ai Python SDK, please ensure that you have installed Python 2.7+ or Python 3.4+ (PyPy supported). Symbl.ai recommends you execute Python’s installation through Homebrew, the open source package manager for MacOS.

Homebrew

To install Homebrew, open Terminal or your favorite OS X terminal emulator and run:

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

After installing Homebrew, run the following command:

brew install python

After the installation, check to see whether your version of Python is compatible with Symbl.ai’s installation requirements:

python --version

If the version is higher than Python 2.7+ or Python 3.4+, then you’re all set to download Symbl.ai’s Python SDK.

Installing Symbl.ai’s Python SDK

Now that Python is installed, the next step is to install Symbl.ai’s Python SDK. To install Symbl.ai’s Python SDK, you install pip in the following way:

pip install --upgrade symbl

In a similar way, you install the same:

python -m pip install --upgrade symbl

Configuration

In a testing environment like the one you create to transform your conversation into a time series graph, you run credentials in-line. Since you quickly test the current code, you can utilize the credentials variable to pass your appId and appSecret as a part of the code itself. However, you must add the credentials variable every time make a new API call.

import symbllocal_path = r'c:/Users/john/Downloads/business_meeting.mp3'# Process audio fileconversation_object = symbl.Audio.process_file( file_path=local_path credentials={app_id: <app_id>, app_secret: <app_secret>},  #This is optional if you didn't setup the symbl.conf file in your home directory. )

Making a Call to the Async API with the Python SDK

To start the process of hacking your conversations with Symbl.ai’s Python SDK you import the Symbl SDK.

import symbl

The next step is to configure a payload in much the same way that you would configure a payload for an ordinary HTTP request in any one of your favorite programming languages. Here is a payload.

payload = {'url':'https://symbltestdata.s3.us-east-2.amazonaws.com/sample_audio_file.wav',}

After configuring the payload, create a dictionary for storing credentials in the following way:

credentials_dict = {'app_id': '<app_id>', 'app_secret': '<app_secret>'}

Last but not least create a conversation object. After the conversation object is created, the next step is to pipe the response selectively into the data you want to visualize in particular. The conversation object is a call to Symbl.ai’s Async API for processing audio through a URL.

conversation = symbl.Audio.process_url(payload=payload, credentials=credentials_dict)

Parameters for Speaker Separation

To configure the conversation payload to deliver parameters directing the Async API to enable both a count as well as a separation of speakers, you configure a parameters dictionary with key-value pairs like so:

Note: If you made the call without specifying the number of speakers, you might have received the following message:

{"message":"diarizationSpeakerCount must be greater than zero and needs to be passed to accurately determine unique speakers in the audio"}

To properly configure the conversation payload, you have to enable both a count as well as a separation of speakers in the following way:

parameters={'enableSpeakerDiarization': True, 'diarizationSpeakerCount': 2}

Separate Speakers

By adding the following to your conversation payload, your proper configuration directs the Async API to both count as well as separate the speakers in the recorded call.

With the proper configuration set up in your conversation payload, make your API call. After the API call is made, you receive a conversationId with which to check whether or not or how the numbered speakers were separated.

Conversation API’s Message API

After separating the speakers, number speaker labels appears within the messages of the Conversation API’s Message API’s return data. The data looks like the following:

{ "messages": [   {      "id": "5543101041999872",      "text": "But what?",      "from": {      "id": "f0fe6722-49c7-47c4-9708-03e4e5182508",      "name": "Speaker 2"    },      "startTime": "2021-08-16T19:38:01.747Z",      "endTime": "2021-08-16T19:38:02.147Z",      "conversationId": "0000-0000-0000-0000",      "phrases": [] }]

What’s Next

In addition to adding bells and whistles to your graphs like labels, you may consider how Symbl.ai’s Python SDK provides access to many of Symbl.ai’s APIs. In particular, you can enable Symbl.ai’s Python SDK on the real-time Telephony API (i.e., Telephony API (in Real-time) or any one of the Async APIs for voice, video, or messaging. A complete list is here:

In addition, Symbl.ai’s Python SDK contains links to enable Symbl.ai on its Telephony APIs with Python special methods. The methods for dialing into a call, work on the Session Initiation Protocol (SIP). With Symbl.ai’s Python SDK dialed into a call, your in a position to expand the experiences around calls in real-time. After connecting to a SIP call in real-time, you subscribe to Symbl.ai’s events such as events for contextual insights like follow-ups, questions, or action-items, or you create a request to deliver a post-meeting summary directly to your inbox.

Community

Symbl.ai‘s invites developers to reach out to us via email at [email protected], join our Slack channels, participate in our hackathons, fork our Postman public workspace, or git clone our repos at Symbl.ai’s GitHub.

Nebula

Generative APIs

Understanding APIs

Integration

Pre-Built UI

Deployment

Security

Featured Blogs

Introducing a Gen AI Powered Pre-Built Experience for Call Insights

Symbl.ai Blog