How to Use Symbl’s Voice API Over WebSocket to Generate Real-time Insights
Live, real-time voice streaming over the web functionality is becoming more relevant as companies, who once met in person, are building applications to communicate remotely online. Streaming real-time audio through desktop apps, podcast interfaces, or WebRTC based platforms, you can broaden your conversation intelligence capabilities by using Symbl’s Voice API with WebSocket integration. Integrate our API with CPaaS platforms like Tokbox and Agora.io to receive actionable insights, transcriptions, and summary topics in real time. It’s instant gratification at its finest.
Before we can get started, you’ll need to make sure to have:
First, create an
index.js file where we will be writing all our code.
To work with the API, you must have a valid app id and app secret. If you don’t already have your app id or app secret, log in to the Symbl platform to get your credentials. For any invalid
appSecret combination, the API will throw a
401 Unauthorized error.
The WebSocket API needs a stream of audio as an input through a medium such as Twilio Media Streams. For this blog, we’ll configure your microphone to push speaking events to the API.
| In the example below, we’ve used the websocket npm package for WebSocket Client, and mic for getting the raw audio from microphone. Run the below command in your terminal
| Add the code below to your index file
Next we need to create a websocket client instance and handle the connection. This is where we will send speaker events through the websocket and configure the insights we want to generate.
For this example, we timeout our call after 2 minutes but you would most likely want to make the stop_request call when your websocket connection ends
Let’s break apart the code blocks and analyze them closely.
This simply begins streaming audio through your microphone. For this blog, we are using your computer’s microphone but you can use a variety of mediums as the input to stream audio data such as a phone call.
This block is just for illustration purposes to see the data that is being streamed. Depending on your use case, you can do many things with this data prior to calling our API on it.
The above code is the payload configuration for the API call. This is where you set the
start_request option, the
insightTypes we want to identify as well as other configurations for confidence and speech recognition.
We also pass in a
speaker object that has the user’s name and email for the speaker event that we are sending.
The above code just sends the data when audio is detected in your microphone.
Finally, for this demo, we use the
setTimeout to trigger our
stop_request option after 2 minutes to stop the call. In production, however, you would trigger this option once your call or meeting actually ends.
To make the connection, we use the
ws.connect option with the API endpoint and our access token that we generated at the start.
Running your code
Now that we have everything set up, let’s test our code.
In your terminal run
node index.js. In your console, you should see logs like
Connection established. If you begin speaking, you should also start seeing data logs and if you speak phrases that are detected as questions or action items, you’ll see that data get logged as well.
This is an example of the summary page you can expect to receive at the end of your call:
Tuning your Summary Page
You can choose to tune your Summary Page with the help of query parameters to play with different configurations and see how the results look.
You can configure the summary page by passing in the configuration through query parameters in the summary page URL that gets generated at the end of your meeting. See the end of the URL in this example:
|Query Parameter||Default Value||Supported Values||Description|
||0.8||0.5 to 1.0||Minimum score that the summary page should use to render the insights|
||false||[true, false]||Enable to disable rending of the assignee and due date ofthe insight|
||true||[true, false]||Enable to disable add to calendar suggestion whenapplicable on insights|
||true||[true, false]||Enable or disable the title of an insight. The title indicates theoriginating person of the insight and if assignee of the insight.|
||true||[true, false]||Enable or disable the summary topics in the summary page|
||‘score’||[‘score’, ‘position’]||Ordering of the topics. <br><br> score – order topics by the topic importance score. <br><br>position – order the topics by the position in the transcript they surfaced for the first time|
score – order topics by the topic importance score.
position – order the topics by the position in the transcript they surfaced for the first time |
Congratulations, you now know how to use Symbl’s real-time WebSocket API to generate your own insights.
Sign up to start building!