Symbl.ai’s API for WebSockets enables developers building into their RTC/RTE implements the generation of voice, video, chat, or broadcast applications with the ability to capture insights into Artificial Intelligence insights in real-time. Although Symbl.ai’s API provides a Telephony API, Symbl.ai’s WebSocket easily integrates directly into JavaScript’s own software for enabling real-time conversations. In the following blog you create a WebSocket connection through Twilio’s Software Development Kit (SDK) for voice communications. After establishing a Twilio WebSocket for voice communications, you create a WebSocket connection with Symbl.ai’s WebSocket. With both established you feed the one into the other. It is really simple.

Symbl.ai’s WebSocket

Symbl.ai’s WebSocket is one of the most famous software products in the history of telecommunications, real-time communications, or real-time engagements, as it enables developers to transform a live conversation or its aspects into a nexus of connectivity with third party software as a service from APIs all over the world through the primary providers of real-time voice, video, message or broadcast SDKs such as Twilio, Vonage, or Agora.

Set up

Register for an account at Symbl (i.e., https://platform.symbl.ai/). Grab both your appId and your appSecret. With both of those, you should authenticate either with a cURL command or with Postman so that you receive your x-api-key. Here is an example with cURL:

$ curl -k -X POST "https://api.symbl.ai/oauth2/token:generate"
      -H "accept: application/json"
      -H "Content-Type: application/json"
      -d "{ \"type\": \"application\", \"appId\": \"<appId>\", \"appSecret\": \"<appSecret>\"}"

Ideally a token server would handle authentication (with code that makes RESTful API call for generating token) so that neither the appSecret nor the appId were ever exposed. However, cURL sets you up immediately anyway. With thex-api-key handy you are now ready to establish a WebSocket endpoint for performing live transcription.

WebSocket’s Endpoint

To enable the WebSocket, you configure two values as query parameters that are fed directly into the WebSocket API’s endpoint. In turn you feed the WebSocket API’s endpoint directly into JavaScript’s own software for enabling real-time conversations. Here are the first two values:

const uniqueMeetingId = btoa('[email protected]'); 
const accessToken = '';

With these two values for the WebSocket’s API endpoint set, you feed these directly into the WebSocket API’s endpoint:

const symblEndpoint = `wss://api.symbl.ai/v1/realtime/insights/${uniqueMeetingId}?access_token=${accessToken}`;

If you want to test your WebSocket before any further integration, you load that endpoint with the data for uniqueMeetingId together with the accessToken into Hoppscotch.io, a free, fast, sleek API request builder.

Creating an Instance of JavaScript’s native WebSocket API

The next step is to create an instance of JavaScript’s native WebSocket API:

const ws = new WebSocket(symblEndpoint);

On that instance, you call methods specific to handling live transcription.

// Fired when a message is received from the WebSocket server 
ws.onmessage = (event) => {
    console.log(event);
};

// Fired when the WebSocket closes unexpectedly due to an error or lost connetion
ws.onerror = (err) => {
    console.error(err);
};
// Fired when the WebSocket connection has been closed
ws.onclose = (event) => {
    console.info('Connection to websocket closed');
};
// Fired when the connection succeeds.
ws.onopen = (event) => {
    ws.send(JSON.stringify({
        type: 'start_request',
        meetingTitle: 'Websockets How-to', // Conversation name    
        insightTypes: ['question', 'action_item'], // Will enable insight generation 
        config: {
            confidenceThreshold: 0.5,
            languageCode: 'en-US',
            speechRecognition: {
                encoding: 'LINEAR16',
                sampleRateHertz: 44100,
            }
        },
        speaker: {
            userId: '[email protected]',
            name: 'Example Sample',
        }
    }));
};

To set up the stream for accessing a user’s media devices such as their laptop’s microphone, program the brower’s navigator accordingly:

const stream = await navigator.mediaDevices.getUserMedia({
    audio: true,
    video: false
});

The code above asks the user for permission to access his or her device through a pop-up. In mobile applications permission for access to devices is hardcoded into the controlling directory of an application binary such as in the manifest for Android of the property list for iOS. JavaScript doesn’t compile so there is no need to ask for permission so that you can ask for permission! To set up the stream to handle events, program the following:

const handleSuccess = (stream) => {
    const AudioContext = window.AudioContext;
    const context = new AudioContext();
    const source = context.createMediaStreamSource(stream);
    const processor = context.createScriptProcessor(1024, 1, 1);
    const gainNode = context.createGain();
    source.connect(gainNode);
    gainNode.connect(processor);
    processor.connect(context.destination);
    processor.onaudioprocess = (e) => {
        // convert to 16-bit payload
        const inputData = e.inputBuffer.getChannelData(0) || new Float32Array(this.bufferSize);
        const targetBuffer = new Int16Array(inputData.length);
        for (let index = inputData.length; index > 0; index--) {
            targetBuffer[index] = 32767 * Math.min(1, inputData[index]);
        }
        // Send to websocket
        if (ws.readyState === WebSocket.OPEN) {
            ws.send(targetBuffer.buffer);
        }
    };
};

Here is the complete code:

const uniqueMeetingId = btoa('[email protected]');
const accessToken = '';
const symblEndpoint = `wss://api.symbl.ai/v1/realtime/insights/${uniqueMeetingId}?access_token=${accessToken}`;
const ws = new WebSocket(symblEndpoint);
// Fired when a message is received from the WebSocket server 
ws.onmessage = (event) => {
    console.log(event);
};
// Fired when the WebSocket closes unexpectedly due to an error or lost connetion 
ws.onerror = (err) => {
    console.error(err);
};
// Fired when the WebSocket connection has been closed 
ws.onclose = (event) => {
    console.info('Connection to websocket closed');
};
// Fired when the connection succeeds. 
ws.onopen = (event) => {
    ws.send(JSON.stringify({
        type: 'start_request',
        meetingTitle: 'Websockets How-to', // Conversation name
        insightTypes: ['question', 'action_item'], // Will enable insight generation
        config: {
            confidenceThreshold: 0.5,
            languageCode: 'en-US',
            speechRecognition: {
                encoding: 'LINEAR16',
                sampleRateHertz: 44100,
            }
        },
        speaker: {
            userId: '[email protected]',
            name: 'Example Sample',
        }
    }));
};
const stream = await navigator.mediaDevices.getUserMedia({
    audio: true,
    video: false
});
const handleSuccess = (stream) => {
    const AudioContext = window.AudioContext;
    const context = new AudioContext();
    const source = context.createMediaStreamSource(stream);
    const processor = context.createScriptProcessor(1024, 1, 1);
    const gainNode = context.createGain();
    source.connect(gainNode);
    gainNode.connect(processor);
    processor.connect(context.destination);
    processor.onaudioprocess = (e) => { // convert to 16-bit payload    
        const inputData = e.inputBuffer.getChannelData(0) || new Float32Array(this.bufferSize);
        const targetBuffer = new Int16Array(inputData.length);
        for (let index = inputData.length; index > 0; index--) {
            targetBuffer[index] = 32767 * Math.min(1, inputData[index]);
        }
        // Send to websocket
        if (ws.readyState === WebSocket.OPEN) {
            ws.send(targetBuffer.buffer);
        }
    };
};
handleSuccess(stream);

Twilio SDK for Real-Time Communication

The next step to enable real-time conversation intelligence in your Twilio SDK is to create an app with the Twilio Programmable JS SDK. Since the app is already live on Twilio’s GitHub (https://github.com/twilio/twilio-video-app-react), the only thing that you need to do is to tie Symbl.ai’s WebSocket to Twilio’s. H2: How to Add Symbl.ai’s WebSocket to Twilio There are a few prerequisite steps that you have to take to tie Symbl.ai’s WebSocket directly to Twilio’s. These are a couple of account specific steps:

  1. Navigate to Symbl.ai’s GitHub where you find Twilio’s https://github.com/symblai/symbl-twilio-video-react. It is a multi-party video conferencing application that demonstrates Symbl’s Real-time APIs. This application is inspired by Twilio’s video app and is built using twilio-video.js and Create React App.
  2. Sign up for a free Twilio account. After logging in, make sure to cache your Twilio appSecret and appId.

With those two steps out of the way, your next step is to add a Symbl-Twilio Connector to your application.

Twilio’s WebSocket

Since the entire app is already built for you, the next step for you is to tie Symbl.ai’s WebSocket to Twilio’s WebSocket. From Symbl.ai’s GitHub repository, navigation to the SymblWebSocketAPI.js file where you find nearly the exact same code for running the WebSocket in the browser as you do for running the WebSocket in an app like Twilio Programmable JS SDK‘s app. Below you find account specific details such as appId, appSecret, together with a base URL for the WebSocket API. After the account specific details, there is a class SymblWebSocketAPI. The class is defined with methods similar, as well as properties, similar, if not identical, to those mentioned earlier for Symbl.ai’s WebSocket in the browser. If you were to match the code for Symbl.ai’s WebSocket in the browser to the code for Symbl.ai’s WebSocket in Twilio’s Programmable Video JS SDK, there would be a one-to-one mapping of parts. The last but not least step you have to complete is to tie the two WebSockets together. To tie Twilio’s Programmable Video JS SDK’s WebSocket to Symbl.ai’s WebSocket, pass the targetBuffer.buffer into the Symbl.ai’s WebSocket’s ws.send() function.

Setup and Deploy

This application offers two options for authorizing your Symbl account, in the application, or via the included token server. Your Twilio account will be authorized via the token server. The default behavior is for your Symbl account to authorize in-app. A dialog box will be shown automatically if you’re opening the app for the first time. In the config.js file you will find enableInAppCredentials set to true. For this option you are not required to update the .env file with Symbl credentials.

Credentials

Demo

If you would like to test out our live demo, feel free to stage your own app on Heroku. If you would like to check out the code on our GitHub repo, please checkout the following link: https://github.com/symblai/symbl-twilio-video-react.

Twilio & Symbl.ai

Symbl.ai’s APIs are already running hundreds of thousands of open WebSocket connections simultaneously throughout the API World. Since Symbl.ai is a cross-domain, platform agnostic, general purpose Conversation Intelligence API platform for developers designed to enable voice, video, message or broadcast builders to augment their real-time engagements with Artificial Intelligence at the bleeding edge of sound, Symbl.ai’s APIs are designed to operate with any or all real-time communication applications or platforms. As Symbl.ai already runs on Agora.io, Dolby.io, Telnyx, or Twilio, Symbl.ai runs anywhere anytime. Sign up for your free transcription account today to discover what Symbl.ai can do for you.

Community

Symbl.ai‘s invites developers to reach out to us via email at [email protected], join our Slack channels, participate in our hackathons, fork our Postman public workspace, or git clone our repos at Symbl.ai’s GitHub.  

Avatar photo
Eric Giannini
Lead Developer Evangelist & Advocate