Capturing audio and deriving real-time insights is not as hard as you may think. Twilio Media Streams provide real-time raw audio and give developers the flexibility to integrate this audio in the voice stack of choice. Couple that with the power of Symbl, and you can surface actionable insights with customer interactions through the Symbl WebSocket API.

What can you expect upon successful installation?

  A post-conversation email with topics generated, of action items, and a link to view the full summary output. This blog post will guide you step-by-step through integrating the Symbl WebSocket API into Twilio Media Streams.

Requirements

Before we can get started, you’ll need

Setting up the Local Server

Twilio Media Streams use the WebSocket API to live stream the audio from the phone call to your application. Let’s get started by setting up a server that can handle WebSocket connections. Open your terminal, create a new project folder, and create an index.js file.

$ mkdir symbl-websocket
$ cd symbl-websocket
$ touch index.js

To handle HTTP requests we will use node’s built-in http module and Express. For WebSocket connections we will be using ws, a lightweight WebSocket client for node. In the terminal run these commands to install ws, websocket and Express:

$ npm install ws websocket express

To install the server open your index.js file and add the following code.

 const WebSocket = require("ws");
 const express = require("express");
 const app = express();
 const server = require("http").createServer(app);
 const ws = new WebSocket.Server({
     server
 });
 const WebSocketClient = require("websocket").client;
 const wsc = new WebSocketClient();
 // Handle Web Socket Connection 
 ws.on("connection", function connection(ws) {
     console.log("New Connection Initiated");
 });
 //Handle HTTP Request 
 app.get("/", (req, res) => res.send("Hello World"));
 console.log("Listening at Port 8080");
 server.listen(8080);

Save and run index.js with

$ node index.js

Open your browser and navigate to https://localhost:8080 Your browser should show Hello World

Setting up the Symbl WebSocket API

Let’s connect our Twilio number to our WebSocket server. First, we need to modify our server to handle the WebSocket messages that will be sent from Twilio when our phone call starts streaming. There are four main message events we want to listen for: connected, start, media and stop. – Connected: When Twilio makes a successful WebSocket connection to a server – Start: When Twilio starts streaming Media Packets – Media: Encoded Media Packets (This is the Raw Audio) – Stop: When streaming ends the stop event is sent. Modify your index.js file to log messages when each of these messages arrive at the Symbl server.

const WebSocket = require("ws");
const express = require("express");
const app = express();
const server = require("http").createServer(app);
const ws = new WebSocket.Server({ server });
const WebSocketClient = require("websocket").client;
const wsc = new WebSocketClient();
let connection;
let client_connection;

// Handle WebSocket server failures
wsc.on('connectFailed', (e) => {
    console.error('Connection Failed.', e);
});

// Handle WebSocket server connection
ws.on("connection", (conn) => {
    connection = conn;
    connection.on('close', () => {
        console.log('WebSocket closed.');
    });
    connection.on('error', (err) => {
        console.log('WebSocket error.', err);
    });
    connection.on("message", (data) => {
        const msg = JSON.parse(data);
        if (msg.type === 'utf8') {
            const { utf8Data } = msg;
        }
        switch (msg.event) {
            case "connected":
                console.log(`A new call has connected.`);
                break;
            case "start":
                console.log(`Starting Media Stream ${msg.streamSid}`);
                break;
            case "media":
                if (client_connection) {
                    let buff = Buffer.from(msg.media.payload, 'base64');
                    client_connection.send(buff);
                }
                break;
            case "stop":
                console.log(`Call Has Ended`);
                // Send stop request 
                client_connection.sendUTF(JSON.stringify({
                    "type": "stop_request"
                }));
                client_connection.close();
                break;
        }
    });
});

// Handle WebSocket client connection
wsc.on("connect", (conn) => {
    client_connection = conn;
    client_connection.on('close', () => {
        console.log('WebSocket closed.');
    });
    client_connection.on('error', (err) => {
        console.log('WebSocket error.', err);
    });
    client_connection.on('message', (data) => {
        if (data.type === 'utf8') {
            const { utf8Data } = data;
            data = JSON.parse(utf8Data);
            console.log(utf8Data);
        }
    });

    client_connection.send(JSON.stringify({
        "type": "start_request",
        "insightTypes": ["question", "action_item"],
        "config": {
            "confidenceThreshold": 0.5,
            "timezoneOffset": 480, // Your timezone offset from UTC in minutes
            "languageCode": "en-US",
            "speechRecognition": {
                "encoding": "MULAW",
                "sampleRateHertz": 8000 // Make sure the correct sample rate is
            },
            "meetingTitle": "My meeting"
        },
        "speaker": {
            "userId": "<[email protected]>",
            "name": ""
        }
    }));
});

wsc.connect('wss://api.symbl.ai/v1/realtime/insights/121', null, null, {
    'X-API-KEY': '<your_auth_token>'
});

// Handle HTTP Request
app.get("/", (req, res) => res.send("Hello World"));

console.log("Listening at Port 8080");
server.listen(8080);

Now we need to set up a Twilio number to start streaming audio to our server. We can control what happens when we call our Twilio number using TwiML. We’ll create an HTTP route that will return TwiML instructing Twilio to stream audio from the call to our server. Add the following POST route to your index.js file.

const WebSocket = require("ws");
const express = require("express");
const app = express();
const server = require("http").createServer(app);
const ws = new WebSocket.Server({ server });
const WebSocketClient = require("websocket").client;
const wsc = new WebSocketClient();
let connection;
let client_connection;

// Handle WebSocket client failures
wsc.on('connectFailed', (e) => {
    console.error('Connection Failed.', e);
});

// Handle WebSocket server connection
ws.on("connection", (conn) => {
    connection = conn;
    connection.on('close', () => {
        console.log('WebSocket closed.');
    });
    connection.on('error', (err) => {
        console.log('WebSocket error.', err);
    });
    connection.on("message", (data) => {
        const msg = JSON.parse(data);
        if (msg.type === 'utf8') {
            const { utf8Data } = msg;
        }
        switch (msg.event) {
            case "connected":
                console.log(`A new call has connected.`);
                break;
            case "start":
                console.log(`Starting Media Stream ${msg.streamSid}`);
                break;
            case "media":
                if (client_connection) {
                    let buff = Buffer.from(msg.media.payload, 'base64');
                    client_connection.send(buff);
                }
                break;
            case "stop":
                console.log(`Call Has Ended`);
                // Send stop request 
                client_connection.sendUTF(JSON.stringify({
                    "type": "stop_request"
                }));
                client_connection.close();
                break;
        }
    });
});

// Handle WebSocket client connection
wsc.on("connect", (conn) => {
    client_connection = conn;
    client_connection.on('close', () => {
        console.log('WebSocket closed.');
    });
    client_connection.on('error', (err) => {
        console.log('WebSocket error.', err);
    });
    client_connection.on('message', (data) => {
        if (data.type === 'utf8') {
            const { utf8Data } = data;
            data = JSON.parse(utf8Data);
            console.log(utf8Data);
        }
    });

    client_connection.send(JSON.stringify({
        "type": "start_request",
        "insightTypes": ["question", "action_item"],
        "config": {
            "confidenceThreshold": 0.5,
            "timezoneOffset": 480, // Your timezone offset from UTC in minutes
            "languageCode": "en-US",
            "speechRecognition": {
                "encoding": "MULAW",
                "sampleRateHertz": 8000 // Make sure the correct sample rate is
            },
            "meetingTitle": "My meeting"
        },
        "speaker": {
            "userId": "<[email protected]>",
            "name": ""
        }
    }));
});

wsc.connect('wss://api.symbl.ai/v1/realtime/insights/121', null, null, {
    'X-API-KEY': '<your_auth_token>'
});

// Handle HTTP Request
app.get("/", (req, res) => res.send("Hello World"));

console.log("Listening at Port 8080");
server.listen(8080);

For Twilio to connect to your local server we need to expose the port to the internet. We need to use ngrok to create a tunnel to our localhost port and expose it to the internet. In a new terminal window run the following command:

$ ngrok http 8080

You should get an output with a forwarding address like this. Copy the URL onto the clipboard. Make sure you save the HTTPS URL.

Forwarding https://xxxxxxxx.ngrok.io -> https://localhost:8080

Open a new terminal window and run your index.js file.

$ node index.js

Setting up your Twilio Studio

Now that our WebSocket server is ready, the remaining configuration needed to join Symbl to your customer and agent conversations, will be done through your Twilio Studio Dashboard. Navigate to Studio and create a new flow. Twilio offers three different triggers that you can use to build out this integration. Depending on your use case, you can choose to begin the flow from either the message, call, or REST API triggers. In our example, we want Symbl to join a voice conversation when a customer calls our agent, so we will be using the incoming call trigger to build out our flow. First, use the Fork Stream widget and connect it to the Incoming Call trigger. In the configuration, the URL should match your ngrok domain. NOTE: Use the WebSocket protocol wss instead of http for the ngrok URL.
startws

Next connect this widget to the `Flex Agent` widget which will connect the call to the Flex Agent:

flexagent

Finally, we need to end the stream once the call is complete. To do so, use the same `Fork Stream` widget but the configuration for `stream action` should be `Stop`.

flexagent

Test the integration

To test the integration, navigate to the Flex tab and click on Launch Flex: On your Flex dashboard, locate your Twilio phone number and call that number from your mobile device. When the agent accepts the call, the audio will stream through the WebSocket API. And at the end of the call, you will get an email with the transcript and insights generated from the conversation.

Wrapping up

What else can you do with the data? You can fetch the data out of the conversation and with this output, you can push the data to downstream channels such as Trello, Slack, Jira. Use GET conversation to find the conversation ID. GET https://api.symbl.ai/v1/conversations/{conversationId} This is a sample API call:

  const request = require('request');
  const your_auth_token = '';
  request.get({
      url: 'https://api.symbl.ai/v1/conversations/{conversationId}',
      headers: {
          'x-api-key': your_auth_token
      },
      json: true
  }, (err, response, body) => {
      console.log(body);
  });

The above request returns a response structured like this:

  {
      "id": "5179649407582208",
      "type": "meeting",
      "name": "Project Meeting #2",
      "startTime": "2020-02-12T11:32:08.000Z",
      "endTime": "2020-02-12T11:37:31.134Z",
      "members": [{
          "name": "John",
          "email": "[email protected]",
      }, {
          "name": "Mary",
          "email": "[email protected]",
      }, {
          "name": "Roger",
          "email": "[email protected]",
      }]
  }

Congratulations! You can now harness the power of Symbl and Media Streams to extend your application capabilities. Need additional help? You can refer to our API Docs for more information and view our sample projects on Github.