In today’s increasingly competitive landscape, applications that provide real-time data exchange and communication are crucial for enhancing user experiences, carving out market share, and, ultimately, driving business success. WebSockets and SIP (Session Initiation Protocol) are fundamental technologies for facilitating smooth, reliable online interactions.

In this guide, we explore the concepts of WebSockets and SIP and the role they play in developing performant modern applications. We also detail how to use these protocols to integrate your application with Symbl.ai’s conversational intelligence capabilities to draw maximum insights from your messages, calls, video conferences, and other interactions.  

What is WebSocket?

WebSocket is a widely used protocol for facilitating the exchange of data between a client and a server. It is well suited for any application that requires real-time, two-way communication between a web browser and a server, such as messaging applications, collaborative editing tools, stock tickers, displaying live sports results, and even online gaming.

How do WebSockets Work?

WebSockets sit on top of the Transmission Control Protocol/Internet Protocol (TCP/IP) stack and use it to establish a persistent connection between a client and server. To achieve this, WebSockets first use the Hypertext Transport Protocol (HTTP – as used to serve websites to browsers) to establish a connection, i.e., a “handshake”. Once the connection is established, WebSockets replace HTTP as the application-layer protocol to create a persistent two-way, or “full-duplex” connection, and the server will automatically send new data to the client as soon as it is available.

This is in contrast to how HTTP transmits data – whereby the client continually has to request data from the server and only receives it is new data is available, i.e., HTTP long-polling. By maintaining a persistent handshake, WebSockets eliminate the technical overhead of continually having to establish connections and send HTTP request/response headers, significantly reducing latency and opening the door for the development of a wider range of applications that rely on real-time communication. 

What Are the Benefits of WebSockets?

  • Speed: as a low-latency protocol, WebSockets are ideal for applications that need to exchange data instantaneously. 
  • Simplicity: as WebSockets sits atop TCP/IP and uses HTTP to establish an initial connection, it does not require the installation of any additional hardware or software. 
  • Constant Ongoing Updates: WebSockets enable the server to transmit new data to the client without the need for requests, i.e. GET operations, allowing for continuous updates. 

What is SIP?

As useful as WebSockets are for general-purpose, bi-directional communication, they lack the mechanisms for real-time media transmission; this is where the Session Initiation Protocol (SIP) comes into play. SIP is a signaling protocol that’s used to establish interactive communication sessions, such as phone calls or video meetings. As an essential component of  Voice over Internet Protocol (VoIP), SIP can be used in a variety of multimedia applications, including IP telephony and video conferencing applications. 

How Does SIP Work?

SIP functions much like a call manager: establishing the connection between endpoints, handling the call, and closing the connection once it is finished. This starts with one of the endpoints initiating the call by sending an invite message to the other endpoint(s), which includes their credentials and the nature of the call, e.g., voice, video, etc. The other endpoints receive the invite message and respond with an OK message, comprising their information so the connection can be established. Upon receiving the OK message, the initiating endpoint sends an acknowledgement (ACK) message and the call can begin. 

These messages can be sent via TCP, as with WebSockets, as well as UDP (User Datagram Protocol) or TLS (Transport Layer Security). Once the connection is established, SIP hands over the transmission of media to another protocol such as Real-Time Transfer Protocol (RTP) or Real-Time Transport Control Protocol (RTCP) (hence being called the Session Initiation Protocol, as its role is to solely establish communication between endpoints). 

What Are the Benefits of SIP?

  • Interoperability: SIP is protocol-agnostic when it comes to the type of media being transmitted, with the ability to handle voice, video, and multimedia calls. 
  • Adaptability: SIP is compatible with a large variety of devices and components. Additionally, it works with legacy systems such as Public Switched Telephone Network (PSTN) and is designed in such a way to accommodate emerging technologies. 
  • Scalability: SIP can be used in both small and large-scale communication networks, with the ability to establish and terminate connections as necessary to utilize resources efficiently. 

How to Integrate Your Application with Symbl.ai via WebSocket

Now that we have explored WebSockets and how they work, let us move on to how to integrate your application with Symbl.ai’s conversational intelligence capabilities via WebSocket – which is accomplished through Symbl.ai’s Streaming API

In this example, the code samples are in Python, using functions from the Symbl.ai Python SDK; however, Symbl.ai also provides SDKs in Javascript and Go. 

Prepare Environment

Before you begin, you will need to install the Symbl.ai Python SDK, as shown below: 

# For Python Version < 3
pip install symbl

# For Python Version 3 and above
pip install symbl


Additionally, to connect to Symbal.ai’s APIs, you will need access credentials, i.e., an app id and app secret, which you can obtain by signing into the developer platform.  

Create WebSocket Connection

The first step is establishing a connection to Symbl.ai‘s servers this will create a connection object, which can have the following parameters:

ParameterDescription
credentialsYour app id and app secret from Symbl.ai’s developer platform 
speakerSpeaker object containing a name and userId field.
insight_typesThe insights to be returned in the WebSocket connection, i.e., Questions and Action Items. 
configOptional configurations for configuring the conversation. For more details, see the config parameter in the Streaming API documentation.

The code snippet below is used to start a connection: 

connection_object = symbl.Streaming.start_connection(
    credentials={app_id: <app_id>, app_secret: <app_secret>}
    insight_types=["question", "action_item"],
    speaker={"name": "John", "userId": "[email protected]"},
)


Receive Insights via Email

You can opt to receive insights from the interactions within your application via email. This will provide you with a link to view the conversation transcripts, as well as details such as the topics discussed, generated follow-ups, action items, etc., through Symbl.ai’s Summary UI.

To receive the insights via email, add the code below to the instantiation of the connection object:

actions = [
        {
          "invokeOn": "stop",
          "name": "sendSummaryEmail",
          "parameters": {
            "emails": [
              emailId #The email address associated with the user’s account in your application 
            ],
          },
        },
      ]


Which results in a connection object like that shown below:

connection_object = symbl.Streaming.start_connection(
    credentials={app_id: <app_id>, app_secret: <app_secret>}
    insight_types=["question", "action_item"],
    speaker={"name": "John", "userId": "[email protected]"},
),
actions = [
        {
          "invokeOn": "stop",
          "name": "sendSummaryEmail",
          "parameters": {
            "emails": [
              emailId #The email address associated with the user’s account in your application 
            ],
          },
        },
      ]


Subscribe to Events

Once the WebSocket connection is established, you can get live updates on conversation events such as generation of a transcript, action items or questions, etc. Subscribing to events is how the WebSocket knows to send the client new information without explicit requests from the server.

The .subscribe is a method of the connection object that listens for events from an interaction and allows you to subscribe to them in real time. It takes a dictionary parameter, where the key can be an event and its value can be a callback function that should be executed on the occurrence of that event.

The table below summarizes the different events you can subscribe to: 

EventDescription
message_responseGenerates an event whenever a transcription is available.
messageGenerates an event for live transcriptions. This will include the isFinal property, which is False initially, signifying that the transcription is not finalized.
insight_responseGenerates an event whenever an action_item or question is identified in the transcription.
topic_responseGenerates an event whenever a topic is identified in the transcription.

An example of how to set up events is shown below, with the events stored in a dictionary before being passed to the subscribe method:

events = {
    "message_response": lambda response: print(
        "Final Messages -> ",
        [message["payload"]["content"] for message in response["messages"]],
    ),
    "message": lambda response: print(
        "live transcription: {}".format(
            response["message"]["punctuated"]["transcript"]
        )
    )
    if "punctuated" in response["message"]
    else print(response),
    "insight_response": lambda response: [
        print(
            "Insights Item of type {} detected -> {}".format(
                insight["type"], insight["payload"]["content"]
            )
        )
        for insight in response["insights"]
    ],
    "topic_response": lambda response: [
        print(
            "Topic detected -> {} with root words, {}".format(
                topic["phrases"], topic["rootWords"]
            )
        )
        for topic in response["topics"]
    ],
}


connection_object.subscribe(events)


Send Audio From a Mic

This allows you to send data to WebSocket directly via your mic. It is recommended that first-time users use this function when sending audio to Symbl.ai, to ensure that audio from their application works as expected.  

connection_object.send_audio_from_mic()


Send Audio Data

You can send custom binary audio data from some other library using the following code. 

connection_object.send_audio(data)


Stop the Connection

Lastly, you need to close the WebSocket, with the code below:

connection_object.stop()


How to Integrate your Application with Symbl.ai via SIP

In this section, we will take you through the process of integrating your application with SIP through Symbl.ai’s Telephony API.  As with our implementation of WebSocket above, the code snippets are Python but the Symbl.ai SDK is available in Javascript and Go. 

Prepare Environment

Before you begin, you will need to install the Symbl.ai Python SDK, as shown below: 

# For Python Version < 3
pip install symbl

# For Python Version 3 and above
pip install symbl

Additionally, to connect to Symbal.ai’s APIs, you’ll need access credentials, i.e., an app id and app secret, which you can obtain by signing into the developer platform.  

Create SIP Connection

After setting up your environment accordingly, the initial step requires you to establish a SIP connection. You will need to include a valid SIP URI to dial out to. 

The code snippet below allows you to start a Telephony connection with Symbl.ai via SIP:

connection_object = symbl.Telephony.start_sip(uri="sip:[email protected]") 


Receive Insights via Email

As with a WebSocket integration, you can choose to receive insights from the interactions from the call via email. This will provide you with a link to view the conversation transcripts, as well as details such as the topics discussed, generated follow-ups, action items, etc., through Symbl.ai’s Summary UI.

To receive the insights via email, add the code below to the instantiation of the connection object:

actions = [
        {
          "invokeOn": "stop",
          "name": "sendSummaryEmail",
          "parameters": {
            "emails": [
              emailId #The email address associated with the user’s account in your application 
            ],
          },
        },
      ]


Which results in a connection object like that shown below:

connection_object = symbl.Telephony.start_sip(uri="sip:[email protected]"),
actions = [
        {
          "invokeOn": "stop",
          "name": "sendSummaryEmail",
          "parameters": {
            "emails": [
              emailId #The email address associated with the user’s account in your application 
            ],
          },
        },
      ]


Subscribe to Events

Once the SIP connection is established, you can get live updates on conversation events such as the generation of a transcript, action items, questions, etc.

The connection_object.subscribe is a function of the connection object that listens to the events of a live call and lets you subscribe to them in real time. It takes a dictionary parameter, where the key can be an event and its value can be a callback function that should be executed on the occurrence of that event.

The table below summarizes the different events you can subscribe to: 

EventDescription
message_responseGenerates an event whenever transcription is available.
insight_responseGenerates an event whenever an action_item or question is identified in the message.
tracker_responseGenerates an event whenever a tracker is identified in the transcription.
transcript_responseAlso generates transcription values; however, these will include an isFinalproperty that will be False initially, meaning the transcription is not finalized.
topic_responseGenerates an event whenever a topic is identified in any transcription.

An example of how to set up events is shown below, with the events stored in a dictionary before being passed to the subscribe method:

events = {
    'transcript_response': lambda response: print('printing the first response ' + str(response)), 
    'insight_response': lambda response: print('printing the first response ' + str(response))
    }

connection_object.subscribe(events)


Stop the Connection

Finally, to end an active call, use the code below:

connection_object.stop()


Querying the Conversation Object

Whether implementing a WebSocket or SIP connection, you can use the conversation parameter associated with the Connection object to query Symbl.ai’s Conversation API to access specific elements of the recorded interaction. 

The table below highlights a selection of the functions provided by the Conversation API and their purpose. 

FunctionDescription
connection_object.conversation.get_topics(conversation_id))get_conversation_id()Returns a unique conversation_Id for the conversation being processed. This can then be passed to the other functions described below.  
connection_object.conversation.get_messages(conversation_id)Returns a list of messages from a conversation. You can use this to produce a transcript for a video conference, meeting or telephone call.
connection_object.conversation.get_topics(conversation_id))Returns the most relevant topics of discussion from the conversation that are generated based on the combination of the overall scope of the discussion.
connection_object.conversation.get_action_items(conversation_id)Returns action items generated from the conversation.
connection_object.conversation.get_follow_ups(conversation_id)Returns follow-up items generated from the conversation, e.g., sending an email, making subsequent calls, booking appointments, setting up a meeting, etc. 
connection_object.conversation.get_members(conversation_id)Returns a list of all the members in a conversation. 
connection_object.conversation.get_questions(conversation_id)Returns explicit questions or requests for information that come up during the conversation.
connection_object.conversation.get_conversation(conversation_id)Returns the conversation meta-data like meeting name, member name and email, start and end time of the meeting, meeting type and meeting id.
connection_object.conversation.get_entities(conversation_id)Extracts entities from the conversation, such as locations, people, dates, organization, datetime, daterange, and custom entities.
conversation_object.conversation.get_trackers(conversation_id)Returns the occurrence of certain keywords or phrases from the conversation.
conversation_object.conversation.get_analytics(conversation_id)Returns the speaker ratio, talk time, silence, pace and overlap from the conversation.

Conclusion 

To recap:

  • WebSocket is a widely used protocol for facilitating the exchange of data between a client and a server
  • SIP is a signaling protocol that is used to establish interactive communication sessions, such as phone calls or video meetings
  • The benefits of WebSockets include:
    • Speed
    • Simplicity
    • Constant ongoing updates
  • The benefits of SIP include:
    • Interoperability
    • Adaptability
    • Scalability
  • Integrating your application via Websocket is done through Symbl.ai’s Streaming API and includes:
    • Preparing the environment
    • Creating a WebSocket connection
    • Subscribing to events
    • Sending audio from a mic, or prepared binary audio data
    • Stopping the connection
  • Integrating your application via Websocket is done through Symbl.ai’s Telephony API and includes:
    • Preparing the environment
    • Creating a SIP Connection
    • Subscribing to Events
    • Stopping  the Connection
  • You can use the conversation parameter associated with the Connection object to query Symbl.ai’s Conversation API to access specific elements of the recorded interaction. 

To discover more about Symbl.ai’s powerful APIs and how you can tailor them to best fit the needs of your application, visit the Symbl.ai documentation.  Additionally, sign up for the development platform to gain access to the innovative large language model (LLM) that powers Symbl.ai’s conversational intelligence solutions, Nebula, to better understand how you can extract more value from the interactions that take place throughout your organization.  

Avatar photo
Team Symbl

The writing team at Symbl.ai