High-Level Architecture
What to Consider Before Integration Starts
Cloud Services
Recorded files:
- Available on public cloud storage to allow Symbl Async requests access for conversation analysis.
- Recorded files length – Symbl conversation Analysis supports files up to 4 hours
Database:
- Store Webinar recording’s metadata: meeting name, number of participants, participants names (If possible) , meeting date/time and more.
- Async request concurrency management – Symbl allows to have up to 50 active requests per account of active Async APIs (scheduled, in_process) requests
Note: If in your concurrency requirements are greater than 50 consider Symbl option for a private tenant package.
- Conversation analysis results
Serverless functions:
- Serverless functions should have access to relevant tables in database and/or storage endpoints and will serve the key for adding/managing/updating conversation analysis status to/from Symbl.
- HTTPS endpoint for the serverless function that serves as Async API WebHook endpoint.
Symbl
- Registered user account
- Valid token to use Symbl APIs
- Here are Symbl’s APIs to review and that will be used for this MVP integration:
- Generate Token – Will be used to generate a valid token to use Symbl’s API. Token is valid for 24 hours.
- Async POST Video URL API – This API is used for:
- Digest and analyze the Webinar recordings.
- In this API you will add most of the metadata like meeting name speaker diarization using the number of participants etc.
- Once you run this API you will get conversationId and jobId. conversaitonId will be used for later Conversations API analysis and jobId will be used to see what is the jobId analysis status.
- In this API you will send a WebHook endpoint to a specific serverless function to get the jobId status in order to know when the job is completed.
- Create Pre-built UI using Experience API – This API result will provide a link of the Pre-Build UI of the Webinar and will include the transcripts, insights and more. For MVP the link will be wrapped with iFrame.
- Summarization API (Labs non-production feature) – This is a pre-alpha feature and can be used to show the meeting summary before along with the Pre-Built UI option.
- Get job status (Optional) – This is a good practice to check the jobId in case the process time goes beyond 40% of recording duration length.
Integration Steps – Initial infrastructure
Step 1: Database design
Here are a few things to consider to store when you design your database tables and fields:
Conversation-metadata
“ table (Prior to Symbl conversation analysis):- Unique identifier
- File name
- Datetime created
- File duration length
- Recording url
- Language code
- Time Zone
- Custom Vocabulary – In case there are specific words that are not common in the conversation spoken language having them will help in getting better results from Speech To Text
- Company/Business name – In case you have different segments.
- Number of participants
- Recorded in separate channel (Optional for future separate channel option) – Recording can be recorded in mono channel or each participant is recorded in separate channel
- Participants names (Optional for future)
- Participants emails (Optional for future)
- Any other metadata you think should be included.
Symbl-Async-Request
table – Holds the request unique identifiers and status associated with the request metadata that are created per Async API request:- conversationId
- jobId
- jobId status
- Datatime created
- The
unique identifier
fromConversation-metadata
table
- Conversation-Intelligence-<names> tables –
- For MVP have a table for Conversation-Intelligence-PreBuiltUI with the following fields:
- conversationId
- Link – Taken from Experience API response (Symbl Pre Built UI usage)
- In case you would like to manage/aggregate different calls intelligence data and/or remove the analyzed data from your Symbl account, store the analyzed data in different tables per conversation API and the unique conversationId per conversation request.
- For MVP have a table for Conversation-Intelligence-PreBuiltUI with the following fields:
- Business-UX table for Symbl Pre Built UI usage to determine which design to provide to each Company/Business(Optional):
- Company/Business name
- Logo url
- Favicon url
- Additional UX configurations
Step #2.1: Create serverless function for adding/reusing new/existing recording
When a new recording and/or new metadata are created:
- If recording file was not added: Add the new recording file to your storage
- Add the conversation metadata to
Conversation-metadata
table - Add a new record to
Symbl-Async-Request
table:jobId status
–pending
Datetime
– Current timejobId
andconversationId
with no valueUnique identifier
value fromConversation-metadata
table
- Trigger serverless function for managing requests in next step
Step #2.2: Create serverless function for managing requests
This serverless function can be triggered in two ways:
#1 – Serverless function trigger from step #2.1:
- Query Symbl-Async-Request for how many active (scheduled, in_progress) are there:
- If active requests is less than max concurrent (Max is 50 requests):
- Query the oldest “pending” request record with no conversationId, jobId in Symbl-Async-Request
- Send the selected record to Symbl’s Async POST Video URL API request with:
- The serverless function
webhook
endpoint in body request - Relevant fields that will be taken from metadata like name, url (Recording url), customVocabulary, etc. in body request
- Speaker separation method:
- In case each speaker was recorded in separate channel use the fields
enableSeparateRecognitionPerChannel
with the value
true
andchannelMetadata
with the relevant data in body request - In case the recording was done only in one channel use the fields
enableSpeakerDiarization
with the valuetrue
anddiarizationSpeakerCount
with the number of participants in the conversation as part of query params
- In case each speaker was recorded in separate channel use the fields
- The serverless function
- Update the selected request in Symbl-Async-Request table with conversationId and jobId values returned from Async API response
- If active requests is less than max concurrent (Max is 50 requests):
Note: In case the file size is large allow the response time to be up to 3 minutes to update the status
#2 - Webhook trigger the serverless function:
- Query for the jobId in
Symbl-Async-Request
table update the jobid status, get the conversationId and return to the webhook 200 success response
Note: In case the Query returned empty result add a retry after a few seconds and then complete the update
- If the returned
jobId status
from Webhook POST request iscompleted
:- Repeat
#1 - Serverless function trigger from step #2.1
- Trigger serverless function for managing conversation APIs requests and pass the conversationId
- Repeat
- If the returned
jobId status
from Webhook POST request isfailed
check file and logs to check what went wrong and share the conversationId or jobId with Symbl team for additional analysis.
Step #2.3: Create serverless function for getting the conversation analysis results
Once this serverless function is triggered with the conversaitonId using the following conversation APIs you can get the conversation’s Speech to Text, Action Items, Follow Ups, Topics, Questions, Conversational Analytics and more and if you choose you can store it in your database.
In addition for a quick Summary UI implementation of the conversation you can use the same conversationId with the Business-UX table to set Symbl Pre Built UI style and wrap it with iFrame as part of your website/app.
Step #2.4: Create a cronjob to check if jobId status is as expected (Optional)
Our Symbl.ai system is resilient and stable but like all systems it is always best to verify that the jobId status from webhook POST requests are not stuck. For this purpose you can implement another function with a cronjob to check every 1 hour the jobId status by:
- Query all the active jobs (scheduled, in_progress) and then calculate if the delta of current time to when the request was made is not going above ~40% of the recording length which is the time Symbl.ai should complete the analysis of the recorded file.
- Check the jobId status
In case one of the above is not as expected please share this feedback with Symbl.ai team, so we can debug this.
Step 3: Create serverless function for managing requests
This serverless function can be triggered in two ways:
#1 – Serverless function trigger from step 2:
- Query Symbl-Async-Request for how many active (scheduled, in_progress) are there:
- If active requests is less than max concurrent (Max is 50 requests):
- Query the oldest
pending
request record with no conversationId, jobId in Symbl-Async-Request - Send the selected record to Symbl’s Async Audio API request with:
- The serverless function
webhook
endpoint in body request - Relevant fields that will be taken from metadata like name, url (Recording url), customVocabulary, etc. in body request
- Speaker separation method:
- In case each speaker was recorded in separate channel use the fields “enableSeparateRecognitionPerChannel” with the value “true” and “channelMetadata” with the relevant data in body request
- In case the recording was done only in one channel use the fields
enableSpeakerDiarization
with the value “true” anddiarizationSpeakerCount
with the number of participants in the conversation as part of query params
- The serverless function
- Update the selected request in Symbl-Async-Request table with conversationId and jobId values returned from Async API response
- Query the oldest
- If active requests is less than max concurrent (Max is 50 requests):
Note: In case the file size is large allow the response time to be up to 3 minutes to update the status
#2 – Webhook trigger the serverless function:
- Query for the jobId in
Symbl-Async-Request
table update the jobid status, get the conversationId and return to the webhook 200 success response
Note: In case the Query returned empty result add a retry after a few seconds and then complete the update
- If the returned
jobId status
from Webhook POST request iscompleted
:- Repeat
#1 – Serverless function trigger from step 2
- Trigger serverless function for managing conversation APIs requests and pass the conversationId
- Repeat
- If the returned
jobId status
from Webhook POST request isfailed
check file and logs to check what went wrong and share the conversationId or jobId with Symbl team for additional analysis.
Step 4: Create serverless function for getting the conversation analysis results
Once this serverless function is triggered with the conversaitonId using the following conversation APIs you can get the conversation’s Speech to Text, Analytics, Trackers, Topics and Summary if you choose you can store it in your database.
In addition for a quick Summary UI implementation of the conversation you can use the same conversationId with the Business-UX table to set Symbl Pre Built UI style and wrap it with iFrame as part of your website/app.
Step 5: Create a cronjob to check if jobId status is as expected (Optional)
Our Symbl system is resilient and stable but like all systems it is always best to verify that the jobId status from webhook POST requests are not stuck. For this purpose you can implement another function with a cronjob to check every 1 hour the jobId status by:
- Query all the active jobs (scheduled, in_progress) and then calculate if the delta of current time to when the request was made is not going above ~40% of the recording length which is the time Symbl should complete the analysis of the recorded file.
- Check the jobId status
Next steps: Try it yourself!
If you have any questions on how to build a webinar solution for business recordings with Symbl.ai, please reach out to us at [email protected].