Natural language understanding is advancing at a breakneck pace, but most solutions are focused on task-specific problems. Moving beyond models that were trained on written text to spoken conversations is critical. Building your own solution for contextual conversation intelligence is the best way to extract the insight you need.

Natural language processing (NLP) has progressed by leaps and bounds over the last few years.

And while advances in the natural language understanding (NLU) space (a subset of NLP) have equipped developers with incredible capabilities, they come with a catch: most of the existing solutions are laser-focused on specific problems.

Technical advances in NLU have improved our ability to solve specific problems with ever-increasing accuracy. You can use a model to do text classification on each line of a given passage. You can perform sentiment analysis on each line. And you can paraphrase every paragraph within a text.

But what’s missing from your toolkit are tools that enable you to get a more intelligent, contextual understanding of everyday natural language.

Context is critical for understanding everyday language. In human to human (H2H) conversations, we rely on our memory of things previously said in the conversation, our ability to recall past conversations, and our knowledge of what the other person knows.

Lack of contextual understanding is the fundamental problem that most developers working in the natural language space face today.

Solving the problem of contextual understanding

Developers in the natural language space create ML/DL models. The ultimate goal is to build models that can process written and spoken language in the same way a human can.

But so far, advances in the NLP space haven’t yielded a contextual understanding of our actual everyday language.

However, developers can now build models that can process the linguistic relationships and patterns of thought beneath the surface-level language to go beyond task-specific models and deliver real conversation intelligence.

Developers have access to a wealth of existing, task-specific solutions in the natural language space. And that’s the key. We can assemble these tools together to create models that offer a genuine contextual understanding of language.

Conversation summaries: the challenge

Consider a common task in most organizations: generating written summaries of conversations. Let’s think through how to build a model that can accomplish this effectively — and why it requires more than just using existing natural language solutions.

There are plenty of resources available for summarization — when it’s done on written text. But spoken language contains lots of complexities and irregularities that written language doesn’t. Text classification and sentiment analysis, for example, doesn’t tell the whole story about what someone is actually talking about.

In spoken language, we use lots of repetition, circling back to the same thought again and again. We use filler words like yeah, uh, you know, kinda. And we use more informal, casual language than in written text.

Since summarization models have been trained on written text, they can’t be used to generate accurate summaries of human conversations. If we were to input the transcript from a typical thirty-minute conversation into a summarization model, it would likely include a lot of irrelevant information and leave out a lot of information that was actually critical to the conversation.

We can’t apply a model that was trained on written text to spoken conversations. But if we use other algorithms strategically in combination, we can build a multi-step model that generates clean conversation summaries.

Conversation summaries: 3 steps to building a model

If we want to summarize a conversation, it’s important to consider the way structure plays a role. Conversations are dialogues in which speakers share their thoughts with each other in real time, clarifying and expanding on their ideas as they often require knowledge of the immediate context to make sense.

Piecing together existing tools gives us a powerful way to achieve contextual understanding of conversations — but making it happen requires careful thought and planning. Curious about the exact steps you need to take? Here’s a closer look.

1. Identify topics

The first step in building a conversation summary model is topic modeling.

Consider a typical recording from an all-hands department meeting. Discussion during the meeting might cover the introduction of new team members, wins from the previous month, reviewing results from the last quarter, the rollout of a new product update, news from the marketing team, updates on the company’s remote work status, and more.

To surface these and other key topics mentioned during the conversation, we can use topic modeling algorithms like probabilistic Latent Semantic Analysis (LSA), a technique used to model information under a probabilistic framework, where topics are treated as latent or hidden variables. Another common option is Latent Dirichlet Allocation (LDA), a method used to determine topics that are likely to have generated a collection of words.

2. Build timelines

We know what was discussed. Now we need to know when.

Topic modeling gives us an idea of what the main topics in the conversation are. Next, we can break up the conversation using topic segmentation so that we know the specific points in time when speakers were discussing each of the key topics.

Topic segmentation provides us with a timeline, breaking the conversation down into blocks of time based on the topics discussed. For example, in our team meeting summary, we might see that the department head made general introductions from 0:00 to 1:10, new team members were introduced from 1:10 to 4:23, team members shared recent wins from 4:23 to 9:37, and so on.

3. Create summaries

It’s time to generate your summary. The last step is to use a text summarization algorithm to generate summaries of each conversation block. Once these have been generated, you can assemble them into an overall summary of the conversation.

Here, it’s critical to ensure that your model is trained on short blocks of conversation data. If your summary model was trained on heavily edited informational articles, it won’t be able to identify the most important points when generating a summary.

Create your own conversation intelligence models

Building your own solution for contextual conversation intelligence requires strategic thinking, technical expertise, and plain old hard work. While you could build your own model, Symbl.ai makes it possible to skip all of that hard work with solutions like our GET Topics API.

Additional reading

What It Really Means to Add Context To Your Conversation AI

The What, Where, and Why of Contextual AI

What’s That, Human? The Challenges of Capturing Human to Human Conversations

Contextual AI: The Next Frontier of Conversational Intelligence

Avatar photo
Sekhar Vallath
Data Scientist(AI in NLP)